Just 14 short days ago, I wrote about the August Mozscape index update. Today, as part of our efforts to create shorter deltas between indices, I'm excited to announce that we have our fastest ever time between updates. There's new data right now in the Mozscape API (for which we're still seeking beta testers on the new version), in Open Site Explorer, through the Mozbar, and in your PRO web app.

This current index has the following metrics:

  • 60,852,245,271 (60 billion) URLs
  • 657,072,652 (657 million) Subdomains
  • 153,355,227 (153 million) Root Domains
  • 610,557,978,730 (610 billion) Links
  • Followed vs. Nofollowed 2.26% of all links found were nofollowed
    • 54.95% of nofollowed links are internal
    • 45.05% are external
  • Rel Canonical - 13.46% of all pages now employ a rel=canonical tag
  • The average page has 70 links on it
    • 59.91 internal links on average
    • 10.57 external links on average

And the following correlations with Google's US search results:

  • Page Authority - 0.34
  • Domain Authority - 0.24
  • MozRank - 0.20
  • Linking Root Domains - 0.24
  • Total Links - 0.20
  • External Links - 0.24

Below is a histogram showing this update's crawling pattern:

2nd August Mozscape Index Crawl Histogram

Basically, this is very good news. We had an outage of our crawler in early June, but the large amounts of crawling performed in late July mean a lot of this index is extremely fresh - in fact, parts of this index are the freshest we've ever had (launched ~20 days after crawling - that's some speedy processing).

Why do Domain Authority & Page Authority Fluctuate?

Every index, we get a lot of questions about why a site's/page's PA/DA goes up or down. The answer's not easy because the inputs vary quite a bit, but basically, four things can cause change in these metrics from index to index:

  1. The site/page received more or fewer links or more/fewer more/less powerful links. Your site's link profile may even remain completely unchanged and still see fluctuation in DA/PA because the sites pointing to you have been recalculated to have better or worse metrics.
  2. Google changed things in their ranking algorithm and thus our models for DA/PA, which measure and attempt to track to correlation with Google's rankings changed, too.
  3. The web's link graph changed, and what was "0" (the lowest possible score) is now lower/higher than before and/or what was "100" (the highest possibly score) is now higher/lower than before. Essentially, think of this as the goalposts moving because the field's gotten bigger or smaller.
  4. Our web index changed in size/structure as we toss our more spam/junk and crawl more/fewer webpages, potentially biasing against links we were counting or hadn't counted in prior indices.

Thus, it's very hard to know for sure whether an increase in DA/PA for a particular page is entirely tied to your efforts, Google's changes or changes to the web as a whole. This is why I strongly, strongly recommend tracking your metrics against your competition. For example, in July, I compared several sites to show the delta between their scores across the May vs. July index like so:

 

Mozscape Data for Seattle Startups from the May Index Update

Above: May's 165 Billion URL index data

July Mozscape Data

Above: July's 78 Billion URL index data

Comparison of August 1st update data

Above: August 1st's 69 Billion URL index
(please ignore the SEOmoz.org numbers in this one - we had an error that affected our own site in the last index)

Above: August 14th's 61 Billion URL index
(again, please ignore SEOmoz.org numbers. Index error on our part)

This comparative process is done for you inside the PRO web app if/when you set up competitors: 

Domain Authority Over Time

Using the comparison data is a great way to get a sense of whether you're gaining/losing vs. the competition and remove a lot of the bias from the other types of macro-index-level modifiers. More so than any other methodology, I recommend this technique to help get a sense for how your site's metrics perform vs. a raw historical perspective.

A Final Note on Index Size

As you can see, the past few indices have been falling in size. This is due to our efforts to make indices faster and more consistent. We hope to remain in the 60-70 billion URL range for the next few indices, and we're relatively close to having our first index produced on our new private cloud. It will take a while, possibly 6 months, to get back up to the 150 billion page indices we had this Spring (which were very, very slow and stale), but the goal is to have an index every 2 weeks that exceeds that size. Exciting stuff, but crazy hard. Luckily, we have a fantastic and growing team of engineers working on it. If you know great minds in the field, we still pay $12,000 referral and signing bonuses, so send 'em our way!

Thanks very much - looking forward to your feedback.