This month we're bringing you a special holiday treat: two Mozscape indices in the month of November! We just released the latest index, and you can now find fresh Mozscape data in Open Site Explorer, the MozbarPRO campaigns, and the Mozscape API.

This index is similar in size to the previous Mozscape index with about 76 billion URLs. The heavy computing AWS machines we moved to in October, detailed in Anthony's blog post, has saved significant amounts of time in our processing schedule thanks to almost no machine failures.

This time saved means more time for the Mozscape engineers to work on exciting projects, like tuning the final configurations in our own private cloud! We've been running a similar sized index in our private cloud located in Virginia alongside the index releasing today. It's running a bit slower as we continue to tune and dial the last pieces, but we hope to be running a hybrid processing solution early next year. Running an index in the cloud and an index in our own private cloud means fresher index data for you and our applications!  

Here are the metrics for this latest index:

  • 76,668,945,929 (76 billion) URLs
  • 664,205,988 (664 million) Subdomains
  • 136,202,352 (136 million) Root Domains
  • 892,544,725,878 (892 billion) Links
  • Followed vs. Nofollowed
    • 2.31% of all links found were nofollowed
    • 56.61% of nofollowed links are internal
    • 43.39% are external
  • Rel Canonical - 13.91% of all pages now employ a rel=canonical tag
  • The average page has 73 links on it
    •  62.28 internal links on average
    •  10.54 external links on average

And the following correlations with Google's US search results:

  • Page Authority - 0.35
  • Domain Authority - 0.19
  • MozRank - 0.24
  • Linking Root Domains - 0.30
  • Total Links - 0.25
  • External Links - 0.29

This histogram shows the crawl date and freshness of results in this index:

Crawl histogram for the late November Mozscape index

As you can see from the histogram, this index has some pretty fresh data mostly coming from October and the first week of November. The freshest data in this index will be from 11/10 when we started processing, and a good percentage was crawled late October and early November.  

As always, we'd love to hear your feedback in the comments - the Big Data team will be reading and responding! And remember, if you're ever curious about when Mozscape is updating, you can check the calendar here. We also maintain a list of previous index updates with metrics here.

Happy data pulling, Mozzers!