It's that time again - the latest Mozscape index is now live! Data is now refreshed across all the SEOmoz applications - Open Site Explorer, the Mozbar, PRO campaigns, and the Mozscape API.
This index finished up in just 13 days, thanks again to all the improvements our Big Data Processing team has been implementing to make our Mozscape processing pipeline more efficient. The team continues to dial out our virtual private cloud in Virginia as well as tweak, tune, and improve the time it takes to process 82 billion URLs.
We've been saying we're close to releasing our first index created on our own hardware - and now we really are! Stay tuned for a deep dive blog post into why and how we built our own private cloud.
This index was kicked off the first week of March, so data in this index will span from late January through February, with a large percentage of crawl data from the last half of February.
Here are the metrics for this latest index:
- 83,122,215,182 (83 billion) URLs
- 12,140,091,376 (12.1 billion) Subdomains
- 141,967,157 (142 million) Root Domains
- 801,586,268,337 (802 billion) Links
-
Followed vs. Nofollowed
- 2.21% of all links found were nofollowed
- 55.23% of nofollowed links are internal
- 44.77% are external
- Rel Canonical - 15.70% of all pages now employ a rel=canonical tag
-
The average page has 74 links on it
- 63.56 internal links on average
- 10.65 external links on average
And the following correlations with Google's US search results:
- Page Authority - 0.35
- Domain Authority - 0.19
- MozRank - 0.24
- Linking Root Domains - 0.30
- Total Links - 0.25
- External Links - 0.29
We always love to hear your thoughts! And remember, if you're ever curious about when Mozscape next updates, you can check the calendar here. We also maintain a list of previous index updates with metrics here.
It never ceases to amaze me how well correlated Page Authority is to Google's organic US search results. In the statistic world, .35 isn't much, but in the SEO world it's likely the highest correlated metric we have.
If you're looking for a quick, single metric to judge the value of a page that's much better correlated with higher rankings than Google's PageRank, this is the one you likely want to consider.
Congrats to the Moz Data team and the entire Mozscape crew who put this all together.
Well thanks to all SEOmoz team who's provide the fresh data as much as possible.
Seomoz has been very regular and fast in updating mozscape index this year. Every time important metrics such as page authority and domain authority are being measured for a site, we see an improvement in all these which is a great estimation of the effort put in by the marketers in the creation and execution of an SEO campaign and even on how to improvise it.
Loving how fast this data is being updated... great job, team!
I have noticed that since the pace of these crawls has picked up, Domain Authority and link metric stats for what seems to be 70-80% of the sites I follow (including all competitors) have decreased. Is this a temporary shift due to the more frequent crawls or is this data in the process of finding a new mean due to changes in algorithm/Google SERPs? Thanks!
Great - I think it's fantastic that SEOmoz is turning out index updates so quickly.
@Carin / @Rand,What is your opinion as to why there are such dramatic shifts in the number of subdomains and domains between indices?
For example:Feb 12: Subdomains: 4.2B | Domains: 160MMFeb 27: Subdomains: 9.1B | Domains: 149MMMar 19: Subdomains: 12.1B | Domains: 142MM
What are your guidelines or rules for sampling?
Thanks,Michael
Hey Michael,
Yes, this is something we have been closely monitoring over the past few indexes. I called it out in the blog post for our early February index, but it appears our crawlers have discovered a small number of root domains that have a substantial number of subdomains associated with them. In other words, some real spamminess.
Great work, thanks!
There is not much relevant information on this topic on the internet, thanks for sharing some knowledge and very well explained.
Cash for cars is our business. Junk cars are a nuisance and an eyesore. In some cases they can cost you money: if your county wants you to keep them registered and insured regardless of if they run or not, or fine you for not moving them. Getting cash for junk cars is a perfect solution to this problem.
We buy junk cars including, all years, makes and models in any condition. Wrecked cars, junk cars, damaged cars for sale, and even crashed cars for sale are of interest to junk car buyers. Yes, we buy wrecked cars; and more importantly you junk a car with us, junk car removal is included free of charge, provided it is local junk car removal. You do not pay us one cent instead we take your junk car for cash paid on the spot.
It impresses me how quickly you guys are getting with this!
It disappoints me that I ran reports yesterday before the update which need to be re-run now because of this haha :-)
Keep it up guys! (Also the tweet about 20,000 active PRO members is awesome - Also reminds me that for this personal account of mine I need to add a new CC - Typically using business account and not this one)
Hiya! From this post i understand that this Mozcape indext reflects changes made until late february?
Is this correct?
Hope to hear from you!
Regards!
Hey! Yep, you'll find data ranging from about 1/20 - 2/28 in this index. You can see from the crawl histogram that a lot of the data was crawled between mid- to late-February.
802 billion links. Blows my mind! Thanks for all the new data - going to run some updated stats now!
hello,Carin. Would you pls disclose something about algo computing correlation between PA and google search result.
It's a machine learning based algorithm. It basically looks at all the metrics Mozscape can calculate and compares those against a large number of Google SERPs, then tries to create a best-fit algorithm based on those metrics. You can learn a lot more about PA/DA calculations here: https://www.seomoz.org/blog/introducing-seomoz-updated-page-authority-and-domain-authority
thanks so much Rand
The one statistic here that surprises me is that the majority of nofollow links are to internal links rather than external links. Even though it's over 3 years since Matt Cutts announced nofollow is not useful for page rank sculpting.
But no follow links are useful from keeping search engines from stuff you don't want crawling. Log in pages, fiddly stuff that there's no point in indexing (the thumbs up and down on this page). Plenty of good reasons to use them internally that don't involve PR sculpting.
Thanks for sharing, really a long waited update
Finally!Been waiting to see the updated DA's for a while now ^^ (I meant I've been waiting as I've seen my clients shoot up and want to see their DA/PA now)
DA should update every index (of course, some sites' DA will remain relatively constant from index to index), and this one is actually our freshest/fastest index ever (I think - only 13 days since our last index)!
Yeh Awesome Rand! I don't see why there are so many thumbs downs at the moment.. Give Keri some more community peoples please! ^^
Many times, thumbs down are an easy way of saying "I disagree with this statement/comment/blog post" without starting or continuing a flame war. Thumbs down can actually be helpful in reducing the number of negative comments that would have happened if someone could not thumb down an item.
In the case of this comment, my guess is that people disagreed with the implied criticism of how long this update took. This is one of the fastest turnarounds we've had for a new index. There are times when we have had to go a couple of months without an update due to hardware problems, so 13 days compared to several weeks is a really fast time.
I didn't mean it as that haha I meant I've been waiting to check the updated DA for my clients sites that have shot up since the Panda update on Friday...