The Second February Mozscape Index is Live!

We're continuing the trend of two index releases each month by bringing you the latest Mozscape index release today - only 15 days after our last release on February 12th! The latest Mozscape index took about 11 days to process, with a fairly significant portion crawled the beginning of February. The crawl data spans about 38 days, so the oldest crawl data will date back to the beginning of January. You can access refreshed data across all of our applications - Open Site Explorer, the Mozbar, PRO campaigns, and the Mozscape API.

Our Big Data processing team (Martin York, Douglas Vojir, and Stephen Wood) have been working on some really exciting improvements to our processing code base reducing the length of time processing takes, as well as beginning development on a highly anticipated new Mozscape index feature:

The Mozscape index is created in one continuous batch processing pipeline. A massive amount of crawl data is initially downloaded which is first sorted and organized, then the computations and magic are applied. Every so often, files get uploaded in a checkpoint step; just in case something catastrophic happens to the index, we'll be able to roll back to a fairly recent step.

Recently the Big Data processing team dug through this checkpointing code to see where they could optimize - and they really optimized! The time needed to checkpoint files varies throughout the pipeline, but the longest checkpointing step used to take about 60 hours to complete… With the optimization from Doug and Martin, this step now takes on average 2.18 hours! Holy time savings!!
The first few steps in processing are dedicated to organizing how the work is going to be distributed across the entire Mozscape processing cluster. These files are broken out into what are called shards and then assigned across the entire fleet of machines. Sometimes these shards aren't always completely full; this means one machine will be all done with work before another machine. Martin revisited this code as well to see what type of optimization could be applied. With the help of our master data scientist, Matt Peters, Martin was able to improve the distribution of work, saving around 25% of time spent processing!
One feature we hear requested fairly often is including HTTPS crawl data in the Mozscape index. Good news - development on this feature has begun, and we hope to have HTTPS data included in the Mozscape index this summer!

Here are the metrics for this latest index:

82,275,594,589 (82 billion) URLs
9,097,532,641 (9.1 billion) Subdomains
148,991,416 (149 million) Root Domains
829,267,740,331 (829 billion) Links
Followed vs. Nofollowed
- 2.25% of all links found were nofollowed
- 56.08% of nofollowed links are internal
- 43.92% are external
Rel Canonical - 15.43% of all pages now employ a rel=canonical tag
The average page has 73 links on it
- 62.93 internal links on average
- 10.33 external links on average

And the following correlations with Google's US search results:

Page Authority - 0.35
Domain Authority - 0.19
MozRank - 0.24
Linking Root Domains - 0.31
Total Links - 0.25
External Links - 0.29

Crawl histogram for the February 27th Mozscape index

As you can see from the metrics above, there continues to be an increase of subdomains as we have discovered a small number of root domains that have a substantial number of subdomains associated with them.

We always love to hear your thoughts! And remember, if you're ever curious about when Mozscape next updates, you can check the calendar here. We also maintain a list of previous index updates with metrics here.

Comments 28

Please keep your comments TAGFEE by following the community etiquette.

E-mail me when new comments are posted

Sort by:

Comments are closed on posts more than 30 days old. Got a burning question? Head to our Q&A section to start a new conversation.

Joe Robison

2013-02-27T16:44:23-08:00

Great work, glad to see the trend of twice a month. Can't wait until it's once a week!

3 1

Great work, glad to see the trend of twice a month. Can't wait until it's once a week!
Cancel
- Bojan Jovanović
 
 2013-02-27T17:50:23-08:00
 
 Considering that last index update took 15 days and that Internet is constantly growing, I believe you'll have to wait a bit longer until weekly update become reality.
 
 Congratulations to whole SEOmoz team for their effort.
 
 1 1
 
 Considering that last index update took 15 days and that Internet is constantly growing, I believe you'll have to wait a bit longer until weekly update become reality. Congratulations to whole SEOmoz team for their effort. 
 Cancel
 - Amir12
 
 2013-02-28T00:46:41-08:00
 
 Welcome and congrats :)
 
 And i thankful to all SEOmoz team who work and share this nice info to all of us.
 
 Thank again
 
 2 0
 
 Welcome and congrats :) And i thankful to all SEOmoz team who work and share this nice info to all of us. Thank again 
 Cancel
- RankWatch65
 
 2013-03-01T11:54:49-08:00
 
 Great to see the mozscape index being updated on a consistent basis. Kudos to Seomoz team for this. Mozbot is becoming hyperactive and is consistent in delivering the fruitful & updated results to its users on time these days. Not to forget the billions of pages he needs to crawl on the web and the bandwidth used before updating its index.
 
 RankWatch65 edited 2013-03-01T11:57:38-08:00
 4 0
 
 Great to see the mozscape index being updated on a consistent basis. Kudos to Seomoz team for this. Mozbot is becoming hyperactive and is consistent in delivering the fruitful & updated results to its users on time these days. Not to forget the billions of pages he needs to crawl on the web and the bandwidth used before updating its index.
 Cancel
Rick Noel

2013-03-06T02:19:28-08:00

Congrats and thanks to the SEOmoz technology team on the performance and efficiency gains.

2 0

Congrats and thanks to the SEOmoz technology team on the performance and efficiency gains.
Cancel
Brahmadas R

2013-02-27T22:57:56-08:00

Thanks for sharing, I wish to use this comment for appreciate the whole technical team behind it.

2 0

Thanks for sharing, I wish to use this comment for appreciate the whole technical team behind it. 
Cancel
nashcarlo8

2013-03-01T02:14:59-08:00

Great!

1 0

Great!
Cancel
MattAntonino

2013-02-28T17:56:08-08:00

Off to download some updates - thanks!

1 0

Off to download some updates - thanks! 
Cancel
PixelKicks

2013-03-02T02:07:15-08:00

Great to see the continued growth of Mozscape, some of the numbers are staggering :) Keep up the good work.

1 0

Great to see the continued growth of Mozscape, some of the numbers are staggering :) Keep up the good work.
Cancel
VikkyNix

2013-03-05T05:44:27-08:00

Yay! So you guys are gonna do it twice every month now. TU!

1 0

Yay! So you guys are gonna do it twice every month now. TU!
Cancel
gamebak

2013-03-14T08:44:44-07:00

Congrats!

1 0

Congrats!
Cancel
Stewert Cough

2013-03-07T01:25:55-08:00

Congrats SEO Moz

1 0

Congrats SEO Moz 
Cancel
Jon Lisbin

2013-02-28T15:47:20-08:00

Will y'all ever get back up to the ~168 billion URL level we saw last year? The more the merrier, and all that :)

1 0

Will y'all ever get back up to the ~168 billion URL level we saw last year? The more the merrier, and all that :) 
Cancel
- Carin Overturf
 
 2013-02-28T15:58:47-08:00
 
 Totally agree - the more merrier for sure! :)
 Unfortunately, the more the merrier, but the longer it takes to process...but we're working on solving that problem! Lots of exciting things coming this year!
 
 1 0
 
 Totally agree - the more merrier for sure! :) Unfortunately, the more the merrier, but the longer it takes to process...but we're working on solving that problem! Lots of exciting things coming this year!
 Cancel
TextFreeSMS.com

2013-03-20T05:53:41-07:00

Thanks for sharing

1 0

Thanks for sharing
Cancel
Jeroen Maljers

2013-02-28T11:01:28-08:00

Where can I find more info on what the correlation figures mean. What can I conclude from the correlation Domain Authority - 0.19 to Google's US search results. What I can remember from high School this feels like a low correlation. Or am I seeing this wrong?

1 0

Where can I find more info on what the correlation figures mean. What can I conclude from the correlation Domain Authority - 0.19 to Google's US search results. What I can remember from high School this feels like a low correlation. Or am I seeing this wrong? 
Cancel
- Carin Overturf
 
 2013-02-28T12:01:45-08:00
 
 Hey there!!
 Our data scientist, Matt Peters, posted a really informative blog post on what correlations mean last September. You can read up on the details here:
 https://www.seomoz.org/blog/mozscape-correlation-analysis-google-algorithm-changes
 Hope that helps!
 
 2 0
 
 Hey there!! Our data scientist, Matt Peters, posted a really informative blog post on what correlations mean last September. You can read up on the details here: https://www.seomoz.org/blog/mozscape-correlation-analysis-google-algorithm-changes Hope that helps!
 Cancel
 - Sean Lade
 
 2013-02-28T15:12:08-08:00
 
 Thanks for the link Carin. It was good to be able to expand on what the figures mean.
 
 1 0
 
 Thanks for the link Carin. It was good to be able to expand on what the figures mean. 
 Cancel
madaboutmedia

2013-02-28T01:47:52-08:00

This is great news, we are currently involved in link removal after getting hit with an unnatural link warning so we can now see how our progress is going every 2 weeks.
Thanks again

1 0

This is great news, we are currently involved in link removal after getting hit with an unnatural link warning so we can now see how our progress is going every 2 weeks. Thanks again 
Cancel
Martijn Oud

2013-02-28T01:31:52-08:00

Always happy to see new data in OSE. Thanks for the fast release!
Is there currently a feature planned to show which backlinks have been changed since the last index update? So I can quickly see what has changed in my campaigns?

1 0

Always happy to see new data in OSE. Thanks for the fast release! Is there currently a feature planned to show which backlinks have been changed since the last index update? So I can quickly see what has changed in my campaigns?
Cancel
- Carin Overturf
 
 2013-02-28T08:51:43-08:00
 
 Hey there! That is a great suggestion - I'll make sure we have that on the feature list to prioritize!
 Thanks,Carin
 
 2 0
 
 Hey there! That is a great suggestion - I'll make sure we have that on the feature list to prioritize! Thanks,Carin
 Cancel
SEO.Mayur

2013-02-28T01:24:43-08:00

Thanks for confirming.....

SEO.Mayur edited 2013-02-28T01:26:29-08:00
1 0

Thanks for confirming..... 
Cancel
Thương Lê

2013-02-28T01:50:24-08:00

Nice news for me.
Thank you for your good work.

1 0

Nice news for me. Thank you for your good work.
Cancel
Nick-SEOSpark

2013-02-28T03:35:52-08:00

Thanks for the update and hard work from the team. I have a question. As the update takes 11 days to process, does the data represent (for some websites) the status from about 2 weeks ago? Thanks in advance.

2 1

Thanks for the update and hard work from the team. I have a question. As the update takes 11 days to process, does the data represent (for some websites) the status from about 2 weeks ago? Thanks in advance. 
Cancel
- Carin Overturf
 
 2013-02-28T08:50:56-08:00
 
 Hey Nick!
 We started this index on February 13th, so the most recent crawl data would be from 2/12.
 Hope that helps!
 Thanks,Carin
 
 1 0
 
 Hey Nick! We started this index on February 13th, so the most recent crawl data would be from 2/12. Hope that helps! Thanks,Carin
 Cancel
Convergence Point Media

2013-02-28T07:03:41-08:00

This is great news! Some Fantastic progression!
Thanks and Congrats to those involved!

1 0

This is great news! Some Fantastic progression! Thanks and Congrats to those involved! 
Cancel
SiteWizard_LLC

2013-02-28T05:28:49-08:00

Nice job. The technology changes at SEOmoz last year have really paid off. It used to take months to get a refresh of the index.

1 0

Nice job. The technology changes at SEOmoz last year have really paid off. It used to take months to get a refresh of the index. 
Cancel
CyNdie

2013-02-28T14:03:41-08:00

Wow... Tanks, My Inspirasions... :)

1 0

Wow... Tanks, My Inspirasions... :) 
Cancel

Post Analytics

Comments 28

Log in to Moz

Don't have an account?