It's that time again - the November Mozscape index is now available! Check out the Mozscape data that is now fresh in Open Site Explorer, the Mozbar, PRO campaigns, and the Mozscape API.
The November Mozscape index is launching a few days later than scheduled. A miscalculation in the amount of crawl data initially included, and the fact that our crawlers are extremely efficient, led to our first index attempt this month to be about twice the size of our 77 billion URLs goal. Had we not made this miscalculation, we would have been able to hit our original release date of 11/5, but restarting the index caused our release date to slip a few days.
Another hiccup we ran into this month was processing 76 billion URLs. It took a bit longer than our previous October index, which was only 55 billion URLs. This became glaringly apparent in one specific step of our index processing. Periodically throughout processing, we checkpoint the files that have been processed so we can roll back if something catastrophic occurs (a machine failure, file corruption, etc.). With the larger index this month, these checkpointing steps were taking noticeably longer; in some cases, it took days to checkpoint some of the larger steps. Thanks to the genius engineers on the Mozscape team, Martin and Brandon were able to come up with a solution that drastically reduced the time spent checkpointing. With Martin's update to the processing software, the time spent in some of these steps was cut from days to just minutes! Once again, taking a step back brought the Mozscape team two steps forward.
The Mozscape team is continuing to make some significant progress finalizing our private cloud solution in Virginia. We are on track to have indices produced in both the AWS cloud and our own private cloud by the end of the year. After a successful test index completed, the first Mozscape index is now in progress, running in our own private cloud. It's an exciting achievement for the Mozscape team!
Here are the metrics for this latest index:
- 76,734,608,461 (76 billion) URLs
- 776,343,422 (776 million) Subdomains
- 134,499,372 (134 million) Root Domains
- 878,838,592,381 (878 billion) Links
-
Followed vs. Nofollowed
- 2.69% of all links found were nofollowed
- 56.69% of nofollowed links are internal
- 43.31% are external
- Rel Canonical - 13.65% of all pages now employ a rel=canonical tag
-
The average page has 71 links on it
- 61.28 internal links on average
- 10.13 external links on average
And the following correlations with Google's US search results:
- Page Authority - 0.36
- Domain Authority - 0.19
- MozRank - 0.24
- Linking Root Domains - 0.30
- Total Links - 0.25
- External Links - 0.29
This histogram shows the crawl date and freshness of results in this index:
The freshest data in this index will be from October 16th (when processing began), and a good portion of the link data will be from late September to mid October. This index will reflect link data that dates back to about mid-September, but the majority of this index will be the first few weeks of October. As we continue to improve on the length of time it takes to process an index, this freshness will keep improving!
Another exciting announcement is our new App Gallery that launched a few weeks ago. Check out all of the great tools our users are building on top of our Mozscape data. If you have a free tool that you would love to see added to this page, submit a request to have it added to the gallery - we'd love to hear about it!
As always, we'd love your feedback. Hope to hear from you in the comments, where the big data team will be reading and responding as usual.
P.S. Remember that if you're ever curious about when Mozscape is updating, you can check the calendar here. We also maintain a list of previous index updates with metrics here.
I misread this as "Movember Nosescape Index is Live!" (which would be pretty cool) - must have mo's on the mind :)
Mozscape is getting huge! Keep up the good work you guys :)
Great. For some reason its been 3 months I made links and I know they are there but my OSE never shows any backlinks to my site. Only shows very old ones.
I'm always excited to see a new mozscape update. I think the best is that you guys were able to do it $600,000 cheaper this month. I can't wait to see what happens when you fully migrate to the hybrid structure.
It's interesting, this is the lowest correlation I can remember seeing between Domain Authority and ranking. Is there any work going on with the Domain Authority metric to bring it back in line?
We are working now on improving Page and Domain Authority and hope to have something out near the end of the year/beginning of next year. That said, we noticed that all of our domain metrics dropped and all of the page metrics increased, likely due to Google algorithm shifts in the last few months, so the correlations may still be lower then before. I blogged about this a few months back if you'd like to read more:
https://www.seomoz.org/blog/mozscape-correlation-analysis-google-algorithm-changes
impressive stats...are there any big changes planned for the future ?
Yep - we do have some big changes planned for the future - we've got some really smart engineers working on the Mozscape project!
Cutting down our processing time in order to provide fresher index data is always one of our top priorities. Historically we've been releasing indexes every 4 weeks, but we're working on making those releases more and more often.
The ultimate goal is to work toward more of a live index, but that probably won't be available for 9 month to a year.
Carin Really impressive statics.Great to hear that news you are making plan for a release all least metrics in every 4 weeks. I am waiting for that And Please publish an Infographic once in month. Once again Thanks a lot Carin.
Thanks a lot for the index, info for the update and light into your inner workings.
For every update I see, I always wonder how to interpret the correlations of the index to the Google US results. I tried to find some article about it at Seomoz, but couldn't. Please direct me to it, if available.
In finance and other fields, the Correlation coefficient ranges as so:
1: Perfect positive correlation, the two move in lockstep.
0: No correlation, the two will move at random to one another.
-1: Perfect negative correlation, the two move in opposite lockstep.
These are the current index correlations to Google US results:
Page Authority - 0.36
Domain Authority - 0.19
MozRank - 0.24
Linking Root Domains - 0.30
Total Links - 0.25
External Links - 0.29
Looking at the correlation coefficient, I don't really see that much of it. I would think your goal would be to get close to a 1 or perfect correlation, so that the index results are a better mirror of how Google sees our site, right?
Would it be possible to have Seomoz take on this?
Also. Is the negative sign in front of the correlation numbers just a separation character? If so, it would be wise to try to use another character like ":", so that there is no possible confusion.
Thanks a lot,
Yes - those hyphens are just separation numbers. In terms of the values, basically, a correlation with rankings of 1.0 or close to that would suggest that the only factor in Google's entire algorithm is based on relatively simplistic link metrics. We use correlation more like an indication that we're crawling good stuff and that our algos for PA and DA are approximating what Google thinks is important in a page's link profile. Hope that helps!
So for the metrics above, Page Authority is the most highly correlated metric and Domain Authority the least correlated metric.
Correct?
Wow.. I'm really surprised that only 13% use canonical
I think the rel=canonical tag isn't really that important, I'd say 87% of websites don't need it. So I guess most web developers/designers (including me) don't really think it's worth the time and effort adding to all our current websites, my sites are ranking fine without it.
Also I doubt half of the worlds website owners even know what rel=canonical tag is.
Been waiting for that update, cause ive done a lot of hard work this month!
I have a question: Is the basic rule for the mozscape update Once a month?Or is it more random and could happen even twice a month?
Hey Yoav!
We plan for a release at least every 4 weeks, unless something knocks us off schedule, like this month. We're continually working to decrease processing time - bigger hardware, improvements to our software, etc. Once an index is complete, we'll release it - so if things run smoothly and it finishes up in 3 weeks, we'll release a week early so you guys can get fresh data!
Awesome!In the far future, do you think the mozscape index could be as dynamic as weekly updates?
I've tried every index in the book, nothing measures a site's trust and authority as good as MozAuthority.
Great to hear about our authority metrics!!
We do have plans to have index updates as often as weekly, or even more often, in the far future. It's a big task, and not one we can accomplish with the current software we have in place, however, we've got some really amazing Mozcscape engineers working on our solution for the future!
Great to hear, cant wait.You should publish an infographic once, showing just how many online marketeers use the mozauthority and Trust rather than any other metric!
I think searchengineland can grow his marketing plan because of live links are few in the box.
Great to know the index has been updated.
However I notice when looking at the mozbar, Mozrank says 5.47
When I check other tools that report Mozrank, all the other tools say the Mozrank is 6.87
Why the discrepancy when both use the same data source ?
What tools are you using besides the Mozbar? The Mozbar, Open Site Explorer and the PRO campaigns are all pulling from the latest index, but if the applications you're using are not managed by SEOmoz, there might be a delay while they update their application. Are you still seeing the discrepancies?
Hi,
To answer your question I am using the MozBar version 2.3), which reports Mozrank as 5.47
I am comparing that MOzbar result, with the result from the free mozrank checker from here (which is reporting mozrank as 6.87):
https://moonsy.com/mozrank/
Open Site Explorer also reports 5.47 Mozrank.
So yes I still see the discrepancies even though all three use the same data source.
Any thoughts or ideas are welcomed...you know maybe I am looking too hard into it!
Hey there!
Unfortunately, https://moonsy.com is a third party application that we don't manage - we don't have any control over when they update their application with our new metrics.
The MozBar and Open Site Explorer are SEOmoz managed tools, meaning they will immediately update with a new index release.
Hope that helps clarify - feel free to reach out our help team, [email protected], if you have more questions!!
Thanks,
Carin
Thanks Carin,
That does make sense. My assumption was that the moonsy tool used SEOmoz data, so logically Mozbar, OSE and Moonsy data results should match.
But yes, perhaps Moonsy dosent update its data. Kind of guessed that but no way to prove it obviously.
Thanks for your help.
All sounds great. It would be pretty cool to have some of those key metrics charted over time for some perspective. The fact that you have indexed 76,734,608,461 URLs and that 2.69% of those are nofollow, is mid boggling (and will probably be an infographic by morning). However some perspective on how that compares to previous crawls would be good.
I believe there is a history of those numbers kept here:
https://www.seomoz.org/api/updates
You have to click on the "More" button to get all the text for the older updates.
Yep, Brandon is correct - but that would be an interesting visual, Matt! A graph over time could be a great addition to the Updates page. I'll pass that suggestion along.
Thanks!
Carin
Cheers Brandon and Carin. Something graphical (even really basic) would be good for the interested, but lazy ;)
Awesome job, Thanks to SEO MOZ. Now i can create my own web tool.
What tool are you thinking of making?