When we started reporting the Google “weather” on MozCast, we knew that one number could never paint the entire picture of something as complex as the Google algorithm. Over the last few months, we’ve been exploring other ways to look at ranking data from high altitude, and have reported on metrics like domain diversity and EMD influence. Today, I’m happy to announce that we’re rolling out five of these “top-view” metrics on MozCast, updated daily.
From the new “METRICS” page (top menu), you’ll see five tabs:
Each metric defaults to a 30-day view, but you can also see 60-day and 90-day data. Please note that Y-axes all auto-scale to emphasize daily changes, so make sure to note the scale when interpreting this data. I trust you all to be grown-ups and draw your own conclusions.
So, let’s dive right into the five top-view metrics…
(1) Domain Diversity
The domain diversity graph shows the percentage of URLs across the MozCast data set that have unique subdomains. Put more simply, it’s the number of unique subdomains divided by the number of total URLs/rankings. The more diversity, the less SERP “crowding” – here’s a 30-day view:
Keep in mind that the range over the past 30 days has been pretty narrow (less than 1%), so let’s take a look at the broader, 90-day view:
You can hover over any data point for dates and more precise percentages. Here, you can see that diversity increased when Google rolled out 7-result SERPs (from about 8/12-8/14), but has gradually declined over the past 90 days. When we started collecting data in early April, domain diversity was closer to 61%, but it dropped significantly after the Penguin update (on 4/24).
On September 14, Matt Cutts announced on Twitter that Google had made a change to improve SERP diversity:
We saw a small bump (about 0.4%) from 9/6 to 9/9, but otherwise have no evidence for major improvements. Please keep in mind that this is one data set and one way of measuring “diversity” – I’m not calling a Matt a liar, and I’d welcome other analyses and points of view. My goal is to create transparency where we currently have very little of it.
(2) SERP Count (“Shrinkage”)
Over a roughly 2-day period in mid-August, Google rolled out 7-result SERPs (for page 1), and our data shows that it impacted roughly 18% of the queries we track. We originally reported this as the number of SERPs with <10 results, but that presented two problems: (1) less results made the graph go up – which is a bit confusing, and (2) that metric doesn’t change if the result count changes. In other words (hat tip to Moz teammate Myron on this one), if all of the 7-result SERPs suddenly changed to 6-result SERPs, our original metric would never show it. So, we’ve replaced that metric with the average result count. Here’s a 60-day view:
In this case, an average drop of 0.5 results is massive, and the graph tells the story pretty well. The 30-day data shows much, much smaller variations, but this metric will help us track any future changes, including a return to 10-result SERPs (if that were to happen).
(3) EMD Influence
The influence of Exact-Match Domains (EMDs) is a hot topic in SEO. Our EMD influence metric shows the percentage of Top 10 rankings that are currently occupied by EMDs. Specifically, if the keyphrase is “buy widgets”, than we consider only “buywidgets.tld” (any TLD) to be an exact match. Here’s the 90-day data:
My recent post goes into more detail and there are a lot of ways to dig into this data, but we’re seeing a slight uptick in EMD influence recently over the past 3 months.
(4) PMD Influence
Similarly, PMD influence measures the influence of Partial-Match Domains on the Top 10. For the keyphrase “buy widgets”, we count any URL with either “buywidgets” or “buy-widgets” in the subdomain as a partial match. This metric does not include EMDs. Here’s the 90-day view:
In line with the broader history reported earlier, PMDs seem to be steadily declining in influence. Keep in mind that this doesn’t mean that any particular PMD won’t rank (they still hold over 4% of Top 10 rankings) – it just means that their overall impact is trending downward.
(5) Daily Big 10
Finally, we have a new metric I haven’t covered in any previous blog post, the “Big 10.” Apologies to college football fans (I’m a former Hawkeye), but I didn’t want to confuse this with the “Top 10.” The Big 10 influence is the percentage of Top 10 rankings accounted for by the ten most powerful subdomains on any given day. This list changes daily, and any single day’s data represents the influence of the Big 10 for that day. Currently, the Big 10 domains account for about 13.6% of Top 10 rankings in our data set:
Below the graph for this metric, we also list the Big 10 subdomains for the most recent day. Like all of the MozCast stats, this list is currently recalculated each morning. Here’s the data from 9/18:
- en.wikipedia.org
- www.amazon.com
- www.youtube.com
- www.facebook.com
- www.ebay.com
- www.walmart.com
- www.webmd.com
- www.yelp.com
- www.overstock.com
- allrecipes.com
Currently, the roughly 9,500 URLs in our data set (Top 7-10 for 1,000 keywords) represent about 5,300 unique subdomains, so the fact that just ten of them take up almost 14% of the real estate is pretty amazing. Wikipedia alone holds 4.6% of the Top 10 URLs that we track (today). There’s a fair amount of movement in the bottom couple of domains, and Twitter dropped out of the Top 10 earlier this year.
What Would You Like to See?
There are a lot of ways to slice the data and we have quite a few ideas in the pipe, but if there are specific, large-scale metrics you’re interested in, let me know. We’re trying to incorporate community feedback into the product development plan. Also, feel free to make suggestions on the @mozcast Twitter account.
I’d like to quickly thank Devin and Casey for doing the behind-the-scenes work to get this page integrated, and to Devin in particular for turning my single, rambling page of stats into a pretty slick design. Thanks as usual to Dr. Matt Peters for feedback on the math, and to Rand for putting up with dozens of emails and somehow reading them all on top of his other 23 hours/day of work.
Pardon a shameless plug, but if you’d like to hear more about the history of MozCast, I gave an hour-long presentation about it at MozCon in July. The online MozCon videos just went on sale yesterday. Even if you hate me, there’s 16 hours of other great content and you can just fast-forward over my part – I won’t mind, really *sniff.*
thanks Dr. Pete. I like your SERP count thing. i'd be interested in seeing an Ad Count. or some ratio of ads to organic listings over time.
Thanks for the suggestion. We're not currently parsing the ad blocks, but is certainly possible. I'm looking at ways to capture more data about the overall visual display and CSS, so that we can track feature changes.
*cough* knowledge graph *cough*
yeah. the knowledge graph are crowding out the organic results in a big way, in addition to ads. these changes are huge!
Awesome work, Pete! Mozcast is now 5x more useful!
Being somewhat of a domainer, the EMD and PMD metrics will be something I watch consistently. Really liking what you've done with Mozcast and am looking forward to watching the trends map out over the coming months/years. Thanks for another great tool!
Someday, we'll get you to stop your evil domaining ways ;) Seriously, domain choice is still really important - not just for branding, but for SEO. I just don't want people to over-rely on it - the days of domain buying being your entire SEO strategy are behind us.
Very true, domain buying is definitely not the core of an SEO strategy. You are right. Domain buying is largely about branding. Honestly, I think this is also true for domaining and not just SEO. I don't own any hyphenated domain names and the domains I've placed the most value in are those I consider to be brandable (e.g. thinspirational.com and squatchapps.com).
This is like #RCS! Love it.
I wasn't sure where the Mozcast was headed in but I can see so many more data buildout opportunities now.
Uhh awesome improvements. Good job! Analysis, analysis, analysis (competitors)
I would be interested in tracking informational queries vs. commercial queries. That might be easier than breaking out specific verticals.
In some ways, it's harder, because you have to quantify that query by query (for 1,000 keywords). There's no good, automated way to tell what's commercial, and sometimes that decision is even a bit subjective.
Hey Dr. Pete, you can track it by measuring the ads volume (number of ads shown) for a certain query. This might be a better way to decide what it considered "commercial" by Google.
I think ads and organic have a lot in common.
Also it will be great to see the future (Panda, Penguin and other) updates and their impact on queries with high ads volume.
That's an interesting idea - thanks.
How do you guys get away with pummelling Google? How many queries per hour are coming out of the 'plex aimed at Google. Do you have a special arrangement?
I'm sure if I tried this at home there'd be "Oooops - you appear to be an automated scraper" messages.
Obviously, that's a delicate subject, but by the standards of these kind of systems and on the scale of Google, what we're doing hardly qualifies as "pummeling". It's a relatively small data set only crawled once per day and spread out over a fairly large amount of time. We try to synchronize when any given keyword is crawled (so that each crawl is roughly 24 hours apart), but the set is crawled over the space of a few hours.
Thanks, not just make the data available, but making it beautiful - awesome data visualization ftw!
Nice new stuff, I think as we get more data we will see some interesting trends going on.
For your SERP count, what exactly is included? Is it just plain organic results or also Places, Videos, News... An infographic style SERP graphic on what gets included would be informative.
From what I see sitelinks have an influence over the shorter SERPs. With your data set can you confirm/deny that? I'd also like to know if domain diversity correlates to sitelinks. A little side project for you ;-)
Another interesting area of analysis would be in rich snippet adoption. Are more author images showing up, or review ratings, breadcrumbs, videos etc. I'd hope that graph would have an upward trend.
I also like @Martin Panayotovs suggestion of tracking ads and @evolvingSEOs suggesting on the knowledge graph. If just to see percentage changes.
You may also have to start forecasting weather around the world. Walmart (from the big 10) does not do so well in Australia, mainly because they don't exist here.Local must be an issue, does your crawler anonymise location so it gets a general US set of results? Localised results can get infected with places entries and be really skewed.
You're idea has opened up our eyes to a lot of new ways to look at things. Thanks.
Sorry, only one question/suggestion per visitor, please ;)
Let's see... currently it is only organic results - we parse those out and save them. The historical data is where the magic comes in right now. If we can think of a new way to crunch the numbers, we can crunch them back to April, which is pretty cool. The next big step is to expand the data set in a way that can be segmented (probably by vertical/category). We don't have other types of results yet, but I can see that down the road.
Data is de-localized and targeted to generic US results. We also took efforts to remove keywords with clear local intent (e.g. "new york restaurants").
The 7-result SERPs almost always occurs when the #1 result has site-links or other rich data. There's not a clear correlation between diversity and 7/10 results, but there are a lot of ways to count diversity, so it's been tough to tell. There is some very interesting stuff going on with the 7-result SERPs, and it's much more complex than most people realize. We suspect there are two "shards" of the algo in play and are collecting data to back that up (or refute it).
International is on the road map, but not sure when yet. UK and then Australia would probably be first, because we can use the English keyword set. I'm not sure if we'll expand to non-English countries, just from a scope perspective, but might be open (down the road) to partnering with people.
It is amazing that one website (wikipedia) can hold 4% of searches. I realize this is only one data set, but no matter how you measure SERPS, wikipedia is still going to pull a lot of weight.
Also, I am encouraged to see exact match domains only showing up 3% of the time. I think we have a long way to go before partial match domains are effected, but exact match domains are not the power-player they once were (although they still work for low competition niches.)
Dr Pete. this is a huge improvement to a tool that was already fun to track. Thanks for working so hard on mozcast.
You are genius Dr.Pete! You have put really hard work. A big thank from me for doing such hard work for the community. As Barry says SEO community is strongly bundled community among all.
I am really curious to know more about EMDs and Mozcast has done it. Your last post was really good dive into deep. There are so many queries and results I have analyzed before you incorporated data here in Mozcast. I have also started a discussion on LinkedIn group of SEOmoz. I firm believe and Mozcast also suggest the same that more than three percent result dominated by EMDs. I have seen so many people seen who are choosing EMDs for selected keywords and performing well without doing more work. I am scare about that a little bit because I think Google will go for algo change if not in near future then any other day.
I am too young to suggest, but still I want to say that social media is hot topic. Mozcast should include social signals. How many percentage of SERPs results social media platforms occupy? Might be you can choose few top social media platforms for that.
Once again Thanks to whole Mozcast team!
We don't have any current plans for social data within MozCast, but we're definitely pursuing social metrics in the larger SEOmoz family of products. In case you missed the announcement, the FollowerWonk team is now part of SEOmoz, and their social metrics are available for free to PRO members.
Great work! I love the mozcast site.
How do you explain allrecipes.com in the top 10? Does your list of 1,000 keywords contain lots of food related queries?
A fair amount - yes. The #10 spot changes a lot, though. Other #10 sites on any given day over the past couple of weeks include Yelp, Medicine Net, and Mayo Clinic. There are a lot of consumer-related terms in the list.
Technically Matt wasn't really lying. He did say that the algo change "improves the diversity," but like you say, it's only by a barely-even-worth-counting 0.4%. It's headed in the right direction admittedly, but it certainly doesn't sound like normality - and common sense insofar as the SERPs are concerned (if common sense does come into play with the SERPs at all) - has been restored.
Exactly - (1) we don't know what he means by "diversity", and (2) we don't know what he means (or how much) by "improved". What bothers me is that it's a sort of faux transparency. Looking even at this year, the diversity situation appears to have gotten a lot worse - recent improvements maybe make up for 20-30% of the loss after Penguin.
Hmmm... so you think PMD got targeted due to a higher percentage of them that might have been see as spammy by Google quality team? I know my own experiements have shown Google has never been a real fan of "Buy-widgets" over "buywidgets". Is there anything in that data that shows details about types of TLDs or ccTLDs that might have dropped out of the top 10?
There's a lot more data in my recent EMD post - see the last few graphs. PMDs got hit hard by Penguin and never really recovered, but it isn't clear that hyphens are necessarily the kiss of death - both hyphenated and non-hyphenated show the same decline. Dot-com PMDs held a little stronger, but there are a lot of correlations baked into that. I think that the kind of sites that use low-value PMDs also tend to use the kind of linking tactics and anchor text that Penguin smacked down.
My perception of Mozcast just went from "huh?" to "wow!". Thanks for the efforts you've put in to build out this tool and for the post explaining it all. Much appreciated.
One thing we really want to make clear going forward is that there's a lot of data behind this. The weather analogy is fun, and I think it's useful, but it's just a start. I also believe that data is only as good as what you do with it.
Right on. As much as I appreciated the tool (prior to this post), I wasn't sure what to do with Mozcast or how we could use it to help our clients. Now it all makes sense. So again, many thanks.
very interesting information.
by the way, I missed that Cutts tweet!
This is great! It's so much more meaningful and applicable than just one number. Thanks!
This is great! the more data we have, the more we insight we gain on the algro. According to the mozcast.com most of the changes tracked were last Thursday and Friday, just before they announced Panda 2.5 official release! Looking again today, there is still some fluctuation as a result of the recent update. Thank you for the mozcast 'weather' tool, it ROCKS!!
Pete,
I saw your MozCon Presentation. It really inspired me. So I have asked our dev team to create something similar to what you are doing. It will give us and everyone else another data comparison tool. We will put up the analysis on https://www.agencyplatform.com . But our plate is full for now and it will take us at least few months to come up with this.
In the meantime some cool ideas for you to implement on Mozcast:-
1) A Subscribe option so that marketers can subscribe and get an alert as soon as changes reach certain threshold. Maybe even compare what kind of update it is (with reference to Panda or Penguin) in the email?
2) Update Subscribers for change in UI (track Overall Visual Display and CSS) ?
3) Update Subscribers if average no of first page results change (Increase or decrease) ?
Thanks for this awesome tool! And keep inspiring us!
Thanks - glad you enjoyed the presentation. We're definitely pondering some notification options, although mass email always gets a bit tricky. Also actively toying around with (2) - I built a crude measurement of source-code change, but it's not very effective. I want to start tracking things like vertical results, Knowledge Graph, Local, UI elements, etc. to catch feature changes ASAP. That gets tricky, because now you're diving deep into source code that changes all the time. It's doable, though.
although mass email always gets a bit tricky
Yeah. I would not send emails from seomoz domain. Mozcast maybe? But if you do a double opt in and use a good transactional email vendor, it should not be tricky.Just ensure that subscribers have option of being notified daily, weekly or as it happens in real time and also the ability to set maximum number of emails they can receive per day. the last part is important because on a crazy google day you do not want your system to send emails every hour ;)
Hey Pete,
Check out Sendgrid for the email stuff.
-Mike
Yeah. I like Sendgrid. We have been using it for close to six months. Can't complain.
Thanks for providing this comprehensive data.
My suggestion: provide more segmented data (local vs general serps) and maybe even by some verticals (re industries). This would provide us with more "local weather" as some vertical serps have different number of results, ads and images.
But this is already a great step ahead! Thanks again.
We're looking to segment out some verticals, but we need to expand the data set. Right now, once we carve it up, it's too small to be reliable (statistically speaking), so it's hard to slice and dice. Expansion plans are in the works, but it's probably going to be a couple of months.
Statistically reliable is the name of the game. :)
That's a big step in the right direction! 2 thumbs up.
Is there any intent to apply all of this awesomeness to Bing yet?
It would seem like all of the code assets to do it would be on-hand at the 'moz already, and just need grafted together. Though maybe not.
We've decided to deep dive into Google for now and focus on really understanding that data and expanding before we tackle Bing. Even if the core logic is similar, the deeper analyses are very different. I don't want to get stuck in a mode where everything we do has to be useful to both, or we may end up with something too generic.
Thanks dude, Personally I feel this is great research with Google update. This is completely related to Google's new panda 3.9.2. update.