At SEOmoz, we compute and track correlations against Google search results with each Mozscape index release. Recently, we've noticed some interesting changes in the page level vs domain level link correlations and decided to investigate. We uncovered some striking differences between the new 7-result SERPs and the standard 10-result SERPs.
Tracking Link Metric Correlations in Mozscape
Before I dive into the data, I want to provide some background information on our data set and methodology. We use correlations against Google search result pages (SERPs) to track algorithm changes and the quality of our Mozscape index. We have published the results of these many times in the past, including the Search Engine Ranking Factors post and in the blog post announcing each monthly index update (see the September update). To summarize the process, we first take a set of keywords and run them through Google to collect the top 10 or 50 results. We then pull the link metrics from Mozscape for each URL in the SERPs (Page Authority, number of linking domains to the domain, sub-domain mozTrust, etc). Then, we compute the Spearman correlation between search position and each metric for each keyword. Finally, the correlations are averaged across all the keywords to produce one number for each metric, the mean Spearman correlation.
Since Mozscape includes some 40+ link metrics for each URL, this process results in 40+ correlations. In practice, many of these correlations are similar, since link metrics themselves are similar. For example, we'd expect that the correlation with the number of links to a page to be similar to the correlation with number of followed links the page. A conceptually useful way to summarize the data is to group them into page level and domain/sub-domain level metrics. Page level metrics are associated with the actual page itself: Page Authority, number of links to the page, number of domains linking to the page, and mozRank. Domain and sub-domain level metrics measure the link authority of the entire domain, for example the Domain Authority and number of domains linking to the domain. As a concrete example, imagine an unpopular page buried on Wikipedia without a lot of direct links. It will have low page level metrics but we might expect to rank simply because the Wikipedia domain has so many links to other pages.
With these preliminaries out of the way, we can dive into the data.
This chart shows a time series of Mozscape correlations on the page and domain/sub-domain levels for all the index updates since November 2011. Focus first on the solid green (page) and blue (domain/sub-domain) lines. Each index update is marked with an "X." Except for a smaller, lower quality index (36 billion URLs) and the larger, experimental 150+ billion URL indices, the two values have been more or less constant over time. The 10,000+ keyword, top 50 result SERP set was updated every two months or so during this time, so both Google algorithm updates and Mozscape releases are represented.
Now focus your attention on the September Mozscape update at the far right. It includes two sets of correlations: the solid line and "X" represent values from SERPs fetched in late June. The dashed lines with filled circles indicates values from a SERP set fetched in mid-September. Everything else remained constant: the keywords did not change, and both sets of link metrics were pulled from the September Mozscape index. However, the correlations jumped in an interesting way. Every page level correlation increased and every domain/sub-domain correlation decreased. I haven't seen this type of behavior since I started tracking these values a year and half ago, and it was the motivation for the following analysis.
Enter Mozcast data, stage left
I had a suspicion that this jump in values was due to an algorithm change at Google, and wanted to see if I could tease it out of the data. Dr. Pete was kind enough to provide a data dump of the Mozcast SERP history from July 1 to September 15 to do some more analysis. Even though the Mozcast data only includes the top 10 results for 1000 keywords, it provides a daily time series to pin point the change. More data FTW!
This chart shows the time series of page and domain/sub-domain correlations from the Mozcast 1000 keywords. The solid blue line is a smoothed version of the raw data (the noisy light dashed line). There are a few things to note here. First, the magnitude of the correlations are different from the first plot but the overall trends are the same. The differences are due to the different data sets (1000 vs 10,000+ keywords and top 10 vs top 50). Second, the page level metrics do indeed increase over the time period, with a noticeable increase centered around August 12-14, the days when Google started displaying the 7-result first page (see these two posts for more information about the new 7-result SERP). Finally, the domain/sub-domain metrics decreased during the last two weeks in July, at the same time domain diversity decreased (see the 90 day history of diversity over at Mozcast).
So, how about those 7-result SERPs?
I was intrigued by the idea that the new 7-result SERP might be associated with an algorithm change, so I decided to probe further.
The 7-result SERP was fully rolled out by August 15, leaving a month of data after the change to analyze. This is a histogram of the percent of days from August 15 to September 15 (31 days) that each keyword had 7 results. The important thing to notice here is that it has two spikes at 0% and 100% and not much in between. Put another way, most keywords have either 10 results or 7 results on all days, and only a small portion alternate between the two cases. With this data, I created two cohorts, one with keywords that had 7 results for 30 or 31 of the days and a similar cohort of keywords that had 10 results for 30 or 31 of the days. All told, there are 144 7-result keywords, 808 10-result keywords and 48 flip-floppers.
With these groups, it is possible to compute link metric correlations for the 7-result and 10-result keywords separately. The results are striking: the 7-result keywords (red) have near zero domain/sub-domain link correlation, but have a huge page level correlation! On the other hand, the 10-result keywords (green) are much more balanced between page and domain link signals.
Now, we all know that correlation is not causation and these results are only averaged over a small sample of keywords that may not be a representative sample of the entire universe of keywords. In addition, any individual keyword may exhibit different behavior then the average. That being said, if we indulge ourselves and ignore these caveats for a thought experiment, we can revisit our example of the unpopular Wikipedia page without many direct links. This page has amazing domain/sub-domain link metrics but poor page metrics. If this page is competing for a 7-result keyword, all the Wikipedia link authority wouldn't help it rank. On the other hand, if it is competing for a 10-result keyword the Wikipedia link authority will help it rank.
Got any more data for us?
Just a bit more. We can bring in some Adwords data to see if there are are any other systematic differences between the 7-result and 10-result keywords.
Here, I've plotted histograms of the Adwords "Competition", the log of the US monthly search volume and the log of the cost per click. As in the prior chart, red lines represent the 7-result keywords and green the 10-result ones. We can see that there are (statistically significant) differences in the competition and CPC, but they have the same search volume. The new 7-result keywords have lower competition and CPC.
OK, so what does all this mean?
I'm not certain, so I'll offer a few ideas. I'd like to hear your interpretations and experiences in the comments below.
- It doesn't mean anything. Just like the cat chasing it's tail around, you are chasing phantom signals around in a noisy data set. This is possible, but I don't think so.
- Those correlations are just so different, Google must be using a different algorithm for these 7-result and 10-result keywords. Ohhh, now that is tantalizing isn't it? I suppose this too is possible, but not likely either. If they were, then they have been using these two different algorithms long before they rolled out the 7-result SERP since the split in the correlations has existed since at least the beginning of July.
- These 7-result keywords are systematically different in some way then the 10-result ones, and we are seeing symptoms of that in the correlations and the Adwords data. Imagine the process that takes a search query and returns the SERP. The first step very well might be an new classifier that decides whether to return 7 or 10 results before passing the query onto the rest of the ranking model. This classifier takes some inputs - perhaps some information about the link metrics as well as some additional information - and makes a decision. In the process it preferentially selects queries from a part of the keyword space that includes low domain correlations, high page correlations, and low CPC.
A final shout out to Dr. Pete and Jerry Feng
This post wouldn't be complete without acknowledging Dr. Pete and Jerry Feng. Pete graciously provided the Mozcast data used in this analysis as well as encouragement and insight. He also kept my crazy ideas in check. Jerry is SEOmoz's newest data scientist and helped with the initial analysis. He's currently thinking about how to best improve the Page Authority and Domain Authority models.
Nice work and that was really complicated :-) Since the 7 result SERPS are usually for branded terms, it kind of makes sense that the results should be domain level and not page level.
Glad you liked the post!
Unfortunately I don't think the answer is that simple, and was one of the reasons I looked at the data. There are two pieces of conflicting information. First, if I scan through the list of 7-result keywords, there are quite keywords that are not branded, and these are pretty generic search terms. Second, the 7-result SERPs have the lowest domain/sub-domain level correlations (close to zero, the bottom red line in the second to last graph) so their domain wide metrics are not helping them, on average.
Can you give examples of non-branded KWs that yield 7-result SERPs?
"six sigma" returns 7 results for me, and this isn't branded. The first result is a Wikipedia page.
Which is darn puzzling, because that's a high-volume, high-comp KW. There are plenty of decent results on page 2. Has Google actually said what they're doing with 7-result SERPs?
That's interesting because that search both six sigma and "six sigma" in the UK returns a 10 result page. I'm not signed in.
Very very interesting analysis, except the lack of axis unit labels in the last graphs makes them quite hard to grasp.
Could you describe them a bit more ?
Thumb up anyway ^^
Thanks for the comment and question. I should have clarified that in the post. I didn't add labels because the values aren't very meaningful, but I'll try to explain it here. I'm more of a visual person, so the important thing to me was the difference in the shapes.
These curves are histograms, so the height of each line represents the proportion of those keywords with each value. For example the bottom panel shows that a much larger proportion of the 7-result keywords have low CPC compared to the 10-result ones. Since the middle panel curves are very similar, the distribution or spread in the search volumes are nearly identical. The top panel shows that a large portion of the 7-result SERPs have low competition compared to the 10-result SERPs.
Warning: the rest of this comment will get exceedingly geeky...
As for the units, the x-axis units correspond to what is plotted in each panel, so from top to bottom they are:
Since these are probability density functions, the area under each curve is 1, and the units on the y-axis are 1 / (x-axis bin width). There are 10 bins in each, so the values in the top panel are 1 / (0.1 competition) and similarly strange units for the remainder.
Makes sense ! Thanks for the explanations.
I've been there before, it's hard to choose between beeing extra precise with all indexes and numbers even when they won't mean a thing to some people, or dumbing things down to a more readable but unprecise description like
------------------------------------------------------>
Less competition More competition
Hard to have it all^^
Thx again for this nice analysis.
If I'm interpreting the data correctly the volume for the 2 sets of keywords is similar but the cpc and competition is a lot different?
If that is correct then:
A set of keywords brings lower income but has a potential to be a lot more profitable (for google) since the volume is comperable. So what would you do if you were google?
Increase the CTR. Limit the choice with 7 results.
That would be an Evil Google hypothesis now wouldn't it :-)
There is nothing evil about it, just good business.
People have bought the google mantra: "Do no evil", too literally.
In reality Google is a company with shareholders.
This will increase their profits, with no or very little impact on regular user experience.
I run with 100 results for years now so I see no problem with this.
20+ results from the same domain, and lousy results that's something to rant about ;)
Anyway I would call Google hypocritical at most but the same could be said for 99.9% percent of multinational companies.
Hmmm very interesting analysis but It says just nothing - need more data to get some results. It's like analysing Yandex xml in Russia ;)
Thanks for a great analysis! I am wondering though, why a 7 result SERP is better than a 10 result.
Is it more efficient for Google's servers to output 7 results per page, than 10? Faster, and more 'ready for more vertical media' searches?
It definately makes ranking a bit harder...
It's better for google. Less choice, more ad revenue.
Totally went over my head lol
Same here, had to re-read it a few times :)
Rigid scientific analysis. Nothing less, nothing more! Awesome!
This is definitely the most interesting part of this post. Is Google handling domain diversity differently on 7- and 10-result SERPs?
Wow! I had to read the article several times, and it is fascinating. Will it be possible to predict which keywords will lead to 7- or 10-result SERPs?
Great research, love it!
Let to chew through there but while were on it - 'crystal healer' gives 7 results too.
Not on my serps.
Fascinating stuff!
I'm personally seeing very few of these in Australia so far, but I expect they'll start becoming more common.
So many people site ranking down after Recent Google Algorithm Changes. How to save ? its a hot issue there is need stpe steps by step guide to save google penalty
Hello Matt,
This is a very helpful post. I also do some of my test regarding generic and branded keywords and I find this article truly informative. The search algorithm updates are sometimes hard to decode. There’s a lot of complication but some SEOs are still waiting for more updates before making the move. Again, thanks for sharing!
timely post. i thought this was just a difference between mobile searches and desktop searches. this feels like a difference between longer tail keywords and head terms based on the adwords competition data and what i am seeing with our terms. it would make sense to show 7 results for long tail terms since there is less quality pages on those terms. tx
Great page investigation of page level vs domain level link correlations. It is good to know about this changes, i have seen the 7 search results many day ago. You have written really nice article about it. +1 from my site for this content.
thanks Matt, very informative. I'm starting on some analysis with keywords. can you share what keywords sets that you used? is the kw sets public info?
I have seen in Italy https://www.google.it/#hl=it&output=search&sclient=psy-ab&q=Preventivo+trasloco&fp=6a018a265500e3a8 .
Thanks for this post. It provides great insight on the latest algo update. SEO is constantly changing and thats what makes it exciting and frustrating. I was once flying high on page 1 but once google rolled out the 7 pack I got pushed down to the bottom of page 1 or now page 2. Trying to figure out how to get in the 7 pack. Any ideas?
Nice post but I want to highlight about one of post which suggest that
Google Search Results: From 10 To 7 To 4?
Here is URL https://www.seroundtable.com/google-four-results-15658.html#disqus_thread
Can you please lookout and give us suggestion?
Thanks for the link - I hadn't seen that. I haven't seen any 4 result pages and "six sigma" returns 7 results for me now. Smells like a A/B test to me...
Yes Matt you are right
Wow, TONS of data and interesting analysis. Thanks for the post!
You should do an analysis on the 10-result data as though they are 7-result pages. In other words, it could be that 2 algorithms are in use, one for 7-result pages and one for 10-result pages. But, it could also be that 2 algorithms are used on all pages: one for the top n results and another for the rest.
This wouldn't surprise me at all to learn this was the case and that insight was buried if we don't look for it. In fact, I have seen evidence as I look at my own site's behavior on the SERP's that there is some funny business going on between pages that rank very high and pages that are much lower.
One bit of detail I left out of the post, but you've hit on is that I did only use the top 7 results from Mozcast for all the correlations, otherwise we'd have a bit of an apples to oranges problem when comparing them to the 7 result SERPs. In general, I've found that using more results increases the correlations. For example the spread in the metrics between the 1st and the 100th result is much larger then between the 1st and 2nd so correlations increase.
More to your point about two separate models for the top N vs the rest - that's intriguing isn't it? If I put on my software engineer's cap for a minute, something like this might make sense for the first page vs the remaining pages given the performance/latency requirements of the first page. The first page needs to be returned near instantaneously, but the subsequent pages can have much longer lags. This opens some doors for instance if the first page results need to be served from memory or the local disk, but the subsequent pages can be fetched from a remote data store.
Haven't noticed any 7-serp results in Australia yet, but will definitely keep an eye out while doing some manually checks during reporting next week.
We all appreciate SEOMoz for explaining in detail the changes in Google Algorithms as well as the SEO/SEM impacts. This post was especially data intensive. Thanks Matt.
This post reminds me of Einstein and his Glorified Unified Field Theory
so your basically saying it won't stand the test of time? :)
Lots of data and work, yet to be conclusive :0
Is it possible that only queries defined by Google Quality Raters as "Vital" are triggering the 7 SERPs?
Good idea! I just checked the Google quality guidelines that were leaked a few weeks ago. Here's the definition of Vital:
This certainly seems like the case for branded queries, but it doesn't hold for the non-branded ones. For example "six sigma" is a 7-result SERP and the first result is Wikipedia.
I think there's a relationship with dominant interpretation, but it's not direct, if only for practical reasons. Quality Raters can't drive results directly - it just doesn't scale. Quality Raters use "vital" queries and similar concepts to fine-tune the algorithm, and in some cases, this may (I'm speculating) lead to manual overrides (if a major brand isn't ranking #1 for its own site), but Google still prefers to do 99.9% of the heavy lifting with the algorithm.
So, in other words, it may be that many 7-result SERPs are "vital" queries or queries that have a dominant interpretation, but there's still an algorithmic rule in place that triggers it. We don't have a good handle on the brand/non-brand split, but as Matt said, there's definitely some non-brand queries in the mix. Interestingly, too, there are a handful of queries that flip from 7-to-10 and vise-versa every couple of days. If it was based on human rating data, you'd expected that to be more fixed.
"Interestingly, too, there are a handful of queries that flip from 7-to-10 and vise-versa every couple of days."
Split tests?
Doesn't seem to be a test - seems to be query-dependent. Some keywords never budge and others have flipped back and forth for the last month. I strongly suspect that it's algorithmic, and they're right on the edge of whatever threshold is triggering 7-result SERPs. I'd also speculate that Google has tweaked that 7-result algo a couple of times since launch.
I definitely agree with the idea that there are dominant interpretation indicators, and if these indicators are within a pre-set range for a subset of data points, then the 7 result SERP is triggered. The 'six sigma' example is the first non-branded query I've seen produce 7 results though.
Can you share a couple more, ya know ...for science?
I'd like to analyze a few to see if I can contribute anything meaningful.
Also Matt, fantastic post! I'd love to see more data driven posts from Moz in the future. I will mention though, as I'm sure you're aware, anytime you pull results by top 50 or top 100 per page, your data will be heavily skewed because of the domain crowding issue. So instead of Amazon having 1 result in 10 RPP serps, they might have 7 (or more) results in 50 RPP which would drastically skew your domain level link metrics. I've actually seen some queries produce a SERP with a single domain holding the top 12 positions.
I love data and this still made my head swim! I only saw 7-result SERPs for a day around when Dr. Pete was talking about Shrinkage. Since then I've only noticed 10-result SERPs.
the 7-result keywords (red) have near zero domain/sub-domain link correlation, but have a huge page level correlation! On the other hand, the 10-result keywords (green) are much more balanced between page and domain link signals.
This is the heart of the post. But it is just what we already knew about authority, relevance, and link signals. The Wikipedia example is a good example. But pages that rank well (in this case the red line) are more affected from fluctuations in SERPs, because of huge page level correlation missing link metrics about domain. On the other hand, page from domains that have both, are more likely to still remain in SERPs, so not to be outranked easily.
Tks for sharing this data detective work
I love data detective work. This is fascinating.
Is anyone seeing 7-result SERPs rolling out elsewhere around the world?
I've seen them in Italy and Spain, but not so frequently.
For instance check this with "Scala di Milano" as query:
https://www.google.it/search?&q=scala%20di%2Bmilano&pws=0
I've seen a few in Australia (can't think of any off the top of my head as it was very late night/half asleep browsing).
I agree with you Will, it's always nice to see data detective work and folks who's willing to share.
I've seen this from the first day I heard of the 7-result serp. Some examples of these result in Scandinavia are:
https://www.google.se/search?hl=sv&q=ikea&pws=0
https://www.google.no/search?hl=no&q=vikingskipshuset&pws=0
https://www.google.dk/search?hl=da&q=Glyptotek&pws=0
These types of result occurs mostly with brand related queries. So I would say that the 7-result serp is common to see with those types of queries.
I have also seen 7 result SERPs https://www.google.com/search?q=GeoTrust+Wildcard+SSl&pws=0&gl=US
I haven't noticed a change in Denmark, except for brand specific searches, as Carl Joel wrote.
Hey Matt,
Thank you for sharing this data. Is it possible to share some more information about the keywords you used in this analysis? I figured i'd try to translate them to Dutch and see if I get similar results in some way. That could indicate that certain types of queries or words trigger the 7 SERP.
I'm usually all for sharing data, but unfortunately I can't in the case. These are the same keywords used in Mozcast and we'd like to keep them secret to protect that project.
is this saying that the domain name of a site as a ranking signal has been decreased in the 7 SERPS?
I didn't look at the domain names. Dr Pete put up a great post two weeks ago that looks at that in detail:
Are Exact-Match Domains (EMDs) in Decline?
During the back and forth for this post, we looked at the influence of EMDs for queries with 7 results vs. 10 results, and there wasn't really a clear pattern. In some cases, a domain that got the top 2-3 spots in 10 results dropped down to just 1 spot with site-links. At the same time, though, that one spot is now 1/7 of the results. So, some domains went from 2/10 to 1/7 - other queries didn't really change or got less diverse. The end result was a bit of a wash. It feels like diversity is part of this, but we can't find clear evidence in the numbers yet.
these key terms showing 7 result as well:-
1- gatwick airport
2- manchester airport
3-aph parking
4- birmingham
5- purple parking
not sure why but definatly there is a trend...