This past week during the SMX Advanced conference in Seattle, I presented some correlation data alongside Janet Driscoll-Miller, Sasi Parthasarathy of Bing & Matt Cutts of Google. Matt in particular was quite vocal in expressing a desire to see additional data points from our research, primarily around the prominence/visibility of particular elements in the results. This post is intended to help make that available.

2 Tweets from Matt Cutts

I must say that I don't agree with Matt on the importance of the raw visibility/counts over the ranking correlations. My feeling is that SEOs in these spaces are more interested in answering the question - "what features predict a result will rank higher vs. lower on page 1?" - rather than the more straightforward - "does this feature appear more frequently on page 1 at Google or Bing?" However, I certainly agree that both are relevant and interesting.

If you're trying to wrap your head around how to understand this prominence/visiblity data vs. our earlier data on the correlation with rankings, here's how we'd best describe it:

  • Correlation w/ rankings data helps to answer the question, "when this feature appears in results on the first page of Google/Bing, who ranks it higher and by what amount?" Those correlation numbers were derived by looking at the liklihood that a result would rank above another when it contained the target attribute.
  • Visibility/prominence of an element helps to answer the question, "is this element more likely to appears on the first page of Google's/Bing's results?" This simply looks at the number of times we saw a result (or multiple results) ranking on page 1 containing the target attribute.

We're looking at the latter one in this post, but before we dive in, there are a few critical items to understand:

  • This isn't correlation data and there's no standard error or deviation numbers here. It's simply how many times we saw the element in the results we gathered, divided by the total number of results (SERPs or URLs depending on the chart) to get a percentage. 
  • This data is from page 1 of results from 11,351 search results, gathered from Google's AdWords categories. This means the terms and phrases vary somewhat in search quantity (from sub-100 searches per month to tens or hundreds of thousands) but generally have a commercial focus and a intent. They generally don't include brand names, long tail phrases or vanityname searches. Overall, we picked them because they're precisely the kinds of queries most SEOs care about when they're doing competitive SEO for their companies and clients.  We also ignore the second result in a SERP from the same domain to avoid effects of indented results (which was important for our earlier statistics, but not those in this post). 
  • The results were collected the week of May 31st and thus, include post-"Mayday" update SERPs and likely results from after the "caffeine" launch as well (though Google did not announce when exactly that rollout occurred - it may not have much bearing as caffeine supposedly is an infrastructure, rather than an algorithmic change).
  • Each feature contains two pie charts, one showing the percentage of results that contained at least 1 URL with this feature and another showing the percentage of total URLs in all results (102,296 for Google and 109,966 for Bing - note that some SERPs will fluctuate the quantity of standard web results they show on page 1). These are labeled as "(feature) in SERPs" and "(feature) in URLs," respectively.

In gathering this data, we did not optimize to share it in this fashion. In fact, Ben & I both feel that if we wanted to do it this way, we should gather the first 3-5 pages of results, not just the 1st page.  The way, one could compare the counts on page 1 with the counts on page 2.  However, since we've got the data and Matt, Sasi and several other folks expressed interest, we're sharing anyway. Hopefully in the future we can do more on this front.

Let's dive in!


Exact Match Domains

These are domains that precisely matched the keywords in the query - e.g. for the query "dog collars" only a domain that matched *.dogcollars.* would be included.

Exact Match Domains in SERPs 

Exact Match Domains in URLs

You can see that Bing has slightly more exact match domains appearing in at least one result of the SERPs we collected and in the overall count of results (all the URLs from all the SERPs).

Exact Match .com Domains

Similar to exact match domains, exact match .com domains had to contain the exact query in the domain name and have a .com TLD extension.

Exact Match .com Domains in the SERPs

Exact Match URLs in the SERPs

Again, Bing showed a slight preference for displaying results from these sites in the SERPs and URLs we observed.

Exact Match .net Domains

As above, but replace ".com" with ".net."

Exact Match .net Domains in the SERPs

Exact Match .nets in URLs

The similarity is much closer in the number of total URLs we saw with .net exact match, but Bing is showing a preference in the SERPs count.

Exact Match .org Domains

In the .org TLDs, we start to see a bit of what we observed in the ranking correlation data:

Exact Match .orgs in the SERPs

Exact Match .orgs in URLs

This is the first exact match domain TLD where Google actually had more SERPs containing a result of this type. Bing, however, had a very tiny amount more URLs with this feature.

Exact Hyphenated Match Domains

One of Matt Cutts' complaints centered around how Google vs. Bing handled exact hyphenated match domains. When we observed them in ranking correlations, it appeared that, when Google listed them, they would rank them higher than Bing did when they appeared on that first page of results. However...

Exact Hyphenated Match Domains in the SERPs

Exact Hyphenated Match Domains in URLs

As I called out in the presentation and the prior post, Bing has quite a few more SERPs where exact match domains appear and somewhat more URLs, too. This is another data point that should make us all think carefully about the fallacy of presuming correlation = causation. Bing might have a preference for exact hyphenated match domains, but the ranking correlations suggest to me there's more going on here - maybe something to do with anchor text or where those types of sites tend to get links or something else we haven't considered?

It's critical to keep in mind that we're just looking at individual factors here - not trying to explain why they exist or correlate (at least, not in the data).

Results that Include All Keywords in the Domain Name

Here we looked for domains that contained the keyword query in the domain, even if the match wasn't exact. For example, mydogcollar.com would now match for the phrase "dog collar."

All Keywords in the Domain Name in the SERPs

All Keywords in the Domain Name in URLs

Again, it's Bing that shows a higher number of these types of domains in their results.

Results that Include All Keywords in the Subdomain Name

We've previously shown some data suggesting that subdomains might have some ranking influence, but not as much as root domains (this was done using our rank modeling / machine learning process). Here's some raw data on the number of times we observed keyword matching subdomains:

Contains all Keywords in the Subdomain in SERPs

Contains all Keywords in the Subdomain in URLs

Perhaps not surprisingly, Bing again is showing more of these results in their SERPs and individual URLs.

.com Domains

For this feature and all the TLDs below, we're just looking at any URL that has the domain extension.

.com Domains in the SERPs

.com Domains in URLs

It looks like Bing has very slightly more .coms in their results vs. Google.

.org Domains

Let's see what happens for .org domains, recalling Google's apparent preference for them in the ranking correlations.

.org Domains in the SERPs

.org Domains in URLs

Oddly, Bing again seems to have more .org pages in the SERPs and URLs.

.net Domains

URLs with .net probably won't surprise you much:

.net Domains in the SERPs

.net Domains in URLs

Yet again, Bing is showing a small number more than their Googly competitors.

.edu Domains

Recall how, in the correlation data, the numbers were small(ish) but negatively correlated? Let's see what the number of results shows: 

.edu Domains in the SERPs

.edu Domains in URLs

True to the stereotype, Google is slightly ahead on number of .edu domains in the SERPs & URLs.

.gov Domains

Given the previous charts, this one likely won't surprise you:

.gov Domains in the SERPs

.gov Domains in URLs

Google has more .edus and more .govs, too.

Keywords in the Title Element

Not surprisingly, nearly every set of SERPs had at least one result where the title tag contained the keywords:

Keywords in Titles in the SERPs

Keywords in Titles in URLs

Bing shows up with more results that contain title tag to keyword matching. One thing that is worth mentioning is that we didn't observe the titles the engines chose to show, but rather the page titles from the results themselves. Hence, if a result was showing a DMOZ title or a brand title (which Goole will sometimes insert), we ignored those and just saw the title element on the page itself.

Keywords in the URL

This one actually surprised me, if only because there were even fewer results with keywords in the URL than in the title! 

Keywords in the URL in the SERPs

Keywords in the URL in URLs

Bing again has more results with keyword-matching URLs, though remember that some of that is probably from keyword matching domains, too.

Keywords in the H1

The ranking correlations suggested that the H1 tag isn't much of a differentiator, yet lots of people still swear by them:

Keywords in the H1 in the SERPs

Keywords in the H1 in URLs

The results would bear out that this is a much less frequent item than URLs or Titles for those ranking on page 1. Bing seems to show more of them than Google, though.

Keywords in the Alt Attribute

Alt attributes looked interesting last fall when we collected ranking information and once again provde worth a look in the correlation data from SMX Advanced. Let's see what the raw couts show:

Keywords in the Alt Attribute in the SERPs

Keywords in the Alt Attribute in URLs

Bing is showing slightly more of these, but if the positive correlation means something, these numbers certanly suggest there's lots of opportunity left for good alt attribute practices.

Homepages

Who lists homepages vs. deep pages in the results more?

Homepages in the SERPs

Homepages in URLs

My word! It's Google by a good margin. Bing's show of internal pages actually surprises me a bit, though perhaps that's an old stereotype I need to abolish.

And with that, we're done!


One important point to notice is that I've not included data on link results, as these would be hard to interpret and likely non-useful. Every page of results had pages with links to them and nearly every individual ranking URL also had links (a good sign for Linkscape's index, but not super valuable as a data point). There were a few other data pieces like this that wouldn't make sense here (keyword prominence in the body tag, word tokens in the body tag, domain name length, etc) and have thus been excluded.

I've done less analysis on these results in general, as I think the data is a bit less ideal for the purpose, but it's still interesting and hopefully, illustrative of general prominence. I look forward to seeing your interpretations and discussion!

p.s. If you email Ben at SEOmoz dot org, he will send you a lot of numbers in a TSV which is for each query the metrics for each result that we used in these posts.  You can also find raw results in a public Google spreadsheet doc here. Feel free to play around and let us know if you see anything else cool and interesting.