correlation does not equal causation

Today I'm going to make a crazy claim—that in modern SEO, there are times, situations, and types of analyses where correlation is actually MORE interesting and useful than causality. I know that sounds insane, but stick with me until the end and at least give the argument a chance. And for those of you who like visuals, our friend AJ Ghergich and his intrepid team of designers created some nifty graphics to accompany the piece.

Once upon a time, SEO professionals had a reasonable sense of many (or perhaps even most) of the inputs into the search engine's ranking systems. We leveraged our knowledge of how Google interpreted various modifications to keywords, links, content, and technical aspects to hammer on the signals that produced results.

But today, there can be little argument—Google's ranking algorithm has become so incredibly complex, nuanced, powerful, and full-featured, that modern SEOs have all but given up on hammering away at individual signals. Instead, we're becoming more complete marketers, with greater influence on all of the elements of our organizations' online presence.

Web marketers operate in a world where Google:

  • Uses machine learning to identify editorial endorsements vs. spam (e.g. Penguin)
  • Measures and rewards engagement (e.g. pogo-sticking)
  • Rewards signals that correlate with brands (and attempts to remove/punish non-brand entities)
  • Applies thousands of immensely powerful and surprisingly accurate ways to analyze content (e.g. Hummingbird)
  • Punishes sites that produce mediocre content (intentionally or accidentally) even if the site has good content, too (e.g. Panda)
  • Rapidly recognizes and accounts for patterns of queries and clicks as rank boosting signals (e.g. this recent test)
  • Makes 600+ algorithmic updates each year, the vast majority of which are neither announced nor known by the marketing/SEO community

how Google works

Given this frenetic ecosystem, the best path forward isn't to exclusively build to the signals that are recognized and accepted as having a direct impact on rankings (keyword-matching, links, etc). Those who've previously pursued such a strategy have mostly failed to deliver on long-term results. Many have found their sites in serious trouble due to penalization, more future-focused competitors, and/or a devaluing of their tactics.

Instead, successful marketers have been engaging in the tactics that Google's own algorithms are chasing—popularity, relevance, trust, and a great overall experience for visitors. Very frequently, that means looking at correlation rather than causation.

[Via Moz's 2013 Ranking Factors - the new 2015 version is coming this summer!]

We'll engage in a thought experiment to help highlight the issue:

Let's say you discover, as a signal of quality, Google directly measures the time a given searcher spends on a page visited from the SERPs. Sites with pages searchers spend more time on get a rankings boost, while those with quick abandonment find their pages falling in the rankings. You decide to press your advantage with this knowledge by using some clever hacks to keep visitors on your page longer and to make clicking the back button more difficult. Sure, it may suck for some visitors, but those are the ones you would have lost anyway (and they would have hurt your rankings!), so you figure they're not worth worrying about. You've identified a metric that directly impacts Google's algorithm, and you're going to make the most of it.

Meanwhile, your competitor (who has no idea about the algorithmic impact of this factor) has been working on a new design that makes their website content easier, faster, and more pleasurable to consume. When the new design launches, they initially see a fall in rankings, and don't understand why. But you're pretty sure you know what's happened. Google's use of the time-on-site metric is hurting them because visitors are now getting the information they want from your competitor's new design faster than before, and thus, they're leaving more quickly, hurting the site's rankings. You cackle with delight as your fortune swells.

But what happens long term? Google's quality testers see diminished happiness among searchers. They rework their algorithms to reward sites that successfully deliver great experiences more quickly. At the same time, competitors gain more links, amplification, social sharing, and word of mouth because real users are deriving more positive experiences from their site than yours. You found an algorithmic loophole and exploited it briefly, but by playing the "where's Google weak?" game rather than the "where's Google going?" game, you've ultimately lost.

Over the last decade, in case after case of marketers optimizing for the causal elements of Google's algorithm, this pattern of short-term gain leading to long-term loss continually occurs. That's why, today, I suggest marketers think about what correlates with rankings as much as what actually causes them.

If many high-ranking sites in your field are offering mobile apps for Android and iOS, you may be tempted to think there's no point to considering an app-strategy just for SEO because, obviously, having an app doesn't make Google rank your site any higher. But what if those mobile apps are leading to more press coverage for those competitors, and more links to their site, and more direct visits to their webpages from those apps, and more search queries that include their brand names, and a hundred other things that Google maybe IS counting directly in their algorithm?

And, if many high ranking sites in your field engage in TV ads, you may be tempted to think that it's useless to investigate TV as a channel because there's no way Google would reward advertising as a signal for SEO. But what if those TV ads drive searches and clicks, which could lead directly to rankings? What if those TV ads create brand-biasing behaviors through psychological nudges that lead to greater recognition and a higher likelihood of searchers click on, link to, share, talk about, write about, buy from, etc. your TV-advertising competitor?

Thousands of hard-to-identify, individual signals, mashed together through machine learning, are most likely directly responsible for your competitor's website outranking yours on a particular search query. But even if you had a list of the potential inputs and the mathematical formulas Google's process considers most valuable for that query's ranking evaluation, you'd be little closer to competently beating them. You may feel smugly satisfied that your own SEO knowledge exceeded that of your competitor, or of their SEO consultants, but smug satisfaction does not raise rankings. In fact, I think some of the SEO field's historic obsession with knowing precisely how Google works and which signals matter is, at times, costing us a broader, deeper understanding of big-picture marketing*.

Time and again, I've seen SEO professionals whom I admire, respect, and find to be brilliant analysts of Google's algorithms lose out to less-hyper-SEO-aware marketers who combine that big picture knowledge with more-basic/fundamental SEO tactics. While I certainly wouldn't advise anyone to learn less about their field nor give up their investigation of Google's inner workings, I am and will continue to strongly advise marketers of all specialties to think about all the elements that might have a second-order or purely correlated effect on Google's rankings, rather than just concentrate on what we know to be directly causal.

-----------------

* No one's guiltier than I am of obsessing over discovering and sharing Google's operations. And I'll probably keep being that way because that's how obsession works. But, I'm trying to recognize that this obsession isn't necessarily connected to being the most successful marketer or SEO I can be.