The vast majority of search marketers operating in the organic space at least lay claim to "following the latest algorithms" at the search engines, and in 90% of the client pitches I've ever heard (or made, for that matter), the subject comes up at least once. However, I think this is still a topic about which there's not a lot of true understanding and for those new to the field, it's probably the most daunting aspect of the work. So, to help ease some pain, I figured I'd address many of the most common questions about keeping up with the search engines' ever-changing mathematical formulas that rank search results.
What is an Algorithm? How does it apply to the Search Results at Google, Yahoo! & MSN/Live?
An algorithm is just a complex equation (or set of equations) that, in the search engines' case, performs a sorting task. Here's an example of an exceptionally simple search engine algorithm:
Rank = Number of Terms * Number of Links to Page * Number of Trusted Links
In the example above, the engine ranks pages on the basis of three simple factors - the number of times the search term appears on the page, the number of links to that page and the number of "trusted" links to the page. In reality, Google has said that their algorithm contains more than 200 individual elements used to determine rank (ranking factors). The ranking factors in search engine algorithms come in two primary varieties (and dozens of offshoots) - query dependent factors and query-independent factors.
Query dependent factors are part of the sorting mechanism that's executed when your search is submitted to the engine. The search engines don't know what you're about to search for, so there are many variables they can't pre-calculate and need to do on demand. These include identifying pages that contain the keywords you've searched for, calculating keyword-based relevance and collecting any geographic or personalized data about you in order to serve a more targeted result. To help preserve resources, the search engines do cache an enormous number of their most popular search results at regular intervals, so as not to force these computations more than is necessary.
Query-independent factors are pieces of information a search engine knows about a given site or page before a query is ever executed. The most famous example is Google's PageRank, which purports to measure the global popularity of a web document, based on the links that point to it. Other factors might include TrustRank (a trust-based link metric), domain association (the website a piece of content is hosted on), keyword frequency (or term weight) and freshness.
Algorithms directly impact the search results by acting as the engines' sorting mechanism. The reason you see SEOmoz's blog post ranking below the Google technology page and above the AMS.org page in the screenshot below is because Google's algorithm has sorted it thusly.
Last year, I wrote a post taking a rough guess at the macro-factors that might make up Google's algorithm, which might serve as a helpful example of how to think about them in a non-technical fashion.
Why do SEOs Need to Pay Attention to Search Algorithms?
Because that's how the search engines rank documents in the results, of course!
Seriously, though, if you're a professional SEO, trying to garner more search traffic, a detailed understanding of the search engine algorithms and a thorough study of the factors that impact them is vital to your job performance. When I imagine that a time machine whisked my 2002 self forward to 2008, the litany of tragic SEO mistakes I might make probably dwarfs any value I might have brought to my 2002 campaigns. In the 6 years since I first learned about the practice of influencing the search results, the algorithms have changed to an enormous extent. Let's take a quick look at some of the algorithmic evolution we've seen in the past 6 years:
- Inherent Trust in Link Metrics
In 2002, PageRank (yes, the little green toolbar kind) was still a kingmaker in the world of search engine rankings. With a heap of anchor text and a lots of green fairy dust, you could rank for virtually anything under the sun. When "trust" entered the picture, raw link juice mattered less and "trusted" link sources mattered more - today they're a strong element of link evaluation. - Domain Trust Over The Importance of Individual Pages
The search engines have all developed some formula for weighting a domain's "strength" and all content on that domain benefits from its host. In 2002, we saw very little of this phenomenon, and individual pages were equally powerful with little regard for their host domain. - Temporal Analysis of Link Growth
It was 2005 that Google's patent - Information Retrieval Based on Historical Data - first opened the eyes of SEOs everywhere to Google's use of temporality in link evaluation to determine potential spam and manipulation. Some attribute elements mentioned in this patent to the infamous Nov. 2003 "Florida" update when so many affiliate and early SEO'd sites lost their rankings. - Spam Identification by Anchor Text Pattern Evaluation
Believe it or not, there was a time when having 50,000 instances of exactly the same anchor text gave you nothing but good rankings. Today, the engines are far more likely to have a very suspicious look at anyone who's link profile stand out so un-naturally. - Sandboxing of New Websites
I think the first elements of the Sandbox became noticable in March of 2004, and launching a new site hasn't been the same since. With a harsh clampdown on new domains targeting commercial keywords that didn't acquire a strong, trusted link profile quickly, Google eliminated an enormous amount of spam from their index (and made it a pain in the butt to help new sites and brands with SEO). - Fixing Blog Comment Spam
In 2003, when I had a client who wanted to rank for a particular e-commerce term, I talked to a friend in the UK, acquired 8,000 or so links over the next 3 weeks and ranked #1 the next month. In this heyday of wonderously powerful blog links, the comment spammers would always say - "Hey, 10,000 bloggers can't be wrong!" As with all things too-good-to-be-true, this tactic largely died as nofollow and intelligent algorithms found ways to detect which comment links to count. - Crackdown on Reciprocal Link Tactics
Even as recently as 6 months ago, thousand of sites in the real estate field relied on a relatively simplistic reciprocal link exchange scheme. No more, though - hundreds of those sites never got their rankings back, and real estate SEO has a vastly different look than it did in 2007.
These are just a few of the many changes to the algorithms over the last 6 years, and only by paying attention and staying ahead of the curve could we hope to provide our clients and our own projects with the best consulting and strategic advice possible. Keeping up with algorithmic changes, particularly those that validate new techniques or invalidate old ones is not just essential to good SEO, it's the responsibility of anyone who's job is to market to the search engines.
How can we Research and Keep Up with the Latest Trends in Algorithmic Evolution?
There are a few good, simple tactics that enable nearly anyone to keep up with the algorithms of the major engines. They include:
#1 Maintaining several websites (or at least having access to campaign & search visitor data) provides some of the best information you can use to make informed decisions. By observing the trends in how the search engines rank and send traffic to different types of sites based on their marketing and content activities, you'll be able to use intuitive reasoning to form hypotheses about where the engines are moving. From there, testing, tweaking and re-evaluating will give you the knowledge you seek.
#2 Reading the following excellent sources for information on a regular basis will give you a big leg up in the battle for algorithmic insight:
- SEO By The Sea - Bill Slawski's Blog regularly examines patent applications and IR papers for clues as to where the search engines might go next.
- SEO Book - No one has a better pulse on successful strategies for targeting search engines than Aaron Wall
- The Google Cache - Virante regularly performs high quality search engine testing, and this is where you'll find the data they leak publicly
#3 Running tests using nonsense keywords and domains (and controlling for external links) also gives terrific A/B test evaluation data of what factors matter more or less to the engines' algos. I've described this testing process in more detail here in the Beginner's Guide.
How do we Apply the Knowledge Learned from Research to Real-World Campaigns?
The same way we apply any piece of knowledge that's primarily theory - by testing and iterating. If you see strong evidence or hear from a trusted source that linking in content provides more SEO value than linking in div elements or top-level menu navigation, you might give this a try by taking a single section of your site and instituting Wikipedia-like interlinking on content pages. If, after a month, you can observe that the engines (or a single engine) has crawled all those pages and your traffic from that source rose more than normal, you might consider the effect "plausible" and try the same thing on other sections of the site.
Alternatively, you can test in the nonsense-word environments described above. This gives less realistic feedback, but doesn't endanger anything on your sites, either :)
All in all, keeping up with the algorithms parallels any other optimization strategy - tax deductions, faster routes to work, better ways to chop onions, etc. Read, research, test and if you experience positive results, implement.
There's plenty more to the practice of algorithmic research and evaluation, but we'll save those for another post. In the meantime, I'd love to hear your thoughts on algo studies and the value you receive from it.
Rand,
Great post. Definitely interesting how things have changed over the years. Back in 2000 - 2003 I dominated the SERPs due to my manipulation of the Amazon communities features. It was a bit spammy but it got results and it got results fast.
During the same time period I utilized comment spamming on DPReview. I did it delicately and made sure I became a member of the community but . . . it was still comment spamming. ;-)
Even last year at OneCall I implemented a 301 redirect to make more difficult (if not impossible) for Google to know that 90% of OneCall's inbound links are from their affiliate network.
You need to keep an edge but you also need to adapt to the times of today. What worked 8 years ago . . . doesn't work today. You need to find the new edge and utilize it. I am surprised that very few SEOs don't mention the community sites that still do pass page rank and that still allow anchor text manipulation of those links from their site to your site.
There are always a million ways to use white hat (and a few grey hat) tactics to outperform your competition. Stay away from the black hat methods as the penalty isn't worth the temporary reward. However, utilizing a ton of white hat tactics with a couple of grey hat tactics to keep an edge? Yeah, go for it!
Again, great post.
Brent D. Payne
hahah! Look at this guy spamming up the comments. Look how many thumbs down he got! Brilliant.
This just comes up to "what a SEO consultant should do on a regular basis" as you mentionned on your point #2. Running tests is just a consequence of what you can read, and analyse.
I am about to leave my company, and as I got interviews, one thing that I have been pointing out is that I need at least 1H30 everyday to read the news and forums, in order to keep in touch with the "real world" and up to speed, so that I can get the best results for my clients since every algorithm is evolving everyday.
The only one that got the point made me a realistic offer. the other ones did not understand why I would "lose" some times in the day, while I could be "working".
A SEO consultant is still a kind of a fairy tale and mysterious character : everybody wants to have one in house and get the best SERP results, but no one really knows how we work, and achieve the goals...
Agreed . . . very few companies 'get it'.
I'm fortunate to have found one that finally does.
Brent
This one seems really interesting to me. I have heard and seen (I think so) this but I would like to know actually how they are doing this...
I agree...some comments are obviously spam, but others are less so. Suppose a combination of length of posting and relevant keywords (one sentence with no relevant keywords / phrase will obviously carry very little weight compared to a sizeable reply of several paragraphs littered with keywords) will be applied?
Is this comment spamming?
It sure is spam! The postings are ancient though
That was me . . . back in the day before nofollow attributes. Those were the good ole days. ;-)
Drove a ton of traffic both to and from Phil Askey's site. Amazon bought them recently. Phil will still attest to me being a factor in getting the site visibility. I would UGC spam Amazon.com about DPreview and in turn I'd go to DPReview and comment spam about Viking products.
The funny thing is though, nobody got pissed about it. I did it in a way that engaged the community and they didn't view it as spam.
Speaking of which . . . I need to get my Amazon account cranking again.
Fun times!!
I suspect that comment spam is fairly easy to identify - it tends to be repetitious. If you run your own blog, look at the duplicate postings you get in your spam trap. I'll guess that SE's identify the same posting in multiple places. The value of any link weight associated could be gradually reduced to zero with increasing frequency of repetition.
If you compare spam postings across several blogs, then you can see patterns where very similar postings are made, but with minor character variation. This may be to mark the source (often seems tied to identity) or may be to try and vary the comment enough in the hopes of evading downgrading. As a wild notion, you could hook up Akismet or other community based spam rejection tool, to the SE. That way you could use the power of many to clean the comments up :)
Repetitious indeed, and quite often totally off track so it's easy for a sentient being to pick up. algos are gonna struggle a bit more!
The algorithms are designed to mimic human behaviour as far as possible, in order to return relevant organic search results. As SEO's study the algorithms and evolve ways and techniques to benefit from this knowledge, so the loopholes get closed (and new ones inadvertantly created) by the next algorithm update.
It is a game of hide and seek, thrust and counterthrust, with each significant gain by one side soon offset by the focused intent of the other. Any kind of insight into this arcane world is of great value, as only the knowledge that this insight can bring may lead to an advantage in this game.
Following the algorithm and it's changes IS integral to our business, but the vast majority of my techniques and methods are built around what the search engine folks simply tell me.
I focus far more of my energy on doing things right and making sites good. To be completely honest, I really only consider the algorithm when I hear around the grapevine that it's changing and could possibly negatively affect me, and it really hasn't so far.
Maybe that's just me.
I really enjoyed this post. I've often wondered how SEO would have been changed without the Sandbox update. Right up until then optimizers were able to pitch a client on a new site and deliver (fast or your order is free!). I think that the sandbox took a while for many SEOs to come to terms with and probably helped darken the dubious cloud that already surrounded the industry when so many contracts were unfulfilled.
I totally agree with you that working across many sites helps you see a larger picture of what works. It is interesting how certain tactics work much better in specific niches.
I think that A/B nonsense tests are fun to study as well (though I've seen conclusions drawn that are not very relevant to real sites).
I will say that it is possible to practice SEO today without giving much thought to algorighms. You could follow a list of best practices and still be successful to some degree -- but you would be a really boring person!
Personal testing is everything. A lot of people's tests will yield different results depending on different variables so you have to go with what works for you, not what works for someone else. Great post.
I think what Rand is trying to get at is this; if you're going to bill yourself as a GM Certified mechanic, you best be working on a few GMs a day.I think judd.exeley and BottomTurn got it right. SEO is only a small part of a much-larger job description.Thanks for the checklist of sites to visit weekly.Much Appreciated,
Rand, awesome information and resources. I'm still learning this stuff but at an even more exponential rate thanks to your site and other like Aaron Wall etc.
I think along with the a/b testing one of the most important thing is to have several live sites in different niches that will react differently to changes in the way search engines do things, sometimes flagging results. I'm also learning when things just don't work.
for example, i was at a friend's place who had limited computer/internet access, so I made a craigslist ad for him on my iphone, after taking pictures of what he was selling with it. Turns out that you can't upload images to craigslist with the iPhone (unless you did something custom) as it is. It simply doesn't have the option to upload the file on the site. - i just used a yahoo account and the imac that was available on dialup to finish the now lenghty process, but these things are all emerging and its good to figure out what does and doesn't work on your own.
I'll be reading more often now.
Aaron (Kronis) - Wpromote SEO
Really great article. I'd like to here more about comment spamming, though.
Rand, Thanks for the trip down memory lane. Elicits fond memories of the easier days.
As Google continues to apply their boundless resources toward eliminating any manipulative practices, I often think the energies spent trying to garner a fleeting moment of free SERP traffic may be better spent procuring better paid traffic. After all, PPC is where the Big G wants you to be.
Great post, thanks Rand.
"...calculating keyword-based relevance..."
Do you not think that the search engines might know something about this already? Especially for common searches? Or are you talking about relevance to the other independent factors?
Also I've heard that a lot of Python is used at google. Do you know if this is true?
Thanks Rand for the post,
Tom
From what I understand, Python is one of three languages at Google along with JAVA and C++ .
Thanks Rand, that's top priority for us SEO's, we must know how search technology and their algos evolve.
aside from knowing how the algorithm evolves, we must also learn how to adapt thru the changes made on SEs systems
This is possibly the single-dumbest comment I have ever read. I would prefer to punch your face, but due to the limitations of the internet... I will settle for thumbing you down.