We've had a lot of discussions recently about SEO as a Science. Unfortunately, these discussions sometimes devolve into arguments over semantics or which approach is the "best" in all situations. I'd like to step back for a few moments today and talk about the wider world of SEO evidence. While not all of these types of evidence are "science" in the technical sense, they are all important to our overall understanding. We need to use the best pieces of all of them if we ever hope to develop a mature science of SEO.
The Fundamental Assumption
All science rests on a fundamental assumption, long before any hypothesis is proposed or tested. The fundamental assumption is that the universe is orderly and follows rules, and that through observation and experimentation we can determine those rules. Without an orderly universe, science would be impossible (as would existence, most likely). A related assumption is that these rules are relatively static – if they change, they change very slowly. Our view of the universe may change dramatically, resulting in paradigm shifts, but the underlying rules remain roughly the same.
The advantage we have as SEOs is that we know, for an absolute fact, that our universe is orderly. Like Neo, we have seen The Matrix. The Algorithm consists of lines of code written by humans and running on servers.
The disadvantage for SEO science is that the rules governing our universe are NOT static. The algorithm changes constantly – as often as 400 times per year. This means that any observation, any data, and even any controlled experiment could turn out to be irrelevant. The facts we built our SEO practices on 5 or 10 years ago are not always valid today.
(1) Anecdotal Evidence
All science begins with observation. In SEO, we make changes to sites every day and measure what happens. When rankings rise and fall, we naturally try to figure out why and to tie those changes to something we did in the past. Although it isn't "science" in the technical sense, the evidence of our own experience is very important. Without observing the universe and creating stories to explain it, we would never learn anything from those experiences.
PROS – Anecdotal evidence is easy to collect and it's the most abundant form of evidence any of us have. It's the building block for just about any form of scientific inquiry.
CONS – Our own experiences are easily affected by our own biases. Also, no single experience can ever tell the whole story. Anecdotal evidence is just a starting point.
(2) Prophetic Evidence
SEOs have a unique type of available evidence. Every once in a while, a prophet will descend from the Mountain Top (or Mountain View), shave his head, and speak the words of the Google Gods. Whether or not we choose to believe these prophets, the fact remains that there are people who have seen and written the Algorithm, and those people have access to facts that the rest of us don't. Their statements (and our ability to critically reconcile those statements) are an important part of the overall puzzle.
PROS – The prophets are as close to objective reality as we're ever going to get. They have direct insight into the algorithm.
CONS – The prophets don't have a vested interest in telling us the whole truth. Their messages can be cryptic and even misleading.
(3) Secondhand Evidence
When you hear "secondhand" evidence, you may naturally think of the extreme examples, like hearsay and urban legends:
My cousin's neighbor's stylist said that she once changed all of her META tags to "sex poker sex poker sex" and her site immediately jumped to #1 on Google!
To be fair, though, secondhand evidence also includes the legitimate science that came before us and the experiences of our peers. If we were forced to confirm and replicate every single conclusion for ourselves, we would never make any progress. Ultimately, we build on the reliable conclusions of other experts, past and present.
PROS – Secondhand evidence is the foundation for scientific progress.
CONS – Sometimes, experts are wrong, and you have to learn how to tell the difference, especially in a field as young as SEO.
(4) Experimental – "The Wild"
Experimentation is the heart of Capital-S Science. The most basic experiments happen something like this:
- You form a hypothesis ("Adding keywords to my title tag will improve rankings").
- You make a change to test that hypothesis.
- You measure the outcome and find out if you were right.
Most SEO experimentation, by its nature, occurs in the "wild". We have to put our sites out in the world, and we often have to use existing sites that are already complicated and changing.
PROS – By directly forming and testing a hypothesis, we can start to determine causality. We can also repeat the process, helping to validate what we've learned.
CONS – Using existing sites in the wild introduces a lot of extra noise. Often, our sites have to keep changing (even during the experiment), and Google is always changing. There's also a fair amount of risk – if we change our bread-and-butter sites to test SEO theories, mistakes can be costly.
(5) Experimental – Controlled
This is the classic SEO experiment, where we register one or more new domain names and build sites from the ground up. We can even introduce a control group, by building both sites up to Step X and then only changing one of the sites after that point. Even then, it might be best to call these experiments "semi-controlled," since the Google algorithm can still change and we can't always control outside influences (like someone accidentally linking to one of the sites).
PROS – This approach is about the best we can do, in terms of control, and it separates out a lot of confounding factors.
CONS – The artificial sites we set up in these experiments (often using nonsense words) aren't always representative of real, complex sites. In addition, these experiments are usually conducted on a sample of just one or very few sites, to save time and money. Statistical significance can be very difficult to achieve.
(6) Correlational Evidence
Sometimes, either we can't separate out the variables involved in a complex situation (like the 200+ factors Google uses in its ranking model) or direct experimentation would be impossible or unethical. For example, let's say you want to understand how smoking affects mortality. You can't take 1000 5-year-olds, force them to smoke for 70 years, and compare them to 1000 non-smoking 5-year-olds. In these cases, you take a very large data set and look at the correlations. In other words, if I look at 1000 smokers and 1000 non-smokers, how likely is each group to die at a certain age? Correlation can help you understood how changes in X (smoking, in this case) co-occur with changes in Y (mortality).
PROS – Correlation can help us mathematically find relationships when direct experimentation is impossible or impractical. These techniques can also help model complex situations where multiple variables are affecting the same outcome.
CONS – Correlation does not imply causation. We don't know if changes in X cause changes in Y or if they just happen to co-occur (maybe even due to a Factor Z affecting them both).
(7) Large-scale Simulation
If we can collect enough data, we can build a model of the universe and test hypotheses against that model. Now that large-scale indexes are being built to mimic Google (including our own Linkscape and indexes like Majestic), it only stands to reason that we'll eventually be able to run experiments directly against these models. Although the conclusions we draw from these simulations are only as good as the models themselves, simulation data can help us both improve models and conduct something closer to a laboratory test than is usually possible in SEO.
PROS – Simulations can be controlled. Unlike Google, we know whether we've changed the model or not. Experiments can also be run very quickly and on a very large-scale.
CONS – The result of any simulation is only as good as the model it's built on, and our models are still in their infancy.
Which One Is The Best?
Any type of evidence, including controlled experimentation, has limits. In a field like SEO, where the Google algorithm is constantly changing, relying too much on any one type of evidence can either stall progress or lead us to bad conclusions (or, in some cases, both). Understanding every available source of evidence not only helps us paint a broader, more comprehensive picture, but it also helps us cross-test our hypotheses and prevent mistakes. SEO science is a young and constantly changing field, and, at least for now, SEO scientists will need to adapt quickly.
What a really well written post Dr. Pete. I especially liked the reference to the "prophet" who shaved his head :) Poor guy. Between this post and your last post he just can't get any rest :-)
Your point #3 about secondhand evidence is so dead on. Both the hearsay aspect of it and the experiences of peers we trust.
Along the lines of the hearsay, one of my clients was convinced that all he needed to do was to change a few words of his website every week so he could "get really high in Google". He heard this from a customer of his "that is doing it and is at the top of Google!"
Along the lines of the trusted peers, I consider the moz posts (both Youmoz and Main Blog) a one stop shop full of excellent peer advice. I've grown to depend upon it heavily for my SEO growth.
You don't think Dr. Pete was kinda referring to Matt Cutts in his "prophet" description, do you? ;)
Thanks for the post. I see a lot of companies, particularly web designers, who advertise SEO as though it was a kind of magic and only they have the potion. I think they do a disservice to the industry.
SEO, like any science, is time-consuming, involves a fair amount of number crunching, and ultimately some common sense. I think there are too many people out there who undermine its legitimacy - far too much hearsay and urban legend.
This is particularly timely with all of the back and forth over test methods and validity lately. No one method is perfect, and it's always easy to say "I could have done it better" about someone else's work. In the long run, I think everyone benefits when more people are willing to go out on a limb and present some type of evidence as opposed to just unsupported anecdotes. With some background on the method used, everyone can come to their own conclusions about validity.
100% agree - it's healthy to debate and even to criticize when we're doing it wrong. If 99% of us are perpetually nitpicking details, though, and only 1% are actually collecting data, the "science" of SEO will go nowhere fast.
Generally talking I agree with both of you.
What I don't stand (as they make loose time to all of us and distract from most important thngs) is when anecdotic events are taken and presented as scientific evidences of any kind of SEO theory, but digging you see that what really is behind the self proclaimed scientific experiment is the attempt to demonstrate a pre-confetionated thesis.
All those kind of experiments what they do is to make the science of SEO more similar to Astrology than to Astronomy (to use a metaphore) or are as when so called scientists make very obnoxious calculations to demonstrate that the Great Pyramid had been built by the Gods.
Very good post Peter, as always.
About prophets... to be criptic it's correlated to their nature of being prophets, therefore many times - as was normal with the Sybil - what they say may mean one thing and the contrary. The problem is that to rely on prophets can lead to blind faith to everythings comes out from their inspired mouths.
About science... in a world were science itself is not anymore so clearly based on Newton, as the quantum theory is making us understand that those undiscussed rules maybe are not so true, maybe the right "scientifical" approach to use so to find an order in SEO "disorder" is the Theory of Chaos, because - as you says - to really conduct experiments in a aseptic enviroment is almost impossible and the Noise and Constant Change of Rules of the Algo almost blur any possible firm evidence; aka: a scientific result of an SEO experiment can be valid only on a very generic scale and for a short frame of time. Luckily we see from experience (which is the base of any scientifical method) that there are constants and those are what give a base to our work, even if I see us SEOs more like alchemists than scientists in white coats.
My conclusion:
to conduct scientific experiments in SEO enviroments is possible, but really complex to realize because to be really determining something clearly you must count on big resources to firstly carry out them and secondly to counter experiment them in order to confirm the results.
Maybe this is where the Academic world and the Professional one could find a common field in order to cooperate, also because that would mean to give to everybody the access to the raw datas of the experiments themselves (maybe a not so good solution for the prophets...)
What a great way of summarizing the methods we use to form conclusions in the SEO industry. I especially like that you explored the pros and cons of each method. It's easy to forget that there is no "one answer" to complicated questions.
I have found #4 Wild experiments has helped me grow as an SEO the most and achieve success.
A couple of reasons I think this is true:
1. Requires action
2. Develops experience
3. Exposes you to new things
An area I think we as an SEO industry can evolve is openly sharing our wild experiments data & observations. This creates great discussion with constructive criticism.
#7 is definitely very attractive (and a priceless asset to whomever gets there first!). When do you think SEO is going to enter that stage?
I can't speak for the industry as a whole, but SEOmoz does do some pretty complex modeling using the Linkscape data. It's not public, but Ben has developed a limited ability to forecast how a given change might impact a site's ranking power. You're going to see some of that functionality gradually appearing in Moz tools - it's just a question of when.
You're going to see some of that functionality gradually appearing in Moz tools
These are pretty exciting times to be a PRO member.
Thanks for answering! That's pretty exciting. I look forward to the rollout of these new tools and agree with goodnewscowboy, def. a good time to be a PRO member.
Whenever there will be enough funds to conduct those kind of experiments (IMHO)... even though experiments done by SEOmoz and others also in the recent past aims to go to that direction.
Or whenever science sided SEOs will put their strenghts together and - maybe creating some sort of Science of SEO Fundation - and start first creating a methodology and thenexperimenting free from any kind of links from the day to day business.
[edit: ok, I've read too much Isaac Asimov in my past]
So often we SEOs just come looking for one-off tactics. But we seem to focus much less on the actual process of doing. Thanks, Pete, for painting the process with the brush of science.
Good summary Pete. It is nice to see a smart approach to the ways we can collect evidence.
There are so many lurking hidden factors in doing testing and so many people that wrongly assume that controlled tests are the only valid way to collect information.
Good job.
Really interesting way of looking at SEO, Pete. I'm not sure I'm willing or even able to divide my thoughts and views of SEO into such well-defined pigeon holes but this does indicate the nature of the beast; that you have to be a jack-of-all-trades in your analysis, to think laterally and take every angle onboard with all the pros & cons that may entail.
I don't think these seven areas are confined to SEO. You could apply these different ways of gathering evidence to any kind of statistical or analytical task.
Yes indeed, hence thinking laterally.
There are few factors that SEO ensures that to provide a high level of value to users and so that the pr od the website increases.
Then there are the sites that rank No 1 contrary to what all evidence would suggest. I love the article... but there is an seo dark matter that still mystifies. The truth is out there.
The probleme viewing SEO as a science is that it's too much "google dependant", without Google, SEO wont be that much SEO.
But nice article, and interesting points.
I like the portayl of the pro's & con's for each SEO technique, the "hearsay" comment is very true, clients who think they know how to egt their own site onto page one, and its by doing an un authorised tip and isnt very honest SEO need guidance. This post just proves that SEO is a craft and there are a variety paths to achieve what you want to.
This is a superb post which really does look thoroughly at the method's of which can be undertaken in modern day SEO. It reflects also the wide number of ways in which campaigns can be undertaken, proving there really is no perfect route to on-page success.
Good post Pete, the part about the prophet made me chuckle. My normal route of evidence is a few sites of my own where i might try things out first, wait for a change (if any) then apply any changes to my customers sites if needed.
Wonderful post. As some who has a strong background in science (wrapping up a Biology major in the near future) I really appreciate how you stuck to the tenets of the scientific method. If only we could develop some sort of SEO laboratory. Maybe the LHC has some space they don't need.
I liked how this was ordered. One can start with the prophetic message then form a hypothesis based on that message then test it through the various experiments.
That is what I liked about this article too. Until now I have not seen anyone explain SEO in such an understandable manner. Thanks for everything. :)
Great piece Pete.
However the bald prophet who tells us how to optimise our sites is a bit biased...I don't tend to believe a lot of what he says :)
Very well Written article, each point you made make complete sense &Â for doing an SEO you have to try various statergies & techniques.
Large-scale simulation is the coolest I think... and the reason im a pro member :) cant wait until yall release some new stuff. Build more APIs for me to mashup with in SEO Site Tools mmmk?
keep up awesome work!
Interesting insight on this. I haven't really thought of it in these terms or broken it down in this way before.
What I'm really interested in is predicting how the algorithm is going to change next. I think if all data on changes was collected from the past ten years, we'd be able to find a pattern there and figure out what they're going to do next. I mean, humans, machines, everything can have some kind of pattern, right?Â
I'm gonna bust out my magic markers and start drawing on my walls and windows "A Beautiful Mind" style...
It would certainly be an interesting exercise. The trick of all of this is that the algorithm is ultimately reactive. Google builds something, it works ok, people figure out how to game it, then they change the rules. Some of what they do is evolutionary, moving forward as the technology and infrastructure allows (Caffeine is a good example), but the rest is reactionary, responding to how people use the system (e.g. the "Mayday" update).
Hm...so we're reacting to Google's changes, but Google's changing depending on our usage. So, really, the way we use Google is influential to its changes.
So, if everyone deleted their links to Facebook, Facebook would no longer rank as highly as it does now. BUT if Google KNEW that Facebook was still the most relevant site, it would change it's algorithm to make sure that Facebook still ranked well.
So, if to an extent, our actions dictate what Google does next...now what?
So, if to an extent, our actions dictate what Google does next...now what?
You ply the prophet with drinks at the next conference in the hopes of loosening his tongue.
Just kidding...really.
As a somewhat novice (only been doing web stuff for what, 15 years?) I firstly see what these guys at SEOmoz do as incredibly important to us who do this stuff for a living, but engaging.
Talking about "Best" and more... Let's be frank. 80% of our clients as web developers (I am NOT a SEO specialist) don't care or appreciate that we stress inbound links, H1 tags, Titles, Key words, Blogs, Social Media... yada.
My biggest issue is not LEARNING what is best for my client... or implementing them.. Â it's convincing them to use what I know, and pay for it!
 Thoughts??
I'm not sure anyone would disagree - these are just different topics at different levels of the process. I do think that they can be complementary. Sometimes, our evidence amounts to "Well, so-and-so said so!" To everyone else in the SEO community who knows that person by reputation, it may mean something. To the client, not so much. Being able to speak to the data behind these points and communicate the impact they have can go a long way.
Truly awesome article Dr Pete.
Hi Dr. Pete,
That's a very good summary you have provided over here. The important thing is that you just had it with an edge of humor and also related it to the other simpler things, which provided better understanding. SEO is a field where the effective implementation is all that what matters. Apart from that how innovative you are in your implementation will also play a vital role.
[edit: link removed]Â