Last week, when the SEO world was distracted by revelations that Google was blocking keyword referral data and nostalgic mania over MC Hammer’s search engine, Search Engine Land released a leaked Google document outlining Google's official guidelines for quality raters. I read the 125-page document out of curiosity, and I decided to share some valuable insights it contains into the mind of Google.
Sorry, No Secrets Here
If you’re looking for SEO “secrets,” you’ll be disappointed by this post. Although this is an internal document, and Google may not be happy about it being leaked, you won’t find a smoking gun here. What you will find is a training manual on Google’s philosophy of quality. The key to proactive SEO is to understand how Google thinks. If you only chase the algorithm, you’ll always be reacting to changes after they happen. Since the document in question is proprietary, I’m not going to link directly to copies of the document or quote large chunks of it. I’m writing this post because I sincerely believe that understanding Google’s philosophy of quality is a fundamentally “white hat” proposition.
What Is A Quality Rater?
Quality raters are Google’s fact checkers – the people who work to make sure the algorithm is doing what it’s supposed to do. Data from quality raters not only serves as quality control on existing SERPs, but it helps validate potential algorithm changes. When you consider that Google tested over 13,000 algorithm changes last year, it’s a pretty important job.
This particular document focuses on rating SERP quality based on specific queries. Essentially, a rater reviews the sites returned by a given query and evaluates each result based on relevance. Raters also flag sites that they consider to be spam. One last note: Google’s philosophy is not always reflected in the algorithm. The algorithm is an attempt to code quality into rules, and that attempt will always be imperfect. The document, for example, says almost nothing about back-link count, unique linking domains, linking C-blocks, etc. Those are all metrics that attempt to quantify relevance.
Here are 16 insights into the human side of Google's quality equation, in no particular order…
(1) Relevance Is A Continuum
I think the biggest revelation of the document, in a broad sense, is that Google’s view of relevance is fairly sophisticated and nuanced. Raters are instructed to rate relevance along a continuum with 5 options: “Vital”, “Useful”, “Relevant”, “Slightly Relevant”, and “Off-topic”. Of course, there is always a certain amount of subjectivity to ratings, but Google provides many examples and detailed guidelines.
(2) Relevance & Spam Are Independent
Relevance is a rating, but spam is a flag. So, in Google’s view, a site can be useful but spammy, or it can be irrelevant but still spam-free. I think we see some of that philosophy in the algorithm. Content is relevant or irrelevant, but spam is about tactics and intent.
(3) The Most Likely Intent Rules
Some queries are ambiguous – “apple”, for example, can mean a lot of things without any context. Google instructs raters to, in most cases, use the dominant interpretation. What’s interesting is that their dominant interpretations often seem to favor big brands. In specific examples, the dominant interpretation of “apple” is Apple Computers and the dominant interpretation of “kayak” is the travel site Kayak.com.
Other interpretations (like “apple” the fruit or “kayak” the mode of transportation) automatically get lower relevance ratings if there’s a dominant interpretation. I think the notion of a dominant interpretation makes some sense, and it may be necessary for a rater to do their job, but it’s also highly subjective. In some cases, I just didn’t agree with Google’s examples, and I felt that the dominant interpretation unfairly penalized legitimate sites. Most people may want to buy an iPad when they type “apple”, but a site that specializes in online organic apple sales is still highly relevant to the ambiguous query, in my opinion.
(4) Some Results Are “Vital”
The “Vital” relevance rating is a special case. Any official entity – a company, an actor/actress, a politician, etc., can have a vital result. In most cases, this is their official home-page. Only a dominant interpretation can be vital – Apple Vacations will never be the vital result for “apple” (sorry, Apple Vacations; I don’t make the rules). I suspect this is a safety valve for checking the algorithm – if “vital” results don’t appear for entity searches, many people would question Google’s results, even if the SEO efforts of those entities don’t measure up.
Social profiles can also be vital, if those profiles are for individuals or small groups. So, a politician, actress or rock band could have multiple “vital” pages (their home-page, their Facebook page, and their Twitter profile, for example). Interestingly, Google specifically instructs that social media profiles for companies cannot be considered vital.
(5) Generic Queries Are Never Vital
Obviously, Walmart.com is a vital result for the query “walmart”, but Couches.com is not a vital result for the query “couches”. An exact-match domain doesn’t automatically make something vital, and some queries are inherently generic.
(6) Queries Come in 3 Flavors
Query intent can be classified, according to Google, as Action (“Do”), Information (“Know”) or Navigation (“Go”). Like ice cream, queries can come in more than one flavor (although Neapolitan ice cream should never substitute banana for vanilla). This Do/Know/Go model comes up a lot in the document and is a pretty useful structure for understanding search in general. Relevance is determined by intent – if a query is clearly action-oriented (e.g. “buy computer”), then only an Action (”Do”) result can be highly relevant.
(7) Useful Goes Beyond Relevance
This is wildly open to interpretation, but Google says that “useful” pages (the top rating below “vital”) should be more than just relevant – they should also be highly satisfying, authoritative, entertaining, and/or recent. This is left to the rater’s discretion, and no site has to meet all of these criteria, but it’s worth noting that relevance alone isn’t always enough to get the top ratings.
(8) Relevance Implies Language Match
If a search result clearly doesn’t match the target language of the query, then in most cases that result is low-relevance. Likewise, if a query includes or implies a specific country, and the result doesn’t match that country, the result isn’t relevant.
(9) Local Intent Can Be Automatic
Even if a query is generic, it can imply local intent. Google gives the example of “ice rink” – a query for “ice rink” should return local results, and clearly non-local results should be rated as off-topic or useless. This applies whether or not the location is in the query. Again, expect Google to infer intent more and more, and local intent is becoming increasingly important to them.
(10) Landing Page Specificity Matters
A good landing page will fit the specificity of the query. A detailed product page, for example, is a better match to a long-tail query for a specific item. On the other hand, if the query is broad, then a broader resource may be more relevant. For example, if the query is “chicken recipes”, then a page with only one recipe isn’t as relevant as a list of recipes.
(11) Misspellings Are Rated By Intent
If a query is clearly misspelled, the relevance of the results should be based on the user’s most likely intent. In the old days, targeting misspellings was a common SEO practice, but I think we’re seeing more and more that Google will automatically push searchers toward the proper spelling. It’s likely Google is only going to get more aggressive about trying to determine intent and even pushing users toward the dominant intent.
(12) Copied Content Can Be Relevant
This may come as a surprise in a Post-Panda world, but Google officially recognizes that copied content isn’t automatically low quality, as long as it’s well-organized, useful, and isn’t just designed to drive ad views. Again, this is a bit subjective, and it’s clear that you have to add value somehow. A site with nothing but copied content (whether legitimately syndicated or scraped) isn’t going to gain high marks, and a site that’s only using copied content to wrap ads around it is going to be flagged as spam.
(13) Some Queries Don’t Need Defining
Dictionary or encyclopedia pages are only useful if a query generally merits definition or more information. If most users understand the meaning of the query word(s) – Google gives the example of “bank” – then a dictionary or encyclopedia page is not considered useful. Of course, tell that to Wikipedia.
(14) Ads Without Value Are Spam
One quote stood out in the document – “If a page exists only to make money, the page is spam.” Now, some business owners will object, saying that most sites exist to make money, in some form. When Google says “only to make money”, they seem to be saying money-making without content value. It’s ok to make money and have ads on your page, as long as you have content value to back it up. If you’ve just built a portal to collect cash, then you’re a spammer.
(15) Google.com Is Low Relevance
By Google’s standards, an empty search box with no results displayed is off-topic or useless. Ironic, isn’t it? Joking aside, the document does suggest that internal search results pages can be relevant and useful in some cases.
(16) Google Raters Use Firefox
I said no secrets, but I guess this is a little bit of inside information. Google raters are instructed to use Firefox, along with the web developer add-on. Do with that as you will.
Knowing Is 53.9% of The Battle*
So, there you go – 16 insights into the mind of Google. Advanced SEO, in my opinion, really comes down to understanding how Google thinks, and how they translate their values and objectives into code. You can lose a lot of time and money only making changes when you've lost ranking – really understanding the mind of Google is the best way to future-proof your SEO efforts.
*I always wondered what the other half was – blowing stuff up, apparently.
Love this! After reading the "leaked" document I found it interesting how the quality raters get their assignments, apparently by logging into a Mechanical-Turk style interface. It's reasonable a busy quality rater could examine dozens of queries a day, and a small army could process 1000s or more queries a week.
We often think of Google as an anonymous algorithm, so it's strange to consider actual folks looking at your site. The higher you rank for a competitive term, the more likely someone, somewhere at Google will put actual eyeballs on you. Another reason to build for users, and not robots.
well said. :) I printed and saved the handbook. I stepped back while reading it and forgot what I know to come as a trainee. "through the eyes of a quility rater"
True! It think it’s is a good approach because no matter how precise, Google Algorithm is it can never win the actual eye ball
I strongly suspect that Google would've once balked at the idea of having an army of 1000s to do this. The post-IPO reality, though, was that trusting the algorithm just wasn't good enough. If www.ibm.com didn't come up for a search on "IBM", Google would be doubted and shareholders would get nervous. They couldn't just say "Well, the site doesn't have enough links". Someone has to make sure the black box does what most people think it should.
Exactly, web development as it's still pretty much chaotic, search engines are balking at so much crap/not standard, an army of reviewers is a necessity.
Confession: I downloaded and scanned the leaked document, filing it for later review. Thank you, Pete, for thinking like a search engine and summarizing 125 pages down to 16 key points for us!
Well Well... Dana I also need to confess the same action. Have to yet go thru it in detail but this write up will surely help corelate the causation....
I'm sure many of us quietly downloaded it, knowing the link would disappear in a few days :)
I am so happy I was able to download it today and gratetful I found this page with so much information.
Absolutely agree with you on that Cyrus - and in the end, this is why trying to win by being smarter than the algorithm is a very dangerous approach.
I see people constantly trying to find ways to auto generate content that can fool Google, congratulating themselves on how clever they are when they think they have achieved their aim. In the end it may not be the algorithm that smacks you - well worth remembering.
Thanks Dr Pete for taking the time to share these insights - it's always good to get an indication of what others take from information like this so we can weigh our understanding against the pool of knowledge and experience out there.
Even though Google has asked every site that was hosting it, to be removed. It's still available in Google:www.google.com/search?q=2011+google+quality+rater+guidelines+"pdf"&pws=0
Then again, this IS an SEO community, I'm sure we can all find it ourselves :)
I dident get to it with your search. My tip is using twingly or any stand alone crawler for that, if google want to hide it they will. Try this twingly.com if you want the 125 page guideline. Works now but probably not for long.
Thanks bobJones
Just kidding.. That link did not work..
I found it on a link from Reddit.
You find the whole 125 pages om this page, second link https://www.chaddo.com/index.php/leaked-google-reviewer-document/ Enjoy!
I got my start in SEO as a quality rater after college. This set of guidelines is only for one of the many tasks available to raters. It is far more complex and much larger than most of the comments here elude to, but every rater starts with the leaked document to sort of 'ground' them to an objective standard.
Careful not to get myself in any trouble, there are also some comments to this post that highlight what most should take from this leaked 'realization'- your sites are getting looked at by people. I tend to think an SEOs primary job function is presenting content that is strategically crafted for the minds of those behind each keyword/keyphrase. For example, those searching for "organic search marketing" are perched at a completely different altitude than those searching for "website marketing". The content on the resultant pages should reflect that. What is often assumed as a subtle difference isn't really subtle at all after you've seen 1000's of searches from a more objective p.o.v. I could provide a better example, but I have some "optimizing" to do for a client. :)
That's interesting - thanks for sharing. It's definitely important to note that this document is just one small piece of a much more complex process, and even that's only the human side.
Totally agree on spam usually not being subtle. We see that in Q&A all the time. It's amazing how easy it is to spot even a spammy link profile in a few minutes, with a little practice.
I should have waited for this before reading the whole mind numbing bloody thing
2 other takeaways I had from it though:
#17 Definition of Bestiality: Bestiality or zoophilia is defined as human-animal sexual interaction.
#18 Leapforce Evaluators & Lionbridge Raters are the companies they use
Hi Peter...
yes, last week (and still this week) the big topic was the Google new "privacy" policy... but I am sure that 99% of us downloaded that Quality Raters Guidelines and read it with attention.
Yes, it doesn't say any secret, but confirms many things we were saying about the importance of content, of relevance et al.
The definition used by Google to define the query intentions are interesting, even though I would have loved for them to use also "transactional", because "do" can be a little bit too generic in a web world where online commerce is so important. But, said that, this is more semantic than a real issue.
What interested me was how Google gives some inputs about how to detect spam issues. Some very "classic" (hidden content and so on) but other quite interesting and that should be read by all the CMS developers, as the long hiphenated URLs... that yes can be a spam indicator but that are also the standard solution in WordPress, Joomla or any other CMS.
Finally, but this is my sarcastic spirit coming out, I found it funny that the Guideline came out online just a couple of weeks after the public presentation of the Quality Search Alliance project (a project lead by ex Senior Quality Raters Googlers)...
The aspect of the doc I found most intriguing was the validation of user intent as a way to preduct future actions. For instance, a query that suggests a purchase action should lead to a series of interactions with a site. This I believe is why many affiliate sites were decimated recently, they target queries that require engagement buy quickly shuttle visitors off-site. Other queries can be satisfied by a single page requiring no additional engagement. Thus one bounce doesn't equal another.
While I didn't necessarily glean this from the doc, I'd suggest that "go" and "vital" queries are very strong quality metrics. It's reasonable to think that large quantities of people looking specifically for your site is a good thing, a sign of site health.
Panda hit sites are more often than not, devoid of "go" or "vital" queries
i found it interesting that Wikipedia was cited as a source to be used for 'official' pages https://twitpic.com/7364yi/full
and that they endorsed using policically mockery of president bush as relevant for stephen colbert https://twitpic.com/736k0s/full
also note the number of times google video/youtube links appear and are mentioned versus other video websites (blip.tv, vimeo, gametrailers, gamerstube, etc..) that and 'social' isnt vital for tv shows: https://twitpic.com/736ami/full but what if that tv show is on blip.tv or justin.tv or other site? that seems to be vital. Also video sites are not social networking sites (my opinion).
oh well. google probably doesnt care and just thinks this: https://twitpic.com/75aba7
Dr. Pete, thanks for this information. For the last few months I have been putting SEO 2nd and usability 1st. I saw a video with Matt Cutts saying something like "Don't chase the algorithm. Instead do what's best for your customers because that's what Google is trying to do." This information you provided provided little to no SEO value, but provided information to create a web strategy.
Thanks!
Thank you - that's very helpful. Your 16 points capture a good 70% of the value in the huge original. But for those with the time, the the detail and the examples of the original are helpful. Also, it includes useful points beyond the 16 - such as when is a redirect to be considered spam and when not.
Very useful.
This is one I will give to select clients - as well as my staff. Great to be reminded about the philosophy around what we do. And a darn good summary.
For me, the biggest take away from the document was that it was the smoking-gun-undeniable-proof of Google's definitive capriciousness and lack of objectivity, couched in language that appears logical: The document reads like most Standards and Practices memos; which, as all business people are aware, are generally designed not to truly inform/educate the people in reception of it, but for CYA-purposes and leverage. This is Google's whole M.O., in fact: Google is what happens when engineers try to figure out people.
Google's aim is not to determine intent, per se; but to tell a user what that user's intent is: It hasto be like this because (to the best of my knowledge) Google is not yet tapped-into each of our individual cognitive matrices; and, is for profit. This, in and of itself, is remarkably revealing about the quantity and quality of conjecture Google employs to deliver any search result that is not based upon a semantically, syntactically, logically, psychologically, and grammatically pristinequery. And even then, like others have mentioned, when I search for “apple” I am not going to get what I want – I am going to get what Google thinks I should want.
What is more, unless Google's quality raters are all clones, the whole document can be summed-up in a single sentence, really: "Take your best guess, ladies and gentlemen." The amount of subjective decision-making on behalf of raters ought not be underestimated. The appearance of a uniform standard does not a uniform standard make; and, as it has clearly and always been, 5 raters can look at the same one site and all five can – and likely often do or would – come to 4 or 5 different conclusions.
Are the things stated in the document 'reasonable?' Well, they're not un-reasonable. However, that which is reasonable is not, ipso facto, objective.
The greatest take away from Dr. Pete's article, itself, really should be to ‘learn to think like Google.’ The problem most people encounter when they actually start to do this is that they start thinking everyone else is retarded.
"Google raters are instructed to use Firefox, along with the web developer add-on."
So i'm assuming that Google raters also have some knowledge of web development and are looking for any markup that may be a bit "questionable". This changes the view of raters that I had orginally of just normal folk checking relevance, content presentation, etc.
Not Chrome?
They go into some detail on spam detection, but it's pretty basic stuff by SEO standards - keyword stuffing, cloaking, doorway pages, suspicious redirection, etc.
+1 for blowing stuff up.
Thanks for this! I waded through the whole thing myself for good measure, but to be honest, you pretty much summed up everything really relevant about it. It's really fascinating to see how Google describes their evaluation priorities to novices, you get a real feel for what actually matters to them in terms of ranking at the moment.
The one thing that struck me was that there was no mention of paid links in the spam section. Since the threat of manual review is a constant bogeyman in discussions on whether or not to buy links, I think it's interesting that actual quality raters aren't even told to look out for them. In fact, they don't have any way of flagging them, even if they see a really egregious case!
Does anyone know if they have separate quality rater teams patrolling for paid links? Or is Google basically relying on the strength of their algorithm, their inhouse spam team, press reports (as with JC Penny) and good old-fashioned denunciation by competitors?
Yeah, I think it's pretty clear there are separate teams dealing with link-based spam. What's unclear is how that process fits into this process and what role the more familiar anti-spam team plays in quality control for Google results and algo updates. I suspect the process goes something like this - the spam team identifies problems and recommends algo changes to fix them. Those recommendations become live tests. Then, the quality raters that this document covers measure the impact of those tests.
I know everyone's excited about the Google leak but I completely missed the MC Hammer news!
Oh, don't get me wrong - it's awesome :) Jen (@jennita) even talked with MC Hammer on Twitter this past weekend. It could be some kind of crazy publicity stunt, but apparently he's been investing heavily in tech. start-ups the past couple of years.
I'm thinking come back for Hammer Time 'you STILL can't touch this'!
Nice post, I think these points are all good in regards to raiting guidelines. Another good point to note is the change in patents and when they are accepted more information into the mind of Googles algo can come to light as seen by the release of a patent today which has an impact on Exact Match Domains: https://www.seobythesea.com/2011/10/googles-exact-match-domain-name-patent-detecting-commercial-queries/But yes interesting time ahead that is for sure =)
An interesting read, and as you have said, no secrets, but still insightful, thankyou.
Seems someone has already jumped on the opportunity and set up a website (leakedqualityguidelines.com) and download link for the "leaked" guidelines. To me, this is probably more amusing than the actual document itself.
Sure is a small piece of the apple pie.. but especially since watching SEO almost neglect this for years, I'm excited to see how this small piece eventually becomes a bigger piece of SEO discourse. In fact, I'd say that the quality rating system arose as a response to SEO-like activities. I'd also venture to say that they will both scale up co-dependently. In the end, the web will be full of content that is indirectly structured by Google's preferences and, in a way, it already is. The further search engines dip into semantics, the larger that piece of pie becomes. Time to put my voo-doo doll and bag of chicken bones down for the afternoon... very interesting discussion though. Job well done on your summary.
RDK said: In the end, the web will be full of content that is indirectly structured by Google's preferences
To me this is just saying that the web is evolving into something more intuitive and predictive that the end-users will appreciate and use even more. So following Google's lead is a good thing.
Thank you for this information. The concise writeup review was a great help.
Pete, what a fantastic article. This has been very useful. The points you make are very relevant and it was explained in a very easy to understand (by me) way!.
Keith
This is the first clear-cut, simple and direct summary I've read about the fall-out for Panda and how it impacts those of us who have websites and want to make them productive and useful. I am not SEO smart and have been struggling for some time to grasp the whole keyword-whatever stuff. It goes against all my training and I've had to do a complete flip in my thinking. However, THIS makes absolute sense to me...so I guess there's hope.
Thank you! (And I'm so happy that even at this late date I was able to download the full document so I can go through it and run it through my internal translator.)
I was with Leapforce about a year ago before the Panda algorithm came out (even before I became an SEO).
The problem with Google's human evaluation is that while Google wants human raters to be involved in the process, they didn't provide enough incentive to enough time rating content because they basically paid you BY TASK. So if you rated 60 pages in 60 minutes, you could get about $15/hour. How great is the evaluation if you take a minute or less to read a page, evaluate a page, determine if it is spam, check for keyword stuffing/hiding, see if the author is well-known, etc.?
Also the turnover rate is very high for Leapforce. I question this process because if you don't treat your employees like humans, how do you expect that human evaluation quality to show significantly through your algorithm?
Another interesting thing to know is what criteria is whether they are demographically/geographically distributed. If, for example, they all are young, recent grads from in and around Mountain View, Ca., that would certainly skew their ratings, especially as it seems so many things like "Most Likely Intent", etc. are so heavily subjective.
I love the article, I didn't hear about the doc but after your article probably don't need to read it now :)
In the end it comes down to common sense, good content and a good website. What it's been about all along
Excellent summary Dr. Pete. So glad to be in a community where a leaked document from Google offers basically no new surprises when it comes to the philosophy of SEO. I freaking love being a part of the SEOmoz community.
Here are my questions - let's say you are launching a new design for your site that is a huge improvement for users. I know Cyrus said they could process 1000s of queries a week, but is there any way to expedite that process? If you feel very confident about your new design and that it is an improvement, any way to draw in the quality raters?
It can be easy to confuse the various rating/review processes at Google, but I think this process sits outside of spam review and even individual sites, at least in the sense we think of it as SEOs. These raters are helping determine if the algorithm is working as expected and if changes perform well. While spam ratings from this process feed back into the system, it's not really on a site-by-site basis.
The team that manually reviews sites to see if they're spam is separate and probably uses a different process (with some similarities, of course). For new sites and new designs, I think it's pretty rare that manual intervention comes into play at all. Google is going to let the algorithm itself do the heavy lifting.
Secrets or no secrets... what I find frustrating with Google is when I see a couple of junk sites when i do my keyword research.. It goes completely off whatever the SEO industry talks about when it comes to Google. I can only calm myself stating that its just a mere algorithm and the fact that the quality raters are but human!
Very interesting also to see that people are instructed to use Firefox vs Google Chrome, really curious why that would be.
Thank you so much! Now we only need to know the other 66.1%.
My fair idea of this 66.1% is -- start acting on these clues. ...
- Ner, the iHerb Builder
Great post! I prefer your number 1 and numbrer 2. Relevance is key to everything and I think Google is drumming down on all those sites out there that aren't following. It'll catchup one way or another.
Unique and quality content accounts a lot for high search engine rankings. It is important to put unique content on your website to improve your rankings in SERP results.
This was a great post! Really liked reading about the "vital" pages and your illustration of couches.com not being a vital page
Thanks Dr Pete - I hadn't seen the leaked doc (have now, thanks to the link provided in the comments) but your summary was great.
I especially liked the reference to Do, Know, Go - I always try to educate my clients on keywords with intent, as I am sure we all do, but this is a simple structure to always bear in mind ... and easy to explain!
Its actually reassuring to know that not all their rankings are solely algorithm derived.
There are so many nuances in humans and their interests that it really takes a human to understand many of our quirks.
You have a spelling mistake, on insight 7:
but it’s worth nothing
that should read
but it’s worth noting
Thanks, Paul. Even spell-check can't always save us :)
Fantastic post. Thanks Dr. Pete.
Good piece. Thank you for sharing what Google might not have otherwise shared.
Great summary of the Google document. It appears that Bing is also using human judges.
https://www.prodigalwebmaster.com/2011/10/bing-ranking-algorithm-how-would-a-human-judge-your-site/
Brilliant, Pete!
Thanks for summarizing!
Awesome, I never thought I would ever be reading this! Very cool
In the case of real estate sites, which are made to sell homes, there`s a lot of issues to address in these variables. A lot of descriptions use repeated terms hundreds of times, and while some provide information about buying, selling and other interesting aspects of the trade, most pages are about properties for sale. We operate a site for real estate in the Dominican Republic and have noticed that one of the most influential facts in the SEO competition is the link exchange.
Thanks,great share. I am first to do SEO with our pp spunbond nonwoven fabric web www.nonwoven-fabric.net. Hope it can help us.
[link removed]
Thankyou for this Dr Pete. I've had a quick read of the document myself, but you've really hit the nail on the head with this article.Interestingly, I noticed the mention of the keyword 'bank' and decided to have a look for myself within Google UK. The result was that a national clothing store actually ranks first rather than any of the popular financial banks situated in the UK. It seems as the notion of Intent could be influencing this result, as more users are maybe searching for clothing in search engines than banks and the quality raters have picked up on this.
The most interesting part of this documment is "fighting spam". Good to know, that SQT member can check WHOIS or check domain if is too long (to many keywords).
Great summary, and I'm looking forward to reading the full document over the weekend.
I wonder in what extend and in which form the raters use the firefox extension.
I suppose has we do for a five minutes audit: to check the site in googlebot view.
Yeah, Gianluca is on the right track - they suggest using it to detect cloaking. For example, to shut off JavaScript, etc.
I read checked some topics and I'm amusing infact I'm sure you summerized quite cleverly in just 16 points. I'm looking to read the whole leaked document comling weekends :)
I found the parts on webspam the most interesting. In particular, I was intruiged by the notion that a long hyphenated keyword rich url is a big indicator for spam. I would love to discuss this. I have a site that has some long urls that are full of keywords. The content is good quality, but perhaps the long urls are detrimental to me?
I am going to do an experiment where I use the rel canonical tag to point to a much shorter url and see what happens. But I'm going to wait for now as my site is experiencing some "Post Panda flux" as Mr. Cutts would put it. Once my visitor count stabilizes I'll see what happens when I shorten these urls.
https://www.youtube.com/watch?v=zGo83DGspMA
Great post and interesting read, Very peculiar that Google says there own home page is bad and they tell their raters to use Firefox and not Chrome?
Think some testing is required for the Do/Know/Go part and see how Google rates certain pages for the same query.
I'm not sure what the logic is on Firefox vs. Chrome, other than that Google is a dev-heavy culture, and many people probably still use Firefox for the add-ons. That may change over time, of course.
Perhaps they don't want to dirty the extra metrics they're getting from Chrome?
I see the rationale for using FF appears to be the webdev addon (disabling CSS and such) - this suggests to me that the advice predates [widespread use of] Chrome and it's inbuilt web dev tools.
100% Agreed! But sooner the better...
I still have a copy of this document, but i posted a link on twitter and within a few minutes Google had emailed me telling me off and asked me to remove it.
I am not sure if there was intention in the leek or what, but it was certainly an interesting read - even if nothing new was found out really.
Ha...I once used this line "Its Google not God, intentions don't matter on the internet" to hit on a rival newspaper website that used to fill up its pages with crappy content obviously and poorly written for search engines. I guess, Google is now trying to become more like God in trying to read into the intentions of us searchers! I guess its for the better though, nice post Dr!
Nice one Dr Pete. I like the "think as if you were a search engine" approach to SEO, so really enjoyed your summary of the guidelines. Now, if only I could grab a copy for myself...
Put this fact at a side for a moment that it’s a leaked document by Google and look at these 16 points… and I’ll call some of the points as the blue print of a website that is love by people as well as search engines.
So, like most of the people above me, I also put this document to read over the weekend but Thanks to you for the Summary ;)
Thanks, Pete for doing the leg work for us.
Great breakdown Dr. Pete.
Unfortunately, not overly valueable for those that conduct ethical and 'white-hat' SEO already and (also unfortunately), those that don't, won't pay a blind bit of attention to how Google determines quality and ratings.
*Sigh*
I'm always blown away by how much effort you put into your posts. Thank you for always keeping me informed and interested.
Thanks. The way I see it, I'm getting paid to do something that I probably would've done anyway, just because I'm a nerd :)
Thank you "private citizen and general troublemaker"! lol
I would have assumed the users would have been instructed to us Chrome with the link checker add on.
"Most people may want to buy an iPad when they type “apple”, but a site that specializes in online organic apple sales is still highly relevant to the ambiguous query, in my opinion."
Completely disagree with you on that one I'm afraid Dr. Pete. I think 99.999999999% of people looking for organic apples will search "organic apples" or even "apples", not "apple". If you care about your fruit being organic you're gonna specify it in your search. I do see what you're getting at but unfortunately I can't think of a better example to use than you did!
However that's a great blog post and thanks to my fellow commenters for the links to the guidelines, been looking for these for a few days now.
oh weird, I posted pretty much the exact same comment (below) at pretty much the same time as you!
That's an imperfect example, I'll grant you. I actually did a post a while back about how Apple dominated the "apple" query (after one of the brand updates), and some analysis did suggest that not every one was looking for Apple products.
My broader concern is that the "dominant interpretation" idea automatically devalues other interpretations, and I'm not sure that's reasonable. I get the logic, to a point, and I see why a rater may need to use a dominant interpretation mindset, but I don't agree that other interpretations should automatically be less relevant. Some queries are naturally ambiguous. I also found that Google's examples almost always favored brands.
I should be clear - I don't think this is a conspiracy or that Google is 100% wrong. I just think that their approach could have negative consequences for search diversity.
I agree that it's an imperfect method, but I don't know that there's any better. Search diversity is a top priority for raters- the reason a rater is asked to choose the "dominant interpretation" is to see what the digital algorithm cannot- what is the popular opinion? Ambiguity comes with the territory and for those queries that are more ambiguous, more emphasis is put on keeping variety in the top results. I don't think there are "negative consequences for search diversity" based on a dominant interpretation mindset. That mindset is the human variable. When it comes to ambiguous search terms, what method would you propose to better know what Google searchers are most-often interested in than asking them to make that choice? Are most queries for the search term "Apple" done by people looking for an iPad or potential ingredients for an afternoon pie bake? I wouldn't dare make the assumption and, instead, would go the good old-fashioned statistical route.
I also wanted to say that you pointed out the center of debate (for a small circle of search nerds). Where is the "common" interpretation defined, when the medium from which information is gathered and disseminated is partially responsible for what is "common"? We mold technology as it, in turn, molds us- an age-old debate. Self-reflexivity on a global, social scale. Should we be concerned? I have to say, um, heck yes (in anxious nerd speak).
I think you hit it on the head there, actually, and maybe I didn't make my point very well. My broader concern is that, once the dominant interpretation drives the results, then that interpretation just becomes more dominant. It's a potentially vicious cycle that can damage the diversity of search over time.
Again, I get the core reasoning. I dont think it's a conspiracy, so much as a side-effect we have to be aware of. I'm just not sure that they're being fair to equally valid, if less common, alternatives. I don't think Apple Vacations should be docked for not being Apple Computers, but, on the other hand, one defintely has a much bigger place in the public consciousness.
I recently did an article on Honeycrisp apples and was checking out keywords and found that Apple did, indeed, come up. :) I think the illustration is a good caution. Sometimes we get caught up in what we are writing and assume way too much.
Great information!
I would never have had the time to read that 125 pages by myself :-)
Surely if someone wants to buy organic apples they will search for [apples], not [apple]?
Pedantry aside, this is a nice summary
I think that this is one of the few posts that are worth bookmarking. Its one of the posts which is well prepared, insightful and as always, has the awesomeness of Dr. Pete written all over it. Kudos!!
Out of curiosity, I got a copy of the 125 pager myself. Its a large document and I find that the post summarizes it all.
Absolutely agree with you. I have the opportunity to read the document, there aren't secrets, but understand how Google thinks is a good way to reach improve your rankings in the SERP's.
I'm specially interested in knowing which reasons make Google flags a website as spammer.
If somebody wants to read it, Björn Trolin paste a link few comments before. :)
Thanks Pete, great summary. "Content is relevant or irrelevant, but spam is about tactics and intent." well said. But still seems very subjective.
Funny, I have all of this info sittig in my inbox right now sent to me directly from Google.
I answered a craigslist job ad for a "Search Quality Rater" a few months ago that I thought I might want to do on the side. It didn't say upfront that the job was working for Google, but it was pretty obvious once the training materials came in.
does anyone have the PDF they can hook me up with? Please Private Message me. I found several dead links.
A few months ago, Google released questions to ask your self when working on a website. is this guide book similar? 'guidance on building high quality'
www.leakedqualityguidelines.com
TY, the link works perfect... I was fortunate enough to have the PDF sent to me... 125 pages!!
Great article!
Absolutely loved #10. Time and time again we are reminded that information archietecture and site hierarchy plays a huge role in SEO.
Curiosity got the best of me and I had to track down a copy of the PDF for myself. Although as you said it does not contain any block buster secrets, it's always interesting to see how they look at quality internally.
If you have trouble finding a copy here is a hint: Google is not the only search engine.
What?! The guidelines specifically say that Google is the only search engine. YOU ARE A BED OF LIES! ;)
Busted :(
The people look at the algorithm not the individual site.
Thanks for the shared information regarding Google's Rater's Guidelines but I am not getting any information regarding the whole process of Google's Algorithm change which you should have described in addition to the Google's Raters rating process,Just Google that ..and you will find the info regarding algo update procedure...additionally if you can have a Quick watch in Google's Insights in order to make it more informative...and well... I know Google raters are not the employees of Google but they are the same people like us sitting in remote locations and contributing to the algo update process...additionally you did not mention about the Difference between the Panda Update and Google Raters Ratings in order to improve the search results as well as the algo update(don't think it as Off topic). and the Google raters not only just give ratings to the websites but they rate many other changes of Google searches also e.g. the Spell suggetion change(which was happend soon ago) just google it! BTW thanks for the info cuz you shown little bit light to it. If you need more info and answer's of my questions Just google this term"Potpiegirl(dot)com"
I am highly unsatisfied this time..Hope...in future I won't see just a fraction of information! :(
If you're interested in Google Algorithm updates, we have a "living document" that tracks Google Algo changes. This includes Panda updates, and I personally keep it up-to-date.
I'm sure a comprehensive post covering all of these issues would be appropriate for YOUmoz, Ajay. Since you seem so well-informed on the topic, enlighten us.