I'll begin with a quote from Google's Guidelines on Cloaking:
Serving up different results based on user agent may cause your site to be perceived as deceptive and removed from the Google index.
There are two critical pieces in that sentence - "may" and "user agent." Now, it's true that if you cloak in the wrong ways, with the wrong intent, Google (and the other search engines) "may" remove you from their index, and if you do it egregiously, they certainly will. But, in many cases, it's the right thing to do, both from a user experience perspective and from an engine's.
To start, let me list a number of web properties that currently cloak without penalty or retribution.
- Google - Search for "google toolbar" or "google translate" or "adwords" or any number of Google properties and note how the URL you see in the search results and the one you land on almost never match. What's more, on many of these pages, whether you're logged in or not, you might see some different content to what's in the cache.
- NYTimes.com - The interstitial ads, the request to login/create an account after 5 clicks, and the archive inclusion are all showing different content to engines vs. humans.
- Forbes.com - Even the home page can't be reached without first viewing a full page interstitial ad, and comparing Google's "cached text" of most pages to the components that humans see is vastly different.
- Wine.com - In addition to some redirection based on your path, there's the state overlay forcing you to select a shipping location prior to seeing any prices (or any pages). That's a form the engines don't have to fill out.
- WebmasterWorld.com - Pioneers of the now permissible and tolerated "first click free," Googlebot (and only GGbot from the right set of IP addresses) is allowed access to thousands of clicks without any registration.
- Yelp.com - Geotargeting through cookies based on location; a very, very popular form of local targeting that hundreds, if not thousands of sites use.
- Amazon.com - In addition to the cloaking issues that were brought up on the product pages at SMX Advanced, Amazon does lots of fun things with their buybox.amazon.com subdomain and with the navigation paths & suggested products if your browser accepts cookies.
- iPerceptions.com - The site itself doesn't cloak, but their pop-up overlay is only seen by cookied humans, and appears on hundreds of sites (not to mention it's a project of one of Google's staffers).
- InformationWeek.com - If you surf as Googlebot, you'll get a much more streamlined, less ad-intensive, interstitial free browsing experience.
- ComputerWorld.com - Interstitials, pop-ups, and even some strange javascript await the non-bot surfers.
- ATT.com - Everyone who hits the URL gets a unique landing page with different links and content.
- Salon.com - No need for an ad sponsored "site pass" if you're Googlebot :)
- CareerBuilder.com - The URLs you and I see are entirely different than the ones the bots get.
- CNet.com - You can't even reach the homepage as a human without seeing the latest digital camera ad overlay.
- Scribd.com - The documents we see look pretty different (in format and accessibility) than the HTML text that's there for the search engines.
- Trulia.com - As was just documented this past week, they're doing some interesting re-directs on partner pages and their own site.
- Nike.com - The 1.5 million URLs you see in Google's index don't actually exist if you've got Flash enabled.
- Wall Street Journal - Simply switching your user-agent to Googlebot gets you past all those pesky "pay to access" breaks after the first paragraph of the article.
This list could go on for hundreds more results, but the message should be clear. Cloaking isn't always evil, it won't always get you banned, and you can do some pretty smart things for it, so long as you're either:
A) A big brand that Google's not going to get angry with for more than a day or two if you step over the line, OR
B) Doing the cloaking in a completely white hat way with a positive intent for users and engines.
Here's a visual interpretation of my personal cloaking scale:
Let's run through some examples of each:
Pearly White - On SEOmoz, we have PRO content like our Q+A pages, link directory, PRO Guides, etc. These are available only to PRO members, so we show a snippet to search engines and non-PRO members, and the full version to folks who are logged into a PRO account. Technically, it's showing search engines and some users different things, but it's based on the cookie and it's done in exactly the type of way engines would want. Conceptually, we could participate in Google News's first-click free program and get all of that content into the engine, but haven't done so to date.
Near White - Craigslist.org does some automatic geo-targeting to help determine where a visitor is coming from and what city's page they'd want to see. Google reps have said publicly that they're OK with this so long as Craigslist treats search engine bots the same way. But, of course, they don't. Bots get redirected to a page that I can only see in Google's cache (or if I switch my user agent). It makes sense, though - the engines shouldn't be dropped onto a geo-targeted page; they should be treated like a user coming from everywhere (or nowhere, depending on your philosophical interpretation of Zen and the art of IP geo-location). Despite going against a guideline, it's so extremely close to white hat, particularly from an intention and functionality point-of-view, that there's almost no risk of problems.
Light Gray - I don't particularly want to "out" anyone who's doing this now, so let me instead offer an example of when and where light gray would happen (if you're really diligent, you can see a couple of the sites above engaging in this type of behavior). Imagine you've got a site with lots of paginated articles on it. The articles are long - thousands of words, and even from a user experience point-of-view, the breakup of the pages is valuable. But, each page is getting linked to separately, there's a "view on one page" URL, a "print version" URL, and an "email a friend" URL that are all getting indexed. Often, when an article's interesting, folks will pick it up on services like Reddit and link to the print-only version, or to an interior page of the article in the paginated version. The engines are dealing with duplicate content out the wazoo, so the site detects for engines and 301s all the different versions of the article back to the original, view on one page source, but drops visitors who click that SERP to the article homepage in the paginated version.
Once again, the site is technically violating guidelines (and a little more so than in the near-white example), but it's still well-intentioned, and it really, really helps engines like MSN & Ask.com, who don't do a terrific job with duplicate content detection and canonicalization (and, to be fair, even Yahoo! and Google get stuck on this quite a bit). So - good intentions + positive user experience that meets expectations + use of a proclaimed shady tactic = light gray. Most of your big brand sites can get away with this ad infinitum.
Dark Gray - Again, I'll give a hypothetical rather than call someone out. There are many folks who participate in affiliate programs, and the vast majority of these send their links through a redirect in Javascript, both to capture the click for their tracking purposers and to stop link juice from passing. Some savvier site owners have realized how valuable that affiliate link juice can be and have set up their own affiliate systems that do pass link juice, often by collecting links to unique pages, then 301'ing those for bots, passing the benefit of the links on to pages on their domain where they need external links to rank. The more crafty ones even sell or divide a share of this link juice to their partners or the highest bidder. This doesn't necessarily affect visitors who come seeking what the affiliate's linked to, but it can create some artificial ranking boosts, as the engines don't want to count affiliate links in the first place, and certainly don't want them helping pages they never intended to receive their traffic.
Solid Black - Since I found some pure spam that does this, I thought I'd share. I recently performed a search at Google for inurl:sitemap.xml, hoping to get an estimate of how many sites use sitemaps. In the 9th position, I found the odd URL - www.acta-endo.ro/new/viagra/sitemap.xml.html, which redirects humans to a page on pharmaceuticals. Any time a search result misleadingly takes you to content it not only doesn't show the engine, but isn't relevant to your search query, I consider it solid black.
Now for a bit of honesty - we've recommended pearly white, near white, and yes, even light gray to our clients in the past and we'll continue to do so in the future when and where it makes sense. Search engine reps may decry it publicly, but the engines all permit some forms of cloaking (usually at least up to light gray) and even encourage it from brands/sites where it provides a better, more accessible experience.
The lesson here is don't be scared off a tactic just because you hear it might be black hat or gray hat. Do your own research, form your own opinions, test on non-client sites, and do what makes the most sense for your business and your client. The only thing we have to fear is fear itself (and overzealous banning, but that's pretty rare). :-)
p.s. The takeaway from this post should not be "cloak your site." I'm merely suggesting that inflexible, pure black-and-white positions on cloaking deserve potential re-thinking.
I am so outraged by Matt Cutts' continuous pointing out the "ONE" site they penalized and that cloaking is bad. It will take him probably 5 minutes to penalize the other 16 mentioned here. I have a question to Sarah Byrd, Esq (our beloved online law expert). Let's say Matt does not penalize any of these sites in a week. And let's assume I decide to follow one of these tactics used here, but his spam team penalizes me. Can I sue Google for discrimination? I know it sounds crazy but here is one line that needs to be drawn. Matt is still using a hit and run tactic to avoid the subject on big business immunity.
My hat is off to Rand for fighting the issue and asking for a clear answer. As a small business consultant myself, I can see many small business owners saying put up or shut up to Matt.
Nice question Mert, I'm interested in that answer also.
Thanks for the post rand.
Mert - I think you might unfairly miscategorizing Matt's intent and his comments. He's noting that one of the sites did the cloaking in an egregious manner, and others did not. Clearly, that speaks to the value of considering a gradiated scale (like the one I've proposed). But, it does not mean that there's "discrimination" of any kind. Nearly all of those sites are cloaking in very white hat, legitimate ways.
Matt's position isn't that Google is practicing hypocrsiy or that cloaking is even always bad. He's just saying that it's higher risk than I am - and I think he's a bit upset and concerned that this post encourages bad practices or encourages experimenting with them (which is fair criticism, but I have a lot of faith in our readers to be smart about it).
Also - Google's index is theirs to do with as they please. They could throw out all my clients sites because they didn't like the way I brushed my teeth and there would be no recompense. They try to stay fair, stay true to their ethics and do the best they can, but it's out of self-policing, not governmental or legal intervention.
BTW - Sarah's last name is "Bird" :)
Sarah,
My truest apologies on the misspell. You should try to spell my last name
(S-a-h-i-n-o-g-l-u)
Rand,
Matt clearly says cloaking is risky behavior and there are ways around fixing it. Here is an example. In this blog post, in the comments section, Phil from Trulia writes a very very long post claiming, "oh we were cloaking to prevent duplicate content". As a professional SEO, does anyone here believe that it is the case. What happened to robots=noindex to take out duplicate content but still retain link juice.
In that Sphinn story you linked to about Trulia, does the Seattle Weekly actually want to give a link to Trulia's Seattle Real Estate optimized page or is Trulia doing this to game the search engines to rank for the keyword "Seattle Real Estate" (currently #3) just like they do in Chicago, New York, and many other major cities around the country? Strictly from Matt's point of view (and I love it when he says this when asked about any SERP situation), "Do they deserve to be ranking?" Does a company who simply publishes duplicate content received from other real estate website owners (creating duplicate content in the process) who cloaks to rank higher for popular keywords by "syndicating" (don't know how else to describe it) already duplicated information to other websites truly deserve to be ranking there? I think Aaron Wall made similar examples before about Google using similar tactics for its own financial gain.
Similar arguments could be made about all the other sites in your list from their competitors. So, I ask again, will Google penalize Trulia (these companies are half sisters as their parent VC company is the same (Sequoia)) or will Matt continue to say it is risky behavior and slip away?
Sorry Rand, Matt insists that the risks are much higher for cloaking. Are you telling me that Trulia took that chance with millions of dollars in stake? Is there a different amount of risk for a mom and pop store than Trulia? If there is, is not this an economic class discrimination??
P.S. I am not trying to attack Matt individually. I just truly believe this debate's opposing side should really be executives way above his pay scale in Google that will probably never read this article.
Mert - on the Trulia example, I'm thinking that's pretty light gray - the Seattle Weekly is clearly giving an editorial endorsement of the quality and value of Trulia's content by hosting it on their site. And Trulia is right to be concerned with duplicate content on other, powerful domains.
The licensing question is a tough one, and if it were up to me, I'd probably just have links pointing to the page on Trulia from the SeattleWeekly, nofollow the links to the Trulia content on SeattleWeekly and noindex those pages - same result, slightly less tactically gray. I think we're getting pretty semantic and picky if we're going to say they deserve to be thrown out of the index.
This whole post is about how cloaking really shouldn't be about tactics, but about intent to manipulate and deceive. Granted, I only read the post (and haven't investigated everything Trulia's doing), but if that's the extent of their issue, I'd say they don't deserve the reprimand. I'd be willing to apply that across the board with similar situations - if Fodors content on the NYTimes travel section 301'd to Fodors.com when engines hit it, I think that would be fine - not my preferred method, but certainly not disingenuos and still providing the same user experience.
Rand,
My point was not if what Trulia did is right or wrong or acceptable, etc. from an SEO perspective. My entire point is, Matt is not willing to accept to draw a guideline on acceptability in cloaking. Did Trulia do this to play the search engines? Yes. You asked for more examples. Trulia gives away map widgets but Trulia gets back "city real estate" anchor text links back to their city for it at the bottom of the map. I am almost positive it was Matt's hand (since a very large network went down in one hour) when he took down the #1 (city real estate keyword) real estate sites network in many areas of the country because they did something similar with song widgets (getting "city real estate" anchor text links for song videos). Trulia takes the listing content from real estate broker but gives a nofollow tag link to the source of the listing, the broker. I am not even debating the acceptability of these actions from an SEO's perspective or your table on this post. I want to know if a mom and shop enterprise would be hurt doing any of the above cloaking. You call it gray hat, Matt calls it high risk behavior. At the end of the day, will every search result top 5 be Wikipedia, yahoo, cnn, trulia (or similar big site), yelp. or youtube. Will user generated content sites dominate the search engines?? If that will be the case what is the point of being an SEO than chasing longtail scraps (that sounds extreme for now but this is not a distant future). As long as the lines are drawn where the big guys are not afraid of Google when doing high risk behavior; heck I dare to ask Google, then what will be the point of using Google search???
No, they're not. Seattle Weekly is linking out to a subdomain on trulia.com that is branded as Seattle Weekly. Trulia then cloaks a 301 redirect to their optimized page. IMHO, this is more akin to the "dark grey" example you gave above:
I disagree - by placing Trulia's content on their site (or wanting to license it), SeattleWeekly is giving a very significant editorial endorsement of that material's value and usefulness; far more so than any link, actually. So, while the method might not be the best, Trulia is not paying SeattleWeekly for the link - it's a true, editorial endorsement. Tactics aside, I'd say intent is more critical in this case, and the intent to me is clean.
And I say this as someone who has friends and clients competing with Trulia, so I'm actually more likely to be biased against them.
It's an editorial endorsement in that they link out to the subdomain, but they haven't placed any of Trulia's content on SeattleWeekly. The "shades of gray" scale is purely subjective. I was merely pointing out that the misdirection of PR is very similar to what you've described in your "dark grey" example.
If the intent is to avoid duplicate content, there are much easier & efficient ways of doing so. So why the conditional redirect? Purely for SEO benefit. There's nothing wrong w/ using strong SEO tactics - I'm just surprised that Google is allowing it here.
Rand said,
Also - Google's index is theirs to do with as they please.
Right it’s Google’s Index. But can’t this be covered under monopoly malpractices? Isn’t it like IE and Windows Media Player are Microsoft’s software and they could do give it all for free and so what if they didn’t allow the users (in the previous versions of Windows) to remove IE and Media Player through “Add and Remove Programs”?
I am not a lawyer but it will really interesting to know a lawyer’s take on this. Can a small company who has been purposefully excluded from Google’s index as a penalty can sue Google for monopoly malpractices if some big company is doing similar behavior and is still included Google’s Index.
Sarah, if you are reading this, your opinion really matters on this topic. It will be great if you did a post about this.
At this point in time, Courts have unequivocably ruled that you do not have a right to rankings. Thus, you can't sue Google about your rankings.
I don't foresee any legal recourse for site owners ejected from Google's index now or in the near future. Maybe in the distant future...
There are other interesting debates circulating about ways that Google may be violating anti-trust laws. For example, several academic-types are arguing for privacy to be a factor in monopoly/anti-trust analysis. You have a right to privacy, and Peter Swire argues that consumers are harmed by the conglomeration of so much private information in one place. After Danny Dover's post on the information collected by Google, Swire's argument is increasingly appealing to me.
I'm sure this wasn't the answer you were seeking, but I hope it's informative just the same.
Respectfuly,
Sarah Bird
Thanks for the reply. Yes It's really informative. And yes before Danny's post I never thought about Google having my personal data. But After reading the post, I must say I am a bit concerned.
I'm not an expert at search marketing or SEO (I just work with a few), but my general philosophy is that you should treat search engines like over-eager dumb users without javascript who will reset their browser every page load. For instance, the NYTimes "5 click rule" isn't so much a cloaking thing as it is tracking users and their ability to delete a cookie. If delete my NYTimes cookies every 5 clicks I can read the NYTimes all day for free.
So they're not, really, serving different content to users who are tenatious cookies vs those that don't.
Actually, Rand, this post is incorrect as well. Not only has Google taken action on several of the sites that you mention at various times, every page from the /article/ section of one of the sites you listed is currently removed from Google because it was cloaking.
So my advice remains the same: that cloaking can be very high-risk, and that there are plenty of ways to structure your site (including things like First Click Free, sIFR/SWFObject, etc.) so that you don't need to cloak.
Hmm... I don't know if that makes this post incorrect, just adds to the stories of the sites I mentioned (and fits in with what I noted about some strategies being dangerous vs. acceptable).
It is good to know that in some cases, Google really will take action, but would you say that the "cloaking" we do on SEOmoz, or what Craigslist does is really going "high risk?"
I'm trying to be more careful after last week. :)
Rand, you stated "let me list a number of web properties that currently cloak without penalty or retribution."
Given that a huge site from your list (and that was mentioned at SMX) currently has all of its story articles removed for cloaking and those articles have been removed from Google for almost a month, I think that it's fair for me to say that your statement was incorrect.
I'll reiterate my point, which is that cloaking is high-risk behavior. Google was willing to take action on one of the sites that you claimed was operating with no penalty, in the same way that we're willing to take action on other sites that cloak. There are many ways to build a great site without taking on the risk of cloaking. I recommend exploring those risk-free options before deciding to cloak.
Fair point - I mistakenly pointed to a domain and said it wasn't penalized, when in fact, parts of it are.
I don't think we disagree that cloaking in some ways is harmful - that's what half the post is about. The other half merely notes that it's not ALL bad, and some of it can be quite good if done properly, with good intentions, for the right reasons. I just don't like the absolute of "showing engines and users different things is NEVER a good idea," just as I don't like the concept of "build for users, not for engines." I think both can be misleading.
My point is that you're portraying cloaking as some sort of low-risk behavior when in fact the risks are much higher.
Really? I think I'm accurately portraying cloaking on a scale, the same way we do with keyword usage or link building or anything else in SEO. There are ways to do it that are smart, ways to do it that are dumb, and a sliding gradient of riskiness. When you usually talk about SEO subjects, I notice you often mention this scale of risk, so I'm surprised about the black & white perspective.
It's not my intention to suggest that all forms of cloaking are low-risk, and re-reading the post, I don't think anyone would walk away with that message. Don't worry, Matt, we're certainly not going down the black hat road (or even the gray hat road). I just think it's unfair to hide information or strategies that should be available.
Since we're on the topic, quick question for you Matt.
I deal with many companies in the architectural and design community. To say these folks are lovers of Flash is the understatement of the year.
In a nutshell, their modus operandi is "A picture tells a thousand words" coupled with "Less is more", which as we know is quite contrary to the SE's when it comes to "content rich", and therefore, high rankings.
Often times these pictures are representative of text rich case studies that they would rather not include on their sites. This leads me to the simple question of "Is it okay to cloak in this instance?"
If your answer is, "Use SWFObject", then please respond to this quote paraphrasing of Dan Crow of Google in July, 2007:
Well, how can I ensure that Google/bot won't consider it abusive, if it's just a bunch of pictures of buildings and a whole slew of copy describing the project and process in detail?
If this doesn't work, what do you suggest, short of - include the case studies in the form of text pages or spiderable pdf's.
Thanks.
We're setting a record for deepest replies here. Sean, Rand himself would probably tell you not to cloak in this instance. Rand & Danny found a site doing exactly this type of Flash cloaking at SMX Sydney, and Google removed that site for weeks, even though that was a very well-known, large company.
My guess is that including it as text or PDF would be best, but I think Google will also get better at indexing Flash over time, which reduces the need to consider cloaking in a case like that.
Thanks Matt. My edited version crossed paths with your expeditious response. ;)
If nothing else, it gives me ammunition when I tell my clients that they really need to bite the bullet on text - or at least pdfs!
Architects and designers can be a stubborn lot!
Thanks for asking this Sean, I've been dealing with something similar. Photographers who are willing to spend money on a decent looking site usually go with Flash to show their work...trying to figure out a white hat way to get good rankings for some of these sites has been challenging to say the least.
Matt,
Did WSJ get a notification in their Webmaster Tools account about the removal? If not, shouldn't they've? Not that I want to help them but I do want to make sure Google communicates this sort of thing with 'offenders'. It's a policy thing versus a specific situation.
Brent
"Often times these pictures are representative of text rich case studies that they would rather not include on their sites. This leads me to the simple question of 'Is it okay to cloak in this instance?'"
Heck, I'd say it is not only OK, but federally mandated - you gotta have that info accessible to the visually impaired, after all ... (halo twinkles)
Hey Matt - is that what you'd tell Yahoo?
Nice!
Hey Rand, Matt et al -
I have to absolutely agree with Rand that, given the right circumstance, IP delivery is in fact a way of making Google look flat out brilliant and, in fact, increase the experience of the user. I take issue (surprise surprise) with painting the entire practice of displaying different content to different surfing entities as "cloaking" because that simply furthers the the black/white notion of content delivery.
Rand I think you've distilled some of the usages of IP delivery into a nice spectrum - a fine map for the webmaster to gauge his risk-reward tolerance.
It has been 8 years since this was published, and I still can't reach Forbes.com home page. I always have to go through welcome page or remove ad blocker etc...
Lets say we use a cloaking strategy that falls below "Pearly White," but improves user experience, and Google bans the site. Do you or anyone else have experience filing a reinclusion request with Google under such conditions?
Another question, which should arguably be first, has anyone here ever had a site banned for user friendly cloaking?
I'd be curious to see how responsive/helpful they were in such a situation, and if they asked the webmaster to stop cloaking before reinclusion.
We haven't had a site banned, but many might remember that our Recommended List of SEOs was kicked out of the index for using CSS display:none layers on much of the content (you had to click on the name of the SEO/company and they'd open the details accordion-style). We never did get the page back in while it existed in that fashion, but it was fine once we moved to a different format.
As far as band and reinclusion goes, you certainly can get banned for even the light gray stuff listed above (a lot of that has to do with intent and the perception of whether your intent was/wasn't manipulation). Removing it, owning up to it and filing a reinclusion request through your WM Tools account is the best way to go.
I wonder if we'll see similar issues for sites with Flash-based and Ajax-based functionality, once the spiders start to recognize those technologies more effectively. As web pages get more dynamic (in a real-time sense), it's only going to be natural to show and hide content on any given page, and for the user experience to be customizable in a way that may not match what the spiders see at any given moment.
Thanks for the quick reply, Rand. I've always been a bit wary of accordion style widgets, but they are such a great UI component that its hard to believe its a violation. Perhaps AJAX would be treated a bit better?
I'm rubbed the wrong way by the idea of "owning up to Google" for improving user experience, all the while pretending search engines don't exist.
Serenity now.
Why would you get banned for that? You're providing all of the info to the user, but just choosing to display it in a usable format. That's squeaky clean to me and absolutely no reason to be kicked out of the index.
Hallo Rand,
I really enjoyed reading this article. You also compiled a very nice list, but I think you forgot one: Youtube.
If I read your article and also the comments from Matt, I would almost want to think that Youtube is not at all doing any form of cloaking.
I read this article one time at SlightlyShadySEO, which was very interesting for me, and I had concluded back then that what Youtube was doing, could be referred as cloaking. I also commented in that article, where I thought/assumed that it was a legal form of cloaking. After reading Matt’s comments, I’m a bit lost…
The article I’m referring to is:
https://www.slightlyshadyseo.com/index.php/youtube-is-cloaking-so-why-cant-i/
Please guide me through this dark path, if you have time of course, and tell me if that what G/Youtube is doing can be referred as cloaking or as something else and if that is something that is assumed legal.
Thank you in advance.
Navin
Navin - that's what this whole post is about. There's good, honest, acceptable reasons to cloak and they won't get you tossed out of Google's index. There are also bad ways and manipulative reasons to cloak, and those can get you in trouble. This is why the post urges people to think about tactics and intent before approving or denying a cloaking tactic.
Thanks for the YouTube link, though - I hadn't seen that post before. :)
Hi Rand, hope you're good - in your opinion, where do conditional redirects sit when you're using user agent detection to canonicalise urls? I'll give an example - session-id's - my old network of recruitment sites used that "technology" and conditional redirects were the only option.
Historically this has been refered to as "cloaking" but i think that time has passed, agree?
Sounds ok to me...
This may seem like a silly question/observation this far down the thread but I think it needs to be asked/stated.
And I want to state that I'm not recommending cloaking of any type as a first choice.
Website redesign isn't an inexpensive process for very large companies with large websites and plenty of legacy technology and mindset and a very old site. Companies invest millions.
These companies aren’t going to spend millions to redesign a site just because the web marketing team says so.
When working in an environment like that the SEO/Webmarketer or whatever you call them is often not the HIPO (highest paid opinion) and is often not able to prevent the use of techniques that are not friendly to search engine or users for that matter. There are just too many stakeholders with too much influence and they get what they want or a compromise that still isn't Search Engine / End User friendly.
In that case I’d like to know what Matt would recommend. I won’t hold my breath because of how loaded that question is. But I would say I have to agree with Rand that it comes down to intent. For the most part I think Matt and his team do a pretty good job of determining intent.
What a great synopsis, Rand. I totally agree that 'cloaking' can in many cases improve user experience; what Craigslist is doing certainly helps users. Nice to see the search engines recognizing this and adjusting their guidelines accordingly.
Side note: am I the only one who thinks that White Hat cloaking shouldn't really be called 'cloaking'? The word conjures up some evil images of shadowy Darth Vader-type figures; Craigslist just seems too friendly for that. How about something like 'caping,' which makes me think of the ever-intriguing 'man with the cape' from Seinfeld.
Nice David. Might I also propose "window dressing"? :)
Nice, I like it, Sean!
LOL I like it.
Rand, in the paragraph about "solid black":
It looks like the end of that section's been cut off... just an fyi.
Sorry about that - it was a CSS span issue in the post, but all fixed now. Thanks for pointing it out.
While we're on the subject, the first paragraph contains a sentence introducing a quotation with that quotation thrown into the middle of it:
A quote from Serving up different results based on user agent may cause your site to be perceived as deceptive and removed from the Google index. Google's Guidelines on Cloaking:
I fixed that. Thanks!
I think the problem is that a method has been labeled, and that label carries negative impressions due to some of the past and present usages.
Whats more, that label is now too broad and serving as an umbrella over an entire spectrum of tactics and implementations. Like many things in SEO -- and life -- it is neither good nor bad, but the intent and how it is used may be classified in some way on this scale.
Most readers here, and I think, possibly naively, that the bulk of those following the SEO community, are looking for techniques that can aid them, with best intentions, and a positive user experience for bots and humans -- Google isn't the only one who prefers to do no evil.
I understand and appreciate Matt (Google's) position...not wanting to give those who are looking to use these techniques with bad intent any more information than they have to to aid them. The reality though is that those who fall within that area, are probably going to do whatever they are going to do, regardless of what Google does or doesn't say.
The challenge is that the rest of us are left trying to understand and interpret broad sweeping definitions where technical functions have been declared good or bad, white or black, safe or risky, but it isn't the actual function, but the specific methods (and intent) used to employ the function that needs to be discussed and categorized.
Isn't it interesting ... it seems that the more the engines share specifically with the community of what is good and bad, and how things are perceived, the bar tends to be raised rather than lowered, that sites tend to be improved ... there are more of us trying to make things better than game the system, and the more we all know about what that means, the better we can help to do that.
You wanna talk about cloaking that hurts the user experience?
Well come to Taiwan, go to a net cafe with a chinese OS, and then try to use the sandbox traffic estimator...it gives you all chinese.
If I log in to my gmail account with google international all the keyword tool works fine, trends, gmail...all except the traffic estimator tool. But the sandbox traffic estimator still gives me all Chinese.
When I use my laptop which has an english OS, i have no such problems.
so...out of 1000 different places that google does good language cloaking, it is done incorrectly in this case.
Great overview!
"[...] engines shouldn't be dropped onto a geo-targeted page; they should be treated like a user coming from everywhere (or nowhere, depending on your philosophical interpretation of Zen and the art of IP geo-location)"
I love this sentence ;D eSEOteric!
There have been instances where I've shown less content to search engines than to users - either to hide what could become duplicate content or to hide affiliate links in a sidebar that they say they don't want to see in the first place... Both cases were well-intentioned and there was no intent to be deceptive, but I still wonder if those actions could be misinterpreted...
I've also used display:none because it was the best thing to do for user experience. Luckily that hasn't resulted in any noticable penalties...
Really good point Jay - in all honesty, the nofollow on links is showing one thing to users (a link) and another to search engines (no link). Using content in iframes on pages with the iframe robots.txt'd out or making external javascript calls for content are all things that keep content out of the engines but there for users.
I just think it's critical for website owners to be thinking about how they employ different content for users vs. engines rather than assuming all of it is either sure to get them banned or completely safe.
White hat cloacking can be usefull... ok, but.... If I use cloacking to show a french content to a user that come from france and an English to a user that comes from united states, does this affect the search engine rankings of my french content ? Knowing that Google Boqt will be considered like an american user, so it ll see only the english version ?
So does this mean all affiliate programs are bad for seo? If you give your affiliates an id that's just an argument to the page, I've seen that search engines often ignore everything after the question mark.
I have a number of pages where I serve header text in an H1 tag to crawlers and an attractive image header everybody else. Two questions:
1. Would this be considered cloaking, even though I'm not redirecting the URL?
2. What color is my hat?
I'm of the school of thought that sometimes cloaking can enhance user experience, and I'm frustrated with the idea that I have to pander to the SEs at the expense of users. Seems to go against the whole ethos of white hat.
Hello Rand,
Search Engine Cloaking Scale image it self enough as a answer for hundreds of questions.
Hello guys,
We have a site that renders in clean html and then we load heavy javascript stuff in background.
It seems google is computing our "time to serve page" as 3000ms.
We're thinking about removing the javascript completely and showing the exact thing, but without letting google load the js.
I wonder if this would come with penalities?
thanks
One of the best guide and their effectiveness still exist. Matt cutts described very well as your mention link in first line, but I surely like your image, in which you declare each step of white and dark hat cloaking. Thank you for this kind of information.
I would love to know if there's an update on this topic in your brain (can't find anything else)? I work with ecommerce sites and I'm finding many clients using user-agent redirection. Disclaimer: this is client's reasoning, not mine. Let's say a product goes out of stock, they want to keep it a 200 but if the page ranks in SERPs, they redirect the user to an in stock related product (different page). While very strange and not what I'd do, I think the intent is white or light gray, what do you think?
Hi Rand, this is a very interesting post thank you. Seven years have passed since you wrote it and I was wondering whether you think it all still applies today. It looks like we may be about to have an issue with Googlebots in the US crawling a version of our site that has been personalized by geolocation and then showing this version of the page in the SERPS in the UK (we are serving the localised content on the same URL). My understanding is that Google started locale-aware crawling (https://support.google.com/webmasters/answer/6144055?hl=en&ref_topic=2370587&vid=1-635792171229609065-647665285) at the beginning of this year so hopefully it won't turn out to be a problem - but I'm wondering what options we have if it is a problem. The Craiglist approach you describe as "Near white" seems like it might be the best approach for us.
As usual, I'm late to the comment party, but I have a good excuse. I'm at the beach learning how to surf (the waves kind) with my 6 year old. Man I suck.
Anyway, when I have client's ask about the cloak|no cloak issue, I always ask them two questions:
1). Is their any possible way to accomplish what you want to accomplish using a technical solution other than cloaking, and...
2). If your content was busted for cloaking, even if it wasn't with intent to fool, and you were purged from the Google index, even if it is in your eyes unfairly, can you handle the resulting fallout|$losses| ?
The issue here for me is simple. We all must deal with and handle the reality of playing on a field that we did not create, with a ball that is not our own. We do what we have to to protect our clients, or we don't.
-ew
You sure do! Rub it in, why don't ya!
Hi Rand, this post was very interesting to read...I was wondering about this comment... "comparing Google's "cached text" of most pages to the components that humans see is vastly different."
I tried poking around to see what you meant by that and I could not find an example of it, and how would you suggest fixing that particular snafu?
If you look at a lot of the article pages (and other pages) on Forbes, there are iframes and javascript elements and ads and additional navigation that shows to human users, but not to bots. In terms of fixing it, you'd just need to make sure that a standard bot could see the content on those URLs in the straight HTML code. Using a tool like www.seo-browser.com could come in handy for something like that.
What about Google's First Click Free initiative? It hasn't been given much exposure and I'm not sure if it's still an active project or not - basically if your content was behind a registration page, Google was allowing you to Cloak your content (actually instructed you on how to do it) to them for indexing purposes but required you to allow that user to view that inital "first" click that was performed from Google. If the user proceeded into your site, you can then require them to register.
Spammers went to town recently on queries that look for "index of" and query modifiers like "inurl:" to catch malformed queries. I have even come across people who code spoofed server responses:
Apache Server at simplesample.org Port 80
Good post Rand. I think that Matt was just worried that if he doesn't come out and give a bit stronger warning that his silence could be approval. I agree that most readers of this community are smart enough to know what to do and what not to do.
He has said before that webmasters can do as they please but Goolge is not obligated to show you in their index.
I found this topic very intriguing at SMX Advanced. I wish we had been able to hear more from Hamlet on the subject. Not necessarily a how-to but more of a good theory discussion.
Some savvier site owners have realized how valuable that affiliate link juice can be and have set up their own affiliate systems that do pass link juice, often by collecting links to unique pages, then 301'ing those for bots, passing the benefit of the links on to pages on their domain where they need external links to rank.
Hmm thats funny, this was just mentioned on seobook's blog...and it mention that seomoz had mentioned this about seobook.com
check out the post here
Thanks for the heads up, Solid. I left a comment on Aaron's blog indicating that it was completley unintentional and offering my apology. I should have used a different example in the video - it was just off the top of my head, and was completely theoretical in presentation.
I love this scale
CL's geo-targeting still works, really?
That page you referenced as cached by Google, whenever I've been hitting CL that's what I've been getting, as of at least a month or two back... and this is all irrespective of whatever my UA settings might be at any given moment, moreover I've seen the same thing when visiting them using various machines via various networks both private and public.
What's odd though is that on their deeper URLs the geo-targeting still seems to work e.g. a request for https://craigslist.org/muc will still resolve to https://sfbay.craigslist.org/muc/ etc. ...To me it seems their geo-targeting is still at least partially in place, additionally via IP delivery as opposed to something as flimsy as UA detection.
Anyway, matters whatever is up with CL and/or just my experience thereof or not aside... I think along this line of discussion one should be careful to not confuse terms. In other words, UA detection and IP delivery are (technical) means whereas cloaking and geo-targeting are more ends, comparatively.
What an excellent review on Cloaking, I was thinking about doing it with some urls. Because of our new dynamic site.
Great Post
Thanks for the info Rand
I can't speak to the specific examples, but I have the same frustration: from a usability and search quality standpoint, I definitely see cases where some minor cloaking could be beneficial (and will admit I do a tiny bit of it), but am always wary of the smackdown. I understand that Matt has to take the hardline on this, much like my car dealer can't tell me it's ok to keep using my brake pads at under 10% wear, but it'd be nice to see that dubious word "cloaking" broken out into the much wider set of concepts that it really is.
i think somewhat like law, it's intent..
if you are cloaking to AID the SEs, then it's OK...
if you are cloaking to deceive the SEs, then it's a no-no.
great article. (as usual), Thanks
p.s. Mert... be careful about shooting the messenger it took OVER 10 years to get one...
Paisley,
My point had nothing to do with Matt. He is at the end of the day a salary person who has to obey company guidelines about what he can or cannot say. Honestly though, official Google (not Matt) guideline is spreading of fear through very blurry guidelines. They are so blurry that Rand had to make an awesome scale on this scientific post with proof to clear it for us and Matt still says keep it blurry and you are encouraging cloaking (even though there were 16 examples of "clean" cloaking on there that Google Spam team is aware of). I am simply tired of it man. If the message is almost always creating doubt and confusion, with all due respect, we can figure this out on our own without the Google help (again I am differentiating Google from Matt Cutts). I feel like Matt has the loneliest job in the world thanks to Google.
Last post on this issue, I am moving on. And again, sorry for making Matt the target of this discussion. The discussion is about Google. I've always said it in my life. A man who loves cats so much can not be a bad person.
Mert,
I agree "Clean cloaking" is one area the problem lies given Google's unwllingness to specify. Personally, and this is just an observation, I feel that Google should not have to be expected to employ tens of thousands of people to put eyes on every page of every site that triggers red flags for possible spam. In fact, I'd venture that it's statistiacally impossible when you factor every possible black hat or grey hat technique.
So if that's true, then what is Google left to do in order to combat spamming of the Google index? Write software algorithms that will hopefully find as many sites that fit the model and weed them out. This then is, unfortunately, an imperfect solution, yet it's probably the only valid solution available. So as a result, some good sites, with good intentions will get harmed.
But what are good intentions? To find new ways to display content in really cool ways is not necessarily a legitimate enough reason to cry foul. Just because something looks slick does not mean that it earns the right to rail against Google. Especially given the issue I just described with not having a real world ability to put enough eyes on enough sites in every instance. If a site owner would rather leave the spam and scum on Google just so that site owner's site is not possibly harmed, then that site owner is failing on the level of shared community responsibility.
And if Google gets so specific as to deliniate the actual algorithm, it only leaves spammers a wide open flood-gate opportunity. So personally I have absolutely no problem with how "vague" they are.
Yet they're not vague at all when Matt Cutts continually and consistently holds the positions he does on hidden text, on cloaking... He's Google's way of saying - hey everyone - if you want to be safe, and if you want to participate in the community effort to wipe out spam in our index, then don't use hidden text, and don't use any form of cloaking.
That's about as clear as any directive I've ever seen on any subject.
I think cloaking used in most major web sites is obtrusive and invasive advertising noise. Forcing me to have to go through a pre-home page, or making me view popups of ads (thank God for Firefox's popup blocker), well that's just rude.
We had a raging debate at work yesterday over the various uses of hidden text and all it's "legitimate" uses.
As far as I'm concerned, if what the Googlebot sees is not what the site visitor sees without them having to do something, then the site is not being properly designed. If you need to rely on hidden text or popups or forcing someone to view content before you allow them to get to the content they want, then the site is not being properly designed.
Most people in the advertising world and apparently Rand totally disagree with me.
The fact that my thinking fits with Google's need to clean out the garbage (and as a result, sometimes toss a baby or two out with the dirty bathwater) just makes my clients that much more appreciative of me because they don't ever have to worry that one day their site won't intentionally or accidentally be penalized by Google.
THAT is best practices as far as I am concerned.
but then mine is just one more opinion.
Hi All!
Thanks to Rand, Matt Cutts and everyone else on this dialogue. We did not see this as a black or even a grey-hat technique , but this issue has seemed to stir up some debate. The Trulia Publisher Platform was designed to provide publishers with a robust real estate search utilizing Trulia’s search technology. User experience and content on our co-brand pages and Trulia pages is the same and 301 redirects were put as an additional precaution to avoid duplicate content (not every search engine seems to respect User-agent: *Disallow: / directive in robots.txt). There was never an intent to deceive either users or search engines, but given recent feedback, the perception that we may be violating Google Webmaster Guidelines is something we absolutely want to avoid. We removed the 301 redirects to be as conservative as possible.
Thanks to everyone here for your insight!
Rudy
Social Media Guru at Trulia
Rudy:"....the perception that we may be violating Google Webmaster Guidlines is something that we absolutely want to avoid"
Then I guess the next step for Trulia will be to remove the nofollows and redirects since that is also perceived as violating google's webmaster guidelines?
https://www.seomoz.org/ugc/trulias-web-ranking-strategies-come-under-fire
I'm going to do some unintentional self promotion but on my blog www.angle45media.wordpress.com, I have the leaked google document that breaks down what google considers spam and what it considers "useful" content... berry berry interesting.
but great post Rand, you da man!
From now i will be following the above method to Rank my website by doing some White Hat Stuff !