When cloaking is mentioned, most people start to shiver and shudder. They point to all the nifty doorway pages, Google's TOS and say how horrible and evil it is.
But is cloaking really that evil?
Take, for instance, a high trafficed site that gets a million hits a day. Serving no ads to the bots as well as other streamlined code could result in significant savings!
You could even use cloaking as a method of getting some of your text that you want indexed higher in the page (some believe (not me mind you) that this will help your rankings). That's not so evil is it? The content really isn't changing - unless you consider the ads to be part of the content.
And most of the time the ads are in javascript so why serve up the javascript just so the search engines have to spend more time downloading and processing it?
G-Man
Is Cloaking a Valid Method for Reducing Bandwidth?
Blogging
The author's views are entirely his or her own (excluding the unlikely event of hypnosis) and may not always reflect the views of Moz.
Google treats cloaking like the U.S. Government treats medicinal marijuana. No matter how useful, how beneficial, how many studies are published and research produced - you will be punished for it.
This is a particularly ridiculous case. Assuming Google determines whether cloaking occurs by sending non-identifiable bots to compare against pages indexed by Googlebot, the algo would simply need to only compare textual content to textual content. As long as they are identical, the page would not be punished. The program could still look for tell-tale cheats like negative pixel positioning or uses of display:none; but a page would not be punished if the textual content remains unchanged (such as exclusion of large blocks of javascript and css, whitespace, etc.)
The question is, how many cancer patients suffer and legitimate sites priced out of existance because of the Government's and Google's paranoia respecitvely.
Hahhaha - yeah, if they're removed from the index, they would certainly lower their bandwidth bills.
And, yes, gzip would certainly help in lowering the bandwidth bills.
I think, however, that you're missing the point in that there are SOME valid reasons to cloak although it appears that the SE lump cloaking/ip delivery into one "evil" category.
rjonesx - I love that analogy - lol.
How about using cloaking so that that first time visitors see a page optimized for first time visitors while return visitors get a page optimized for return visitors? That's can be a valuable use that doesn't hurt anybody too.
What about cloaking of affiliate links such as Clickbank hoplinks?
Do the engines attribute this as something underhanded about the linking site, or the affiliated site?
Is this a case for using the no-follow tag to stop the engines attributing the affiliate site with linking site or should affiliate link cloaking not be done at all?
Certainly a site that cloaks and then is removed from search engines would also save ton on their bandwidth bills too. ;)
The title is "Is Cloaking a Valid Method for Reducing Bandwidth?" I would say that from the search engine perspective, the answer is no. If you choose to do so, you should be aware that you're stepping into a much higher risk technique. Rand, are you really encouraging SEOMoz folks or clients to cloak?
Google (and other search engines [where are the other search engine reps these days?]) want to score the same page that a user would see. Starting to change keyword density, adding/removing links and content really takes away from that.
From talking to the crawl/index folks, it appears that gzip encoding is a way that most site owners could achieve a 2.2-2.4x compression rate, which is a huge reduction in bandwidth. Personally, I would recommend playing with mod_gzip way instead of cloaking.
Matt - no, we are, as Newsweek said, 100% white-hat, which means we're following the guidlines you issue by letter and intent with our clients. They're folks who simply can't afford to have any run-ins with penalization or removal.
But, I think Geoffrey's post is still valuable - it illustrates how the common perception of "cloaking" or IP delivery doesn't tell the whole story and shows how certain sites (as Greg points out) do use the technique and derive value from it.
I think that subjects like this are why it's been so great having G-man blog here - he has experience beyond the scope of those of us who can't afford risk in our tactics.
Probably doesn't hurt to hear about it on your end either, eh? :)
p.s. Hope your trip to Boston was good; I'm bringing the whole staff down to San Jose in August and they're all excited to meet you in person. ;)
Rand, it's nice to finally see you thinking about all the possibilites... :) But there's a flaw with your last idea. If you are selling paid links, removing them would mean you wouldn't have many buyers. The smart webmaster who wants to sell links and also happens to be fluent in IP delivery techniques would present the ads with traditional third-party, crappy looking urls, and then provide straight links to bots.
No one looking at the site would ever think the site was selling links to manipulate search engines, and the bot links would be presented in a way that would make it difficult to spot.
Regarding the other ideas, I've pretty much done them all. Not showing ads to save bandwidth isn't all that. Bots don't actually fetch the images, so the savings in terms of reduced code is fairly small. It does ad up on very large sites, but you can usually get more savings from reduced code bloat by moving all embedded scripting to external files. From a production standpoint, that's always an easier sell than trying to get the IT department to emplement an IP delivery system.
The biggest benefits of IP delivery for very large sites has to do with navigation. So many SEOs spend time on rewritting, but don't understand that rewritting alone won't prevent duplication. Adding IP delivery allows you to make sure that search engine bots are never served any links that will create duplicate content.
It truly is a wonderful tool, and using it for those types of situations will never be considered a TOS violation. What engines are concerned about is using IP delivery to alter the actual content that gets indexed. If you stay away from that, you won't get into any trouble.
I have a client, pr8 site, @3000 alexa rank... big traffic.. We are using cloaking for the exact purpose you mention GMan. I feel like an accountant that helps a client take a creative liberty. This type of cloaking should not be against the TOS, because it is a constructive purpose, helping the functionality of the site, honest and noble. We have been doing it for more than a year, results have been just as expected and we haven't gotten that letter from google yet saying 'what are you doing, c-c-c-cloaking???', screw that, it's working wonders for us and our readers.
The real issue is the lack of standards from Google as to what is "Bad" cloaking and to what is acceptable change in delivery depending upon user-agent. By leaving it vague and undefined, it causes many in the webmaster community to remain scared of the big bad G and possibly being banned for cloaking.
If they had a document that said: bad examples of cloaking: showing terms for childrens stories then sending user to a goat porn site.
ok examples of cloaking: removing ads, different css or javascript needed for browsers, locations, etc.
Whatever one may define the debate, if they at least said, this is our stance, it would allow people to work around it and choose how to act, instead of feeling everything they do might get their site banned, even if they are doing it for smart business reasons like saving bandwith.
Michael @ SEOG.net
Nice article Geoffrey,
As stated, if done right and for good reason, it can be done. If an index team saw a page in their index that contained [468x60 Ad Here], [250x250 Ad Here], [Sponsored NoFollow Text Links Here], etc, do you think they'd think twice about being deceived? They'd most likely look at it, take a look at the real page, then flag it as ok.
(of course I'm making a lot of assumption here, so don't take my word for it)
I would not have a problem blocking certain on page elements like ads, off topic text, images, duplicate navigation etc from the search engines. However there is no need to cloak since all of this could be implemented with e.g. javascript and robots.txt
Cloaking would also be highly useful for "noindexing" content blocks. At the current time, there's no way to say "index this page, but not this part of the content." You have to do it on a page-by-page basis. One could use this to remove blog comments or wiki discussion or real-time editing blocks from the index.
You could even use it to remove (gasp gasp) paid link ads or other financially motivated content that you're not wanting to be found for.
Excellent point, in every coding language you can comment out sections you don't want the processor to interpret. Why not make this applicable to bots? If they'd adopt a standard way of commenting out blocks of code, webmasters could feel more comfortable serving different ads and/or content depending on the IP hitting the server.
Google was fast to add typo-proofing to their robots.txt interpretor this year, perhaps they're ready to explore other ways of making robots.txt more useful.
That's a good start but that method won't allow you to save bandwidth. In fact, it'd actually increase it.