Today at SMX Sydney (which I'll have a more detailed post about this weekend), Danny Sullivan and I sat together on a site review panel in the late afternoon. One of the sites we were asked to review was DiscoverTasmania.com. During our investigations, we stumbled across some remarkable cloaking from one of Australia's top travel websites - Flightcentre.com.
We started by reviewing this page about Hobart, and grabbed a short snippet of text to analyze whether the page had any duplicate content issues. Here's the page below:
We ran a search at Google to check the phrase and came up with this result:
Clearly, you can see a page from Flightcentre.com outranking DiscoverTasmania.com's page about Hobart, yet something looked fishy. The lack of a cached version indicates that they're trying to hide something, so we decided to investigate further. Clicking on the Flightcentre.com link, you're brought to this page:
And yet, reviewing the source code shows no mention of the phrase in question... Which instantly led us to the hypothesis that the domain is cloaking. Sure enough, turning off Javascript and changing our user-agent to Googlebot, we saw a completely different version of the page:
Of course, this site review was in front of a live audience, so there was no putting the cat back in the bag (and if that wasn't enough, Google's Adam Lasnik was also in attendance as a panelist). From talking to attendees after the show, it appears that Flightcentre is actually one of the largest travel companies in Australia, and folks here seemed to feel this was as significant as the BMW Germany spam case from several years back.
Lessons to learn here:
- Cloaking with User Agent isn't wise - if you're going to cloak in a gray/blackhat way, use IP addresses (however, even in that scenario, we still could have used Google Translate to see the original version of the page).
- Noarchive is a clear giveaway that something shady is going on - it's like whistling and twirling your thumbs outside a bank that's just been robbed.
- No matter who you are or where you are, there's a good chance that someone with some SEO knowledge is going to stumble across your clever (or, in this case, not so clever) spam.
How egregious is this particular case? Not that bad, actually. While they're technically violating the guidelines (and Adam certainly didn't seem particularly pleased), the content in the cloaked text-only version actually does match up to the text in the image version. The content from the DiscoverTasmania.com website may have been taken with legitimate permission as well (we were unable to verify this at the conference). All in all, my guess is that Google won't ban the entire site, though they may remove some of these pages from the listings. The bigger issue is how fragile cloaking really is, and how close you are to being called out for it at the most inopportune moments. :)
p.s. Looks like Neerav from the conference has already posted on the subject (and has a great photo of Danny, Adam, and me in our SMX Sydney Site Review labcoats, to boot).
Ahaaa,
Looks Like Adam/Google have taken this seriously.... not only does the search query for "Tasmania's capital lies in the south-east of the state" no longer return a listing for the Flight Centre page...... But a query on site:catalogues.flightcentre.com.au returns no listings at all..
So looks like the whole catalogues.flightcentre.com.au subdomain has been delisted already...
Now the fun begins for Flight Centre I guess... Lets see how long it takes them to sort out the issue and get re-included....
AH, good on ya for spotting that Flight Centre's cloaked content has been taken out of the Google search results.
Well, in my humble opinion, it's about time the SEO cowboys of Australia are outed. When I arrived here from Belgium in 2006, I thought this was the "SEO Wild West"...Ok, don't get me wrong, not all Aussie SEO's are dodgy,... but sadly, most of them seem to be into the "hit and run", short term optimisation approach.
Their excuses? "It's such a new industry here... We need to discover the boundaries..." "We didn't know you weren't supposed to do that..." Yada, yada.
Such nonsense. Search engine "webmaster guidelines" and top quality SEO resources have been around for years now (YES! Since the mid/late 90's). But it seems that investing time and resources in the dodgy stuff is a lot more appealing/exciting...
Of course there is also the fact that these tactics still work... Search engineers... roll up your sleeves and get rid of this crap please.
To end on a more positive note: luckily, there are some "enlightened" search marketing veterans Down Under, who do knowwhat's what. Let's hope that at the next SMX in Sydney, there are more than 8 people and a horse's head to listen to them (or maybe not ;-)
My insurance affiliate site has been cloaking Google referrals directly through to the affiliate link for ages, and nothing has happened.
Google even provided the method needed for cloaking.
If you're going to take so much time and put in so much effort to cloak a site, why not just go ahead and create and write new creative content? I'm not experienced in cloaking because it seems that if you "have quality content and a web presence through quality links" that your results should take care of themselves.
But talk about getting busted... Flight Centre might want to consider hiring a quality SEO firm to help overcome their current dark gray hat technique. Hmmmm, maybe someone located in the Seatle area?
I think a lot of people mistakingly think that in order for your site to be indexed better than the next guy or be indexed effectivly you need to cloak. Simply not the case.
I think you are right in the case of a large organization like FlightCentre - but cloaking is and always will be an effective SEO strategy for those who are disinterested in long-term solutions. In fact, you could easily argue that by repeating the short-term solution of cloaking over and over and over again, you can be just as effective (if not more so) in the long run, minus the overhead of running a legitimate organization.
Cloaking does not require much effort once you have the system down. And it's still worthwhile (for some) due to the longtail keywords it has, and the low chance of getting caught if done properly(which this wasn't). Although honestly, if I was going to create a site that looks nie and all that jazz, I would definitely not cloak. At that point the site is an investment, and must be treated more carefully.
But ok. To make my point about it not taking effort to cloak, here is literally the procedure for the site gen I have laying around(custom, not RSSGM).
It's not about the effort it takes to cloak that makes this a waste in my eyes. It's about the effort it took to promote/create the site in the first place, only to not bother with an even competent cloaking mechanism. Oy.
Also, I'd point out that Google Translate is not the end all be all of destroying cloaking I will point out. It's perhaps 2 lines of code to check for a referrer from Google Translate, and then act accordingly.
Excellent article though :-) Thumbs Up.
Radzster, I don't disagree with your comment that Australian SEO is like the Wild Wild West. It's true that our SEO market is still in infancy, but that's NO EXCUSE for producing shoddy work.
We all know the guidelines, we know what's going to get a site banned, so why do it? I don't know - I can only imagine it's for a quick buck.
Personally, I think we need to educate the marketplace. We should all be telling our friends, clients, colleagues that there are things you can do to your website that will get it de-indexed.
Where's the Sydney Morning Herald or The Australian when you need it? This could be a great story that could educate the nation on the don'ts of SEO.
Right on.
This, if we all remember, IS kind of how the world turns though ain't it? Web design, in it's earlier stages, was a smash and grab of coders doing shit work and charging out the datehole for it.
SEO, in these it's earlier stages in Oz, is no different in my opinion.
Radzster, I'd love to believe I'M one of those SEMs with integrity and I mean to keep it that way, even though integrity doesn't pay the rent and you can't eat it.
Hi Judd,
Totally with you there...
There are still so many web pro's and agencies, globally, who are charging excessively for full flash sites that are totally inaccessible and totally worthless for search marketing.
You need to be passionate... You need to be prepared to put in a lot of time and effort in order to stay on top of this exciting, but ever changing industry. The information on best practices is widely available, from reputable sources... But not enough web agencies seem to be willing to learn it and implement it.
Hold on to your integrety Judd! It will pay off in the long run.
Hi BlackMax,
Well, in all honestly, my comments were a bit harsh. And I definitely agree with you that there is a strong need to educate webmasters, online marketers and other stake-holders. At Bruce Clay Australia we do our best to fill this need with an intense 3-day SEO training course, every 6 months (Hey... on-topic plug. no harm done, right? ;-)And I am also totally with you on the lack of local media coverage... Where are they when you need them?
Rand, I was in the audience, and can I just say that you and Danny made the perfect "odd couple" for that session... brilliant!!
Sucks for flightcentre as well. We'll have to wait and see if anything comes of this.
Like Antonio Days above, this seems to me like an honest attempt at putting up a textual version of a graphic page that would otherwise be invisible to the search engines. SE guidelines aside, this is a nice solution that provides a good user experience. As a searcher, I would be glad to see this scanned brochure page on my serp and the person behind this technique got it to work.
Ouch - getting busted for cloaking at an SMX panal with Google engineers SAT IN THE AUDIENCE has to smart a little...
Nice labcoats btw - looking good ;-)
Flight Centre has developed an online brochure system that converts brochure pages into javascript which allows people to quickly and easily view travel information. The online catalogue pages are images, so the content in those pages, while discernible to the human eye, cannot be indexed by search engine spiders and hence this large volume of travel information may never be accessed by the public. Our intent is purely to improve the visual experience and not to deceive the search engines.
In the development of this solution for our customers, our technology partners conducted research to understand how content from our brochures could be searched by people researching travel. We sought a second opinion from Google about how to index the content from the brochures to allow our customers access to this information. We showed our solution to a Google CSE. He reviewed our site maps and robots.txt and replied in writing that “everything looks ok”. It is clear that our intent is not to show content to spiders that differs from the content in the pages and therefore should not be regarded as blackhat cloaking. Further, Flight Centre did not receive any extra benefit in natural rankings from providing content in this format. The content that was visible to Google’s spiders is an identical replica to what is shown in the customer friendly brochure viewer so no unfair advantage was gained nor sought.
We support Google’s stance on blackhat cloaking. However, their universal mandate to outlaw the solution to indexing images does not account for situations such as ours where our intention is purely to provide the best possible user experience.
We are working on an alternative solution and will seek to have it vetted by the Google search and webspam teams.
If what you say is true, that you had no intention of cloaking or of using someone else's content AND if a couple of fellas clownin' around in white coats can find such a big boo-boo that quickly (no offense Rand buddy, you're a champ), then whomever you had in charge of putting that content up needs to be sacked... and then kicked in the sack.
Remind me to slide you guys my card next time I'm walking through the mall.
Thanks Colin,
I think that its clear from the way this has been implemented, that the intent was probably not to deceive.... However, cloaking is cloaking... and you seem to be a victim of a porrly conceived solution to the problem... and Google has had little choice but to act.
Hi Colin,
I think it's very admirable and courageous of you to try and explain your side of the story. It's the right thing to do in a situation like this... Good on ya!
I also think it's great that you are looking to implement an alternative, because the way in which the content was originally delivered, using user agent cloaking, was not set up to be accessible... except then for targeted search bots.The thing to take away from this experience is perhaps, don't rely on a seach engine representitive to tell you if your solution is ok (asking them how to have them index your content?). They keep changing the goal posts all the time anyway. Go for best practice approaches from a usability and accessibility perspective... because these methods offer a safer, long term solution, in this constantly changing industry. Make sure what you implement is best for your users, and it will most probably be good for search engines as well.
"it's like whistling and twirling your thumbs outside a bank that's just been robbed."
Loved that comparison! Detective work, such as this, is one of my favorite things about SEO. It's like having a personal key to a magician's closet.
Rand I was in the audience too and the banter between you and Danny was hilarious. Neither of you should present without diet coke ever again. And remember - Adam was "cloaking" too!
Rand, you forgot to mention the fun and games you and Danny had trying to get Firefox and the user agent installed, you guys should have your own stand up comedy routine, seriously! The unveiling of cloaking issue was a great punch line!
I'm a bit confused on what constitutes cloaking.
I have a site where I preload a bunch of content with AJAX so it can be accessed instantly with jQuery. Basically I have an AJAX/jQuery (user portion) of the site, and a hardlink (Googlebot portion). I load the content of a given page normally, then load the rest of the entire site with AJAX so as not to cannibalize the SEO on that page.
Provided that all the content DOES exist on the site and is accessible by everyone including Googlebot and users, would this be considered cloaking?
Apologies for the personal story.
That sucks for them. They can officially say goodbye to their rankings. Someone is getting fired this month.
I love this stuff. Great job Rand and panel. Love the coats.
Ooops ... soz for the multi posting on Monday... not sure how that happend ... (damn technology ;-))
After a quick bit of detective work, it seems that the suspect pages and in fact the functionality used for the Flight Center Catalogues seems to have been "borrowed" (or probably licenced) from www.cataloguecentral.com.au who have a whole pile of catalogues, from a whole range of companies (and industries), and guess what ... all of those seem to use the same cloaking technique - although from a quick look I had yesterday Flight Center seemed to be a bit more agressive in adding extra text content than most.
At time of posting Catalogue Central seems to be experiencing a bunch of VBScript runtime errors - perhaps in a rush to fix the "problem" before they are outed as well.... ooops... too late.
Haha... Brilliant detective work there iReckon.They are definitely, frantically trying to fix it...
thats really funny especially the part about the google guy seeing you find it
So what does this show? Spamming still works on Google. Yup I knew that already. Another bigger company gets caught doing it? Yup lots more of them doing it to. Will they get a slap on the wrist after they fix it? Yup Will a mom and pop shop get treated the same way? Nope.
Rand that was absolutely fantastic. As the person responsible for discovertasmania.com I was completely blown away. The way you and Danny did it was entertaining and informative. Thanks for all the advice you gave while people were telling Danny how to get Firefox going, as well as some great sessions during SMX
"Provided that all the content DOES exist on the site and is accessible by everyone including Googlebot and users, would this be considered cloaking?"
As long as the loaded content and the "hard" content match up, I cannot imagine an organization having an issue. Cloaking for the most part involves showing web users one batch of content and other users - namely bots - something completely different. And to be honest, what you described is about as white hat of a creative idea as I can remember.
This has me angry.
Let's say I have a powerful site and an upstart is starting to move up the SERPs. I grab their content and stick it under a cloak. That might be able to bump them from some rankings because of the duplicate content. And, at minimum, if my site is more powerful, I'll probably outrank them for all of their terms.
pwnd!!!!
Still kicking myself a tiny bit for not being able to make it over from Perth (5 hour flight) for probably the only bigshot show that happens here in Oz, I am doubly disappointed that I missed out on this.
To be honest, the industry marketplaces here in Australia can have some serious competition, and that is fairly easily lent to some naughty, naughty behaviour.
Heck, even here in Perth, somebody's showing high in a highly sought -after phrase and their company's Mission Statement is completely swiped. I found out because that company's CEO put it on a forum. The reason I KNOW that he's legit? I helped write that content when I worked there. Brutal.
Maybe I should be pleased that the Wild Wild West is alive and kicking here Down Under. Heh. Maybe I'll just keep my head down and keep playing by the rules while waiting for the sh*t to hit the fan.
Smooches Rand, hope to catch you at the next one.
This looks like a legitimate use of text replacement. The site is using a javscript based image manipulation program to create a 'magazine like' site which their customers may prefer to plain text. The displayed page is 100% image, but the text behind the page seems to match.
The only error is in cloaking - flight centre needs to create the page with text in the magazine view using hidden divs instead of using user agent redirection.
I don't think flight centre meant to use shady practices, unlike BMW, VW and others with super shady SEO's.
Right idea, wrong implementation.
'legitimate use of text replacement'?
Is it?
Quick Update : Two weeks on and Flight Center seem to have removed the cloaking from their site (over a week ago I believe)... but so far they still dont appear in Google Results...
im not sure if this is the best thing to post, but looks like some companies done learn their lessions, automated results, in some categories there is over 17 different versions of the same page, each is targeted at a geolocation for "rental cars"
The companies standard links are https://www.flightcentre.com.au/cars/holidayautos.jsp
The autogenerated pages are....
https://www5.flightcentre.com.au/1/3/indexc3.html - Car Rental
https://www5.flightcentre.com.au/1/4/indexc7.html - NZ holidays
These www5 pages went from having poor quality results, and a giant YOURAMIGO logo, over a few days to more relevant results, and a small youramigo link, now to what seems like dozens of pages of results and a very small piece of text at the bottom...
im not sure that this is really best practice, since the results are being served onto a different subdomain www5, and the results or pages arent consistent with the main site.
the site with the www5 pages was featuring on first page results around terms for "car rental" but now the youramigo pages have seen to have disappeared from the results, and the main flight centre page is back at #88 for "car rental"
just my thought on the issue, does a company ever learn?
Ahhh... They are back .... 1,900 pages listed today
Pages finally starting to appear back in the Google Index.... 3 pages currently listed
It's amazing that this all went down in front of a live audience. I imagine Flight Centre will be hiring new SEO staff... this is pretty bad!
Word on the street is that the Flight Center catalogue pages have been using this technique for quite some time, and IMHO this was probably not an intentional attempt to deceive Google (or any other Search Engine Spiders - which have the same effect).
I believe this cloaking effort was simply an ill conceived attempt to to provide a text based summary of the images shown on each of the pages.
However... the implementation is a classic example of cloaking, and given the very public way this was discovered, Google has had very little option but to act.
It seems that Flight Center has been in discussions with Google today in an effort to rectify this as soon as they possibly can.
Doesn't seem that bad at all. While cloacking it's not ok, if this was flash and this page were it's html version (with some formating that would perhaps help it rank better) this page would be ok according to the guidelines - or am I wrong here?
That's quite a public embarassement for them! No covering that one up!
TareeInternet makes a good point - this page is not an isolated occurance on the site. This user agent based cloaking has been applied to many pages (though as noted above, only for a select number of bot signatures).
Word on the street is that the Flight Center catalogue pages have been using this technique for quite some time, and IMHO this was probably not an intentional attempt to deceive Google (or any other Search Engine Spiders - which have the same effect).
I believe this cloaking effort was simply an ill conceived attempt to to provide a text based summary of the images shown on each of the pages.
However... the implementation is a classic example of cloaking, and given the very public way this was discovered, Google has had very little option but to act.
It seems that Flight Center has been in discussions with Google today in an effort to rectify this as soon as they possibly can.
I was wondering where that term "Tasmanian Devil" came from. What an embarrassment that's going to be.
Word on the street is that the Flight Center catalogue pages have been using this technique for quite some time, and IMHO this was probably not an intentional attempt to deceive Google (or any other Search Engine Spiders - which have the same effect).
I believe this cloaking effort was simply an ill conceived attempt to to provide a text based summary of the images shown on each of the pages.
However... the implementation is a classic example of cloaking, and given the very public way this was discovered, Google has had very little option but to act.
It seems that Flight Center has been in discussions with Google today in an effort to rectify this as soon as they possibly can.
Well,... I can't help it. I struggle to see this as an innocent, ill conceived attempt to provide a text based summary of the images, because it does not seem to be accessible to all users... but only to a searchbot.
If they really want to ensure that the content is accessible,... then there are perfectly legitimate, best-practice, search engine friendly web development techniques available. And no need to compromise the "cool" design either.
Thanks for the link to my article Rand :-)
Sent you an email with the full res photo of you and Danny in labcoats if you want to use it
I did snap poll of industry people at the conference and while none wanted to go on record the consensus was that it's blackhat because they're only targeting the main search bots - it seems too deliberate to be a case of 'ignorance'
PS I even have live video from my camera of you and Danny discovering the cloaking but I asked Barry and he thought it wouldn't be wise to publish it because the audio has a few comments about Flightcentre which shouldnt be repeated
Rand,
Great catch! Just playing "Devil's Advocate" here. On every other outbound link from SEOmoz you are running a "no-follow", why not run a no follow on the offending site. If they're running black hat, why give them any more juice than they deserve?
Thanks for all the great info!
ryantodd
We prohibit caching by Google and the Internet Archive. But our reasons are not SEO related, and I don't think the lack of a cache should be stigmatized.
We have a news blog in a contentious niche, and we get threats of lawsuits on a regular basis. No matter how hard you try to avoid getting into trouble, these threats come out of left field.
Upon receiving a threat, which usually happens when the "victim" is at his most irate and before he has even called his lawyer, we remove the content, edit our robots.txt, verify the robots.txt in Google Webmaster Tools, and then use GWT to remove the content (including excerpts) from Google's index, which takes about 30 minutes most of the time.
By the time the victim has gotten in touch with his lawyer, he realizes he hasn't saved a copy of the page, the page is gone online. He can't see a cache, he can't go to the Internet Archive, he can't reconstruct it from Google SERP snippets.
His lawyer starts to ask questions ("Did they say this was a fact or did they state it as an opinion?") and the guy can't really remember. Lawyer says "O.K., we can take the case up to the point where we can subpoena their files, if they've even kept them, and that will cost about $5,000 for my time. At that point we'll see if we have a case or not."
The would-be plaintiff tends to fade away into the sunset at this point.
Wow! What a catch! Many would probably be surprised at how many large companies use black hat, including cloakig, hidden text, copyright materials, etc. I'd love to hear their response.
Google is gay. The sites content matched up with the cloaked material according to the article so what is the big deal. PR: wait... I: wait... L: wait... LD: wait... I: wait...wait...