At the SMX Sydney conference in Australia this past week, search engineers Priyank Garg & Greg Grothaus (of Yahoo! & Google, respectively) shared information about duplicate content filtering across domains of which I and many of the other speakers/attendees were previously unaware.
Priyank, when asked about best practices for "localizing" English language content across domains, noted that Yahoo! does not filter duplicate content out of their results when the works are found on multiple ccTLD domains. Greg confirmed that this is also how Google's engine behaves and, with the exception of potentially spammy or manipulative sites, reproducing the same content on, for example, yoursite.com, yoursite.co.uk and yoursite.com.au was perfectly acceptable and shouldn't trigger removal for duplicate content (assuming those sites are properly targeting their individual geographic regions).
This (seemingly) came as a surprise to the audience, along with noted experts on the topic, David Temple (Head of Search Marketing for Ogilvy in Singapore), Cindy Krum (Founder of Rank Mobile) and Rob Kerry (Head of Search for Ayima). According to Priyank, this should not have come as a surprise, as he felt this best practice recommendation had been previously messaged (thus, this may not be new information for everyone). Greg was less clear about whether this was "new" information from Google, but supported the use of content across shared-language domains in this fashion.
For those a bit confused, I've created this quick comic/illustration:
In my opinion, this shifts the balance quite a bit in favor of creating separate, country-specific top level domains when geo-targeting in each specific region is important to the brand/organization. However, I have to balance that against my personal experience with country-targeted domains for small and medium size businesses. The engines place an enormous amount of value on domain authority, and it's been our experience that many smaller sites that launch overseas rank worse with their geo-targeted domain than they did with a region agnostic .com or .org.
As an example, SEOmoz.org gets significant traffic from English-language searches outside the US, and our domain trust/authority/history/link metrics help to bolster that. If we were to launch seomoz.co.uk or seomoz.com.au and replicate our content on those domains, then restrict geo-targeting (in Google's Webmaster Tools and through IP-based redirection, for example), I wouldn't be surprised to see that the rankings outside the US (.com version) would suffer dramatically (until and unless the UK/Australian domains gained similar levels of link popularity/importance). However, if we were a much larger brand - an Expedia, AT&T, Tesco, etc. it might make much more sense to localize and build individually to help get the benefits of the local preference algorithms (that favor geographic proximity) in the engines.
BTW - All the panelists noted that it would still be best practice from a usability and conversion standpoint to localize language usage and create a customized experience for each audience. It's just that search engines won't block those pages from inclusion in their indices simply for content overlap.
As always, looking forward to your thoughts on the issue (and apologies for errors in writing, I've just landed in Los Angeles after a 13 hour flight from Sydney, and had to grab a hotel as I foolishly booked my flight to Seattle for tomorrow morning - doh!).
p.s. Folks in the comments have pointed out that Vanessa Fox has previously discussed this issue with great detail, and that Neerav from Bruce Clay Australia has some good coverage of this particular session. I'm also a fan of Kalena Jordan's coverage of SMX, though it didn't hit this particular topic.
Heh. I was quite pleased that this info came about as a result of the question I asked Cindy, when I called out through my Darth Vader like microphone:
"Is duplicate content something you should really be concerned about when targeting specific countries? Ie, shouldn't it be the case that if the search engines see the same content on, for example, your .com and .co.uk site, they would then simply filter and serve the most relevant results... eg, showing your .co.uk pages in the Google.co.uk SERPs?"
(yes, it was a pretty convoluted question)
When Priyank stepped in and said they wouldn't consider the same pages across different TLDs a dup content, and then Greg agreed (although less convincingly), I remember thinking, "Did they just say that? Wow... this changes everything".
(ps. Rand, thanks so much for initiating the chat on Friday over dinner. I'm painfully shy, and otherwise might not have said a thing ;P)
Thank you Rand for sharing, However I have a question, wouldne that allow if someone has a country specific domain to duplicate my content and not get penalized.
There should be some sort of link between the website, where the robot decides weather or not these pages are related, so the duplication penalization choice would be taken or not.
Thanks and waiting for your reply
farahat - the other side of that, of course, is that this prevents a competitor or antagonistic party from getting your domain name in another ccTLD, reproducing your content and getting you filtered out of the index (assuming they could get the engines to view their version as the authoritative one).
I agree - it would be nice to have a cross-domain tag or indication of some kind (beyond simply registering multiple sites with GG WMTools) to indicate that the sites are owned by the same party and possibly receive benefits/consideration appropriately.
The point was also made in the session that acquiring ccTLDs in some countries isn't easy for spammers. For example in Australia you must have an Australian Business Number (ABN) to register a .com.au domain.
I'd also like to know if Google Webmaster Tools recognises well-intended multi-national content repetition that's not on separate ccTLDs. Perhaps the fact that they sit under the same verification account would help?
>>> "it would be nice to have a cross-domain tag or indication of some kind (beyond simply registering multiple sites with GG WMTools) to indicate that the sites are owned by the same party and possibly receive benefits/consideration appropriately."
Perhaps <link rel="alternate" type="text/html" src="https://www.example.es" /> ?
Or maybe (and I like this idea far less) you could decide on a primary language and label this domains homepage with rel=canonical. Can you tell I've been considering rev=canonical recently.
I agree with others here that I've seen talk before of this effect but I've never seen anything official from the search engines before so I've always taken it with a pinch of salt.
It's interesting to see the search engines come out and publicly announce this but I echo what you, will and others have said above is that in some ways this doesn't change much since you still have the problem of domain authority. If they fix THAT then it's a much more interesting announcment.
This sounds a lot like what Vanessa Fox has been saying - that when it comes to duplicate content across geo-targets, the search engines filter rather than penalise. In other words, seomoz.org.uk would rank instead of seomoz.org if you had both, but that wouldn't be a penalty on seomoz.org - it would simply filter it out and show the most relevant.
This is nice (and somewhat consistent with what we have seen) but it misses large parts of the puzzle. I think the issue you highlight about domain trust is a massive problem - the new information doesn't actually change anything when it comes to multi-lingual sites as the search engines have never filtered (or penalised) duplicate content across language (for the simple reason that it's not really duplicate!). Nonetheless, we have seen cases where having two languages in sub-folders on a .com is better than having to cctlds.
I think this information is good to know - it's nice to hear that Vanessa's remarks were officially true - but it doesn't change much in the evaluation process. To my mind, you still need to think about whether you have local resource and ways of getting links to your smaller domains.
Hey Will, I agree with you that using sub-folders or sub-domains for other languages is a strong option. Matt Cutts verified this in answer to my question (see the comments) https://www.mattcutts.com/blog/subdomains-and-subdirectories/
Now I'm curious as to duplicate content for multilingual sites. Is it strictly an issue of unique strings of characters (same content but translated) or do engines have the ability to identify semantic value? I'm guess what I'm asking is "are multilingual translations unique, duplicate, or 'other' content?"
No - translated content (truly translated - UK --> US or vice versa doesn't count) is *not* duplicate content.
What about if you localized a directory and setup geo-targeting in webmaster tools?
I know you would not typically do this for outside the US english language pages... but I wonder how it would affect rankings.
I have some pages setup for french, german, spanish translations.
Would you get domain authority/trust from using your well linked/aged domain and localization benefits?
<edit>FYI... having problems with draggable comments with IE8</edit>
Kurt - there's a bunch of things that go into geo-targeting, but ccTLD domain name is certainly a big one. It's possibly to target a subfolder to a country, but it generally doesn't carry as much benefit for the local/regionalized ranking algo as the TLD, hence the issue. This post, however, is more about the engines' duplicate content filtering on different region TLDs, though.
Vanessa Fox says on her blog: Even if the content is the same across each site, you don’t need to worry about duplicate content. Remember that search engines generally don’t penalize for duplicate content, they filter. And in this case, filtering is exactly what you want. You want the search engine to show the UK page to searchers in the UK and filter out the US page. And that’s what search engines typically do. https://www.ninebyblue.com/blog/making-geotargeted-content-findable-for-the-right-searchers/
So, it doesn't seem to be a totally new info.
Thank you for this Rand - this confirms something I've been noticing with a client that has a .com and .co.uk site. Actually what this means is that having slightly DIFFERENT content on each TLD (esp page title) can cause each ccTLD site to compete against each other, whereas if they're identical then google picks the appropriate version for that country.
So does this mean we should scrap the accepted wisdom of "create unique copy when you have same language / different country domains"?
This issue was bothering me since long for a high profile client.
This post made it clear about the ccTLDs that one business can operate. Good that this was cleared atleast now.
Thanks Rand.
Fantastic. I'm working on a .co.uk site that's very much a duplicate of the original .com.au and have been worrying about this issue for a while. Also, I would be interested in seeing a compiled list of factors that affect the geo location recognition.
It has been a very long time problem and now we have some control on the basic issues, using GWT, Geo location tagging, getting local links to local domain etc..
Current isssue that I am facing, top level .com domain is not ranking for fresh content/keywords instead Google is giving top ranking for some CCTLD/CCDIR, which is not a trusted page like .com site.
Do you think Canonical Tags can help me in this case to show the original content and prefer the .com site to rank.
Any other tweaks to play around?
Thanks
Aj
Theory is nice, but it doesn't always happen.
We have a high profile site, initially a .com.au version, Australian company. It has the most links, and most traffic is Australian.
Then they extended services to the US, and created the .com website.
It has it's own links, some custom content, but there are numerous pages with identical content.
Am I understanding this thread correctly, where the Google and Yahoo reps said that both sites would be fully indexed, and the appropriate version shown per country?
I have many examples where say xxx.com.au/fred.html is indexed, but xxx.com/fred.html is not indexed. Both are linked from sitemaps and other navigation. But content is basically identical.
From Kalena's notes, with the multiple site solution, this is suffering : Potential link and duplicate content risks
There is no way the client will merge both sites.
I was about to recommend that they change the content slightly, but now people are saying this is not necessary?
This a good point. I've had similar experiences. Plus see willcritchlows post above.
Ahem: https://www.sitepronews.com/2009/04/08/smx-sydney-international-seo/
Good to know. It makes sense since the domains don't really compete with each other.
Yes, its 100% acceptable.
Am I missing something - please let em know if I am. But in the UK, we are often limited to our domain name choices because the .com TLD has already been taken. For the most part this is okay as we are geo-targetting a UK-based business or campaign. However, what does this do to brand management and SERPs? Should I avoid buying the duplicate domain name and look to variations. What is the impact on branding? Are you still competing against a duplicate TLD overall, especially if the subject matter is similar.
Does this also mean that you should always buy a regional domain and leave .com to the americans?
thanks for sharing, i have a question what will happen when some one try to optimized com,co.uk for same country with same content? will google and yahoo penalized them and if yes then how?
I'm guessing Google relies more on the Specify Regional Location tool in the Webmaster console to automate their regional filtering, unlike Yahoo's approach. But dupe content issues would have more impact on GG and their expensive datacenters so they couldn't afford the soft option taken by Yahoo. Plus the Specify Region tool gives webmasters the ability to inform Googlebot on a per-domain, per-subdomain, or per-directory level so more powerful.
BTW - great to see you again Rand and thanks for offering to link to my SMX blog coverage (hint hint!)
This info is really useful to know, but I think it's still very important to try to reduce duplicate content as much as possible across geographical domains. The main reason being the different audiences, whether they be people or search engines. For people; they need to be able to relate to the information you're providing. And for search engines; what's top in the UK and US is not necessarily the most popular in other countries.
Since when was something that Matt Cutts mentioned liked two years ago as "new" -- this is just a clear sign people don't pay attention and don't bother testing anything.
Oh wait, everyone who was surprised aren't quantiative so they wouldn't have tested this. Hmm... I suppose thats why people shouldn't listen to MBAs and such when it comes to search marketing?
Have to agree with you here. I used to run an e-commerce site accross a few dozen different countries, all with the same content, but of course translated to the native languages (by professional translators). We achieved top rankings for every ccTLD and it's associated country very easily. A majority of the sites had the same layout and theme as well.
Before we get too self-congratulatory about how we knew this or knew it was coming, I thought I should throw in a separate client example I forgot to mention earlier. Small biz client: not that many links. Duplicate .com and .co.uk versions. They got a few powerful links to the previously weaker .co.uk domain and suddenly neither version ranked for branded search and a number of previously-ranking terms. They came to us for help and we had them redirect and standardize on one domain (and repoint some of the powerful links). Bang. Straight back into the results as before. So it's definitely not fully working yet. (Comment written on iPhone. Sorry for dodgy typing )
not even my $0.02 worth:
Rand, I love your "region agnostic" Domains -- Cracked me up!
I always wondered what the Google-bot looked like, glad you cleared that up with your picture too.