A year and a half ago, the major search engines got together and talked about the issue of blog spam. Ask didn't attend (although they received an invite), but Google, Microsoft & Yahoo! all sat down (with the creators of popular blog software packages) and found a way to help make links from anonymous sources on the web separate from editorial votes for the value of a site or page.

Many would argue that the nofollow tag solution is riddled with problems, but no one disagrees that on principle, major search engines cooperating on issues like nofollow and, more recently, MSN & Google adopting the use of the noodp tag is a great leap forward. Now another issue affecting a great number of searchers around the world needs the attention and consensus of the major search engines - international content, targeting, language & hosting.

What are the current problems:
  • No agreement on how to determine whether content is intended for a specific country, language or subgroup (i.e. French speaking Québécois in Canada)
  • No universal guidelines for the use of top level domains (TLDs)
  • No universal guidelines for the use of hosting in specific geographies
  • No recommendations for webmasters seeking to reach a particular geographic or linguistic audience
Who does this affect:
  • Users - the best results as of now are generally found in English language searches from countries like the US, UK, Canada & Australia, users in other countries, using their native language or a popular non-official language are often getting a much worse user experience.
  • Businesses & Publishers - the vast majority of webmasters and content creators on the web are seeking to be found through the search engines. Without language and regional guidelines, it's very hard for these folks (whether large or small) to decide where to host their site, what TLD to register with and what language or languages to write in to reach their audience.
  • Search Engines - Without a consistent, congruous system to follow, the search engines themselves are dealing with a multitude of problematic issues from spam to database management to unhappy users and confused webmasters.
The questions that need public, universal answers:
  • If a website wants to reach multiple audiences in multiple countries with different languages for each, what is the best practice?
  • If a website is intended to be targeted to a regional language-speaking group inside a country with a different official language, what is the best practice?
  • If a website wishes to target all speakers of a language worldwide, what is the best practice?

Now, let's explore some of the particular problems currently affecting the search engines to help illustrate the problems.

Searching for the Spanish word for books - "Libros" - on a few different search engines:


A search on MSN in the US for libros


A search on MSN México - prodigy.msn.com - for libros


A search from the US on Yahoo! for libros


A Search on Yahoo! México for libros

 
A search at Google US for libros


A search at Google México for libros

Even readers who aren't familar with Spanish can see that many of the above results contain serious inconsistencies and issues with what sites are or should be appearing in the results. We're forced to wonder why Amazon's Spanish language book section ranks at Google in the US, while Barnes & Noble's section ranks in a search from México. Yahoo! and MSN have very odd advertisments for their US sectors, and the organic results in English aren't geared towards what searchers are most likely seeking. There are language interpretation problems, relevancy issues and questions about why a certain site should reach US Spanish-language speakers vs. Mexican audiences at all three engines.

Along with this specific example, there are multiple other issues that we've encountered while surfing in different languages and from different country portals:

  • Google generally appears to consider hosting & TLD extension more strongly than Yahoo! & MSN, although they eliminate this requirement for certain sites (like Wikipedia), yet don't apply it universally for content on sites like the BBC, Amazon or Yahoo! (who all produce lots of international content)
  • Yahoo! has remarkable inconsistencies with ranking US-focused content that mentions or uses words in other languages, although their system appears to be somewhat more even-handed than Google's with regards to content ranking in certain domains/countries.
  • MSN is overly reliant on link data, which led to my post from a few weeks back on how to MSN-bowl someon out of the results by linking to them heavily from another country. This needs serious attention.

I'd love to see more information, suggestions and ideas for solutions in the comments - please do contribute if you, too have experience with international targeting issues at the search engines. With some luck, they'll get together in the near future and issue some guidelines so the non-English language users of the world can receive higher quality search results and we, as webmasters, will know how to reach our audience.