In the world of big brands and sites, SEO practices become part of a laundry list of developer tasks, often far beneath the threshold of serious attention. While folks in our industry have learned the detriment of ignoring the search engines' guidelines for crawling & ranking, in the world of the Fortune 1000's, there are hundreds of disbelievers. Thus, it's a great time to re-visit some common SEO mistakes.

  1. Un-Spiderable Navigation
    From Flash links to Javascript calls to drop-downs and search box interfaces, there's dozens of sites that fall victim to a lack of crawling due to their spider un-friendliness.
  2. Disregard for Relevant Keywords
    Out of the Fortune 500, I'd estimate that only a scant few dozen are actually implementing proper keyword research and targeting - the rest leave it to a "creative ad writer" to determine page content and title tags.
  3. Flash & Image-Based Content
    In addition to navigation, the content that's most critical to search engines is frustratingly hidden in files that spiders can't see. Despite the promises from years ago that engines would eventually be able to spider Flash content (or read text in images), it seems we're still many years away.
  4. URL Cannonicalization Problems
    With "print friendly" versions, different navigation paths in the URLs leading to the same page and plain-old duplication for the heck of it, "canonical content" is going underappreciated.
  5. Content Distribution & Partnerships
    Along with cannonical issues on their own sites, many large owners of web content license it out to dozens (sometimes hundreds) of sources. The only thing more damaging than having six versions of content on your site is having six versions of it on six other big, powerful sites.
  6. Cookie or Session-Variable Requirements
    Big sites that don't build content access systems for spiders are asking for trouble - if even a spider has to have a cookie drop to pass through, someone else will be getting your traffic.
  7. Controlled-Access to Content
    The NY Times, Economist and Salon.com don't see nearly the link popularity growth of their more generous competitors. Even when you let the spiders through, requiring membership of paid-access means that far fewer visitors will bother to link.
  8. Multiple Site Creation
    Rather than launch projects behind their root domain, big companies seem to take pride in releasing 6 new websites every time their ad agency changes the campaign slogan. Somebody's never heard of the sandbox...

One of the most fascinating people I met at the recent Pubcon was a "search proxy architect" whose job is to work with big brands' unfriendly sites and create alternate pages for search engines to crawl and index. Basically, it's an advanced form of cloaking that the engines tolerate, largely because they'd rather be able to spider the content from these sites than to have it removed from the index. I had no idea the extent to which this practice is used, but apparently, this "ethical cloaking" is much, much more common than you might think. Sadly, I can't post the examples I know of, but if you've got some, feel free to share in the comments.

So, next time someone asks you about whether cloaking is white-hat or black-hat you can tell them... "it depends."