Short post tonight as I'm just back from a short trip with Mystery Guest to celebrate our one year anniversary (which was awesome, BTW) and need to get caught up on lots of email.

Let's start with a quick quiz - which of the following statements is true?

  • A) My pages are in my XML sitemaps file, so they must be getting crawled
  • B) My pages have been crawled, so they must be in the index
  • C) My pages are in the index, so they must be able to show for queries
  • D) None of the above

If you guessed A, B or C, congratulations, you're part of a large contingent of folks doing SEO who are (rightfully!) a little confused about how the engines might be doing this. I've created a quick graphic to help out:

Levels of Indexation

The takeaways here aren't tremendous, but they can be valuable to help explain to SEO outsiders why pages may not be drawing traffic even though metrics like appearing in your XML sitemaps, showing in Google Blogsearch queries or appearing to be crawled in Google Webmaster Tools suggest they should. If you want to determine if a page (or set of pages) are actually included in the engines' main indices, there's only two definitive ways to know:

  1. Perform queries that show the page appearing in the results (without having to use the &filter=0 in the URL string)
  2. Check your traffic logs to see if queries are actively sending the page traffic

This is why I love the metric of # of pages that received at least one visit from search engine X each month. If that number is trending in a positive direction, you can at least rest assured the engine is indexing (and holding onto) your pages.

Comments are strongly encouraged on this topic (particular since I didn't get to cover it in great detail). Thanks!