Xan's got some highly relevant distillations of presentation material from the IR conference she attended 2 weeks ago. From MSN:
Ruiha Song et al. from Microsoft research presented a paper on “exploring URL hit priors for web search”. As they explained, URLs contain meaningful information for measuring the relevance of a web page to a query in web search. Some priors might include the probability of a page being good given the length and depth of a URL. They studied the use of the location of query terms and their occurrence in the URL for measuring relevance. A hit is the occurrence of a query term in a URL. A hit type would be for example “the priori probability of being a good answer given the type of query terms in the URL.” They found that hit priors have the advantage of acquiring stable improvement for informational and navigational queries. They observed an improvement of 33-66% in recall performance. They intend to rollout this method and test it on real web data using MSN in the future.
Better make sure you're using keywords in the URL for MSN. Apparently, it may soon have an even larger impact.
Xan mentions an IR conference in Seattle in August. Hopefully, it doesn't conflict with SES San Jose so I can finally meet her in person :)
Keywords in the url are one of the best indicators of the page content. Whether intentional or otherwise its very rare to see unrelated keywords in a url.