I was recently doing some in-depth log analysis and came across with two search engines / bots: (mozdex and MJ12bot) that I find very useful and hope them the very best .
The first bot belongs to mozDex which is a search engine built on free, open source technologies like Nutch. So far nothing new, but mozDex has one special feature which makes my heart go boom-boom. In the search results, there is a link "explain". This displays a pretty detailed explanation of how this page scored, take a look for picture below for query seomoz.
The second bot belongs to Majestic-12, a distributed search engine project (similar to SETI and many other distributed computing projects). The first kick-off comes from their MJ12search (currently in alpha) - on the right side there is an select list of 9 possible algorithms to test between search results. I personally enjoyed the best relevancy algorithm most, but have very little to complain with default algorithm either. IMO this is a good indicator of how large role algorithms play with search results.
Another hidden gem inside Majestic-12 is ConAn - a visual content analazer tool. It shows some interesting results on how search engines can determine blocks of pages more valuable than others - a technique often discussed in search engine white papers.
As always, feel free to comment this topic and suggest other "new generation" search engines with cool / SEO friendly features.
Keijo - That really is impressive. To see the broken out pieces of the algorithm coming together to give a relevancy score is something that's not typically a part of SEO. It's cool to see the direct tf*idf equation right in the code lines, too. It leaves little doubt that keyword density is a complete joke.
One question - what does the number 859,804 represent?
Nutch documentation has a pretty good summary about what they use in scroring and there's a direct link to api-docs that explain in detail the functionality.