I've been watching a lot more of the Dr. House show, and becoming a bit obsessed with some SEO mysteries of my own. Since our last outing into diagnosis went so well, I thought we'd try again. I haven't got either completely solved, but I feel pretty good about some hypotheses. Let's see how you do:
Mystery #1: Google SERPs for Web Crawler
Ranking in position #8 is the real estate site "www.mckinley.com" with neither the word "web" nor "crawler" on the page or in the URL. Why is it there?
Mystery #2: Google SERPs for 4:20
Don't ask me why I was searching for it, but that 7th result is peculiar. What causes it to rank at Google, but not Live or Yahoo?
Looking forward to your analyses in the comments, and yes, if SEO were a class, this would be extra credit. :-)
Rand,
The first thing that caught my eye about the post is that Google is picking up the date from the article "Apr 20, 2008" as being close to 4:20. This could be an indication to changes being made to how they view semantics.
With Google recognizing the date as being similar to 4:20 in combination with the url contain 4 20, this could be the case.
Good post Rand.
good spot!
It's lupus.
I said that last time...
It's NEVER lupus.
And I'm more and more convinced that this House fixation really IS the secret origin of Whiteboard Fridays.
Either that, or the Moz crew just really enjoys the smell of those dry erase markers.
Well, that one time it WAS lupus. And I thought they'd make a bigger deal about it.
Rand -
For #1
Check https://web.archive.org/web/20031231161926/https://www.mckinley.com/
Probably lots of backlinks on the web still point to Mckinley.com for the keyword webcrawler.
For #2, x:y format is treated as a date format. For example 10:20 gives a Wikipedia entry for October 20th - https://en.wikipedia.org/wiki/October_20
Wow -- good spot! If I were teaching the SEO class I'd give you extra credit for sure and probably some extra recess time!
Think you might have nailed this one Atul - good spot looking in the archives. Was definitely something to do with anchor text of back links...as Google mentions when you click on the cache for the 1st result "These terms only appear in links pointing to this page: web crawler"
Hate to spoil the suspense, but I think it's worth covering why that NYTimes article ranks where it does as well (great work to all those who uncovered the Magellan/McKinley connection).
The NYTimes piece has 04/20 in the URL string, but that alone isn't what does it. Looking in Linkscape, you can see that the URL anchor text has more than 30 different domains using the anchor text 04/20/2008 as the anchor text (or in some cases 4 20). A few articles from a single press publication - FAIR - https://www.fair.org/index.php?page=3361 (notice the anchor link near the top of the story) - seem to have been picked up and really spread that anchor text around.
Takeaways:
I see that the thumbs count above is relatively small, but I'm a huge fan of these diagnoses, so maybe I'll give it another shot in the future and see how it fares. If you don't like this kind of post, feel free to let me know or let me know what you'd like to see in them to make them more interesting/valuable.
Maybe the lack of thumbs is tied to your lack of crutch, meds, and meanness.
Hi Rand,
I found the conversation and comments around this post very helpful. I would love to see more posts like these in the future! Perhaps a "Dr. Rand - SEO Diagnosis" series could be in the works?
--Eric
I think you wrote a good post, and it is clear to see that people do search to find out the solution.
Keep posting those type of 'problems' Rand.
I was a project manager at the McKinley Group and we had a search engine called Magellan. We were purchased by Excite at the same time as Web Crawler and a couple other web directories. Coincidentally it was my project to consolidate these directories into one directory product to remarket to AOL, Microsoft, and several other "portals" (remember them?).
The reason why it must be ranked so high is because of the multitude of press releases and coverage that were issued at the time of the acquisitions and subsequently the re-release of the new directory product under Excite. It was the largest online directory for a couple years with far more nodes and categorized websites than Yahoo! long after Excite@Home collapsed in 2001-2002.
Also, don't forget that Hugh Laurie hosts "Saturday Night Live" this weekend.
You secretly just want us to use Linkscape don't you?
I would say the same reason why disney.com ranks #1 for "leave". lots of inbound links to disney from the porn sites asking you to enter or leave.
hi Rand I found a SUSPENDED website on Google search of my local language jual rumah (house for sale) ranked #8 with no jual (sale) or rumah (house) on it and how could it be such SUSPENDED website ranked on page 1?
I run a report in Linkscape and found this website well optimized for jual rumah (house for sale)from the backlink but I really disagree, a SUSPENDED showing on page 1 as a result, some of its backlinks are SUSPENDED as well.
Try a Yahoo! search for
linkdomain:mckinley.com -site:mckinley.com "web crawler"
et voila, 1660 results. Couple that with their 44,500 total external links, and 12 year domain age, and to Google, they must be authoritative reference for "web crawler."
For example #2 I have seen some strange behavior on Google in the last two weeks.
Two weeks ago a client's 3rd-party blog was shut down. Within a couple days it jumped from #18 to #4 for the brand name and has since settled in at #14 (higher than its original position). But, it has no related content on the page and no use of the name, just a "This account no longer exists," even stranger is that this page is still #4 on Yahoo!
The same client had a similar issue this week. A large domain used their brand name once in a page with little content and they made the first page for a day, and then disappeared.
I suspect that the last release of the search algorithms produced overvalued URL elements in the absence of other indicators.
Magellan is an old search engine which was released by The McKinley Group.
mckinley.com was the web address for the Magellan search engine.
There's tons of links with the word "Magellan" to mckinley.com - that is why it ranks for "Magellan"
It appears that in 2002 mckinley.com got 301d to webcrawler.com
From 2002 to 2004 it got many links with the words "web crawler" - that is why it ranks for "web crawler"
In 2004 it seems the 301 was removed.
will go for #2, ppl might have replied the answer above, but i m just checking if i still got it in me.
Nytimes authority domain.4:20 is also equated to "4/20","4-20" etc which is available in the URL plus it appear exactly 21 time & close to 200 times each for "24" and "4". Its got around 219 backlink for the same page itself.
Also since Google search convert kgs to lbs, miles to kilometer and usually ranks them on the top, it think that tool which kick in. Search for "20 April" and the same link achieves higher ranking.
So unless you were looking for cannabis, I think it a pretty relevant search.
If this cant get you the ranking, i think you betta call up Rand and buy his services :P
Rand,
Along your line of thinking, check out... "amsterdam"
https://www.google.com/search?sourceid=navclient&ie=UTF-8&rlz=1T4GGLL_en&q=amsterdam
how many different types of results do you see here?
As long as you give us the answer...I think these types of posts are great. I really feel involved :)
Re: #1,
The Mckinley domain used to be web-related. Not sure when it changed hands (not all that interested either), but the previous use was highly related to web crawlers. Many of their backlinks are from sites awarded as "3 Star" by Magellan (yeah I haven't heard of it either).
Example:https://www.users.fast.net/~mhmyers/3star.html
Many more instances where that came from
Likewise, mckinley.com is ranked highly for [magellan].
cf. https://www.searchengines.net/magellan_se.htm
Ha ha, I like the lupus comment. If I had to bet on one of the above answers I would probably go with Atul's suggestion.
But hey guys, sometimes it may have little to do with straight up mathematical logic and more to do with Google deciding to un-sandbox something during a randomize in order to see if real humans will vote for a particular page.
So for that reason, across the Billions of queries done weekly, there must be millions of results in the serps that don't appear to have a mathematical reason for being there.
I know this because my google webmaster stats sometimes tells me that my site ranks for single keywords like marketing ( yet I can never find my page even on the 100th or 200th position on google.
Yet, MS Live ( bless M$ for something ) keeps sending me actual traffic based on the one general keyword. I thought that Yahoo loved blogs, but M$ appears nearly fanatic about regularly updated blogs.
Anyway, One sees this sort of strange thing in the webmaster stats all the time. Google claims that your site ranked on page one for blah blah ( though no one clicked )... you run over to see if its true and waste the next half hour patiently clicking thru tens and tens of pages looking for your stuff... Ms G has wasted a few half hours of my time like this... but not anymore ha ha
So perhaps in some cases, there is no mathematical logic beyond, Ms. G just decided to randomize a section of the results?
We were not told whether multiple searches on that datacentre brought back this exact same anomalous looking type information in that spot.
These are great examples of the imperfection that is inevitable in any algorithm, and how black-hat methods can skew the results.
What I really enjoy about this industry is the challenge to understand the competitive landscape.
One of my clients is a law firm in the dreaded "mesothelioma" market. They are one of our biggest clients both because a single client means a huge payout, and also because the industry is rife with black-hat SEO.
I'll never forget the time I found that several of their competitors were coming up high in the SERPs because they had set up free page counter sites, and got tens of thousands of unsuspecting site owners around the world to embed that free counter on their sites, with hidden anchor-text embedded links to those mesothelioma sites...
As someone who works a lot with clients having a regional market, I routinely need to explain to them that just because they saw a competitor come up on the first page of Google when that competitor had a one-time Craigslist ad with two sentences, that this doesn't mean they don't need real content or depth on their site...
Addressing #2... it's wishful thinking on behalf of the Petagon and NY Times staffs.
just think it goes to show the importance of the url.
this post proves how imp are anchor links..
Rand it is pretty simple!
Those are examples of collision of document ids. It happened 1:millions but it happened.
actually, that makes a lot of sense, ZoranSa... i Have just made the same search but in google.co.uk:
https://www.google.co.uk/search?hl=en&q=4:20&start=10&sa=N
then moved onto the 2nd page in the SERPs and found a lot of news and articles webpages where the pattern occurs....
it seems like the last 4:20 result was shown because it was published on April, 20. Maybe Google gives special meaning to the date in news items.
Shimrit