Try a few searches with me...
Buy Viagra
Cialis Online
Cheap Payday Loans
viagra cialis credit card mortgage payday loan ringtones (yeah, yeah, I know... just stick with me)
There are some characteristics here that reveal a lot about Google's ranking algorithms. Here's what I take away:
- Trusted domains are excellent places to store nefarious content, because Google gives content on these domains the benefit of the doubt (and has for at least the past 3 years)
- Reversing the links on these pages via Site Explorer shows that many, low quality links (blog, comment and link injection spam) from relatively trusted domains combine with crappy link farms to produce these short term high rankings.
- EDU pages and links may not be "inherently" better able to rank, but they certainly have a high propensity to earn the kind of trust necessary to perform well.
- Google's ability to detect hidden content still isn't 100%, and high rankings can be earned by spam content and links on domains like cornucopia.org (view source and search for "viagra") and feelphones.com (same thing).
Domain trust is, in my opinion, the biggest tool Google has been able to leverage to fight spam and grow relevance. It's the reason why content on Wikipedia outperforms the same content on virtually any other domain and why barely relevant pages from trusted sites will often outperform more targeted material on smaller, new domains. This isn't just something that's visible in web spam searches, but in the normal SERPs as well.
Recently, jamersan posted a YOUmoz entry about Google's .com search results. I like to use some similar searches to find out some of the domains the search giant appears to trust most highly:
- www site:.com (which illustrates which .com websites are most relevant to "www," essentially stripping semantic intent from the query)
- www site:.org
- www site:.edu
- www site:.gov
- www site:.net
- www site:.co.uk (try it on any country, and you're likely to get a list of the most "important" domains with that TLD)
The results of these queries match up very well to the highest PageRank, most linked-to, most referenced domains on the web. If you're seeking to become a true authority online, I think this is some of the better competitive intelligence material available. These domains fit Google's profile for trust and authority extremely well.
While domain authority may seem like an unreachable goal for many small and mid-size site owners, there are some very big takeaway lessons from a dive into Google's algorithmic dependence on domain weighting:
- It's almost universally wiser to put all of your content onto a single domain, where every link earned can reinforce and bolster the ability of all content on the domain to rank better.
- Links from trusted domains may be important, but if you've already got a trusted domain, it doesn't take much external link juice for Google to consider your content relevant
- Age correlates fairly well with domain authority - there are very few domains younger than 5 years old on any of the top 100 .com/.net/.edu/etc lists
- Leveraging "rented" pages or placing relevant listings/content/links on high-trust, high-authority domains might be an excellent strategy to help achieve visibility in the short term (I don't mean spam injections, I mean legitimate placement/submission/ugc)
- Spending a lot of time in strange search results can bring a lot of extra SEO knowledge (I blame it for that 6th sense many SEOs develop about rankings over time)
Spam isn't always annoying and negative. Sometimes, we can even learn from it :)
p.s. I can't resist showing one more spam discovery off - one of the spamming sites, bad-credit-advisor.com, has a sitemeter link to this page - www.sitemeter.com/?a=stats&s=s21badcredit - which shows both the incredible effectiveness of their link manipulation campaign and just how much traffic #1 positions at Google generate for terms like "cheap payday loan" and "credit dispute letters." Checking their link profile at Yahoo! is especially interesting as well - lots of EDU, Yahoo! directory, and low quality link lists on there. I think I'm just fascinated by black hat SEO in general :)
Did anyone else notice that the only result that Rand seems to have actually clicked through to was for the Cheap Payday Loans? Perhaps a sign that he's in need of some quick cash ahead of the upcoming wedding?
Still, I suppose that the fact that he didn't need to go to one of the viagra sites is a good sign at least ;-)
Noticed that, too.
Don't do it, Rand! It's a scam! ;)
Why do you have to RAT sites out all the time? bad karma. I know its tempting to write about these results, because it is in fact how most black hatters start learning the tricks of the trade, but why not do a black hat site yourself, test it, rank it and write about it then.
You may learn something from your own experience instead of assuming and getting someone else's site banned.
Oggy - my experience from watching these SERPs is that rankings don't stick around much longer than 72 hours, so I don't feel too bad. I'm also of a mind that Google certainly doesn't need me telling them stuff like this - they hire some of the top minds on the planet who work on this problem day and night. The spam/search quality division has dozens if not hundreds of people who are very well informed about these issues long before I brought them up.
I'm not a pearly white hat who thinks its evil or wrong to spam, but I'm also certainly of the mind that those who do so publicly or have visible success (especially for such obvious query strings), wouldn't be surprised at all to see their domains mentioned in blog posts about learning from web spam.
BTW - I know I've been a little softer on this issue in the past, but with a lot of the research projects we're working on, I expect to be showing off and exploring a lot more web spam in public ways. The "honor among thieves" clause doesn't carry considerable weight with me.
(however, I am thumbing you up for bringing up an important issue, and one I might have been wise to address in the post - thanks for that!)
While it is true that Google Manually removes these results regularly, showing Google's "dirty Laundry" to the public forces them to take action much quicker than they would have.
From my experience, and by having tracked similar SERPs every day for quite some time, these results will just be replaced by other spam sites. The reality is that Google cannot really keep up. and 72 hours in #1 result for some terms could really make thousands of dollars.
To me the point is this: Does the site actually offer what it ranked for? When you look for "buy viagra" with the intention of buying, and some BH spammer got the 4th result, and offers you a site to purchase it, are these really bad results? It does not really matter how he got the ranking. To me it is a relevant result, and the BH that worked hard (or smart) to get the ranking deserves to get a cut for his conversion.
In reality, yes Google has a huge webspam team diligently working on these things - yet they haven't seemed to figure out that if a page has external links from the Internet, but not from the root domain, it's likely a parasite. Or shouldn't automatically benefit from the "trustrank" of the domain. Shouldn't be hard to write that case statement should it? If they don't need our help they don't need our help.... until M.C. asks for it through their "spam reports".
"yet they haven't seemed to figure out that if a page has external links from the Internet, but not from the root domain, it's likely a parasite."
How often does that happen really? Plenty of forums dont link out to new subject threads from the home page. Are those parasite pages, NO!
not from the root "of" the domain
but from the root domain. somewhere on it.
Are you saying there are forums that orphan their own pages? I doubt it.
Oggy's right, google should be able to tell a page is orphaned from a domain, and so should not inherit trust-rank.
But Google has had its trust-rank dial cranked up way too high while failing to take this into account for ... I dunno, about three years now. I don't see that as progress on behalf of their web-spam team.
"Are you saying there are forums that orphan their own pages? I doubt it."
Nope never said that. Just was trying to understand what you meant by root. When people talk of root they all most always mean the home page, not "somewhere on it".
Analysis of spam sites is valuable and fascinating. This is an analysis of who is ranking for some of the most competitive terms on the internet.
Everyone knows buy viagra is a huge spam target, and if Google is going to pick a term to start testing anti-spam measures, that would be the best place to start. As such, its also the best place to learn what factors will cause their algo to rank sites it should obviously be kicking out of its index.
I've got no tears to shed for those who fill my blogs with their PPC links. I don't buy into the idea of the starving spammer just trying to feed his kids. Blackhat is by definition not playing by the rules, so why cry foul?
playing by the rules? Whose rules? Google's?
Who is Google to set the rules? Let them ban and deindex my sites as they see fit... its their site, and it is the only thing they can do to Blakhatters.... I'll have something up there again in no time to replace it. There's nothing illegal with Blackhat if you're using techniques that don't hack websites.
And I missed the comment Rand made: "I'm not a pearly white hat who thinks its evil or wrong to spam"
And then says: "The "honor among thieves" clause doesn't carry considerable weight with me."
So which is it?
You all don't seem to understand blackhat. There are methods that don't require theft of content or hacking, (mad libs sites, database driven sites, the list goes on and on). Links are there for the taking on many sites, we just know how to automate this. Google just eats it up and puts it in their index.
If it "dirties" Google's results its their problem for having a faulty algo. Google wants one thing: "to make money" and that's what I want, they just make it easy for me to get there.
If there was no content network in Adwords (aka adsense from the other end), 80% of blackhat sites would dissapear. But Google makes a significant amount of their revenue (more than half), from spammy blackhat sites out there.
Why are some of my sites de-indexed from organic results, but not from my adsense accounts? I get plenty of visits from other referred domains and the ads in those "banned" sites make me AND Google money... And the "big G" just eats it up looking the other way.
Just saw the related discussion yesterday...
That's curious that we are talking about that only now. Is that a new black-hat technique? - no. Blogger as an example has always been an excellent place to spam and win...
Anyway, that's a great proof that domain authority is still a major ranking factor.
That was an eye-opening discussion. I thought it was amusing Blogger was at the top of those results, now I understand why.
Look at how much of it is user generated content. Blogs, forums and social networking type websites.
Rand
I love the www Site:.com technique. It is a very interesting look at domain trust. Odd to see Alta Vista and Lycos in that group, even though they are certainly old enough.
Still, I began to bulk at your interpetation of the meaning of these results when I saw the lower half of the list. Annual credit report was also a bit of a surprise, but Neopets? ST Microelectronics, Quiznos? Can these really be in the 100 most trusted domains?
Do you think it might also be sites most linked-to with www in the anchor text? Something else? I definitely don't think it's ordered from "most important" to "least important," but I do think it's a relatively good illustration of importance in general.
I think we are seeing randomization by Google to obfuscate the results of the site: query.
Aha! Other possibilities...
& yes Google does investigate spammy sites to get a hold of spam networks via their Search Quality Teams via a lot of automatic & manual ways, when the spam networks are found, they take the appropriate action against them.
interesting use of the www search there - being from Ireland, I tried it with .ie and found the results to be quite familiar, and reasonable (not too much spam in ie yet). I wonder if using htaccess to redirect www to non-www form would affect rank in such a search...
That is a damn fine question. Any opinions? Better yet, any authorative answers?
In this in-depth analysis of the article.. I just wanted someone to through some light on last point on 'unpopular search engine bringing extra seo knowledge..' what kinda of 6th sense Rand is indicating towards? Thoughs?
This is slightly OT, but another consideration is that in a category that has negative connotations to begin with, this kind of activity actually damages the reputation of the entire industry.
It makes it really hard on legitimate, reputable companies who have brand image concerns (and thus must stick to white hat tactics) to compete for terms for which they should legitimately rank well. Did anyone else notice that the manufacturers of Viagra and Cialis don't appear in those results? They own the trademark for those terms--yet that counts for nada in the SERPs.
People want to purchase pharmaceuticals, payday loans and any number of services and products that are prone to SERP spamming because people like the anonymity of shopping for that category online. And they'd probably prefer to purchase them from the most legitimate, trustworthy source possible.
So the question then becomes, "Why the huge disconnect between the sites I would trust as a consumer, and the sites Google deems 'trustworthy'?"
Hi Rand
Do you have any views or thoughts on how long (if at all) domain authority passes from one domain to a new one assuming the individual URLs have all been 301 redirected properly etc.
I ask because we recently migrated from a domain with very good apparent domain authority to one with apparently very little. We've no idea how long (if at all) the domain authority will transfer.
The full story of the migration and negative SEO impact so far is at https://econsultancy.com/blog/3244-econsultancy-site-migration-and-seo-impact-the-story-so-far for anyone interested.
Regards
Ashley
Ashley Friedlein
CEO
Econsultancy.com
...
A search for 'buy viagra' on google currently shows vastly different results 24 hours later than the screenshots posted above.
This could mean that Google is actively monitoring specific queries in their continuous attempt to algorithmically fight spam.
Translated: Matt Cutts' team knows these queries exist and and allow blatant spam and hijacking of domain authorities & major university websites in order to better understand spammers.
Does Google play favorites? Of all the university .edu domains to be compromised, has stanford.edu ever ranked for 'viagra'?
Hey Rand,
You can find the lots of spam sites in every drugs related searches like Viagra online, order Cialis, order Levitra, buy Acomplia etc. I found in SERP on “buy Viagra” keyword only 14 websites are genuine in 1 to 50 other all are spam websites.
I also found that so many .edu and .gov sites are hacked and placed their order processing on those sites.
Before sometimes I found that only one kind of websites are comes in 1 to 10 ranking.
You can see these images
https://jwesly.stifen.googlepages.com/spam.jpg
https://jwesly.stifen.googlepages.com/spam1.jpg
Those websites are redirecting to other websites using java script. Now these websites are removed from SERP.
I also sent the spam reports to Google so many times, but no changes.
Can you suggest me? What I should to do now?
Don't take this personally but:
How about not wasting your precious time reporting spam sites, doing Google monkey work for free and try make some money on your own?
;)
Excellent round up of the still current state of Google's algo about domain trust. Fully agree.
Gaining visibility by hosting content on old authority domains is what our clients do - and very successful actually. And as their sites age the domain trust from the presell pages transfers to their site over time...
So they get visibility AND domain strenght in one go.
I think this, more than anything else, lead me inquire as to whether google has a metric in place for spammy content within a trusted domain or trusted page.
What I'm referring to is a segmentation analysis, user input, or some other form of iscolating spammy content and using its algorithmic scalpel to carefully remove the cancerous spam. Edit to clarify: without penalyzing the page or domain that is being attacked. That's the key.
According to my tests and inquiries to Matt Cutts.... the answer is sadly "We're not there yet"
SpamRank TM ftw
The big problem with Google is Google itself! Or more specifically it's popularity. Because of the way Google calculates rankings and it's emphasis on links and the fact that if you have the time and in particular the money (and the money to hire someone elses time you can just throw money at) it makes it impossible to police. If Google policed spam effectively I don't think they'd have any resorces left to rank sites! Links after all are very easy to manipulate.
The rod that beats Google's back is Google, if they had a more content centered search engine like a yahoo they wouldn't have such problems. Google will never effectively deal with spam, just simply because of the multiudes of spamming methods and the amount of spam out there, and if they did they'd be yahoo!
I'm finding more and more examples of spam nowadays. Not the hard to spot kind the kind of spam farms spread across several sites with very similar name pulling of news feeds all with the same content which have all the same links on all pages and having checked the domain owners are all registered to the same SEO company. The issue I have is the one thing they all seem to have in common is they run Google Adwords on the sites.
So my point is with spam are google filering out how they deal with spam now according to how much money they can make out of it!
i was reading an article about google algo that they are thinking to introduce case sensitive algo, i donot know how is this true any one have any idea
I think it's funny that the www site:.com search on Google yields Yahoo.com as the number 1 result, but google.com is on page 3.
Indeed Domain Trust is given certain weightage by Google,
But in the above case, there is a possibility that majority of the sites which appeared in the serp's were hacked because of the weak security on the domains (tht also goes for .edu sites, they sometimes suffer because of low level security measures when it comes to sub directories) & hence we could see some trusted domains also appearing in the serps as certain directories might have been hacked & hence loaded with the spammy content.
Google actually tackles these kind of spammy search results quite regularly, there was a time when tons of splogs were found on blogspot, and google took measures by updating the blogspot development team & they in return removed all the spammy splogs,
So Rest Assure guys, Google does take of spammy serps like these in the longer run and if they find Trusted Domains filled with spam then then the website owners do get a mail from Google stating that xyz spammy directory in their site was found & hence it was a violation of Google's webmaster guidelines, the webmaster in return can file a Re-inclusion Request to get his site back in serps.
nice work i wrote an article about page strength and search engine ranking i completly agree wioth you about the page strangth game in search enigne ranking, but i am bit confused as you mention these website are spaming but the are using absulte link instead of relative link which is not consider spam by google as for as i know? may be i am wrong.
are you saying you don't think it's spam just because they use absolute links? i'm not sure i follow. if they're linking out doesn't it have to be an absolute URL? so it's only considered spam if your using internal relative linking? i don't think that's right.
wel if you check bad credit advisor website code only thing they are spaming is they are using absolutre link for internal linking instead of relative linking, which will slow down there website refresh and download time..... but i am still not sure about relative link and absolute link, if you think we should not use absolute link for internal linking. i am agree you have to use absolute linking when your linking some outer source. you cannot link without absolute link, can you?
that was exactly my point. you can't link out with relative. and my opinion is that it's better link internally with absolute urls. i haven't a clue why that would be considered spamming by anyone.
Well, with base href you can (kind of) link out with a relative link...
(but it still has nothing to do with detecting (or being) spam :S
i been through matcutt blog and guess what i found exactly same as bad credit website is doing, so we can say matcutt is spamin?;)
if you chek the source of Matcutt Blog you will see, i could pase here but then i thought it will be too much and second you know how to check so you can check it by urself
I spent an hour yesterday looking at the exact same thing. Searches for "buy viagra" are just littered with hacked sites. A manual edit by Google could find and fix so much spam if they looked at this once a week.
It could but they really don't want to go down the route of having to manually edit their results and are probably spending more time looking at automated options that won't screw up results for other, irrelevant searches.
Deleting a spam site isn't manually editing. They could delete the site and then find all the links to it and wipe out entire networks easily.
Having someone go out there and remove a site they have found, rather than havin an algo detect it and deal with it...is manual. You even stated that in your first comment.
I think what a lot of people don't realise that it's not simply a case of finding a site, removing it and whoever links to it.
The links could easily come from hacked websites where the owners don't know they are linking to a site or forums that have been spammed.
I dont see MIT or Stanford in the top 100 .edu results.
Could be personalized results though because I do a ton of .edu searches.
regarding the top tld searches, sometimes this research makes total sense and then there are others that leave me thinking a big "huh?"
For example: .co.uk lists www.amywinehouse.co.uk/ as one of the top 10 co.uk tld's.
Really google, wtf?
There are some of those on the .coms too. I think it partially has to do with popularity. Think about how popular she has been in the media recently.
if i had three name .com, .co.uk and .net which two should i use for redirect and why? does it mater with extension if yes how much and which one is bater in which case
That actually kind of makes sense though. She is constantly the focus of practically every celeb gossip site (and often times they will post a link back to her webpage). Talk about the power of social media! Amy Winehouse and Franz Ferdinand both rank higher than Google's own local UK search engine according to my search!
Think "cultural phenominon". Not having results like that would mean the SERPs are out of date.
Domain trust might also give some insight into who can better get away with buying their way to the top. Numerous large corporations with aged domains that are buying lots of competitive SERPs...
Great stuff as always Rand. Thanks for including the discussion of how this applies to small/mid sized sites.
We're dealing with this issue right now. Between Clipmarks, Propellor, Blogger, etc., it is really hard to rank #1 for a product release during the first 72 hours. The second the product is released every spammer reposts the content and/or links on such sites.
Very frustrating and yet completely understandable. They want to win just like we do. It's all about competition.