I'd like to preface this entry by saying SEOmoz does not practice or endorse blackhat SEO techniques. This is not intended to be an instructional blackhat SEO article or a list of websites you should all go take advantage of. The goal of this post is, rather, to "out" a significant weakness that can be exploited by savvy users.
While reading EGOL's recent post on gov links, I started brainstorming possible ways to creatively acquire a few .gov links of my own. Thus was born my first foray into the world of blackhat SEO. About a year ago I heard about how webmasters were all running scared because malicious users could easily place HTML code into their form input boxes and manipulate the markup on their sites (aka XSS) I was curious as to how difficult this actually was, so I decided to investigate.
After running a Yahoo site: command I was able to get a list of search forms from hundreds of .gov sites. I used the web developer toolbar to convert HTML form methods from POST to GET, making the search results link-able, inserted a few HTML tags into the search boxes, and voila: I had 20 links from .gov websites pointing to my site. Once these pages were created, in theory all I'd have to do is link to them from various other domains and they'd eventually get spidered and start passing link love.
In the list below I only linked to www.example.com (a domain reserved for documentation - RFC 2606) and used the anchor text "Look, I made a link" to make the links obvious to spot. The list below shows the compromised pages:
- Environmental Protection Agency
- United States Department of Commerce
- NASA - This one was a bit tricky, I had to throw in some extra markup to make sure the HTML that was rendered wasn't mangled. In the end I managed to get a link embedded inside a giant h1 tag.
- The Library of Congress - I even added an image of a kitty wearing a watermelon helmet
- US Securities and Exchange Commission
- Official California Legislative Information
- US Department of Labor
- Office of Defects Investigation - Their website is the only thing that's defective :)
- National Institutes of Health
- US Dept of Health & Human Services
- Missouri Secretary of State
- California Department of Health Services
- Hawaii Department of Commerce and Consumer Affairs
- IDL Astronomy Library Search
- US Department of Treasury
- California State Legislature
- Office of Extramural Research
- Health Information Resource Database (health.gov)
- United States Postal Service I had to jump through a few advanced forms before I found one I could use.
- North Dakota Legislative Branch
Many of these URLs ended up being very long and gnarly, possibly discounting any value they might pass.
I see a few possible solutions to the problem (Assuming this is a problem)
- All those sites need to be informed of the exploit and start validating form input. Unfortunately this is just a quick list I put together this evening, I'm sure there are thousands (if not millions) of sites out there that are vulnerable.
- The SEs needs to de-value links that are found on a site search results page (perhaps they already do this?). These exploits aren't limited to search results, however; you could do this on any HTML form that wasn't properly validating input.
- The SEs could greatly de-value links that aren't linked to from the rest of the site. These injected pages are essentially "floaters": pages that are not linked to anywhere on the site but have incoming external links. Do the SEs already do this?
- De-value pages that contain HTML in the URL (both encoded and unencoded), particularly if it contains A tags.
- Disallow indexing of any forms via robots.txt or a meta tag. Again, this would require work on the .gov webmasters part and changes are probably made at the speed of molasses (assuming the web departments of the goverment work as slowly as the rest of it).
What do you all think? Would these injected links pass link love or is this simply something that search engines already account for and is a non-issue?
SEO aside, this could also be used for phishing scams. For example, an attacker could build a fake payment form on the nasa.gov website asking for $100.00 for whatever reason. The form would then POST to another server, the payment data would be stored, and then they'd get forwarded back to another exploited nasa.gov page with a "thanks for your payment" message. The user would never know they'd been duped - frightening to say the least.
UPDATE (from Rand): We originally pulled this post on concerns that it could spark legal issues or create more problems then it helped solve. However, after consultation with several folks, we've decided that sweeping the problem under the carpet is more detrimental than getting it out in the open.
I have a friend... yeah thats it, a friend... and he recently ran an experiment in a current SeoContest using this little technique.
It's been over a month since 'he' inserted links on various .gov and .edu sites (5 links to .gov and 2 to .edu sites) and I think it is safe to say that Google wasn't fooled.
'He' inserted images, text links and even whole articles complete with links... but apparently didn't notice so much as a hiccup from the search engines.
A year ago this may have been a different comment... now-a-days, I recommend finding better ways to get links from .gov sites...
What about Yahoo and MSN? No love from them either?
G-Man
I believe you which is why I've been calling people out to fix their broken wiki's, tiki's, blogs and guestbooks.
It's out of control and being left filling up the net with so much garbage that I think the only solution is for the search engines to start dropping spammed pages from their SERPS, not the whole site mind you, just the spammed comment pages.
Maybe that will get people to pay attention when all their pages go POOF!
FYI, I'm not anti-spam specifically, I'm anti-bot. Some bots scrape, some bots spam, I want them all to drop dead.
Worse g-man...
Yahoo doesn't seem to have appreciated the subterfuge... MSN is indifferent...
but ASK... that is a different story...
The site has been in 1rst position in ask for 40+ days now :)
Of course... I get more traffic from page three in Yahoo than I do in first position in Ask...
hire Freelancer for Just Five Dollar www.justfives.com
A few weeks later the SERPs are rather interesting:
https://www.google.com/search?hs=qQh&hl=en&lr=...
a nasa.gov URL is ranking at #7 for the term (with all the HTML embedded in the URL)
example.com is ranking at #11
The rest of the .gov links don't appear to have been spidered
Those are very interesting results. The Nasa link and example.com in particular.
IMHO, It's always good to discuss weaknesses and flaws, as well as blackhat tehniques. This way, you can protect yourself against them.
Clients are this way more educated and will think twice before hiring ANY seo company.
Anyway, for the faint hearted, most of the pages in those websites (Including NASA's search results) have a disallow directive in the robots.txt. So the pages will not be indexed.
Looks like a few of the sites plugged the XSS vulernability so perhaps it WAS a public service posting after all to make them get off their butts and clean it up.
That's ok. For every hole that's fixed, there's 10 more that will never be touched!
Don't believe me? How many windows systems are affected every day by viruses that are two years old? :)
Anyhow, keep fighting the good fight Bill. I enjoy reading your rants against spammers.
G-Man
you'd be surprised as to how easy it is to sneak some javascript redirects in there too.
(always looks good when the .gov websites are making doorways to party poker :D )
Today almost all links are broken.
Please fix them
I would just like to mention that after exactly 5 years a couple of the links still work :).
Link Moses Speaks...
Would any self respecting site do this, or is it a tactic used for quick and temporary benefit by porn, gambling and other crap? I think it's funny frankly. The post itself is pure genius, and hopefully wakes up the Gomers that can fix the hole. Please keep the expose's coming. I've been at this 13 years and link building is finally just now getting fun...
Eric Ward
I just found an easy way to get link from .gov site. Go to business.gov and click on the "Our Community" section. It is a blog, you can register and they allow a link to your website in the signature area.
I used the SEOMoz.org toolbar to highlight nofollow links, I clicked on it and it showed that the link was a 'follow'. I also hard coded a text link in my main comment, that was a nofollow.
The more comments you make in relevant subject matter, to your business or website, the more links, hopefully on a hot topic.
Rather than linking to example.com you could maybe have tested out how search engines handle this by registering a new domain and pointing the links there.
I suspect most, but not all, of these links are discounted by Google.
If these pages are just floating out there, with no link from the mother ship, my assumption (Probably false) is that they would wind up in the supplemental results and PR would not pass on to the linked website. I read about this a while back as well. I have no proof of anything but I would assume that this is an old trick that Matt filtered out a long time ago. I’m still interested to know if it works or not. Please keep us posted.
yep dave - I agree - It was probably filtered already last year when the first posts about it were on threadwatch
i agree with you too, useless pointless backlinks wont do you any good.
Cool post, Matt!
Axandra actually published a similar piece awhile back:
https://www.free-seo-news.com/newsletter179.ht...
Here is also another one with a different twist:
https://www.free-seo-news.com/newsletter205.ht...
I would say that the search engines are aware of these type of links and are getting a handle on them, if they haven't done so already...
Another thing... if those injected links do pass link love, it wouldn't be much becuase the pages that they are on are not going to be linked to and co-cited enough to have much quality. Just because it's a link from a .gov site doesn't mean anything if there is no authority to support it.
Google still seems not be able to detect spam that had been injected in gov sites. https://www.patroc.com/zzzebmaster/art1.html gives an example which shows that awful spam and malware pages on .gov sites perform better than many quality sites with .com or international top-level domains. Isn't it time for Google to give up their preference of gov and edu domains and for SEOs not to focus on these TLDs in their link building strategies?
Very intersting post regarding getting links from .gov sites. I was wondering if you have seen the informaton about authority codes that Ryan Deiss has been talking about recently. Does using these codes work? I would like to hear your thoughts regarding them.
Thanks,
Dave
Thank you for all the information very interesting :)
Hi All,
Very worring as a website owner that sites like these are so easy to manipulate. Considering the money at their disposal you would expect them to be more secure and untouchable !!!
As the owner of my own website - https://yournetbizmentor.com - I am now extremely worried as to how vulnerable it is. I have built it on a wordpress installation and hope that this stands up to the unscrupulous hackers out there.
Warmest Regards,
Andrew Potter.
"Think Differently - Think Success !!!"
Excellent info. SEO seems to be abit of a minefield. Will try out these ideas on my site. Thank you.
Good Post. May be I can try to get .gov links soon. Thanks
Great post. I guess using some kind of automation to link to theese search results might give quite dramatic improvements in SERPS. However, spending time on techniques like this, might be focusing on the wrong things.
HI,
This article indeed a very helpful for new bloggers, because getting high authority gov links are very important to get. The .gov links will help you to achieve the trust value of the search spiders and hence your domain or page authority.
Anyways, thanks for sharing a quality, and for sure it will make newbie’s to learn the quality.
Thanks again.
sunil k
Wow, this article is old but I heard somebody bring this trick up in a recent discussion. I am assuming this is wiped out by now but from what I am hearing it seems like some sites are still vulnerable. Doubt Google would count these links regardless.
these pracistes still work?
Getting links or presell pages from OLD domains is great...I think it demonstrates that there is a lot of moral ethics in the community..
Does gov backilnikg matter now that Google Penguin is out?
I will be happy to share this information with rusiian speaking users. Thank you!
You can get a gofer do the links for you on compromised pages here: https://www.goferr.com/seo/300/get-20-EDU-LinksNot sure how long the links will last though. I was looking for information on how to get legitimate links to my blog from edu and gov domains :(
I made some some .gov backlinks for my website. Is it black hat? Tell me if it is considered black hat then i will remove it.
[link removed by editor]
Hi, gibsonjack! That sort of question is best suited for our Q&A section -- we try to keep the blog comments relevant to what the author discussed. Good luck earning those links! =)
I have just this articel. I am really very imprees with auother. I have always tried hard for .gov backlinks for my blog. That is inspriational staff. I will this way for .gov backlinks.Thanks
Thanks for the update Matt - was wondering why the post in my reader had disappeared from the site.
This exposes a pretty ugly side to the web in my view. I look at my own blog logs and shudder a bit when I see all the attempts to 'inject' stuff on to my site. And I'm thankful that the good folks at Wordpress keep up to date with all this stuff so I don't have to.
Its a bit shocking to think that these governmental sites are so easily 'hacked'. But webmasters must keep up with these security matters and protect their sites. So I think this is probably something that needed to be done.
I can imagine this post is going to end up being seen by a lot of people. Lets hope it has a positive impact!
WebCart - If you want links it's the same rule for .gov links as it is for regular ones. Have some related content? Go right ahead and contact the admin. of the .gov site and introduce yourself and tell him why it is a good match. There is no magic rule, are you good with people? It's easy to get links, I should do it more often! ;)
Concentrate on the quality, accuracy and usefulness of your content, over time people will find you and these include .edu and some .gov sites. That is if you are not selling Viagra.
this 'exploit' kinda reminds me of that giant google exploit...
you know - the one where you get an old domain, a bunch of backlinks & good anchor text and then can rank your site above others even though it might not be better.
:)
it's all blackhat.
Heh. When I found out about this one I implemented it over a year ago in a php script. It's incredibly easy to do but the value is far less than what it used to be.
Quite often I found more of the actual search pages indexed than I did MY sites because of the XSS links to me - go figure - their pages were outranking me on the keyword I wanted LOL.
As someone above posted - on to bigger and better things.
Wow, now that's ballsy! It's one of the most interesting things I heard all week. I am not looking forward to the spammers who will probably create bots to visit every one of my sites all day to create pre-sell pages for texas hold 'em sites.
Ok so does anyone know how a traditional site can get .gov and .edu links? I have been searching the web, but I have yet to find anything that would lead me in to the right direction.
https://www.seomoz.org/blogdetail.php?ID=130
@webcart: did you also try google for an .edu presell page?
Although I think it DOES serve a public interest, I think it's a little too explicit. I'm all for exposing the weaknesses, and pointing out the easily hacked sites too, and that's just fine. Doing it live on the .gov sites is beyond hillarious, especially the kitty in the helmet.
However, the step by step instructions and links weren't necessary, to prove the point, and do cross a line in my book. Just because it's legal to share the info, doesn't make it responsible or the right thing to do.
By the way, I'm completely unconcerned with the black hat SEO aspect of this, and far more worried about the phishing issues. Links like these are probably worthless these days, but getting the public to bite on the phishhook is far too easy these days.
Scott - your phishin point is a great one... I didn't see any problem til now..
but OTH - those big "phishers" do know these tricks as well - so hiding these "news" from the public probably would only avoid that some script kiddies play around with it...
Are you kidding - phishers are the ones pushing the envelope by exploiting web technologies for profit. This isn't anything new - but no one should think for a second people don't know about XSS. I see it every day, all day on thousands of servers. Used to deface websites, inject phishing ebay/paypal/bank pages and all kinds of greatness.
I have developed specific mod_security rulesets that stop a huge number of these injects and attacks from client servers, something SearchStudent would probably like to have and many others.
This no longer works really.
Hi folks,
funny to see such controversy here if we should blame rand for pointing out this old stuff or not.
This XSS hack was around 1 year ago? or 9 mo? or 1,5 yr? who cares?
This hack is so old that anyone who bothers to create a script for exploiting it may do - but the SEngs implemented their filters for this trick already last year...
it's one statement in their link code
if SANITIZED(url) != url then filter-spam-link-by-100%
Don't get me wrong - I like the post as it's a typical seomoz quality post with good examples showing off still way too many old .gov websites
Heck - maybe it even works in MSN or Yahoo for 1-2 months? but as usual - as soon as too many people start talking about exploits they won't work
Oh - and BTW @aaron
The public already relates SEO to hacking, because SEO simply IS hacking
simple using the advantage of understanding how a system works and using that knowledge to your benefit can be called hacking... and learning to understand how SEngs work, filter, rank etc. is what SEO's days are filled with, eh?
cheers,christoph
I do think that this is very interesting and important information. However, I don't see the benefit in exposing specific pages on the specific websites.
As far as legality goes, those are not the website's that I would ever f**k with (USPS comes first to mind), and publicly admit to it. Yes, it's a flaw in their system, but I can guarantee that the government doesn't work that way. Since when has the government accepted accountability for anything.
As far as the article goes, this is a perfect example of perfect content. You piss off a bunch of well respected people, cause controversy, stir some things up but through a very high quality, informative post. Well done...
There are also exploits on sites like the US Senate and House of Representatives, not that I would know anything about that.
More importantly, the real problem with XSS these days is actually the phishing potential, not the backlinks potential. A decent phishing attempt can draw a user through a legitimate site w/o the user ever noticing. Its actually not that difficult at all.
Interesting discussion and expirement. And great to see such polarized reactions, on both sides as I think it demonstrates that there is a lot of moral ethics in the community... regardless of the view, opinions seemed to be based on the public good.
However, and I'm no lawyer nor do I play one on TV, I'm not sure this is exactly the experiment to test or prove a point that I would choose, at least not without some form of working agreement with websites involved... as you know, definition, especially in the government sector can probably be interpreted a million different ways and I'm sure it wouldn't be a huge jump to essentially define this as hacking a government entity, which is a pretty short hop I'm sure from a Federal offence. Maybe not... either way, glad it wasn't my experiment.
This is an interesting post... I am curious to know if anyone here has actually tested and proven that getting a .gov or .edu link has made any difference...
I was a little thrown off by Matt C's video response making it seem like these links & (DMOZ) would not be treated any different then a regular link. I have not had time to test this but am very curious if anyone here has and what the results were.
@WWIP: if you were thrown off, then it worked :-)
AGE is everything - so getting links or presell pages from OLD domains is great... and well - .edu and .gov domains ARE old... most of them were registered in the early 90s or late 80s *g*
and yes - I had the time to test :-)
Interesting.
This used to work pretty well. After using gthe XSS exploit you'd simply post some blog comments and the generated links for your url. The engines would then crawl and the backlinks would show.
Something has changed in recent weeks and many of those backlinks have started to disappear. If I had to guess it would be due to the fact that the generated source has no tie to the actual site from a link perspective. I'd not waste any time going this route. On to bigger and better things and never look back. :)
If someone did this for their site, would the page strength tool show an increase because of the fake backlinks from the .gov sites?
Only if those .gov pages with the links on them were eventually spidered by Yahoo (we use them to determine the # of gov links), which could happen if you pointed enough external links to them (and they weren't disallowed, etc).
How much could this, theoretically, raise the page strength? From 1.5 to 4, or something considerably less?
Something less - at most you'd probably see a jump of a 1 1 or 1.5
Almost all links are broken. Please fix them
This post is over six years old, and websites change considerably in that time. We won't be updating the links in this case, but have left them there for historical information.
Chances are if I performed this exploit as easily as I did, blackhats and phishers have been doing it for years. Putting it out in the open seemed like the best thing to do.
Hi all,
I'm a no tech wiz so I'm not sure whether this is another example or a new exploit but it looks like there is a site that is using the link popularity of the USDA Forest Service site to spam the search engines. The page https://www.fs.fed.us/cgi-bin/HyperNews_mm/get...shows up as the 19th result in Google out of 6,460,000 results for the search term "laminate flooring" (a very competitive term). This page on the gov't site then redirects to https://listnew.com/search.php?qq=laminate+flo...
I have emailed the webmaster of that site. It will be interesting to see if I get a response.
Nice find, Rob. It's definitely not natural for a forest service page to re-direct to a MFA site...
I was curious, so I did some snooping and figured out exactly the problem. I disabled javascript in my browser and checked the source on the forest service page, and found this lovely bit of code:jomigfnmq='laminate+flooring';mipmsgyc ='o';lhuismoo='e';xdfkfhozx='m';vidkctf='.';ihifkr='w'; xdkdfvtxi='t';zayajdze='p';oapbwtr='t';xivzymvv='w';wjz homxbh='/';ekaanvsm='.';cayxbou='w';xwmtofrtt='s';khmvx m='/';jmnafl='a';wogefoz='/';oimunl='/';jqdcthhow='b';y xzihwv='c';wmlvame='t';wsjrtz='s';yxlfucugf='e';pcohtx= ':';htneghrs='h';ofsqsgtxv='t';xklplho='k';mcxsqu=htneg hrs+oapbwtr+wmlvame+zayajdze+pcohtx+wogefoz+khmvxm+xivz ymvv+ihifkr+cayxbou+ekaanvsm+xdkdfvtxi+jmnafl+xklplho+y xlfucugf+jqdcthhow+lhuismoo+wsjrtz+ofsqsgtxv+vidkctf+yx zihwv+mipmsgyc+xdfkfhozx+wjzhomxbh+xwmtofrtt+oimunl+jom igfnmq;eval('do'+'cume'+'nt.l'+'oc'+'ati'+'on.h'+'re'+' f=mcxsqu;');It's nastly and obfuscated, but basically it just redirects them to the listnew page automatically. Clever.
Becuase the Forest Service convert the user input to HTML entities, spammers have been abusing the Recreation: Ask a Question - Make a Comment page. Scroll towards the bottom and take a look. Nobody at the forest service has noticed this?
I think it's interesting how suceptible the .gov sites are to this relativly easy XSS attack, yet there is so much weight given to them by the SEs...
Rob,
that's just one of thousand bulletin boards/forums that allow html+javascript injection ... software from the last century... I've seen a lot of bigger spam networks these days operate by injecting redirects like that into all those old software installations...
Actually I just came across a big one last week that even attacked a whole .edu domain with a subdomain hack... frightening actually...
best,christoph
PS: let me know if the .gov webmaster responds and takes it down - by that time the automated spam injection bots probably injected another 1000 posts into that site :-/
This is quite an interesting post. Made me do a little research myself and figure out how other sites are effected.
We have reviewed those XSS exploits - normally those pages will have very little to no page authority and Google will automatically discount them of any link value. Over at https://www.seobutler.com/ we've done some tests to check the authority of the pages and see if it passes any juice or not. Even if you link to those deformed links, you will not get any link authority.
I just checked the SEOmoz website for common XSS/HTML injection vulnerabilities with the cool tool at SEO Egghead.
https://www.seoegghead.com/tools/scan-for-html...
It found two forms but no vulnerabilities.
best health blog news-health.net
not useful
ساخت استخر very good text
i love faceadventure.com
I think this search results are not cached by Google, because after searching they dont exist.
Does these searches are stored in DB ?? I think not...
paltel chat www.paltale.com
We have reviewed those XSS exploits - normally those pages will have very little to no page authority and Google will automatically discount them of any link value. Over at https://www.seobutler.com/ we've done some tests to check the authority of the pages and see if it passes any juice or not. Even if you link to those deformed links, you will not get any link authority.
hire Freelancer for Just Five Dollar www.justfives.com
Please , can anybody tell me how to make a xss link , how does it work , where to find useful information about xss links , or where can I download a software for making xss links. Can you explain me with a few words how does the software works , or what do I have to do , to understand how to make a xss link. Thanks in advance!
Okay so you are in saying I shouldn't do this ?
Come on rand isn't "complete loss" a little dramatic? You must be going for the BIG linkbait to get in the news with this one yes? Well why not come clean with your methods. ;)
Let me get this straight - since we're trying to attract visitors to SEOmoz and get more subscribers to the blog, this article is not altruistically motivated and is thus damaging to the reputation of the search marketing industry?
I find your logic un-followable.
Also - what "methods" must we come clean with? The process is described above in sufficient detail that any programmer could work it out. We have nothing to hide here.
This post is fine. Looks like he's just poking the animal.
I am with Michael on this one. This type of stuff also gives "SEO" a bad name and when the public starts relating SEO to hacking it's game over guys. ;-(
Aaron - and the better solution in your mind is to not talk about it and let it happen? Why is that a good thing? How can any of us benefit if this exploit continues to be unfixed?
I'm at a complete loss to understand your perspective.
Aaron, the image that SEO has will always be varied and often influenced by the space that you want to play in. If you want to rank with the bad boys in a competitive field you have to be prepared to play a better game. If you're not willing to do that, stay in the shallow end of the pool. That's not intended to be mean spirited in any way but there is a reason some people stay away from certain markets and if you're not going to throw everything at it, don't even bother.
XSS is, errr in my mind, was a novel but short lived method. I don't think there will be any long term benefit so I'm not going to waste my time with it. Talking about it will at least put it front and center and hopefully raise awareness.
I think so. In my idea Google YAHOO already taken action against it. because its a well known method....
Also this link says the same : https://www.free-seo-news.com/newsletter179.htm#facts
Chances are if I performed this exploit as easily as I did, blackhats and phishers have been doing it for years. Putting it out in the open seemed like the best thing to do.