The Illustrated Guide to Duplicate Content in the Search Engines

Comments 96

Please keep your comments TAGFEE by following the community etiquette.

E-mail me when new comments are posted

Sort by:

Comments are closed on posts more than 30 days old. Got a burning question? Head to our Q&A section to start a new conversation.

JeremyLuebke

2007-03-12T08:49:50-07:00

Ecommerce sites are the worst. 98% of the time when I take on a client who has a large catalog with thousands of items for sale, they all have product descriptions that are copy and pasted from the manufacturer.

Just by having all the pages rewritten to say the same thing using different words, they usually see huge increases in rankings.

Leave the copy and pasted junk for the exports to shopping sites I tell them.

4 0

Ecommerce sites are the worst. 98% of the time when I take on a client who has a large catalog with thousands of items for sale, they all have product descriptions that are copy and pasted from the manufacturer. Just by having all the pages rewritten to say the same thing using different words, they usually see huge increases in rankings. Leave the copy and pasted junk for the exports to shopping sites I tell them.
Cancel
- Jane Copland
 
 2007-03-12T15:16:35-07:00
 
 Very true. It's incredible how many people are too lazy to re-write simple sentences and descriptions!
 
 2 0
 
 Very true. It's incredible how many people are too lazy to re-write simple sentences and descriptions!
 Cancel
- vangogh99
 
 2007-03-13T12:23:44-07:00
 
 Agreed. It's sad, but I still think many people don't see websites as something they need to work on. I think many still see them as something you do once as cheaply as possible and then you make money.
 
 It's interesting, because no one would think in a brick and mortar that they would be smart to stock every room with the same products, but they seem to think it's ok to do that online.
 
 Jane in some cases I think it is just laziness, but I think it's more a lack of understanding of the web in general and goes to a deeper problem than laziness.
 
 The same person who would hire someone to paint their store often doesn't think they should do the same and hire someone to design their website or rewrite their content.
 
 2 0
 
 Agreed. It's sad, but I still think many people don't see websites as something they need to work on. I think many still see them as something you do once as cheaply as possible and then you make money. It's interesting, because no one would think in a brick and mortar that they would be smart to stock every room with the same products, but they seem to think it's ok to do that online. Jane in some cases I think it is just laziness, but I think it's more a lack of understanding of the web in general and goes to a deeper problem than laziness. The same person who would hire someone to paint their store often doesn't think they should do the same and hire someone to design their website or rewrite their content. 
 Cancel
- ChrisRothwell
 
 2007-03-13T16:25:49-07:00
 
 You're spot on about the re-write Jeremy. I got hired a while back for an ecommerce with thousands of products. They were size wise bigger by twice then their closest competitor, but the competitors site was coming up more in common keywords. By simply changing the text copy of the product description, from the manufactures description, to something more unique and informative, we overtook our competition for rank and our ROI went through the roof for the following months.
 
 Since I've left (about two years now) they have not kept up and have reverted back to copy and paste method from the manufactures catalog. Subsequently, as you can imagine, they have lost rank to several competitors.
 
 1 0
 
 You're spot on about the re-write Jeremy. I got hired a while back for an ecommerce with thousands of products. They were size wise bigger by twice then their closest competitor, but the competitors site was coming up more in common keywords. By simply changing the text copy of the product description, from the manufactures description, to something more unique and informative, we overtook our competition for rank and our ROI went through the roof for the following months. Since I've left (about two years now) they have not kept up and have reverted back to copy and paste method from the manufactures catalog. Subsequently, as you can imagine, they have lost rank to several competitors. 
 Cancel
Jonah Stein

2007-03-13T01:10:17-07:00

I want to address is Stever's Multiple language versions of a single site that wind up using the first language when they deeper into the site.

We just finished a site audit of a site that has English, Korean, Japanese and Chinese "versions". The entire site is in english while the other versions have homepages as subdirectories of the main site and then logographic navigation with the English content recylced on the second and third levels. This is a recipe for duplicate content nightmares.

The solution was to use the same approach we employ for any site with lots of boilerplate content (shipping information, warranty, return policy, etc) embedded in the page. We spec'ed the non-enlish language pages to use Iframe calls against the content management system, so the logographic pages pulled the the english language content on the site client side. The other idea was to use AJAX to prevent the spiders from being able to crawl the English content within the other languages.

4 0

I want to address is Stever's Multiple language versions of a single site that wind up using the first language when they deeper into the site. We just finished a site audit of a site that has English, Korean, Japanese and Chinese "versions". The entire site is in english while the other versions have homepages as subdirectories of the main site and then logographic navigation with the English content recylced on the second and third levels. This is a recipe for duplicate content nightmares. The solution was to use the same approach we employ for any site with lots of boilerplate content (shipping information, warranty, return policy, etc) embedded in the page. We spec'ed the non-enlish language pages to use Iframe calls against the content management system, so the logographic pages pulled the the english language content on the site client side. The other idea was to use AJAX to prevent the spiders from being able to crawl the English content within the other languages. 
Cancel
- Rand Fishkin
 
 2007-03-13T01:21:43-07:00
 
 Jonah - I love the i-frame approach. Never considered that before; very creative and a good trick to have in the bag. That's my "new thing I learned today."
 
 1 0
 
 Jonah - I love the i-frame approach. Never considered that before; very creative and a good trick to have in the bag. That's my "new thing I learned today."
 Cancel
- stever
 
 2007-03-13T01:40:05-07:00
 
 Nice reply, Jonah.
 
 I was debating discussing frames (although not iframes - good point!) but the implications of the internal value flow around the different domains and implementing the return link made my head hurt and I cut the post back.
 
 (Plus "poor man's cloaking" can still be surprisingly effective in uncompetitive areas, which detailed pages in other languages often are.)
 
 But still, there is nothing better for user value and search engine results than proper translation.
 
 1 0
 
 Nice reply, Jonah. I was debating discussing frames (although not iframes - good point!) but the implications of the internal value flow around the different domains and implementing the return link made my head hurt and I cut the post back. (Plus "poor man's cloaking" can still be surprisingly effective in uncompetitive areas, which detailed pages in other languages often are.) But still, there is nothing better for user value and search engine results than proper translation. 
 Cancel
 - Jonah Stein
 
 2007-03-13T12:48:33-07:00
 
 Stever:
 
 Notice I said built into a content management system. This solution stops scaling for manual coding somewhere under 100 pages.
 
 1 0
 
 Stever: Notice I said built into a content management system. This solution stops scaling for manual coding somewhere under 100 pages. 
 Cancel
- cherylfuerte
 
 2007-03-14T19:42:34-07:00
 
 Your post about multiple language versions is very helpful. Thank you.
 
 1 0
 
 Your post about multiple language versions is very helpful. Thank you.
 Cancel
Bogdan Lebu

2007-03-12T08:29:04-07:00

like the robot!

so if the content is on the same domain, is not duplicate? it wil not go to supplemental?

3 0

like the robot! so if the content is on the same domain, is not duplicate? it wil not go to supplemental? 
Cancel
- identity
 
 2007-03-12T09:13:16-07:00
 
 dupe content is dupe content, whether it's on the same domain, different domain, or different domains owned by the same person... the engines don't want to serve up results that are all the same as it lowers the quality, potentially sending someone to multiple sites with the exact same information.
 
 You may not be penalized, it is just that the engine will make a choice as to which page to serve up in the main SERPs.
 
 1 0
 
 dupe content is dupe content, whether it's on the same domain, different domain, or different domains owned by the same person... the engines don't want to serve up results that are all the same as it lowers the quality, potentially sending someone to multiple sites with the exact same information. You may not be penalized, it is just that the engine will make a choice as to which page to serve up in the main SERPs. 
 Cancel
EGOL

2007-03-12T06:26:24-07:00

Rand, This sounds like a strong argument against syndication of a blog. For example, if you have a blog that is not very strong yet and make a homepage for it on one of the syndication sites, then that page might bump the homepage for your blog from the results - or outrank it on many queries. This puts your feed on a very strong site that has millions of links and has more authority.

Then we go to the "snippets" that appear on the syndication sites. These could contain long tail phrases or keyword combinations that get a little search. Having those on a powerful and authoritative site will cut your traffic - simply because they will outrank you in the SERPs on these long tail queries.

There is no "penalty" here and maybe no "filtering" - they simply outrank you. My bet is that in most cases you get more traffic by outranking them on these long tail queries than you will get by being listed within their content.

EGOL edited 2007-03-12T06:34:34-07:00
2 0

Rand, This sounds like a strong argument against syndication of a blog. For example, if you have a blog that is not very strong yet and make a homepage for it on one of the syndication sites, then that page might bump the homepage for your blog from the results - or outrank it on many queries. This puts your feed on a very strong site that has millions of links and has more authority. Then we go to the "snippets" that appear on the syndication sites. These could contain long tail phrases or keyword combinations that get a little search. Having those on a powerful and authoritative site will cut your traffic - simply because they will outrank you in the SERPs on these long tail queries. There is no "penalty" here and maybe no "filtering" - they simply outrank you. My bet is that in most cases you get more traffic by outranking them on these long tail queries than you will get by being listed within their content. 
Cancel
- vangogh99
 
 2007-03-13T12:16:24-07:00
 
 EGOL you caught my attention since I have this issue. My blog gets republished on the iEntry network, which includes WebProNews. WPN had more authority than me and so my articles there generally outranks the same article on my own site.
 
 They're nice enough to change the title and I think the meta description ends up different as well. But generally they get a lot of long tail traffic that should really fall to me.
 
 In this case I don't mind since having my content there (with links bac to other posts on my site) seems to help build authority in my site and other pages have gained in long tail traffic. I also get direct traffic through WPN and do get to spread my brand or at least my name.
 
 But it's always puzzled me why search engines can't figure out where the content originated. It would take a human being all of about 5 seconds to figure it out and while I now an algorithm isn't a human I can think of many ways they too could figure it out.
 
 I think the search engines really could do a better job of determining which is the orignal content. I think at the moment the heavy emphasis is on the authority of each site.
 
 3 0
 
 EGOL you caught my attention since I have this issue. My blog gets republished on the iEntry network, which includes WebProNews. WPN had more authority than me and so my articles there generally outranks the same article on my own site. They're nice enough to change the title and I think the meta description ends up different as well. But generally they get a lot of long tail traffic that should really fall to me. In this case I don't mind since having my content there (with links bac to other posts on my site) seems to help build authority in my site and other pages have gained in long tail traffic. I also get direct traffic through WPN and do get to spread my brand or at least my name. But it's always puzzled me why search engines can't figure out where the content originated. It would take a human being all of about 5 seconds to figure it out and while I now an algorithm isn't a human I can think of many ways they too could figure it out. I think the search engines really could do a better job of determining which is the orignal content. I think at the moment the heavy emphasis is on the authority of each site. 
 Cancel
 - EGOL
 
 2007-08-17T14:48:48-07:00
 
 If their site can get your content on the first page but your site can't lift it higher than the third then you might get more traffic to your own site by syndicating.
 
 This will also expand your brand if your article are done will to promote it.
 
 I was guessing that Google bought Feedburner so that they could thorugh the feeds identify original content sources. Maybe that isn't true.
 
 2 0
 
 If their site can get your content on the first page but your site can't lift it higher than the third then you might get more traffic to your own site by syndicating. This will also expand your brand if your article are done will to promote it. I was guessing that Google bought Feedburner so that they could thorugh the feeds identify original content sources. Maybe that isn't true. 
 Cancel
Scott Willoughby

2007-03-12T12:43:20-07:00

I love GoogleBot. He's the best teaching aid since The Count.

Now we just need SEOmozzilla to combat GoogleBot in the streets of Mountain View!

2 0

I love GoogleBot. He's the best teaching aid since <a href="https://en.wikipedia.org/wiki/Count_von_Count" rel="nofollow">The Count.</a> Now we just need SEOmozzilla to combat GoogleBot in the streets of Mountain View! 
Cancel
- Dr. Peter J. Meyers
 
 2007-03-12T12:48:16-07:00
 
 Would it be shameful to admit that I've spent the better part of my day obsessed with what weapons GoogleBot should have? For the first one, I was thinking "The Supplementalizer"; he fires it at you and all your content goes into the supplemental index.
 
 Dr-Pete edited 2007-03-12T12:48:32-07:00
 3 0
 
 Would it be shameful to admit that I've spent the better part of my day obsessed with what weapons GoogleBot should have? For the first one, I was thinking "The Supplementalizer"; he fires it at you and all your content goes into the supplemental index.
 Cancel
 - Scott Willoughby
 
 2007-03-12T13:41:20-07:00
 
 Excellent. Go on...
 
 1 0
 
 Excellent. Go on...
 Cancel
 - Dr. Peter J. Meyers
 
 2007-03-12T13:43:14-07:00
 
 More? Hey, I didn't say it was a productive "better part of my day" :)
 
 1 0
 
 More? Hey, I didn't say it was a productive "better part of my day" :)
 Cancel
 - Rand Fishkin
 
 2007-03-12T14:02:48-07:00
 
 Oh, oh!! And the "Sandboxer" and the "De-Indexer" - such a great idea...
 
 2 0
 
 Oh, oh!! And the "Sandboxer" and the "De-Indexer" - such a great idea...
 Cancel
 - Dr. Peter J. Meyers
 
 2007-03-12T15:24:32-07:00
 
 Every once in a while, GoogleBot could just pull a knife and say "I'll Cutts you!"
 
 Sorry...
 
 2 0
 
 Every once in a while, GoogleBot could just pull a knife and say "I'll Cutts you!" Sorry... 
 Cancel
 - Scott Willoughby
 
 2007-03-12T17:04:46-07:00
 
 BWA-HA-HA!!! That, my friend, was divinely puntastic.
 
 1 0
 
 BWA-HA-HA!!! That, my friend, was divinely puntastic.
 Cancel
 - vangogh99
 
 2007-03-13T12:54:47-07:00
 
 "My spammy sense is tingling."
 
 1 0
 
 "My spammy sense is tingling."
 Cancel
 - Jane Copland
 
 2007-03-12T15:21:13-07:00
 
 Name the weapon that totally invalidates a thing's existence by asking "Did you mean...?"
 
 1 0
 
 Name the weapon that totally invalidates a thing's existence by asking "Did you mean...?"
 Cancel
 - Will Critchlow
 
 2007-03-12T15:40:50-07:00
 
 That's one for us folks over in the UK (and anywhere else that likes their vowels) - I'd call it the 'colorizer'.
 
 If you search for 'colour' on google.co.uk, the top results are all about 'color' and while it doesn't actively say 'did you mean color?' it might as well.
 
 [edit: just remembered the search I did the other day where this annoyed me - it was for feng shui colours - since I happen to know someone who wants to rank for that kind of search in the UK and they slap a great big 'did you mean feng shui colors?' on the top.]
 
 willcritchlow edited 2007-03-12T15:44:03-07:00
 1 0
 
 That's one for us folks over in the UK (and anywhere else that likes their vowels) - I'd call it the 'colorizer'. If you search for 'colour' on google.co.uk, the top results are all about 'color' and while it doesn't actively say 'did you mean color?' it might as well. [edit: just remembered the search I did the other day where this annoyed me - it was for <a href="https://www.google.co.uk/search?q=feng+shui+colours&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a" rel="nofollow">feng shui colours</a> - since I happen to know someone who wants to rank for that kind of search in the UK and they slap a great big 'did you mean feng shui colors?' on the top.]
 Cancel
 - Kwyjibo
 
 2007-03-12T16:09:02-07:00
 
 That's how it starts...First Google will try and "fix" how you guys over the pond spell, then google maps will start giving you distances in Miles instead of Kilometers.
 
 1 0
 
 That's how it starts...First Google will try and "fix" how you guys over the pond spell, then google maps will start giving you distances in Miles instead of Kilometers.
 Cancel
 
 Scott Willoughby
 
 2007-03-12T17:07:11-07:00
 
 I wish they'd do it the other way around: start giving US users metric results, it's about damn time we figured it out.
 
 I can't get behind colour or optimisation, but I am fond of spelling theater theatre.
 
 1 0
 
 I wish they'd do it the other way around: start giving US users metric results, it's about damn time we figured it out. I can't get behind colour or optimisation, but I am fond of spelling theater theatre. 
 Cancel
 
 Will Critchlow
 
 2007-03-13T05:42:40-07:00
 
 Don't mind imperial measurements (we like our miles and our pounds of vegetables).
 
 Don't even think about taking away our pints of beer though. The EU wanted to standardise on half litres (or standardize on half liters). That's not a good idea.
 
 1 0
 
 Don't mind imperial measurements (we like our miles and our pounds of vegetables). Don't even think about taking away our pints of beer though. The EU wanted to standardise on half litres (or standardize on half liters). That's not a good idea.
 Cancel
Megan-15336

2007-03-12T10:19:27-07:00

(Search engines are) very familiar with the layout of websites and recognize that permanent structures on all (or many) of a site's pages are quite normal. Instead, they'll pay attention to the "unique" portions of each page and often, largely ignore the rest.

I'm a little bit confused about this in light of some related comments on this post at SEOBook. He is saying that in order to get out of supplementals it's a good idea to reduce repetitive side-wide (header/footer/sidebar) content:

You may also want to resort your code order to put unique content higher in the page content and have duplicated and sitewide template related issues occur later on.

But you are saying that it doesn't matter. This is slightly different problem that he's talking about so maybe it just depends on the context ???

Edit: I'm enjoying the robot illustrations too! They made me smile today :)

Megan-15336 edited 2007-03-12T10:26:21-07:00
2 0

<blockquote>(Search engines are) very familiar with the layout of websites and recognize that permanent structures on all (or many) of a site's pages are quite normal. Instead, they'll pay attention to the "unique" portions of each page and often, largely ignore the rest.</blockquote> I'm a little bit confused about this in light of some related comments on <a href="https://www.seobook.com/archives/002030.shtml" rel="nofollow">this post at SEOBook</a>. He is saying that in order to get out of supplementals it's a good idea to reduce repetitive side-wide (header/footer/sidebar) content: <blockquote>You may also want to resort your code order to put unique content higher in the page content and have duplicated and sitewide template related issues occur later on. </blockquote> But you are saying that it doesn't matter. This is slightly different problem that he's talking about so maybe it just depends on the context ??? Edit: I'm enjoying the robot illustrations too! They made me smile today :) 
Cancel
- Bob Gladstein
 
 2007-03-12T10:25:53-07:00
 
 The sentence you've quoted is the only one in that post of Aaron's that I disagree with. I've never seen any evidence that having unique content closer to the top of the code makes any difference at all. The first version of my site used absolutely positioned layers to make the content the first thing in the <body> and it didn't accomplish anything as far as I could tell.
 
 1 0
 
 The sentence you've quoted is the only one in that post of Aaron's that I disagree with. I've never seen any evidence that having unique content closer to the top of the code makes any difference at all. The first version of my site used absolutely positioned layers to make the content the first thing in the <body> and it didn't accomplish anything as far as I could tell.
 Cancel
 - Rand Fishkin
 
 2007-03-12T10:38:11-07:00
 
 Aaron's a very smart guy and he might have the data to back up that assertion, but in my experience, I haven't seen issues with bloated code, nor heavy header/footer content (although if it was extreme...)
 
 1 0
 
 Aaron's a very smart guy and he might have the data to back up that assertion, but in my experience, I haven't seen issues with bloated code, nor heavy header/footer content (although if it was extreme...)
 Cancel
 - Jonah Stein
 
 2007-03-13T01:25:14-07:00
 
 Rand
 
 I have to side with Aaron on this one. While all of the engines attempt to spider around code and Google does the best job of it, you leave a lot to chance if you assume the engine will treat your content the same when it is buried beneath hundreds of links.
 
 I analyzed a PR 7 site that dates from 1996 with 45,000 unique skus, each with about 500 words of hand-written description. This site was all but knocked out of the index for duplicate content. They had 400,000 pages in the index when Big Daddy rolled out and less than 1,000 by the time we got involved. They suffered from mulitple sins (duplicate title and description tags on most pages, canonical issues stemming from indexed search result pages).
 
 Still, they would not have been hit nearly so hard if it wasn't for the fact that they have a couple of hundred lines of code, most of which is hundreds of global and directory-contextual navigation, between the body tag and the beginning of the unique content.
 
 Aaron's suggestion also contains another bit of wisdom. If you mix up the order of things in the global navigation, and maybe the anchor text, Google is more likely to count those internal links more than once because the footprint has changed.
 
 1 0
 
 Rand I have to side with Aaron on this one. While all of the engines attempt to spider around code and Google does the best job of it, you leave a lot to chance if you assume the engine will treat your content the same when it is buried beneath hundreds of links. I analyzed a PR 7 site that dates from 1996 with 45,000 unique skus, each with about 500 words of hand-written description. This site was all but knocked out of the index for duplicate content. They had 400,000 pages in the index when Big Daddy rolled out and less than 1,000 by the time we got involved. They suffered from mulitple sins (duplicate title and description tags on most pages, canonical issues stemming from indexed search result pages). Still, they would not have been hit nearly so hard if it wasn't for the fact that they have a couple of hundred lines of code, most of which is hundreds of global and directory-contextual navigation, between the body tag and the beginning of the unique content. Aaron's suggestion also contains another bit of wisdom. If you mix up the order of things in the global navigation, and maybe the anchor text, Google is more likely to count those internal links more than once because the footprint has changed. 
 Cancel
 - Megan-15336
 
 2007-03-13T07:31:21-07:00
 
 So what about this bit in Response to SEO Questions by Rand Fishkin:
 
 Yes, content at the top matters more.It blows my mind how many sites get thrown into duplicate content hell just because the top 30 lines of content on every page are a nav bar, a company name, slogan, and login form.
 Still confused :) I have been reading some other stuff questioning the source order thing web designers seem to take for granted (this timefrom an accessibility perspective).
 
 2 0
 
 So what about this bit in <a href="https://www.thegooglecache.com/white-hat-seo/response-to-seo-questions-by-rand-fishkin/" rel="nofollow">Response to SEO Questions by Rand Fishkin:</a> <blockquote>Yes, content at the top matters more.It blows my mind how many sites get thrown into duplicate content hell just because the top 30 lines of content on every page are a nav bar, a company name, slogan, and login form. </blockquote>Still confused :) I have been reading<a href="https://mondaybynoon.com/2007/02/12/source-order-can-create-usability-disasters" rel="nofollow"> some other stuff</a> questioning the source order thing web designers seem to take for granted (this timefrom an accessibility perspective). 
 Cancel
 - vangogh99
 
 2007-03-13T12:43:32-07:00
 
 Thanks for the link to the usability article. This whole issue (in terms if usability and accessibility) has been on my mind a lot lately, though I've yet to come to any conclusions.
 
 With the seo part of the issue I've seen arguments on both sides and have yet to convince myself of either argument. Generally when I code the template for a site I do leave the header, footer, sidebar, in the same place in the code, mostly because it's so much easier to maintain and for that benefit is worth more than anything I might lose in having the pages be repetitive.
 
 It would be nice though to have some more definitive information though about the seo effect.
 
 2 0
 
 Thanks for the link to the usability article. This whole issue (in terms if usability and accessibility) has been on my mind a lot lately, though I've yet to come to any conclusions. With the seo part of the issue I've seen arguments on both sides and have yet to convince myself of either argument. Generally when I code the template for a site I do leave the header, footer, sidebar, in the same place in the code, mostly because it's so much easier to maintain and for that benefit is worth more than anything I might lose in having the pages be repetitive. It would be nice though to have some more definitive information though about the seo effect. 
 Cancel
Bob Gladstein

2007-03-12T09:35:48-07:00

Use meta name = "robots" content="noindex, follow" - place this in your page's header and the search engines will know that the content isn't for them. It's best to do it this way (in my opinion), because then humans can still visit the page, link to it and the links on the page will still carry value.

But the links will only be, at best, a source of traffic, unless you're suggesting that even a non-indexed page can take the benefit of its backlinks and spread it to the internal pages to which it links. And if that's the case, then the presence of the noindex instruction doesn't really make a difference, does it?

Gladstein edited 2007-03-12T09:36:04-07:00
2 0

<blockquote>Use meta name = "robots" content="noindex, follow" - place this in your page's header and the search engines will know that the content isn't for them. It's best to do it this way (in my opinion), because then humans can still visit the page, link to it and the links on the page will still carry value.</blockquote> But the links will only be, at best, a source of traffic, unless you're suggesting that even a non-indexed page can take the benefit of its backlinks and spread it to the internal pages to which it links. And if that's the case, then the presence of the noindex instruction doesn't really make a difference, does it? 
Cancel
- identity
 
 2007-03-12T09:52:57-07:00
 
 Interesting observation and a good point.
 
 However, couldn't we say that these are two different factors in ranking?
 
 Unlike a nofollow added to the link, I would hope that the SEs would still give the links value towards ranking the target site, even though they are told not to index or follow on the page itself.
 
 But this is hypothesis, not an answer on my part. I can also understand why they may not want to give value after all.
 
 1 0
 
 Interesting observation and a good point. However, couldn't we say that these are two different factors in ranking? Unlike a nofollow added to the link, I would hope that the SEs would still give the links value towards ranking the target site, even though they are told not to index or follow on the page itself. But this is hypothesis, not an answer on my part. I can also understand why they may not want to give value after all. 
 Cancel
 - Rand Fishkin
 
 2007-03-12T10:35:57-07:00
 
 My understanding is that they do give value to those links, although it may not be as high as from indexed pages. The test I saw on this was from a couple years ago, but it still passed at least some ranking weight.
 
 1 0
 
 My understanding is that they do give value to those links, although it may not be as high as from indexed pages. The test I saw on this was from a couple years ago, but it still passed at least some ranking weight.
 Cancel
- vangogh99
 
 2007-03-13T12:30:50-07:00
 
 You know I hadn't really thought of this till now, but it makes me wonder. If the search do pass link value through non-indexed pages then couldn't a spammy tactic be to create 1000's of duplicate pages add the no index, but still try to pass even a little link juice by allowing the links to be followed.
 
 I suppose that would be pretty easy to detect, but it still makes me wonder.
 
 I would think though if you've added noindex to dup pages then you're not working to build links into those pages. Also if they aren't showing up in SERPs since they're not in the index they wouldn't get so many natural links either.
 
 I too would hope there would be some value in the links out of the page, but I would expect it to be a little less than for indexed pages.
 
 1 0
 
 You know I hadn't really thought of this till now, but it makes me wonder. If the search do pass link value through non-indexed pages then couldn't a spammy tactic be to create 1000's of duplicate pages add the no index, but still try to pass even a little link juice by allowing the links to be followed. I suppose that would be pretty easy to detect, but it still makes me wonder. I would think though if you've added noindex to dup pages then you're not working to build links into those pages. Also if they aren't showing up in SERPs since they're not in the index they wouldn't get so many natural links either. I too would hope there would be some value in the links out of the page, but I would expect it to be a little less than for indexed pages. 
 Cancel
Staff

Dr. Peter J. Meyers
Staff

2007-03-12T08:48:16-07:00

Sometimes, I feel like the GoogleBot is actually kicking me in the supplementals.

Thanks, Rand; great information. I am having issues, though, with Google penalizing for too great a navigation element to unique content ratio, but I think this is severely compounded by duplicate TITLE and/or META tags. I'm struggling with getting out of the supplemental doghouse for intrasite content.

2 0

Sometimes, I feel like the GoogleBot is actually kicking me in the supplementals. Thanks, Rand; great information. I am having issues, though, with Google penalizing for too great a navigation element to unique content ratio, but I think this is severely compounded by duplicate TITLE and/or META tags. I'm struggling with getting out of the supplemental doghouse for intrasite content. 
Cancel
krumel

2007-03-12T17:26:34-07:00

Hy Rand

This is my first comment here.

I have one question: from Google patent we know that Google knows well the date that a document was released. You can find this on History data - Inception Date

And they say: Google may determine how old each of the pages on a given website is and then determine the average age of pages on the website as a whole.

So if Google do knows the origin date of an page/article should not this be a criteria for selecting duplicate content from non duplicate content pages or site?

I mean..ok you know the date when a page "was born" but do you use it to find duplicate content?

It`s like this: I have a blog that has syndication and people steal content from me posting on their blogs. But the date or original content is well know by Google, as Patent says, how come PR, IBL`s and other factors say that the well ranked site or with good PR or links is the orginal source of content?

Just asking..

2 0

Hy Rand This is my first comment here. I have one question: from Google patent we know that Google knows well the date that a document was released. You can find this on <a href="../article/google-historical-data-patent#historydata">History data - Inception Date</a> And they say: Google may determine how old each of the pages on a given website is and then determine the average age of pages on the website as a whole. So if Google do knows the origin date of an page/article should not this be a criteria for selecting duplicate content from non duplicate content pages or site? I mean..ok you know the date when a page "was born" but do you use it to find duplicate content? It`s like this: I have a blog that has syndication and people steal content from me posting on their blogs. But the date or original content is well know by Google, as Patent says, how come PR, IBL`s and other factors say that the well ranked site or with good PR or links is the orginal source of content? Just asking.. 
Cancel
- tfbpa
 
 2007-03-12T17:51:05-07:00
 
 Good to see ur first comment!
 
 Pay special attention to the word "may", it is not a fact, although pretty likely.
 
 What it means is that it is possible that Google may give different weight to pages by looking at when Google first found out about that page, meaning when it was spidered. So if you write an excellent, unique blogpost today, but because your site is not well linked to, Google will only find/spider that page in 10 days from now. So if I know your blog and visit it, I can copy your blog post, post it on my highly linked to blog and your content will be spidered by Google tomorrow on my domain. So in the eyes of Google I had the original blog post.
 
 That is one explanation, another one is that Google doesn't care who has the original content, as long as the content itself is on a highly linked to website, see my example somewhere above. I'm sure there will be other explanation as well. Just don't take any words for the absolute truth, investigate on your own and you just might end up with another theory. You have to remember that nobody, and I mean nobody, knows the absolute truth. Not even about search! ;)
 
 <edit>damn my spelling at 1am!</edit>
 
 tfbpa edited 2007-03-12T17:55:51-07:00
 1 0
 
 Good to see ur first comment! Pay special attention to the word "may", it is not a fact, although pretty likely. What it means is that it is possible that Google may give different weight to pages by looking at when Google first found out about that page, meaning when it was spidered. So if you write an excellent, unique blogpost today, but because your site is not well linked to, Google will only find/spider that page in 10 days from now. So if I know your blog and visit it, I can copy your blog post, post it on my highly linked to blog and your content will be spidered by Google tomorrow on my domain. So in the eyes of Google I had the original blog post. That is one explanation, another one is that Google doesn't care who has the original content, as long as the content itself is on a highly linked to website, see my example somewhere above. I'm sure there will be other explanation as well. Just don't take any words for the absolute truth, investigate on your own and you just might end up with another theory. You have to remember that nobody, and I mean nobody, knows the absolute truth. Not even about search! ;) <edit>damn my spelling at 1am!</edit> 
 Cancel
 - Rand Fishkin
 
 2007-03-12T17:53:31-07:00
 
 Excellent points, tbfpa - I'd totally agree. Google knows they can't always trust dates by themselves, so they'll naturally use other metrics, even when the date data conflicts.
 
 1 0
 
 Excellent points, tbfpa - I'd totally agree. Google knows they can't always trust dates by themselves, so they'll naturally use other metrics, even when the date data conflicts.
 Cancel
VusalZeynalov

2008-06-23T13:03:39-07:00

Hi Randfish,

I have made a few experiments on duplicate content in 3-4 month. I have read in every dup content articles, like yours, google penalize, remove duplicated one, how determine whether an article dup is or not etc.. And i know the rules. Actually SEO rules. But my experiments show me different results. For example: search for "12 Deadly Diseases Cured in the 20th Century" and you will find 3 of same content. One of them is from howstuffworks.com which the original is, one is in symbianize.com - in the first rank, and the other is my experimental... (i have deleted it but google still shows it). I wonder about one thing, howstuffworks have more PR, Alexa rank, linking to this page, first published- years ago than symbianize.com. Then why it is in the second place?! also why google just doesnt penalize this site about 6 month (that i discovered it)?! very interesting.

Would like to read your answer.

Sincerely,

Vusal Zeynalov

VusalZeynalov edited 2008-06-23T13:04:31-07:00
2 0

Hi Randfish, I have made a few experiments on duplicate content in 3-4 month. I have read in every dup content articles, like yours, google penalize, remove duplicated one, how determine whether an article dup is or not etc.. And i know the rules. Actually SEO rules. But my experiments show me different results. For example: search for "12 Deadly Diseases Cured in the 20th Century" and you will find 3 of same content. One of them is from howstuffworks.com which the original is, one is in symbianize.com - in the first rank, and the other is my experimental... (i have deleted it but google still shows it). I wonder about one thing, howstuffworks have more PR, Alexa rank, linking to this page, first published- years ago than symbianize.com. Then why it is in the second place?! also why google just doesnt penalize this site about 6 month (that i discovered it)?! very interesting. Would like to read your answer. Sincerely, Vusal Zeynalov 
Cancel
krumel

2007-03-12T18:28:26-07:00

Thanks tfbpa for your fast comment.

So that means that if I only write content, unique content not for rankings but for public information (as peculiar documentation to inform people) and I do not have rankings, IBL` or PR..that should suggest Google that the stealler is the author...and not me. Why? Becouse he has good rankins IBL` or PR. But that`s easy to get.

Google is forcing me to find IBL`s, to get rankings or PR to sustain my content not to be stolen and to be creditated as the author of those articles?

So if I`m not ranking my site or getting good PR or links I might be stolen my content. Or steal content written by me from other sites that are well ranked (but they do have my content).

But, why Google has those data of my content if he can not make the difference from original content (by the time was created) and stolen content (the page that is on a well ranked site)?

That makes me a theft, my own theft. I`m stealing my content? Or others do that?

They do not care about the date that content appears (source or original content) but the well raning/PR/IBL sites that soled (sites that worked out for PR, rankings or PR) that content.

So in my mind is something like that: if you do not rank well or you do not have IBL` or PR you should not create content. Becouse if you do..you must get ranings, IBL`s or PR. But I`m not write content for SE`s but for users. And Google does care more about rankings and PR instead of good content?

How about people that do create good content for users but they are stolen? That means I do not have a chance on SE`s becouse I do not rank well. But an theft can stole my content and in the eyes of Google he is the creator, `couse he rank well.

Just asking again.. :)

Hope I do make sense.

2 0

Thanks <a href="../users/view/1289">tfbpa</a> for your fast comment. So that means that if I only write content, unique content not for rankings but for public information (as peculiar documentation to inform people) and I do not have rankings, IBL` or PR..that should suggest Google that the stealler is the author...and not me. Why? Becouse he has good rankins IBL` or PR. But that`s easy to get. Google is forcing me to find IBL`s, to get rankings or PR to sustain my content not to be stolen and to be creditated as the author of those articles? So if I`m not ranking my site or getting good PR or links I might be stolen my content. Or steal content written by me from other sites that are well ranked (but they do have my content). But, why Google has those data of my content if he can not make the difference from original content (by the time was created) and stolen content (the page that is on a well ranked site)? That makes me a theft, my own theft. I`m stealing my content? Or others do that? They do not care about the date that content appears (source or original content) but the well raning/PR/IBL sites that soled (sites that worked out for PR, rankings or PR) that content. So in my mind is something like that: if you do not rank well or you do not have IBL` or PR you should not create content. Becouse if you do..you must get ranings, IBL`s or PR. But I`m not write content for SE`s but for users. And Google does care more about rankings and PR instead of good content? How about people that do create good content for users but they are stolen? That means I do not have a chance on SE`s becouse I do not rank well. But an theft can stole my content and in the eyes of Google he is the creator, `couse he rank well. Just asking again.. :) Hope I do make sense.
Cancel
- Rand Fishkin
 
 2007-03-12T22:08:38-07:00
 
 Krumel - I wish I could answer you, but sadly, I can't follow your reasoning here and your sentences are very difficult to understand as well. If you're simply pointing out that SEO and content building on the web is becoming more difficult, I'd agree with you, but I'd still say it's a far cry easier than starting a new newspaper, magazine, radio station or other mass media distribution system. Sure, there are things that are unfair or tough, but nearly everyone on this blog has dealt with those issues and managed to emerge successful in the end.
 
 1 0
 
 Krumel - I wish I could answer you, but sadly, I can't follow your reasoning here and your sentences are very difficult to understand as well. If you're simply pointing out that SEO and content building on the web is becoming more difficult, I'd agree with you, but I'd still say it's a far cry easier than starting a new newspaper, magazine, radio station or other mass media distribution system. Sure, there are things that are unfair or tough, but nearly everyone on this blog has dealt with those issues and managed to emerge successful in the end.
 Cancel
vvkchandra

2009-01-23T04:49:15-08:00

I am closing my existing domain https://webgeekblog.com completely. Its been penalized by google for reasons I am unable to fanthom.

I am planning to move some of its content to my new website? does this affect my new domain in search results badly?

The present website will be nomore available except an index page saying I have moved to new website.

I have placed a robots.txt for blocing search engines to crawl and placed a removal request on google!

Please let me know whether I can copy few articles on my old website to new website?

thank you!

chandra

1 0

I am closing my existing domain https://webgeekblog.com completely. Its been penalized by google for reasons I am unable to fanthom. I am planning to move some of its content to my new website? does this affect my new domain in search results badly? The present website will be nomore available except an index page saying I have moved to new website. I have placed a robots.txt for blocing search engines to crawl and placed a removal request on google! Please let me know whether I can copy few articles on my old website to new website? thank you! chandra 
Cancel
Magnolia CMS

2009-02-18T08:46:55-08:00

If my own site has two URL's to the same content, in other words, if x.com/new and x.com/products/tv/123.html are exactly the same page, does this matter?

1 0

If my own site has two URL's to the same content, in other words, if x.com/new and x.com/products/tv/123.html are exactly the same page, does this matter?
Cancel
Jeff Sliger

2008-11-12T22:14:17-08:00

Over a year later and still relevant.

Great explanation Rand. Do you think that the Search Engines have improved their handling of this type of information since you wrote this?

In the last six months I have seen indications that either the algorithm has become more robust or that individual verticals have been adjusted for duplicate allowances.

My personal take on it at this point is that in some instances (verts) it is still a fairly strong factor. It may be Most. However in others, it is less likely to get you slapped into supplemental hell but won't help you flow any juice or punch up your search position.

Could it be also that by increasing the ability to handle a higher volume of pages in their index, the engines have just worried less about the issue?

This is, in my opinion, the power of SEOmoz. An index of searchable information from a trusted source. You really have no idea how great this is.

1 0

Over a year later and still relevant. Great explanation Rand. Do you think that the Search Engines have improved their handling of this type of information since you wrote this? In the last six months I have seen indications that either the algorithm has become more robust or that individual verticals have been adjusted for duplicate allowances. My personal take on it at this point is that in some instances (verts) it is still a fairly strong factor. It may be Most. However in others, it is less likely to get you slapped into supplemental hell but won't help you flow any juice or punch up your search position. Could it be also that by increasing the ability to handle a higher volume of pages in their index, the engines have just worried less about the issue? This is, in my opinion, the power of SEOmoz. An index of searchable information from a trusted source. You really have no idea how great this is. 
Cancel
DUIAdvice

2008-09-28T19:50:35-07:00

Great post! Clears up a lot of misunderstanding as far as how search engine deals with duplicate content.

Elaine

DUI Advice

1 0

Great post! Clears up a lot of misunderstanding as far as how search engine deals with duplicate content. Elaine <a href="https://dui-advice.blogspot.com/" rel="nofollow">DUI Advice</a>
Cancel
Yog

2007-08-17T12:45:34-07:00

Hi Randfish

Do you know any efficient tools that can identify duplicate articles within repository? I want to run it as a background process where new incoming content will be checked for duplicates with let say 100,000 articles available in my repository.

I'm looking for better performing and easily integratable tool.

1 0

Hi Randfish Do you know any efficient tools that can identify duplicate articles within repository? I want to run it as a background process where new incoming content will be checked for duplicates with let say 100,000 articles available in my repository. I'm looking for better performing and easily integratable tool.
Cancel
Fred2008

2008-06-25T11:10:12-07:00

I think the illustrations are the best part! Keep them coming. They break it down so anyone can understand. Maybe my next tattoo will be that sweet google bot drawing...ha!

1 0

I think the illustrations are the best part! Keep them coming. They break it down so anyone can understand. Maybe my next tattoo will be that sweet google bot drawing...ha!
Cancel
theexo51

2009-11-30T05:06:52-08:00

Interesting post, but it didnt answer, nor do the comments, my primary concern.

If I have 3 articles, all unique and written by me, and I publish these to 50 Article sites, am I generating 147 pages of duplicate content?

Would this be punished by Google? Or, would the 3 Articles that are seen as the originals be good, and just the IBL from the remaining 147 be counted? or would they not be counted since it is dupe content??

Hopefully someone may be kind enough to answer...

Thanks in advance!

1 0

Interesting post, but it didnt answer, nor do the comments, my primary concern. If I have 3 articles, all unique and written by me, and I publish these to 50 Article sites, am I generating 147 pages of duplicate content? Would this be punished by Google? Or, would the 3 Articles that are seen as the originals be good, and just the IBL from the remaining 147 be counted? or would they not be counted since it is dupe content?? Hopefully someone may be kind enough to answer... Thanks in advance! 
Cancel
Harsh Kundariya

2011-10-13T11:22:39-07:00

At last My Question is : Article Syndication is Useful on NOt ?

if Useful then How?

if NOt then Why?

1 0

At last My Question is : Article Syndication is Useful on NOt ? if Useful then How? if NOt then Why?
Cancel
Xtremeno

2012-05-16T04:11:57-07:00

I found your post really useful. Duplicate contetnt is always a issue to get penalized by google. But yet no one knows how the exact algorithm work I guess? Any more info on it?

1 0

I found your post really useful. Duplicate contetnt is always a issue to get penalized by google. But yet no one knows how the exact algorithm work I guess? Any more info on it?
Cancel
HiteshJain87

2012-06-29T04:15:05-07:00

Hi Friends,

I need help on this. My US Based client is a interior designer and runs his service in multiple places like Michigan, Ohio, Florida.

I want to know will his 3 website be penalised for duplicate content by google if he has :

1. 3 Websites like InteriorDesignersMichigan.com, InteriorDesignersOhio.com, InteriorDesignersFlorida.com

2. The look and feel, the website template and content text of all the websites are same.

3. But the city names are changed in the content text.

For example all three website have content like:

1. Design Tech is a interior design company Since 1993 for office space in Michigan. We provide our services.....Blah...Blah

2. Design Tech is a interior design company Since 1993 for office space in Ohio. We provide our services.....Blah...Blah

3. Design Tech is a interior design company Since 1993 for office space in Florida. We provide our services.....Blah...Blah

Please suggest will all the three websites be panelised for duplicate content.

1 0

Hi Friends, I need help on this. My US Based client is a interior designer and runs his service in multiple places like Michigan, Ohio, Florida. I want to know will his 3 website be penalised for duplicate content by google if he has : 1. 3 Websites like InteriorDesignersMichigan.com, InteriorDesignersOhio.com, InteriorDesignersFlorida.com 2. The look and feel, the website template and content text of all the websites are same. 3. But the city names are changed in the content text. For example all three website have content like: 1. Design Tech is a interior design company Since 1993 for office space in Michigan. We provide our services.....Blah...Blah 2. Design Tech is a interior design company Since 1993 for office space in Ohio. We provide our services.....Blah...Blah 3. Design Tech is a interior design company Since 1993 for office space in Florida. We provide our services.....Blah...Blah Please suggest will all the three websites be panelised for duplicate content.
Cancel
hireawizseo

2012-07-16T17:52:18-07:00

Thx for the info and illustrations. We are writing some content that targets separate cities in the same state and have manually rewritten the content 3 times and it's around 88-91% unique. We are hoping this works fine and is not flagged.

1 0

Thx for the info and illustrations. We are writing some content that targets separate cities in the same state and have manually rewritten the content 3 times and it's around 88-91% unique. We are hoping this works fine and is not flagged.
Cancel
Jeff Grant

2011-10-06T08:16:21-07:00

In less word, to fix dupe issues i should add a "noindex, follow" tag to all the pages containing duplicate contents.. also if i replace dupe text with original text in some existing pages and also changing the argument of the page, is sufficient for google to recognize the new pages just by sending them via webmaster tools?

[link removed]

KeriMorgret edited 2011-10-06T10:43:30-07:00
1 0

In less word, to fix dupe issues i should add a "noindex, follow" tag to all the pages containing duplicate contents.. also if i replace dupe text with original text in some existing pages and also changing the argument of the page, is sufficient for google to recognize the new pages just by sending them via webmaster tools? [link removed]
Cancel
ewsseo

2010-05-26T15:15:36-07:00

This debate is going to rage on I am sure.

What is the risk of using duplicate content in articles to build links? We have been discussing this at our office recently and we are finding that there doesn't seem to be much in the way of examples or case study.

Anyone out there have any actual "proof" and not just anecdotes?

1 0

This debate is going to rage on I am sure. What is the risk of using duplicate content in articles to build links? We have been discussing this at our office recently and we are finding that there doesn't seem to be much in the way of examples or case study. Anyone out there have any actual "proof" and not just anecdotes? 
Cancel
SEMWarrior

2009-05-13T14:21:15-07:00

Here's a pickle, so to speak. A client has a localized website for a prominent plumbing business in Colorado (ranking well in the local market search) and is expanding down into Texas.

Can we take the website and update all the "Colorado" text to be "Texas" without a duplicate content penalty?

So, everything would be the same except for the instances of location. Time saver OR seo Russian Roulette?

And Rand, you generate such great conversation on here. Thanks man.

1 0

Here's a pickle, so to speak. A client has a localized website for a prominent plumbing business in Colorado (ranking well in the local market search) and is expanding down into Texas. Can we take the website and update all the "Colorado" text to be "Texas" without a duplicate content penalty? So, everything would be the same except for the instances of location. Time saver OR seo Russian Roulette? And Rand, you generate such great conversation on here. Thanks man.
Cancel
SEONewbie91

2007-07-04T10:54:57-07:00

We own an online store and recently we opened a similar store on eBay with the hope of improving sales through different sales channels -perfectly legitimate. Though we only have a small percentage of our items listed on eBay, is there any probability of being penalized for duplicate content because we use the same product; name, description and pictures? We are thinking of opening a similar store elsewhere for the same reason –no different that a brick and mortar store opens multiple locations. I should mention that none of the stores directly or indirectly links to each other.

1 0

We own an online store and recently we opened a similar store on eBay with the hope of improving sales through different sales channels -perfectly legitimate. Though we only have a small percentage of our items listed on eBay, is there any probability of being penalized for duplicate content because we use the same product; name, description and pictures? We are thinking of opening a similar store elsewhere for the same reason –no different that a brick and mortar store opens multiple locations. I should mention that none of the stores directly or indirectly links to each other.
Cancel
wasatch

2010-02-03T15:40:28-08:00

It's been a few years since this post, but there are still dup content questions out there.

Apple for example uses the same content for various versions of English sites (America, Canada, Australia). Also, it does not have country specific domains, it just uses .com, .com/ca, .com/au. Is this dup content bad? Will Google.com rankings be effected because of .com/ca getting ranked in Google.ca?

Any thoughts?

Thanks

PS - Search "apple imac" in Google.ca. It's pretty funny, results for all three countries are returned. So much for Google choosing one and throwing out the rest.

1 0

It's been a few years since this post, but there are still dup content questions out there. Apple for example uses the same content for various versions of English sites (America, Canada, Australia). Also, it does not have country specific domains, it just uses .com, .com/ca, .com/au. Is this dup content bad? Will Google.com rankings be effected because of .com/ca getting ranked in Google.ca? Any thoughts? Thanks PS - Search "apple imac" in Google.ca. It's pretty funny, results for all three countries are returned. So much for Google choosing one and throwing out the rest. 
Cancel
- wasatch
 
 2010-02-05T12:12:24-08:00
 
 I have researched this more on my own and have come across a great post from Duncan Morris, entitled Why Apple isn't UK enough for Google. Check it out if you're interested.
 
 1 0
 
 I have researched this more on my own and have come across a great post from Duncan Morris, entitled <a href="why-apple-isnt-uk-enough-for-google">Why Apple isn't UK enough for Google</a>. Check it out if you're interested.
 Cancel
SocialMediaMonsters

2010-05-11T09:27:09-07:00

I have a question on this - yes, I am a total beginner and apologize for my lack of knowledge in advance!!!

I have a website being built. There are three h1 tags and text boxes at that bottom of the index page. I just took a look at the new pages being designed and the graphic designer has put the same three boxes with the same titles at the bottom of every single page. It looks great and the flow of the website is nice, but will this give me too much duplicate content?

Thanks for any help/advice thrown my way.

1 0

I have a question on this - yes, I am a total beginner and apologize for my lack of knowledge in advance!!! I have a website being built. There are three h1 tags and text boxes at that bottom of the index page. I just took a look at the new pages being designed and the graphic designer has put the same three boxes with the same titles at the bottom of every single page. It looks great and the flow of the website is nice, but will this give me too much duplicate content? Thanks for any help/advice thrown my way. 
Cancel
soaphope

2009-04-15T18:27:18-07:00

Similar to a poster below, we have a website that sells about 200 all-natural soap products (soaphope.com) and we also sell the exact same portfolio of products on eBay. We use similar descriptions of our products in both channels. Is Google smart enough to know that this is not spam and not penalize the content as duplicative?

Also - eBay page headers are generated by them, not the seller - so if someone wanted to tell Google to ignore their eBay pages, is there even a way to do that?

1 0

Similar to a poster below, we have a website that sells about 200 all-natural soap products (soaphope.com) and we also sell the exact same portfolio of products on eBay. We use similar descriptions of our products in both channels. Is Google smart enough to know that this is not spam and not penalize the content as duplicative? Also - eBay page headers are generated by them, not the seller - so if someone wanted to tell Google to ignore their eBay pages, is there even a way to do that?
Cancel
ChrisRothwell

2007-03-13T16:37:30-07:00

Very interesting post. GoogleBot is awesome!

1 0

Very interesting post. GoogleBot is awesome!
Cancel
tfbpa

2007-03-12T10:13:24-07:00

In my experience the single element that is of importance to get out of supplemental hell due to duplicate content is the quality of backlinks.

I have several hotel sites that uses the content provided by the mother company. Infact I use CNAME records to make it look like the content is on a subdomain, but the content is 100% the same as found on 100's of other websites.

At first those sites suffered a lot from supp hell, but once I received some quality backlinks (maybe just one...) most of those pages came out of the supp hell and went into the main index. And I must add again that the content is 100% the same as all other sites.

This is evidence to me that Google doesn't determine an original at all, they just include all pages which have enough quality backlinks.

You can take a look here https://www.google.com/search?q=%22The+171-room+Banff+Rocky+Mountain+Resort+%22is+nestled+at+the+base+of+Rundle+and+Cascade+Mountains+in+Banff+National+Park,+just+four+kilometers+from&num=100&hl=en&filter=0

Grrr, because above line is too long and therefore doesn't show all of it, do a search for "The 171-room Banff Rocky Mountain Resort is nestled at the base of Rundle"

Pay especially attention to the results with subdomains, they are 100% exactly the same. This example doesn't show 100's of exactly the same pages, but it does show a dozen of them and all of them are in the main index.

tfbpa edited 2007-03-12T10:16:38-07:00
1 0

In my experience the single element that is of importance to get out of supplemental hell due to duplicate content is the quality of backlinks. I have several hotel sites that uses the content provided by the mother company. Infact I use CNAME records to make it look like the content is on a subdomain, but the content is 100% the same as found on 100's of other websites. At first those sites suffered a lot from supp hell, but once I received some quality backlinks (maybe just one...) most of those pages came out of the supp hell and went into the main index. And I must add again that the content is 100% the same as all other sites. This is evidence to me that Google doesn't determine an original at all, they just include all pages which have enough quality backlinks. You can take a look here https://www.google.com/search?q=%22The+171-room+Banff+Rocky+Mountain+Resort+%22is+nestled+at+the+base+of+Rundle+and+Cascade+Mountains+in+Banff+National+Park,+just+four+kilometers+from&num=100&hl=en&filter=0 Grrr, because above line is too long and therefore doesn't show all of it, do a search for "The 171-room Banff Rocky Mountain Resort is nestled at the base of Rundle" Pay especially attention to the results with subdomains, they are 100% exactly the same. This example doesn't show 100's of exactly the same pages, but it does show a dozen of them and all of them are in the main index. 
Cancel
- Rand Fishkin
 
 2007-03-12T10:36:53-07:00
 
 tbfpa - you can use the WYSIWYG editor to make long links into anchor text :)
 
 1 0
 
 tbfpa - you can use the WYSIWYG editor to make long links into anchor text :)
 Cancel
- Kwyjibo
 
 2007-03-12T11:57:11-07:00
 
 In my experience the single element that is of importance to get out of supplemental hell due to duplicate content is the quality of backlinks.
 
 I've seen this as well.
 
 2 0
 
 <blockquote>In my experience the single element that is of importance to get out of supplemental hell due to duplicate content is the quality of backlinks.</blockquote> I've seen this as well. 
 Cancel
jestep

2007-03-12T11:03:02-07:00

One thing that has worried me in the past related to this, is publishing full blog feeds vs. snippets. If you use a full-feed format, you essentially give anyone the ability to duplicate your entire site. In the even that a more authoritative site does it, you could potentially loose your blog into the sup index. Luckily, every instance i have actually seen of this so far was from worthless sraper sites, but looking into highly competitive business topics, I could see this becomming a potential problem.

As far as the robot goes, I think you should get rid of that flower thing, and incorporate it into your logo.

1 0

One thing that has worried me in the past related to this, is publishing full blog feeds vs. snippets. If you use a full-feed format, you essentially give anyone the ability to duplicate your entire site. In the even that a more authoritative site does it, you could potentially loose your blog into the sup index. Luckily, every instance i have actually seen of this so far was from worthless sraper sites, but looking into highly competitive business topics, I could see this becomming a potential problem. As far as the robot goes, I think you should get rid of that flower thing, and incorporate it into your logo. 
Cancel
Jordan McCollum

2007-03-12T11:27:26-07:00

Yeah, the post was okay, but more importantly you've quoted the best line from the best book ever.

(Okay, so the post was really good and I've been really impressed with the post quality lately. I mean, you guys always have good posts, but this feels like an SEOmoz Renaissance!)

1 0

Yeah, the post was okay, but more importantly you've quoted the best line from the best book ever. (Okay, so the post was really good and I've been really impressed with the post quality lately. I mean, you guys always have good posts, but this feels like an SEOmoz Renaissance!) 
Cancel
- Rand Fishkin
 
 2007-03-12T11:34:10-07:00
 
 Did you like my Scott & Zelda references too? I know a lot of people who don't get Gatsby, but it always spoke to me.
 
 2 0
 
 Did you like my Scott & Zelda references too? I know a lot of people who don't get Gatsby, but it always spoke to me.
 Cancel
 - identity
 
 2007-03-12T14:26:40-07:00
 
 Thanks for the reminder.... I forgot to mention, along with a great post, any post that can bring F. Scott into is tops in my books... Matt needs to figure out a way to give posts a "two thumbs up."
 
 Gatsby is great, but personally, This Side of Paradise was his absolute greatest work, and personally, my all time favorite. They even did a movie version of Gatsby but I've always thought that Paradise, if done right, could make an incredible movie as well.
 
 I'm afraid of you. I'm always afraid of a girl--until I've kissed her.
 
 1 0
 
 Thanks for the reminder.... I forgot to mention, along with a great post, any post that can bring F. Scott into is tops in my books... Matt needs to figure out a way to give posts a "two thumbs up." Gatsby is great, but personally, This Side of Paradise was his absolute greatest work, and personally, my all time favorite. They even did a movie version of Gatsby but I've always thought that Paradise, if done right, could make an incredible movie as well. <blockquote>I'm afraid of you. I'm always afraid of a girl--until I've kissed her. </blockquote>
 Cancel
 - vangogh99
 
 2007-03-13T12:50:36-07:00
 
 I read way to fast and completely missed the obvious reference. It's been awhile since I've read Gatsby though, even if you did point it out with fscott.com. Here's a reference about F Scott and you can have fun trying to discover where it's from.
 
 'You've been through all of F. Scott Fitzgeralds books. You're very well read it's well know."
 
 For a hint check my profile. The author of the reference is there. If you're familiar with the author you'll recognize where the reference is from easily.
 
 By the way Rand Googlebot is looking much better this time around.
 
 1 0
 
 I read way to fast and completely missed the obvious reference. It's been awhile since I've read Gatsby though, even if you did point it out with fscott.com. Here's a reference about F Scott and you can have fun trying to discover where it's from. 'You've been through all of F. Scott Fitzgeralds books. You're very well read it's well know." For a hint check my profile. The author of the reference is there. If you're familiar with the author you'll recognize where the reference is from easily. By the way Rand Googlebot is looking much better this time around. 
 Cancel
Bud-Caddell

2007-03-12T09:24:42-07:00

I'm lovin the use of images lately Rand, really spicin it up. Is someone getting content ready for his book??

1 0

I'm lovin the use of images lately Rand, really spicin it up. Is someone getting content ready for his book??
Cancel
identity

2007-03-12T09:09:13-07:00

22.45%? Everyone knows it's 19.67% ;)

Nice Googlebot makeover, he's totally styling. You'll of course have to dress him up for St. Pat's now!

This was a great recap on something that is definitely still a challenge to explain. I think it is important to address the issue of dupe on your own site as well as having multipe sites, both based on the misconception of two sites must be better than one as well as the dupe content issues across sites.

1 0

22.45%? Everyone knows it's 19.67% ;) Nice Googlebot makeover, he's totally styling. You'll of course have to dress him up for St. Pat's now! This was a great recap on something that is definitely still a challenge to explain. I think it is important to address the issue of dupe on your own site as well as having multipe sites, both based on the misconception of two sites must be better than one as well as the dupe content issues across sites. 
Cancel
househunter

2007-03-12T06:51:49-07:00

Hi Rand - another fantastic post! The plagiarism issue is certainly a frustrating one. I was amazed to find one of my recent blog posts appearing on around 15 other sites within 24 hours of posting. Everything had been scraped and chopped up into a load of gobbledegook mush. Of course, the pages had Adsense all over them. I don't have to worry about the pages ranking higher than mine however it's bloody annoying. I think a polite email may be in order...

househunter edited 2007-03-12T06:52:23-07:00
1 0

Hi Rand - another fantastic post! The plagiarism issue is certainly a frustrating one. I was amazed to find one of my recent blog posts appearing on around 15 other sites within 24 hours of posting. Everything had been scraped and chopped up into a load of gobbledegook mush. Of course, the pages had Adsense all over them. I don't have to worry about the pages ranking higher than mine however it's bloody annoying. I think a polite email may be in order...
Cancel
- Michael Visser
 
 2007-03-12T07:00:43-07:00
 
 Polite as in dropping a friendly DCMA request in their Inbox?
 
 2 0
 
 Polite as in dropping a friendly DCMA request in their Inbox?
 Cancel
 - househunter
 
 2007-03-12T11:52:23-07:00
 
 You knows it! ;)
 
 1 0
 
 You knows it! ;)
 Cancel
Dudibob

2007-03-12T08:40:38-07:00

Hi Rand,

Good post :) but I always thought Google took into consideration that Google took into account the domain age too when determing dominance Vs. Dup content

1 0

Hi Rand, Good post :) but I always thought Google took into consideration that Google took into account the domain age too when determing dominance Vs. Dup content 
Cancel
Nuno Hipólito

2007-03-12T04:49:18-07:00

Great post.

GoogleBot rocks!

Duplicate content problems are hard to explain to clients. But its even harder to make them understand that having multiple domains that are exactly the same website is a bad thing.

What do you recommend for multiple language sites? These sometimes display the same content but start off with different homepages (ex. .com english, .pt portuguese...). Should there be a one language one domain principle, or 301s?

1 0

Great post. GoogleBot rocks! Duplicate content problems are hard to explain to clients. But its even harder to make them understand that having multiple domains that are exactly the same website is a bad thing. What do you recommend for multiple language sites? These sometimes display the same content but start off with different homepages (ex. .com english, .pt portuguese...). Should there be a one language one domain principle, or 301s?
Cancel
- mozUser1469143212458
 
 2007-03-12T06:52:49-07:00
 
 Hi Carfeu,
 
 Was having the same question. I've got some sites in Dutch on a .be and .nl domain. People from the Netherlands dislike Belgian sites and visa versa. So duplicate content is an easy sollution for me. We don't have the people or the time to rewrite content for 2 sites.
 
 1 0
 
 Hi Carfeu, Was having the same question. I've got some sites in Dutch on a .be and .nl domain. People from the Netherlands dislike Belgian sites and visa versa. So duplicate content is an easy sollution for me. We don't have the people or the time to rewrite content for 2 sites. 
 Cancel
- Michael Visser
 
 2007-03-12T06:58:17-07:00
 
 With you on the GoogleBot, seomoz should go official with the GoogleBot character as our mascot.
 
 2 0
 
 With you on the GoogleBot, seomoz should go official with the GoogleBot character as our mascot.
 Cancel
- stever
 
 2007-03-12T11:08:21-07:00
 
 Carfeu, I'm not rand, but I deal with this quite frequently.
 
 Firstly, what you describe is pretty usual, but the sites do not often have duplicate content on the same domains. Often there will be a "starter" set of pages on the country domain in the native language (which do not then run into the duplicate problem or the "search pages from..." problem). Then the sites will link into the main (usually English-language pages) section for more detailed information where the site owner has not wanted to translate.
 
 (Note that here the different languages are eligible for different directory listings and links from other sites, so also are feeding external whateveryouwanttocallitthesedays back into the main domain. This is also a minor but important point against Rand's recent theory of "keep it all on one site".)
 
 Secondly, DE raises an issue against what I said above. This is an interesting conundrum where a particular country is important and it is worth hitting duplicate content issues. Here the duplicate is worth it because it will attract all the people who only search with the country restriction on and who otherwise would not see the pages on the "foreign" domain. So here it may well be worth having the duplicate content on both a .de and an .at. I have some figures somewhere - from an old WMW thread, I think - where proportions came out at ca. 15-20% in Europe who do that.
 
 stever edited 2007-03-12T11:09:49-07:00
 2 0
 
 Carfeu, I'm not rand, but I deal with this quite frequently. Firstly, what you describe is pretty usual, but the sites do not often have duplicate content on the same domains. Often there will be a "starter" set of pages on the country domain in the native language (which do not then run into the duplicate problem or the "search pages from..." problem). Then the sites will link into the main (usually English-language pages) section for more detailed information where the site owner has not wanted to translate. (Note that here the different languages are eligible for different directory listings and links from other sites, so also are feeding external whateveryouwanttocallitthesedays back into the main domain. This is also a minor but important point against Rand's recent theory of "keep it all on one site".) Secondly, DE raises an issue against what I said above. This is an interesting conundrum where a particular country is important and it is worth hitting duplicate content issues. Here the duplicate is worth it because it will attract all the people who only search with the country restriction on and who otherwise would not see the pages on the "foreign" domain. So here it may well be worth having the duplicate content on both a .de and an .at. I have some figures somewhere - from an old WMW thread, I think - where proportions came out at ca. 15-20% in Europe who do that. 
 Cancel
rjonesx

2007-03-12T11:47:52-07:00

The language needs to be clarified somehow to fit what are two very different penalties (in my opinion)...

Cross-Site Duplicate Content: ie, I am syndicating an article I wrote to 100 sites, all of which are older, better, more visited and more popular than my site. Same-Site Duplicate Content: ie, i forgot to redirect my non-www to www.

Russ

1 0

The language needs to be clarified somehow to fit what are two very different penalties (in my opinion)... Cross-Site Duplicate Content: ie, I am syndicating an article I wrote to 100 sites, all of which are older, better, more visited and more popular than my site. Same-Site Duplicate Content: ie, i forgot to redirect my non-www to www. Russ 
Cancel
amber

2007-03-12T11:49:25-07:00

Thanks for the informative post Rand. I write content for several plant blogs every day, and well...run into several of these problems every day! It's hard enough to come up with original content after your posts get into a higher range, let alone keep track of who's stealing it (and trust me I know people are stealing it). But I certainly can't say that I'm not a little guilty of re-arranging a few words in a post from an abandoned blog to generate traffic to mine (just a little guilty though) ;)

1 0

Thanks for the informative post Rand. I write content for several plant blogs every day, and well...run into several of these problems every day! It's hard enough to come up with original content after your posts get into a higher range, let alone keep track of who's stealing it (and trust me I know people are stealing it). But I certainly can't say that I'm not a little guilty of re-arranging a few words in a post from an abandoned blog to generate traffic to mine (just a little guilty though) ;) 
Cancel
kyles

2007-03-12T04:23:30-07:00

Hi Rand

Really good post, and I love your illustration skills!

Question, that may be silly.

If the website with the dup content gets rid of it and Google comes back and crawls, and finds no dup content, how does the website try and get their "trust" back from the se's - is it possible?

kyles edited 2007-03-12T04:23:55-07:00
1 0

Hi Rand Really good post, and I love your illustration skills! Question, that may be silly. If the website with the dup content gets rid of it and Google comes back and crawls, and finds no dup content, how does the website try and get their "trust" back from the se's - is it possible? 
Cancel
cherylfuerte

2007-03-14T19:38:46-07:00

I'd say it's been a well-spent-staying-up-to-2:15-a.m. time. Thank you for the article.

How about duplicate content residing on the same domain but with multiple URL's? Let's say, an article in English is being linked from the same website using 6 different URL's. This is not ideal of course, but how bad is this?

Also, if Google is smart enough to determine page headers, footers, and other redundant parts, is it smart enough to determine if a site has 2 versions simply because one caters to a different country/territory, but has almost 90% of the same content/pages?

Thanks again.

1 0

I'd say it's been a well-spent-staying-up-to-2:15-a.m. time. Thank you for the article. How about duplicate content residing on the same domain but with multiple URL's? Let's say, an article in English is being linked from the same website using 6 different URL's. This is not ideal of course, but how bad is this? Also, if Google is smart enough to determine page headers, footers, and other redundant parts, is it smart enough to determine if a site has 2 versions simply because one caters to a different country/territory, but has almost 90% of the same content/pages? Thanks again. 
Cancel
seointern

2007-03-27T06:24:05-07:00

Great Content Rand! I know I am late getting the post.

My only questions is can you recieve a penalty for having duplicate content within your own site? Or are these pages simply added to the suplementals?

My example would be: https://www.usalarm.com/home-security/

Every City & State page has duplicate content.

Thanks!

1 0

Great Content Rand! I know I am late getting the post. My only questions is can you recieve a penalty for having duplicate content within your own site? Or are these pages simply added to the suplementals? My example would be: https://www.usalarm.com/home-security/ Every City & State page has duplicate content. Thanks! 
Cancel
Patricia Skinner

2007-05-18T01:43:19-07:00

I absolutely love Googlebot as you interpret him. Makes him look so cute! I also love the article. So easy to duplicate without realizing it.

1 0

I absolutely love Googlebot as you interpret him. Makes him look so cute! I also love the article. So easy to duplicate without realizing it.
Cancel
am9905d

2007-03-12T16:23:11-07:00

I am currently trying to see if I can take advantage of blogs scraping my content. What I do is simple, when I post something I make sure that I have several internal links to pages of my sites that are hard to get links from external sources. When the content gets syndicated by splogs, slightly change the post to avoid any big dup content issues and in the same time I build deeplinks for my site. Obviously the links I get aren't of the highest quality but they are within content and from a relevant theme page.

1 0

I am currently trying to see if I can take advantage of blogs scraping my content. What I do is simple, when I post something I make sure that I have several internal links to pages of my sites that are hard to get links from external sources. When the content gets syndicated by splogs, slightly change the post to avoid any big dup content issues and in the same time I build deeplinks for my site. Obviously the links I get aren't of the highest quality but they are within content and from a relevant theme page.
Cancel
Kwyjibo

2007-03-12T12:09:07-07:00

At PubCon Vegas I had a fairly long chat with Adam Lasnik on the topic of dup content.

He explained to me that google has a few levels of duplicate content;
- Duplicate Content
- Possible Duplicate Content
- Probable Duplicate Content
He was very adamant that it is indeed better to be in the supp index than the regular index (which I definitely agree with). His claim was that being in the supplemental index just means that you need to add a little bit of value to your page in order to get out....be that in trusted backlinks, a customer review of a product, a more unique description, etc. Come up with a way to add value to the end user experience and Google will give you love.

Kwyjibo edited 2007-03-12T12:09:46-07:00
1 0
At PubCon Vegas I had a fairly long chat with Adam Lasnik on the topic of dup content. He explained to me that google has a few levels of duplicate content; <ul><li>Duplicate Content</li><li>Possible Duplicate Content</li><li>Probable Duplicate Content </li></ul>He was very adamant that it is indeed better to be in the supp index than the regular index (which I definitely agree with). His claim was that being in the supplemental index just means that you need to add a little bit of value to your page in order to get out....be that in trusted backlinks, a customer review of a product, a more unique description, etc. Come up with a way to add value to the end user experience and Google will give you love.
Cancel
Kurt

2007-03-12T14:11:34-07:00

I agree with everyone... another great post... awesome content here recently.

Rand, I like the charts and illustrations you have been adding to your posts. They help reinforce the main topic of the article.

Google bot is cool... will Yahoo Slurp make an appearance? Maybe we can get a bot battle going between the two. :)

Kurt edited 2007-03-12T14:11:55-07:00
1 0

I agree with everyone... another great post... awesome content here recently. Rand, I like the charts and illustrations you have been adding to your posts. They help reinforce the main topic of the article. Google bot is cool... will Yahoo Slurp make an appearance? Maybe we can get a bot battle going between the two. :) 
Cancel
- identity
 
 2007-03-12T14:16:34-07:00
 
 will Yahoo Slurp make an appearance?
 
 Suddenly picturing a Jabba the Hut type creature.
 
 Let the battles begin!
 
 1 0
 
 <blockquote> will Yahoo Slurp make an appearance?</blockquote> Suddenly picturing a Jabba the Hut type creature. Let the battles begin! 
 Cancel
 - Kwyjibo
 
 2007-03-12T16:07:29-07:00
 
 Maybe instead of one big bot Slurp should be a bunch of little ones?
 
 Like an ant colony all working towards the same goal...
 
 Kwyjibo edited 2007-03-12T16:07:53-07:00
 1 0
 
 Maybe instead of one big bot Slurp should be a bunch of little ones? Like an ant colony all working towards the same goal... 
 Cancel
 - identity
 
 2007-03-12T21:39:38-07:00
 
 I'm not sure they are that organized.
 
 1 0
 
 I'm not sure they are that organized.
 Cancel
ClifThompson

2007-03-12T15:09:46-07:00

He explains it so even I can understand it.

Thanks Rand.

1 0

He explains it so even I can understand it. Thanks Rand.
Cancel
AndyBeard

2007-03-16T07:47:30-07:00

I benefit from all my duplicate content

How many blogs recently discussed Feedburner now providing Google Reader stats?

Today I have also seen proof that links are becoming less and less relevant.

1 1

I benefit from all my duplicate content How many blogs recently discussed Feedburner now providing Google Reader stats? Today I have also seen proof that links are becoming less and less relevant. 
Cancel

Post Analytics

Comments 96

Log in to Moz

Don't have an account?