Duplicate Content: Block, Redirect or Canonical - SEO Tips

Comments 80

Please keep your comments TAGFEE by following the community etiquette.

E-mail me when new comments are posted

Sort by:

Comments are closed on posts more than 30 days old. Got a burning question? Head to our Q&A section to start a new conversation.

Michael Gray

2011-01-03T11:48:49-08:00

Assuming the canonical tag is handled the same way as a 301 is a bit of a dice roll. In theory yes, but what of the non-canonical URL gets more links than the canonical one? SE's treat it as a recommendation, not the same as a 301. IMHO SE's are like 6 year old kids the less you leave open for interpretation the less likely they are to get it wrong.

11 0

Assuming the canonical tag is handled the same way as a 301 is a bit of a dice roll. In theory yes, but what of the non-canonical URL gets more links than the canonical one? SE's treat it as a recommendation, not the same as a 301. IMHO SE's are like 6 year old kids the less you leave open for interpretation the less likely they are to get it wrong.
Cancel
- Benj Arriola
 
 2011-01-03T12:32:26-08:00
 
 Makes sense and knowing with your time in the industry you are very well more experienced than I am. So far I have been basing the behavior of canonical tags on actual results. Mainly 1. When I apply the canonical tag on pages to point to another. The pages disappear in the SERPs. And 2. After replicating this experiment with the same results: https://www.seomoz.org/blog/using-canonical-tag-to-get-more-than-one-anchor-text-value-11283 it seems to me, it is behaving very similarly to a 301.
 
 But now that you have posted this comment,... makes me think... if it is treated as a recommendation, then... I could be wrong in my statement and it just so happened in my testing and actual results, Google followed my recommendations all the time. I guess the best answer would be further testing and if results are repeatable, then that is the time we can say it is a good theory.
 
 Thanks for the comment, puts more work on our testing board to really revisit and see how canonical tags have been working.
 
 6 0
 
 Makes sense and knowing with your time in the industry you are very well more experienced than I am. So far I have been basing the behavior of canonical tags on actual results. Mainly 1. When I apply the canonical tag on pages to point to another. The pages disappear in the SERPs. And 2. After replicating this experiment with the same results: <a href="https://www.seomoz.org/blog/using-canonical-tag-to-get-more-than-one-anchor-text-value-11283" rel="nofollow">https://www.seomoz.org/blog/using-canonical-tag-to-get-more-than-one-anchor-text-value-11283</a> it seems to me, it is behaving very similarly to a 301. But now that you have posted this comment,... makes me think... if it is treated as a recommendation, then... I could be wrong in my statement and it just so happened in my testing and actual results, Google followed my recommendations all the time. I guess the best answer would be further testing and if results are repeatable, then that is the time we can say it is a good theory. Thanks for the comment, puts more work on our testing board to really revisit and see how canonical tags have been working.
 Cancel
 - Jun Baranggan
 
 2011-01-06T17:52:26-08:00
 
 would be looking forward to the results on the follow up test Benj. would be interesting to know if Graywolf's recommendation theory is spot on, which I assume is...
 
 Nice post by the way.
 
 Cheers!
 
 seowebjunction edited 2011-01-06T17:54:35-08:00
 1 0
 
 would be looking forward to the results on the follow up test Benj. would be interesting to know if Graywolf's recommendation theory is spot on, which I assume is... Nice post by the way. Cheers!
 Cancel
Charles Montgomery

2011-01-04T05:59:27-08:00

I use the NOINDEX/Follow tags on product pages that have duplicate content. I have a large ecommerce store with 8,000 products. All products are very similar, but not all products have active searches. I NOINDEX the product pages, then change the meta robot to INDEX/Follow as I write new content. That has been the best way for me to handle 700 products with the same content.

In the meantime, I use category and sub-category pages to rank and drive traffic, as well as content pages and the product pages I have previously changed to unique content.

3 0

I use the NOINDEX/Follow tags on product pages that have duplicate content. I have a large ecommerce store with 8,000 products. All products are very similar, but not all products have active searches. I NOINDEX the product pages, then change the meta robot to INDEX/Follow as I write new content. That has been the best way for me to handle 700 products with the same content. In the meantime, I use category and sub-category pages to rank and drive traffic, as well as content pages and the product pages I have previously changed to unique content.
Cancel
Staff

Dr. Peter J. Meyers
Staff

2011-01-04T08:19:54-08:00

Nice recap, and I'll be curious to see more testing around the rel-alternate tag. One issue I've found with Robots.txt (besides not passing link-juice) is that it tends to be unreliable for pages that are already indexed. In other words, if you put it in place before a page exists or when you launch a site, Robots.txt will keep that page out of the index pretty well. Once the page is in the index, though, Robots.txt won't always kick it out.

3 0

Nice recap, and I'll be curious to see more testing around the rel-alternate tag. One issue I've found with Robots.txt (besides not passing link-juice) is that it tends to be unreliable for pages that are already indexed. In other words, if you put it in place before a page exists or when you launch a site, Robots.txt will keep that page out of the index pretty well. Once the page is in the index, though, Robots.txt won't always kick it out.
Cancel
Briana Myricks

2011-01-06T17:50:20-08:00

I get it but trying to explain this stuff to my fiance or my mom, that's just plain impossible.

2 0

I get it but trying to explain this stuff to my fiance or my mom, that's just plain impossible.
Cancel
ramone kalsaw

2011-01-03T17:04:50-08:00

A very noob question:

How are you determining duplicate content?

Thanks.

2 0

A very noob question: How are you determining duplicate content? Thanks.
Cancel
- Benj Arriola
 
 2011-01-03T17:16:05-08:00
 
 - Internally, within a website, normally we all know what are the common things to look at. Like www and non-www, index files, variable orders, etc. Basically what I have listed above, I test for all of them. In the summary I have. And if you are familiar with the site, sometimes you already know the answer to these questions.
 
 - Sometimes at the end of the SERPs pages... when you see omitted results, check them out, sometimes the duplicates get filtered out and are all in there. Sometimes even similar content end up in the omitted results.
 
 - With other websites, other domains, maybe even properties you also own, or people copying or scraping your content, or resyndicating your RSS which you are really sharing to the world and is not necessarily a bad thing... you can check the duplicate content using Google searching long exact phrases or use tools like https://www.Copyscape.com
 
 benjarriola edited 2011-01-03T17:16:18-08:00
 1 0
 
 - Internally, within a website, normally we all know what are the common things to look at. Like www and non-www, index files, variable orders, etc. Basically what I have listed above, I test for all of them. In the summary I have. And if you are familiar with the site, sometimes you already know the answer to these questions. - Sometimes at the end of the SERPs pages... when you see omitted results, check them out, sometimes the duplicates get filtered out and are all in there. Sometimes even similar content end up in the omitted results. - With other websites, other domains, maybe even properties you also own, or people copying or scraping your content, or resyndicating your RSS which you are really sharing to the world and is not necessarily a bad thing... you can check the duplicate content using Google searching long exact phrases or use tools like https://www.Copyscape.com
 Cancel
not_found

2011-01-03T12:16:38-08:00

Great post, benjarriola! I have bookmarked the page as I am sure I will have to read it some time again in future.

But I have a question: do you really think duplicate content is such a big issue? I assume that probably you will not rank as well as you could if you have exactly the same text/content on the pages like /url, /url1 and /url2 which were created by mistake, but I doubt Google or any other search engine will penalize my blog if I open categories and archives for Googlebot.

Google now claims to find out if the feedback was positive or negative (do you remember the ‘Christian Audigier glasses story’? If Google is so clever, why it should penalize for link architecture blogs have by nature? Personally I have categories and archive open. Do you think the pages of my blog will rank better if I close them?

P.S. Sorry if I seem to be rude of smth, it’s just your post touched the strings of my heart. :)

P.P.S.: It is nice to be one of the first to comment here, on SEOmoz.org :)

2 0

Great post, benjarriola! I have bookmarked the page as I am sure I will have to read it some time again in future. But I have a question: do you really think duplicate content is such a big issue? I assume that probably you will not rank as well as you could if you have exactly the same text/content on the pages like /url, /url1 and /url2 which were created by mistake, but I doubt Google or any other search engine will penalize my blog if I open categories and archives for Googlebot. Google now claims to find out if the feedback was positive or negative (do you remember the ‘Christian Audigier glasses story’? If Google is so clever, why it should penalize for link architecture blogs have by nature? Personally I have categories and archive open. Do you think the pages of my blog will rank better if I close them? P.S. Sorry if I seem to be rude of smth, it’s just your post touched the strings of my heart. :) P.P.S.: It is nice to be one of the first to comment here, on SEOmoz.org :)
Cancel
- Benj Arriola
 
 2011-01-03T12:40:08-08:00
 
 Good question... there are a lot of pages out there that duplicate content exist and some still seem to rank well even without fixing it.
 
 All I can say about that is... Google seems to be smart already in determining which is the original among the duplicates and rewards that page accordingly. Although there are also testimonials from other people claiming, a page outranking them or they disappeared after a more authoritative, or popular blog posted what they have originally had on their blog. So Google is smart, but sometimes it may still make mistakes in choosing the original page among the duplicates.
 
 Aside from that, some keywords are more competitive than others. In a less competitive market, and the less amount of duplicates, the less it is a problem. But as a preventative, precautionary measure, I'd just fix all cases of duplicate content I can fix.
 
 And lastly... on the dask side... I believe it was an experiment by Dan Theis some time ago... where multiple free web-based proxies were used to create multiple duplicate content pages to kick something out of the SERPs.
 
 3 0
 
 Good question... there are a lot of pages out there that duplicate content exist and some still seem to rank well even without fixing it. All I can say about that is... Google seems to be smart already in determining which is the original among the duplicates and rewards that page accordingly. Although there are also testimonials from other people claiming, a page outranking them or they disappeared after a more authoritative, or popular blog posted what they have originally had on their blog. So Google is smart, but sometimes it may still make mistakes in choosing the original page among the duplicates. Aside from that, some keywords are more competitive than others. In a less competitive market, and the less amount of duplicates, the less it is a problem. But as a preventative, precautionary measure, I'd just fix all cases of duplicate content I can fix. And lastly... on the dask side... I believe it was an experiment by Dan Theis some time ago... where multiple free web-based proxies were used to create multiple duplicate content pages to kick something out of the SERPs.
 Cancel
Sean McDonnell

2011-01-03T12:49:03-08:00

Comprehensive Information Benj!

2 0

Comprehensive Information Benj!
Cancel
Reprise Media Australia

2011-01-03T14:11:15-08:00

I've not used alternate in the link tag before, so did some reading.

Here are some clarifying articles on how to correctly implement for HTML4 and HTML5. Note that both these articles mention that this attribute may be dependant on sibling attributes.

https://www.w3.org/TR/html5/links.html#rel-alternate

https://blog.whatwg.org/the-road-to-html-5-link-relations#rel-alternate

repriseaus edited 2011-01-03T14:13:56-08:00
2 0

I've not used alternate in the link tag before, so did some reading. Here are some clarifying articles on how to correctly implement for HTML4 and HTML5. Note that both these articles mention that this attribute may be dependant on sibling attributes. <a href="https://www.w3.org/TR/html5/links.html#rel-alternate" rel="nofollow">https://www.w3.org/TR/html5/links.html#rel-alternate</a> <a href="https://blog.whatwg.org/the-road-to-html-5-link-relations#rel-alternate" rel="nofollow">https://blog.whatwg.org/the-road-to-html-5-link-relations#rel-alternate</a>
Cancel
- Benj Arriola
 
 2011-01-03T14:36:29-08:00
 
 Thanks repriseaus, and in addition to the resources on the proper standards, here are a few resources on how Google has announced how they will treat the tag when used in multilingual/international pages:
 
 Google Webmaster Central Blog
 
 https://googlewebmastercentral.blogspot.com/2010/09/unifying-content-under-multilingual.html
 
 Google Webmaster Tools Help Pages
 
 https://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=189077
 
 Google Webmaster Tools Support Forum
 
 https://www.google.com/support/forum/p/Webmasters/thread?tid=64d086930a0bd2d0&hl=en
 
 I have seen a few SEO bloggers out there saying this can be used for duplicate content for PDFs and HTML pages, and it is the proper way to do it actually. Since it is alternate content and this is how the standards were set to be. But I am not sure if Google is already set to view it in the same way. I have seen no official statement from Google saying the Alternate tag works for HTML and PDFs with the same content.
 
 2 0
 
 Thanks repriseaus, and in addition to the resources on the proper standards, here are a few resources on how Google has announced how they will treat the tag when used in multilingual/international pages: Google Webmaster Central Blog <a href="https://googlewebmastercentral.blogspot.com/2010/09/unifying-content-under-multilingual.html" rel="nofollow">https://googlewebmastercentral.blogspot.com/2010/09/unifying-content-under-multilingual.html</a> Google Webmaster Tools Help Pages <a href="https://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=189077" rel="nofollow">https://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=189077</a> Google Webmaster Tools Support Forum <a href="https://www.google.com/support/forum/p/Webmasters/thread?tid=64d086930a0bd2d0&hl=en" rel="nofollow">https://www.google.com/support/forum/p/Webmasters/thread?tid=64d086930a0bd2d0&hl=en</a> I have seen a few SEO bloggers out there saying this can be used for duplicate content for PDFs and HTML pages, and it is the proper way to do it actually. Since it is alternate content and this is how the standards were set to be. But I am not sure if Google is already set to view it in the same way. I have seen no official statement from Google saying the Alternate tag works for HTML and PDFs with the same content.
 Cancel
Efendy

2011-06-04T16:03:52-07:00

Good post, I like to read this. But I have one question. How about this:

I have no one similar (or same) post. All of my posts are completely different. For examples:

mydomain.com/my1category/post-title-about-book/

mydomain.com/my1category/another-book-post-title/

mydomain.com/my2category/good-pencil-title/

mydomain.com/my2category/best-post-title-about-pencil/

All the four posts above have different content, completely different, not similar even one paragraph.

Note:

a) I let my tags and my categories indexed by search engine.

b) Each post only has one category (no double categories). But one post has more than one tags.

My questions are:

1) Although the post URLs seems similar a little, will Google treat them as double/similar posts?

2) Do I still need to use canonical?

Hope you could explain and share with me. Thanks for all nice people here.

Efendy edited 2011-06-04T16:22:14-07:00
1 0

Good post, I like to read this. But I have one question. How about this: I have no one similar (or same) post. All of my posts are completely different. For examples: mydomain.com/my1category/post-title-about-book/ mydomain.com/my1category/another-book-post-title/ mydomain.com/my2category/good-pencil-title/ mydomain.com/my2category/best-post-title-about-pencil/ All the four posts above have different content, completely different, not similar even one paragraph. Note: a) I let my tags and my categories indexed by search engine. b) Each post only has one category (no double categories). But one post has more than one tags. My questions are: 1) Although the post URLs seems similar a little, will Google treat them as double/similar posts? 2) Do I still need to use canonical? Hope you could explain and share with me. Thanks for all nice people here.
Cancel
Thein Aye

2011-10-22T08:27:41-07:00

I spent the last few hours trying to find out how to redirect 301 with adding trailing slash just for home page ONLY. I am trying to do like this https://www.berricle.com to https://www.berricle.com/

I need solution with minimum resource on server. Just need for ONLY home page.

I tried like this in htacess file, but didn't work.

Redirect 301 https://www.berricle.com https://www.berricle.com/

Please someone help me with solution. Thanks you sooooooo much.

jennita edited 2011-10-25T15:35:02-07:00
1 0

I spent the last few hours trying to find out how to redirect 301 with adding trailing slash just for home page ONLY. I am trying to do like this https://www.berricle.com to https://www.berricle.com/ I need solution with minimum resource on server. Just need for ONLY home page. I tried like this in htacess file, but didn't work. Redirect 301 https://www.berricle.com https://www.berricle.com/ Please someone help me with solution. Thanks you sooooooo much.
Cancel
Phil Smith

2011-05-24T12:05:59-07:00

When I compare 2 sites we own... 1 is built completely wrong for SEO and the other is better optimized for SEO. The one that is built wrong completely dominates Google (over 1,000 keywords on page 1 Google) and the other site just doesnt do as well... but it is 3 times the size and both have been online for 5 yrs. Both sites are article/content sites.

It is just funny how everything is hit or miss with this stuff.

phatride edited 2011-05-24T12:06:53-07:00
1 0

When I compare 2 sites we own... 1 is built completely wrong for SEO and the other is better optimized for SEO. The one that is built wrong completely dominates Google (over 1,000 keywords on page 1 Google) and the other site just doesnt do as well... but it is 3 times the size and both have been online for 5 yrs. Both sites are article/content sites. It is just funny how everything is hit or miss with this stuff.
Cancel
Sunil Singh

2011-07-21T03:30:50-07:00

Thanks for nicely explaining how to handle duplicate content and which one option would suit in implementation in different-2 situation of seo like LSEO, ISEO and country specific.

1 0

Thanks for nicely explaining how to handle duplicate content and which one option would suit in implementation in different-2 situation of seo like LSEO, ISEO and country specific.
Cancel
picsoul

2011-02-02T08:36:53-08:00

I would like to have your advise on something. I was told by an SEO expert hired by one my client that a recent drop in ranking for their site was mainly due to a duplicate content problem between pages.

My client sells events and concert tickets in Montreal and other cities across North America. I've recently built citiy hub pages to promote some specific cities. These pages content a good sized paragraph talking about the city, some featured events links and the links to all the events in that city. The SEO expert tell me this conflicts with the search event results page when client do a search for all events in Montreal, wich of course gives out the same event listing that in the city hub page BTU with a different small upper text, different descriptions, different title and keywords tags. I find it odd that Google would threat those are duplicate content pages, and penalise our raking (loosing almost 50% visits to those pages).

What would be your oppinion on this? (Before I go in and do all the change this SEO epxerts ask me to)

Thanks

1 0

I would like to have your advise on something. I was told by an SEO expert hired by one my client that a recent drop in ranking for their site was mainly due to a duplicate content problem between pages. My client sells events and concert tickets in Montreal and other cities across North America. I've recently built citiy hub pages to promote some specific cities. These pages content a good sized paragraph talking about the city, some featured events links and the links to all the events in that city. The SEO expert tell me this conflicts with the search event results page when client do a search for all events in Montreal, wich of course gives out the same event listing that in the city hub page BTU with a different small upper text, different descriptions, different title and keywords tags. I find it odd that Google would threat those are duplicate content pages, and penalise our raking (loosing almost 50% visits to those pages). What would be your oppinion on this? (Before I go in and do all the change this SEO epxerts ask me to) Thanks
Cancel
- Benj Arriola
 
 2011-02-02T21:37:55-08:00
 
 There is duplicate content and also similar content. Sometimes, if a page is very similar to another, even if differences are present, these can be treated as duplicate content also. I guess it boils down to how much is similar and how much is different? Probably one signal to look at if pages are considered as duplicate content on your end, normally when you do a site: command in Google to see all indexed pages. The pages that are very similar end up at the very end of the last page where it says omitted results. Check your omitted results if you have any and if many pages end up here... they are somewhat treated as duplicate content.
 
 1 0
 
 There is duplicate content and also similar content. Sometimes, if a page is very similar to another, even if differences are present, these can be treated as duplicate content also. I guess it boils down to how much is similar and how much is different? Probably one signal to look at if pages are considered as duplicate content on your end, normally when you do a site: command in Google to see all indexed pages. The pages that are very similar end up at the very end of the last page where it says omitted results. Check your omitted results if you have any and if many pages end up here... they are somewhat treated as duplicate content.
 Cancel
Ushan Arya

2011-01-25T02:48:43-08:00

Hi,

I have few questions..

1) What if the content is spinned ?? I have seen this happen and spinned content getting better ranking as they promoting it more aggressively and because of this my site was ranking much behind then theirs.

2) Is it ok to have some part of content same in 2 pages on same site? Eg: First paragraph of Custom Coding page on Home page.

Ushan edited 2011-01-25T02:53:03-08:00
1 0

Hi, I have few questions.. 1) What if the content is spinned ?? I have seen this happen and spinned content getting better ranking as they promoting it more aggressively and because of this my site was ranking much behind then theirs. 2) Is it ok to have some part of content same in 2 pages on same site? Eg: First paragraph of <a href="https://www.blackidsolutions.com/custom-coding.html" rel="nofollow">Custom Coding</a> page on Home page.
Cancel
Vincent Prévost

2012-02-17T11:46:15-08:00

What about missing entry redirection? Should we use Canonical Link Tag?

Lets say a page uses 'id' as a url parameter to know which product to display, and a redirection to the category page is done if the product cannot be found (deleted or missleading link). Google treats this link as dupplicate content but would the Canonical Link Tagfix this?

1 0

What about missing entry redirection? Should we use Canonical Link Tag? Lets say a page uses 'id' as a url parameter to know which product to display, and a redirection to the category page is done if the product cannot be found (deleted or missleading link). Google treats this link as dupplicate content but would the Canonical Link Tagfix this?
Cancel
Ilian Iliev

2011-02-17T04:57:12-08:00

Hi,

I am asking for your opinion which one of these methods to use(or another one) in the following case:

Our client has a blog section of his site. Currently there are less than 20 post so frequently while searching for a specific tag you get the same content - because frequently 1 or 2 post have the same tags and the SEOMoz Craw Diagnostic shows these pages as "Duplicate Page Content".What is your advice, what do you recommend?

Thanks in advance,Ilian Iliev

1 0

Hi, I am asking for your opinion which one of these methods to use(or another one) in the following case: Our client has a blog section of his site. Currently there are less than 20 post so frequently while searching for a specific tag you get the same content - because frequently 1 or 2 post have the same tags and the SEOMoz Craw Diagnostic shows these pages as "Duplicate Page Content".What is your advice, what do you recommend? Thanks in advance,Ilian Iliev
Cancel
al sefati

2011-02-21T21:30:40-08:00

I used rel=canonical for a relativly high traffic website and I saw drop of traffic which I explain on my seo blog.

I don't think rel=canonical works like 301. It might tell google where the parent page is but I don't think it properly transfers the ranking of duplicated page to its parents like 301 does.

1 0

I used rel=canonical for a relativly high traffic website and I saw drop of traffic which I explain on my <a href="https://www.sefati.net/2011/02/a-thought-on-duplicated-content-and-canonical-tag/" rel="nofollow">seo blog</a>. I don't think rel=canonical works like 301. It might tell google where the parent page is but I don't think it properly transfers the ranking of duplicated page to its parents like 301 does.
Cancel
anna-294451

2011-04-05T10:01:55-07:00

Hi, i have a B2C website which sold apparel. My website is designed with sub-directory of multi-language. I notice that many of page detected as duplicate content and title. For example, in one category, there are 10 pages with different products, and google seems duplicate content and only index once. The website was translated with multi-language pages, google seems duplicate as well.

can I use 'Alternate Link Tag', 'Canonical Link Tag', '301 Redirect' together? and how to implement in B2C website which is targeting multi-countires?

Thanks

1 0

Hi, i have a B2C website which sold apparel. My website is designed with sub-directory of multi-language. I notice that many of page detected as duplicate content and title. For example, in one category, there are 10 pages with different products, and google seems duplicate content and only index once. The website was translated with multi-language pages, google seems duplicate as well. can I use 'Alternate Link Tag', 'Canonical Link Tag', '301 Redirect' together? and how to implement in B2C website which is targeting multi-countires? Thanks 
Cancel
- Benj Arriola
 
 2011-04-05T10:33:59-07:00
 
 Multi-Language: Use Alternate Link Tag.
 
 If one of the languages has duplicate content internally also, then you can use the Canonical Link tag and 301 redirect also.
 
 So on a single page, you can have both Canonical link tag and Alternate link tag but they do not necessarily have to have the same href value.
 
 As for doing these with a 301... it would be a good precautionary measure to have everything, but generally I would give more priority to a 301 than a canonical link tag.
 
 1 0
 
 Multi-Language: Use Alternate Link Tag. If one of the languages has duplicate content internally also, then you can use the Canonical Link tag and 301 redirect also. So on a single page, you can have both Canonical link tag and Alternate link tag but they do not necessarily have to have the same href value. As for doing these with a 301... it would be a good precautionary measure to have everything, but generally I would give more priority to a 301 than a canonical link tag.
 Cancel
 - anna-294451
 
 2011-04-06T04:32:15-07:00
 
 Thanks, can u give an example which website also use this methods?
 
 1 0
 
 Thanks, can u give an example which website also use this methods?
 Cancel
Pratap Singh Rajawat

2012-10-26T03:11:57-07:00

Hi Ben and thanks for sharing this very useful note for canonical tag. We had discussions on FB regarding my online shopping website TheMiniMall.com duplication of contents issues and Not Selected Pages issues in Webmaster.I studied your article but i am not sure that what should i use canonical or Disallow: *Duplicate* in Robots.txt for my site. Please assist by providing some more details on this topic.
Thanks

1 0

Hi Ben and thanks for sharing this very useful note for canonical tag. We had discussions on FB regarding my online shopping website TheMiniMall.com duplication of contents issues and Not Selected Pages issues in Webmaster.I studied your article but i am not sure that what should i use canonical or Disallow: *Duplicate* in Robots.txt for my site. Please assist by providing some more details on this topic. Thanks
Cancel
heunplugged

2016-05-17T23:10:33-07:00

Great Post, Kindly suggest if i have Free Trial, Get A Quote, Contact Us, Sign up Pages have the same content, is it necessary to block the every single page in Google or else we can use noindex, follow tag in each Page. Kindly suggest. If we block these pages in Google is there any chance of getting leads decreases. Waiting for your suggestions.

heunplugged edited 2016-05-17T23:15:32-07:00
1 0

Great Post, Kindly suggest if i have Free Trial, Get A Quote, Contact Us, Sign up Pages have the same content, is it necessary to block the every single page in Google or else we can use noindex, follow tag in each Page. Kindly suggest. If we block these pages in Google is there any chance of getting leads decreases. Waiting for your suggestions. 
Cancel
William_Ju

2016-10-20T13:31:18-07:00

Don't 301 redirects also cause the page to load slower? If you have too many redirects, like with a legacy site that created 16 duplicate pages, wouldn't 16 301 redirects essentially slow down the page loading time immensely, and drive the organic ranking down in that way?

1 0

Don't 301 redirects also cause the page to load slower? If you have too many redirects, like with a legacy site that created 16 duplicate pages, wouldn't 16 301 redirects essentially slow down the page loading time immensely, and drive the organic ranking down in that way?
Cancel
Parveender

2016-11-23T22:09:38-08:00

Great technical SEO report on duplicate content Benj.

1 0

Great technical SEO report on duplicate content Benj. 
Cancel
Parveender

2016-11-29T23:28:36-08:00

Analysing technical points is little a challenging work but it's important for SEO

1 0

Analysing technical points is little a challenging work but it's important for SEO
Cancel
RoxBrock

2015-06-24T10:39:10-07:00

I am creating a help desk for both agents and retail customers. My managers are suggesting robots.txt for the agent site to eliminate the duplicate content issue. I want to use canonicals. Since this article is dated, I'm wondering what are your recommendations? Thank you.

1 0

I am creating a help desk for both agents and retail customers. My managers are suggesting robots.txt for the agent site to eliminate the duplicate content issue. I want to use canonicals. Since this article is dated, I'm wondering what are your recommendations? Thank you.
Cancel
Kaitlin McMichael

2015-04-06T14:46:16-07:00

What about URL Parameters handled within Webmaster Tools? Would you say parameter handling is as effective or less effective as the measures you've listed above? Would you recommend using both?

1 0

What about URL Parameters handled within Webmaster Tools? Would you say parameter handling is as effective or less effective as the measures you've listed above? Would you recommend using both?
Cancel
Ronak_Agrawal

2012-07-11T00:15:21-07:00

Is www.example.com/ and www.example.com/?ref=blog are also duplicate pages? How does Google Bot treats the bookmarks eg: www.example.com/#video. Does it get counted as duplicate content?

And How much beneficial are the tags in the Pages. My site is new and example.com/tag1 and example.com/tag2 generates same content. Is it Harmful in terms of Duplicate Content?

Ronak_Agrawal edited 2012-07-11T00:31:10-07:00
1 0

Is www.example.com/ and www.example.com/?ref=blog are also duplicate pages? How does Google Bot treats the bookmarks eg: www.example.com/#video. Does it get counted as duplicate content? And How much beneficial are the tags in the Pages. My site is new and example.com/tag1 and example.com/tag2 generates same content. Is it Harmful in terms of Duplicate Content? 
Cancel
Tidus

2012-07-28T12:27:38-07:00

Very useful post, but there's something I've been wondering about.

I've recently chosen my preferred domain to be www. With some of the pages (moreso posts) in my website, viewers sometimes get a message similar to the one below:

"The previous page is sending you to https://ffmages.com/final-fantasy-xiii-2-logos-artwork-character-renders-and-high-resolution-screenshots-of-the-battle-system-historia-crux-system-and-etros-shrine/.
If you do not want to visit that page, you can return to the previous page."

This has caused a substantial decrease in my site traffic. Is there anything I can do to fix this issue? The pages still exist and the slugs have not changed. All I've done is change the preferred domain.

KeriMorgret edited 2012-07-28T18:26:56-07:00
1 0

Very useful post, but there's something I've been wondering about. I've recently chosen my preferred domain to be www. With some of the pages (moreso posts) in my website, viewers sometimes get a message similar to the one below: <blockquote>"The previous page is sending you to https://ffmages.com/final-fantasy-xiii-2-logos-artwork-character-renders-and-high-resolution-screenshots-of-the-battle-system-historia-crux-system-and-etros-shrine/.If you do not want to visit that page, you can return to the previous page."</blockquote> This has caused a substantial decrease in my site traffic. Is there anything I can do to fix this issue? The pages still exist and the slugs have not changed. All I've done is change the preferred domain.
Cancel
- Benj Arriola
 
 2012-08-24T06:34:17-07:00
 
 I have doubts this is being caused by the preferred domain setting. And thanks for sharing your site, but from what page do you get this message?
 
 2 0
 
 I have doubts this is being caused by the preferred domain setting. And thanks for sharing your site, but from what page do you get this message?
 Cancel
 - Tidus
 
 2012-09-06T17:56:02-07:00
 
 When I view traffic statistics from the backend, it shows me links of pages people are viewing, and from where. A lot of the traffic comes from either the Google search engine or images directory. When I click on some of the Google links (specifically from their images directory) that direct viewers to pages in my site, I get the redirect notice like the one shown above.
 
 I wonder if people see the same redirect notice that I encounter... I noticed a significant decrease in my traffic as well.
 
 1 0
 
 When I view traffic statistics from the backend, it shows me links of pages people are viewing, and from where. A lot of the traffic comes from either the Google search engine or images directory. When I click on some of the Google links (specifically from their images directory) that direct viewers to pages in my site, I get the redirect notice like the one shown above. I wonder if people see the same redirect notice that I encounter... I noticed a significant decrease in my traffic as well.
 Cancel
Stills

2011-01-21T05:20:41-08:00

What are your thoughts on cross-domain canonicals? I have a client that has three websites in the real estate business. The original website was specific to one area with a domain name that includes keywords for the area, then they added a new site that covers a large area (statewide) with a domain specific for that region and finally a nationwide site.

Each higher-level site includes all the content of the lower-level sites. (i.e. the content on the city site is inlcuded in both the state site and country site) and is design of the pages are the same so the content is almost 95% duplicated on those pages. They don't want to merge all the sites since the lowest-level site is generating most of the income and is very well-ranked so they can't risk it.

So the question is does it make sense to use the canonical tag to indicate that the lower-level site is the original or just leave all three sites with no canonical tags and let google sort out the duplicates. Of course, the higher-level sites have additional content not included on the lower-level site.

I hope that made sense.

Stills edited 2011-01-21T05:21:10-08:00
1 0

What are your thoughts on cross-domain canonicals? I have a client that has three websites in the real estate business. The original website was specific to one area with a domain name that includes keywords for the area, then they added a new site that covers a large area (statewide) with a domain specific for that region and finally a nationwide site. Each higher-level site includes all the content of the lower-level sites. (i.e. the content on the city site is inlcuded in both the state site and country site) and is design of the pages are the same so the content is almost 95% duplicated on those pages. They don't want to merge all the sites since the lowest-level site is generating most of the income and is very well-ranked so they can't risk it. So the question is does it make sense to use the canonical tag to indicate that the lower-level site is the original or just leave all three sites with no canonical tags and let google sort out the duplicates. Of course, the higher-level sites have additional content not included on the lower-level site. I hope that made sense.
Cancel
- Benj Arriola
 
 2011-01-21T10:26:56-08:00
 
 Cross domain canonical link tag doesn't seem to work well for me. I would normally just do a 301 redirect instead for cross domain.
 
 1 0
 
 Cross domain canonical link tag doesn't seem to work well for me. I would normally just do a 301 redirect instead for cross domain.
 Cancel
 - Stills
 
 2011-01-22T07:50:14-08:00
 
 The problem with the 301 is from the user point of view. In our experience, people don't like to be sent to a different website during their session. Within the same website it's fine, they don't even realized they were 301'd, but when you send them to a different site it really creates some confusion for them.
 
 1 0
 
 The problem with the 301 is from the user point of view. In our experience, people don't like to be sent to a different website during their session. Within the same website it's fine, they don't even realized they were 301'd, but when you send them to a different site it really creates some confusion for them.
 Cancel
Alex Smith

2012-12-18T23:07:30-08:00

Hello Ben

Great Post about redirect, block & canonical tips. But i want to in how many ways we block the pages, so search engine bots not index that page?

1 0

Hello Ben Great Post about redirect, block & canonical tips. But i want to in how many ways we block the pages, so search engine bots not index that page? 
Cancel
jonathanverrall

2012-05-28T07:48:50-07:00

What i'd like to know is, does the tag pass link juice.

say if i had:

www.example.com/laptops

www.example.co.uk/laptops

www.example.pl/laptops

then i had the other 2 conjunctioning alternate tags on each page (which will create an infinite loop of alternating)

then built links to one of the pages, will this power then be distributed between them all?

1 0

What i'd like to know is, does the tag pass link juice. say if i had: www.example.com/laptops www.example.co.uk/laptops www.example.pl/laptops then i had the other 2 conjunctioning alternate tags on each page (which will create an infinite loop of alternating) then built links to one of the pages, will this power then be distributed between them all?
Cancel
chris raynor

2011-01-11T00:34:51-08:00

thanks for the great summary.

but i'm not altogether clear on how you would implement rel=canonical as recommended in these two instances from your post:
- Tracking codes, Session IDs mainly because redirection sometimes interferes with the functionality of the tracking codes and sessions. Example: https://www.example.com/path/file.php?SID=BG47JF448JD6I7TGF439LVFD476 https://www.example.com/path/file.php?utm_whatever=5uck3rs https://www.example.com/path/file.php
- Different variable orders due to how some CMS platforms are created. Example: https://www.example.com/path/file.php?var1=x&var2=y
The two cases we bump into are:

1. with a particular cms that generates SID/var1/var2 parameters. The CMS allows us to create the page, and specify the rel statement on that one source page. But the additional versions of the page with the extra parameters are outside our control - we can't specify rel=canonical on the var1 version, but not on the original version, as we only have one version we have access to;

2. utm parameters. Again, these parameters are added after the page was created, and are used for tracking in analytics. the original page that we create, should not have rel=canonical. But the additional URLs with tracking parameters should have rel=canonical.

In practise, how have you carried this out?

cheers

Chris

1 0
thanks for the great summary. but i'm not altogether clear on how you would implement rel=canonical as recommended in these two instances from your post: <ul><li>Tracking codes, Session IDs mainly because redirection sometimes interferes with the functionality of the tracking codes and sessions. Example: https://www.example.com/path/file.php?SID=BG47JF448JD6I7TGF439LVFD476 https://www.example.com/path/file.php?utm_whatever=5uck3rs https://www.example.com/path/file.php</li> <li>Different variable orders due to how some CMS platforms are created. Example: https://www.example.com/path/file.php?var1=x&var2=y</li> </ul> The two cases we bump into are: 1. with a particular cms that generates SID/var1/var2 parameters. The CMS allows us to create the page, and specify the rel statement on that one source page. But the additional versions of the page with the extra parameters are outside our control - we can't specify rel=canonical on the var1 version, but not on the original version, as we only have one version we have access to; 2. utm parameters. Again, these parameters are added after the page was created, and are used for tracking in analytics. the original page that we create, should not have rel=canonical. But the additional URLs with tracking parameters should have rel=canonical. In practise, how have you carried this out? cheers Chris
Cancel
- Benj Arriola
 
 2011-01-11T07:48:44-08:00
 
 As long as you know how to code it, there should be no problem at all.
 
 1. with a particular cms that generates SID/var1/var2 parameters. The CMS allows us to create the page, and specify the rel statement on that one source page. But the additional versions of the page with the extra parameters are outside our control - we can't specify rel=canonical on the var1 version, but not on the original version, as we only have one version we have access to;
 
 You will need server side programing here. (PHP, ASP, ASPX, Cold Fusion, Perl, JSP, whatever your cart was made in) And have conditional statements. Like in PHP it would be something like:
 
 if($var1 and $var2) {
 
 echo '<link href="https://www.example.com/'.$var1.'/'.$var2'" />';
 
 }
 
 What is happening here: If both variables exist, then display the canonical URL with the variables in the correct preferred order. And if does not hurt to have the the canonical link tag where the URL is the same as the URL in the canonical link tag. But it will help in the URL that has a different order. In the code, the SID is also excluded already.
 
 This is a simplified example, of course it may change depending on the situation of the actual code. If you are using POST or GET variables, or if it is so complicated, you do not know where the URL folders are coming from, then use $_SERVER[REQUEST_URI] to get the URL string, then some tricks with the string handling, using functions like strpos, str_replace, preg_replace, strstr and more.
 
 2. utm parameters. Again, these parameters are added after the page was created, and are used for tracking in analytics. the original page that we create, should not have rel=canonical. But the additional URLs with tracking parameters should have rel=canonical.
 
 I'll probably do something like:
 
 $RequestURI = $_SERVER[REQUEST_URI];
 
 $RequestURI = str_replace(strstr($RequestURI,'?utm'),'',$RequestURI);
 
 echo '<link href="https://'.$_SERVER[HTTP_HOST].$RequestURI.'" />';
 
 What is happening here is, I get the URL string, which is the REQUEST_URI, I look for the ?utm and everything after it using strstr. Then I take it all out using str_replace, then I put it back in the URL for the canonical tag.
 
 Extra non-related story...
 
 This is answering the canonical question specifically. Although I had a nice chat with Jaimie Sirovich aka SEO_Egghead where he approaches this problem in a different way. Probably even in a more simple implementation. But it really uses exclusion by robots.txt and adding in URL parameters to indicate who is duplicate. But that is another story and not really an answer to your question, but is just interesting also. And I am sure, other people, other readers, may have other solutions also that I have not heard of yet. Maybe the more aggressive people might even try 301 everything instead of canonical, but... cloak the 301 and show it only to Googlebot.
 
 2 0
 
 As long as you know how to code it, there should be no problem at all. 1. with a particular cms that generates SID/var1/var2 parameters. The CMS allows us to create the page, and specify the rel statement on that one source page. But the additional versions of the page with the extra parameters are outside our control - we can't specify rel=canonical on the var1 version, but not on the original version, as we only have one version we have access to; You will need server side programing here. (PHP, ASP, ASPX, Cold Fusion, Perl, JSP, whatever your cart was made in) And have conditional statements. Like in PHP it would be something like: if($var1 and $var2) { echo '<link href="https://www.example.com/'.$var1.'/'.$var2'" />'; } What is happening here: If both variables exist, then display the canonical URL with the variables in the correct preferred order. And if does not hurt to have the the canonical link tag where the URL is the same as the URL in the canonical link tag. But it will help in the URL that has a different order. In the code, the SID is also excluded already. This is a simplified example, of course it may change depending on the situation of the actual code. If you are using POST or GET variables, or if it is so complicated, you do not know where the URL folders are coming from, then use $_SERVER[REQUEST_URI] to get the URL string, then some tricks with the string handling, using functions like strpos, str_replace, preg_replace, strstr and more. 2. utm parameters. Again, these parameters are added after the page was created, and are used for tracking in analytics. the original page that we create, should not have rel=canonical. But the additional URLs with tracking parameters should have rel=canonical. I'll probably do something like: $RequestURI = $_SERVER[REQUEST_URI]; $RequestURI = str_replace(strstr($RequestURI,'?utm'),'',$RequestURI); echo '<link href="https://'.$_SERVER[HTTP_HOST].$RequestURI.'" />'; What is happening here is, I get the URL string, which is the REQUEST_URI, I look for the ?utm and everything after it using strstr. Then I take it all out using str_replace, then I put it back in the URL for the canonical tag. Extra non-related story... This is answering the canonical question specifically. Although I had a nice chat with Jaimie Sirovich aka SEO_Egghead where he approaches this problem in a different way. Probably even in a more simple implementation. But it really uses exclusion by robots.txt and adding in URL parameters to indicate who is duplicate. But that is another story and not really an answer to your question, but is just interesting also. And I am sure, other people, other readers, may have other solutions also that I have not heard of yet. Maybe the more aggressive people might even try 301 everything instead of canonical, but... cloak the 301 and show it only to Googlebot.
 Cancel
DaveGrimesII

2011-01-06T14:01:36-08:00

Benj,

Do you ever find having your articles "flipped" to be an issue? In the past I've had outsourcers flip articles, but not alter them enough... would this creat a "duplicate" problem? Just curious. Thanks.

1 0

Benj, Do you ever find having your articles "flipped" to be an issue? In the past I've had outsourcers flip articles, but not alter them enough... would this creat a "duplicate" problem? Just curious. Thanks.
Cancel
- Benj Arriola
 
 2011-01-06T15:39:21-08:00
 
 I am not a fan of article flipping or article spinning, although I have very many friends that testify this works. I guess it all depends on the degree of or extent of flipping/spinning. Sometimes it does look truly original and duplicate content is no longer a problem. Although if you want to maintain readership of these, you would want to double check if they still sound nice to users. Although this is totally another topic.
 
 1 0
 
 I am not a fan of article flipping or article spinning, although I have very many friends that testify this works. I guess it all depends on the degree of or extent of flipping/spinning. Sometimes it does look truly original and duplicate content is no longer a problem. Although if you want to maintain readership of these, you would want to double check if they still sound nice to users. Although this is totally another topic.
 Cancel
Bill Valentine

2011-01-06T14:56:19-08:00

Great post, thank you.

I have embraced the rel = canonical tag because it's often the only option I have. My clients are generally small business with cheap hosting plans. You can't do a redirect through the cPanel and the hosting company won't do it for you.

They will give you "code" to insert by yourself at your own peril. I tried to edit an .htaccess file to redirect the non-www. version to the www version and crashed the website.

I'm using rel=canonical to redirect index.html files to the root directory, but I don't see how it could help with redirecting the non-www to the www.

1 0

Great post, thank you. I have embraced the rel = canonical tag because it's often the only option I have. My clients are generally small business with cheap hosting plans. You can't do a redirect through the cPanel and the hosting company won't do it for you. They will give you "code" to insert by yourself at your own peril. I tried to edit an .htaccess file to redirect the non-www. version to the www version and crashed the website. I'm using rel=canonical to redirect index.html files to the root directory, but I don't see how it could help with redirecting the non-www to the www. 
Cancel
al sefati

2011-01-06T15:44:23-08:00

thanks benj for this.

something about 301's creating infitit loop for my pages i ended up using

I didn't know of the other one so I just learned :0

1 0

thanks benj for this. something about 301's creating infitit loop for my pages i ended up using I didn't know of the other one so I just learned :0
Cancel
krissy-cca

2011-01-07T02:53:18-08:00

Nice post - Just to join in the debate between 301 vs. canonical. I think it totally depends on what you're using it for. For example it pays to do a 301 on a page that has duplicate content or has 'another version' but you can't do this for a product page for example. If you have 1 product with 5 different colour variants then I'd use the canonical tag and pick one of the colour options as the one I want to be seen as the authoratative page and place the canonical tag on the other colour options.

1 0

Nice post - Just to join in the debate between 301 vs. canonical. I think it totally depends on what you're using it for. For example it pays to do a 301 on a page that has duplicate content or has 'another version' but you can't do this for a product page for example. If you have 1 product with 5 different colour variants then I'd use the canonical tag and pick one of the colour options as the one I want to be seen as the authoratative page and place the canonical tag on the other colour options.
Cancel
harmstra

2011-01-06T02:12:47-08:00

None of these methods solves my duplicate content issues.

We sell batteries for cars. One type of battery fits in many cars.

We have these URLs

1: domain.com/car-make1/model1/batteryA -> BatteryA fits in this car, page is optimized for this car make and model and this battery

2: domain.com/car-mak2/model2/batteryA -> BatteryA also fits in this car, page is optimized for this car make and model and this battery

3: domain.com/batteryA -> this is the mainpage of this battery

The content of these pages is almost duplicate

So one could say that we need to have a canonical link in 1 & 2 pointing to 3. But then my pages won't show up in the SERPs if someone searches for a battery for a certain car make and model

1 0

None of these methods solves my duplicate content issues. We sell batteries for cars. One type of battery fits in many cars. We have these URLs 1: domain.com/car-make1/model1/batteryA -> BatteryA fits in this car, page is optimized for this car make and model and this battery 2: domain.com/car-mak2/model2/batteryA -> BatteryA also fits in this car, page is optimized for this car make and model and this battery 3: domain.com/batteryA -> this is the mainpage of this battery The content of these pages is almost duplicate So one could say that we need to have a canonical link in 1 & 2 pointing to 3. But then my pages won't show up in the SERPs if someone searches for a battery for a certain car make and model
Cancel
- Luiz Almeida
 
 2011-01-06T04:35:03-08:00
 
 Well, in my opnion the solution is to optimize de content.
 
 Maybe if you create a paragraph like "the battery A is good because... and it is usefull for..."
 
 Create some specification on the products, even if they are similar explore de diferents details to create unique content and expand the text quantity.
 
 This will help user get the diferences between the batteries and give them a more detailed product page.
 
 Amazon usually do that, get content with product details.
 
 2 0
 
 Well, in my opnion the solution is to optimize de content. Maybe if you create a paragraph like "the battery A is good because... and it is usefull for..." Create some specification on the products, even if they are similar explore de diferents details to create unique content and expand the text quantity. This will help user get the diferences between the batteries and give them a more detailed product page. Amazon usually do that, get content with product details. 
 Cancel
- Benj Arriola
 
 2011-01-06T15:36:29-08:00
 
 If you are targeting each one of these pages with different target keywords, might as well optimize it separately as well. I agree with luizamcalmeida, might as well work on the content, and it is no longer duplicate content.
 
 1 0
 
 If you are targeting each one of these pages with different target keywords, might as well optimize it separately as well. I agree with luizamcalmeida, might as well work on the content, and it is no longer duplicate content.
 Cancel
 - Jaimie Sirovich
 
 2011-01-06T19:39:48-08:00
 
 There's no easy solution here, but there are a few things I'd suggest thinking about:
 
 1. In a proper faceted design you can address this by showing different permutations of products to the bots targeting from the facet-level, not the product level. Each product might have a slightly different applicable products and you can tweak each facet page (not all platforms allow for this, but Endeca and FAST have a solution for this, as do we in our applications and retrofits). Done wrong faceted design is a spider trap to begin with (see #4), however — so this is not an option for most.
 
 2. What Ben is saying is ultimately correct. There's no way to fix this without authoring at least something. Do #1 or write product content. I think #1 is a little better.
 
 3. In theory rel canonical _could_ sum the content and show the correct page per query. In other words they'd group or collapse the results and show the most relevant one. I doubt they do it, but someday they could — in this way canonical is more powerful than eliminating the result. Of course this only works when the duplication is in small numbers.
 
 4. Canonicalization doesn't work at all when you have many orders of magnitude of useless or duplicate combinations of settings. You'll still need robots.txt for that.
 
 Hope this helps someone. Feel free to drop me a line.
 
 Jaimie Sirovich
 
 SEO Egghead, Inc.
 
 Professional Search Engine Optimization with PHP & ASP.NET (Wrox Press)
 
 SEO_Egghead edited 2011-01-06T19:44:48-08:00
 2 0
 
 There's no easy solution here, but there are a few things I'd suggest thinking about: 1. In a proper faceted design you can address this by showing different permutations of products to the bots targeting from the facet-level, not the product level. Each product might have a slightly different applicable products and you can tweak each facet page (not all platforms allow for this, but Endeca and FAST have a solution for this, as do we in our applications and retrofits). Done wrong faceted design is a spider trap to begin with (see #4), however — so this is not an option for most. 2. What Ben is saying is ultimately correct. There's no way to fix this without authoring at least something. Do #1 or write product content. I think #1 is a little better. 3. In theory rel canonical _could_ sum the content and show the correct page per query. In other words they'd group or collapse the results and show the most relevant one. I doubt they do it, but someday they could — in this way canonical is more powerful than eliminating the result. Of course this only works when the duplication is in small numbers. 4. Canonicalization doesn't work at all when you have many orders of magnitude of useless or duplicate combinations of settings. You'll still need robots.txt for that. Hope this helps someone. Feel free to drop me a line. Jaimie Sirovich SEO Egghead, Inc. <a href="https://www.seoegghead.com/our-seo-book/search-engine-optimization-with-php.seo" rel="nofollow">Professional Search Engine Optimization with PHP & ASP.NET</a> (Wrox Press)
 Cancel
alfredopalconit

2011-01-04T18:50:37-08:00

Nice article Benj.

I wonder if content generated by auto-translate modules/plugins will be considered as duplicate entry.

For example

"This is such a wonderful place" is translated to Filipino "Ito ay talagang napakagandang lugar" or in French "C'est un endroit merveilleux"

Will search engines treat the translated versions as duplicate content, or unique content entirely?

Thanks!

Alfred

1 0

Nice article Benj. I wonder if content generated by auto-translate modules/plugins will be considered as duplicate entry. For example "This is such a wonderful place" is translated to Filipino "Ito ay talagang napakagandang lugar" or in French "C'est un endroit merveilleux" Will search engines treat the translated versions as duplicate content, or unique content entirely? Thanks! Alfred
Cancel
- Benj Arriola
 
 2011-01-04T19:26:41-08:00
 
 Duplicate or not, I think the alternate link tag is still good to use since you get to target the respective countries better.
 
 2 0
 
 Duplicate or not, I think the alternate link tag is still good to use since you get to target the respective countries better.
 Cancel
 - Luiz Almeida
 
 2011-01-05T03:48:17-08:00
 
 I agree,
 
 Because you can have many pages in many languages, but still being the same content. So, in this case, it better for you to know in what language this page is viewed the most.
 
 1 0
 
 I agree, Because you can have many pages in many languages, but still being the same content. So, in this case, it better for you to know in what language this page is viewed the most. 
 Cancel
Richard Getz

2011-01-03T17:26:51-08:00

thanks Benj for the post. I would vote 301 each and everytime to eliminate thinking on anyone else part (read SEs)

1 0

thanks Benj for the post. I would vote 301 each and everytime to eliminate thinking on anyone else part (read SEs)
Cancel
- Randy Pickard
 
 2011-01-04T19:59:57-08:00
 
 One common type of mirror page is a printable version of a page. The canonical tag seems to me to be the appropriate method for handling this type of mirror page.
 
 2 0
 
 One common type of mirror page is a printable version of a page. The canonical tag seems to me to be the appropriate method for handling this type of mirror page.
 Cancel
 - Benj Arriola
 
 2011-01-05T11:28:08-08:00
 
 That is totally right. Forgot to mention that. Although if the printable version is a pdf... that is another story.
 
 2 0
 
 That is totally right. Forgot to mention that. Although if the printable version is a pdf... that is another story.
 Cancel
- des mc carthy
 
 2011-01-07T07:43:13-08:00
 
 don't you lose a certain percentage of link juice with 301 redirect?
 
 1 0
 
 don't you lose a certain percentage of link juice with 301 redirect?
 Cancel
Fatwallet

2011-01-04T08:38:46-08:00

I find that the canonical tag is really a hit or a miss, especially in cross domain situations.

1 0

I find that the canonical tag is really a hit or a miss, especially in cross domain situations.
Cancel
- Benj Arriola
 
 2011-01-04T10:06:02-08:00
 
 Nice to note, and this is where I believe a 301 is better. When it is a cross domain situation.
 
 1 0
 
 Nice to note, and this is where I believe a 301 is better. When it is a cross domain situation.
 Cancel
Luiz Almeida

2011-01-04T13:06:19-08:00

One of the best post I read this days!

I like a lot the canonical and alternate comparision.

I have two clients and canonical is very useful to one of them: it´s an e-commerce site and we have some problems before with duplicate categories in diferent places, but with the canonical we solve this question.

The alternate link tag, i never used but as im reading the post give a nice idea to apply it. its the perfect example, use sites with multilingual pages.

Very nice and keep writing more and more to us!

1 0

One of the best post I read this days! I like a lot the canonical and alternate comparision. I have two clients and canonical is very useful to one of them: it´s an e-commerce site and we have some problems before with duplicate categories in diferent places, but with the canonical we solve this question. The alternate link tag, i never used but as im reading the post give a nice idea to apply it. its the perfect example, use sites with multilingual pages. Very nice and keep writing more and more to us!
Cancel
Goran Candrlic

2011-01-07T03:56:16-08:00

Just to follow-up on the debate about - it works. Recently I've done triple-domain swap with canonical tag and I've managed to gain rankings for similar (although not the same) set of keywords.

301 would just be killing one website or page and in terms of user experience would be bad IMHO.

Regards,

1 0

Just to follow-up on the debate about - it works. Recently I've done triple-domain swap with canonical tag and I've managed to gain rankings for similar (although not the same) set of keywords. 301 would just be killing one website or page and in terms of user experience would be bad IMHO. Regards,
Cancel
Michael Gullaksen

2011-01-07T04:31:45-08:00

Really nice round up benjarriola. But robots.txt does not keep Google from indexing a page, it only keeps Google from crawling the page when the crawl initiates with your site. So if, for example, there is an outside link to a page or URL that is not crawled per robots.txt commands, the page/URL/content will still be indexed by Google. All it takes is one link and - bam - so much for robots.txt.

1 0

Really nice round up <a href="../users/view/3578">benjarriola</a>. But robots.txt does not keep Google from indexing a page, it only keeps Google from crawling the page when the crawl initiates with your site. So if, for example, there is an outside link to a page or URL that is not crawled per robots.txt commands, the page/URL/content will still be indexed by Google. All it takes is one link and - bam - so much for robots.txt. 
Cancel
- Benj Arriola
 
 2011-01-07T09:49:56-08:00
 
 True. Totally agree but failed to mention. Thanks for mentioning this. Also some added observation is if you block it in robots.txt, but still is indexed because of a link, it will not have a good title and there will be no description in the SERPs but it is still listed.
 
 1 0
 
 True. Totally agree but failed to mention. Thanks for mentioning this. Also some added observation is if you block it in robots.txt, but still is indexed because of a link, it will not have a good title and there will be no description in the SERPs but it is still listed.
 Cancel
5225Marketing

2011-01-10T14:35:08-08:00

Much of the post was about duplicate content within a site, but how do each of these methods work when it's content from a different site?

One of my clients has been copying news articles about his company on the company site. I was considering just getting rid of that content (since it's all copied from other sites) and just posting links, but if there's a way to keep the content without getting penalized, I'd go with that. What's the best way?

5225Marketing edited 2011-01-10T14:35:32-08:00
1 0

Much of the post was about duplicate content within a site, but how do each of these methods work when it's content from a different site? One of my clients has been copying news articles about his company on the company site. I was considering just getting rid of that content (since it's all copied from other sites) and just posting links, but if there's a way to keep the content without getting penalized, I'd go with that. What's the best way?
Cancel
- Benj Arriola
 
 2011-01-11T07:29:36-08:00
 
 It cannot be avoided but a few things that can be done:
 1. If they care copying RSS feeds, you can use summaries instead of full stories. Although Google should be smart enough by now to know who is the original.
 2. If they are scrapers that copies the whole page, then might as well always use the canonical tags do they see your domain. You might even want to use the original source tags. https://googlenewsblog.blogspot.com/2010/11/credit-where-credit-is-due.html although this is an experimental tag right now, it does not hurt you adding these tags.
 3. Always have links going back to your site using absolute linking. So even if they are duplicated, at least they still have links going back to you.
 1 0
 It cannot be avoided but a few things that can be done: <ol><li>If they care copying RSS feeds, you can use summaries instead of full stories. Although Google should be smart enough by now to know who is the original.</li> <li>If they are scrapers that copies the whole page, then might as well always use the canonical tags do they see your domain. You might even want to use the original source tags. <a href="https://googlenewsblog.blogspot.com/2010/11/credit-where-credit-is-due.html" rel="nofollow">https://googlenewsblog.blogspot.com/2010/11/credit-where-credit-is-due.html</a> although this is an experimental tag right now, it does not hurt you adding these tags.</li> <li>Always have links going back to your site using absolute linking. So even if they are duplicated, at least they still have links going back to you.</li> </ol>
 Cancel
Per Svanström

2011-01-11T02:40:47-08:00

ASFAIK Robots.txt dont exclude from indexing, only from crawling, making it a bad choice if you want to exclude for duplicate content

meta-robots noindex is the way to go over robots.txt if you want to exclude.

More info on seomoz post on Robots Exclusion Protocol

1 0

ASFAIK Robots.txt dont exclude from indexing, only from crawling, making it a bad choice if you want to exclude for duplicate content meta-robots noindex is the way to go over robots.txt if you want to exclude. More info on seomoz post on <a href="robots-exclusion-protocol-101">Robots Exclusion Protocol</a>
Cancel
smuralii

2011-01-12T01:44:16-08:00

Great post, i usually use 301 for most of my website and canonical in rare cases. But this is the first time i come across alternate link tag. As i am dealing with a Multilingual SEO sites, this will help a lot i hope.

Thanks for this.

1 0

Great post, i usually use 301 for most of my website and canonical in rare cases. But this is the first time i come across alternate link tag. As i am dealing with a Multilingual SEO sites, this will help a lot i hope. Thanks for this.
Cancel
Erin Martin

2011-01-14T10:17:19-08:00

We have ajax driven URLs once auser is on our site and static URLs that we deliver to an "SEO" directory. I want to implement rel=canonicals on my site. Do you think I should use the static URL and not the ajax URL? That's the direction I am heading in. Do you agree?

1 0

We have ajax driven URLs once auser is on our site and static URLs that we deliver to an "SEO" directory. I want to implement rel=canonicals on my site. Do you think I should use the static URL and not the ajax URL? That's the direction I am heading in. Do you agree? 
Cancel
- Benj Arriola
 
 2011-01-14T16:00:49-08:00
 
 I would say static URLs. When you say AJAX URLs, I am assuming these are the URLs with the hash/pound/sharp/number sign. It is common SEO knowledge that the everything after the hash tag is not read by search engines. Thus it is viewing the page as if the hash tag was not there. Since this is really a client side technology that the browser reads and serverside scripting cannot read this either.
 
 Although many have reported already that Google does read the hash tag. And Google has been playing with it's headless browser for some time. https://googlewebmastercentral.blogspot.com/2009/10/proposal-for-making-ajax-crawlable.html
 
 And many have seen Google differentiate pages with hash tags in the first link priority issue. https://www.seomoz.org/blog/the-first-link-counts-rule-and-the-hash-sign
 
 [Off topic story: I remember playing around with this experiment before this was announced by SEOMoz. I am not saying I was first. Actually I was walking in the parking lot of OMS and saw Rand Fishkin lost also. And had a brief conversation on 'first link'. And he told me about the hash tag URL effects on first link, after about a few months... the blog post came out from someone on YouMoz with the test that was done. So I'm not first, my source was still Rand. :) ]
 
 What I do is even if I have AJAX content and AJAX links, the default static content and links that load are still... welll static. Better if you see an example than explaining it. Check https://www.ajaxoptimize.com/
 
 1 1
 
 I would say static URLs. When you say AJAX URLs, I am assuming these are the URLs with the hash/pound/sharp/number sign. It is common SEO knowledge that the everything after the hash tag is not read by search engines. Thus it is viewing the page as if the hash tag was not there. Since this is really a client side technology that the browser reads and serverside scripting cannot read this either. Although many have reported already that Google does read the hash tag. And Google has been playing with it's headless browser for some time. <a href="https://googlewebmastercentral.blogspot.com/2009/10/proposal-for-making-ajax-crawlable.html" rel="nofollow">https://googlewebmastercentral.blogspot.com/2009/10/proposal-for-making-ajax-crawlable.html</a> And many have seen Google differentiate pages with hash tags in the first link priority issue. <a href="https://www.seomoz.org/blog/the-first-link-counts-rule-and-the-hash-sign" rel="nofollow">https://www.seomoz.org/blog/the-first-link-counts-rule-and-the-hash-sign</a> [Off topic story: I remember playing around with this experiment before this was announced by SEOMoz. I am not saying I was first. Actually I was walking in the parking lot of OMS and saw Rand Fishkin lost also. And had a brief conversation on 'first link'. And he told me about the hash tag URL effects on first link, after about a few months... the blog post came out from someone on YouMoz with the test that was done. So I'm not first, my source was still Rand. :) ] What I do is even if I have AJAX content and AJAX links, the default static content and links that load are still... welll static. Better if you see an example than explaining it. Check <a href="https://www.ajaxoptimize.com/" rel="nofollow">https://www.ajaxoptimize.com/</a>
 Cancel
Pascal Jansen

2011-01-10T04:15:06-08:00

Bookmarked this one... Very clear! Thank you so much!

1 0

Bookmarked this one... Very clear! Thank you so much!
Cancel
leons0133

2011-01-10T01:04:09-08:00

Nice information about Duplicate content benjarriola, Thanks for your post!

KeriMorgret edited 2011-03-22T16:44:59-07:00
1 0

Nice information about Duplicate content benjarriola, Thanks for your post!
Cancel
BeattieGroup

2011-01-07T04:57:32-08:00

I tend to recommend 301 where possible and canonical elsewhere. I've recently moved whole blog domains using canonical though and it worked well and fast too.

1 0

I tend to recommend 301 where possible and canonical elsewhere. I've recently moved whole blog domains using canonical though and it worked well and fast too.
Cancel
Nadia King

2011-01-07T09:27:29-08:00

Great post! Thanks for laying out all of the options.

I have used <meta name="robots" content="noindex,follow" /> and it has definitely worked for me. I have staging URLs that aren't linked to anywhere (or anywhere that I can find!) yet they were still indexed - blocking via robots.txt definitely did NOT work! Thanks for the other suggestions, too.

1 0

Great post! Thanks for laying out all of the options. I have used <meta name="robots" content="noindex,follow" /> and it has definitely worked for me. I have staging URLs that aren't linked to anywhere (or anywhere that I can find!) yet they were still indexed - blocking via robots.txt definitely did NOT work! Thanks for the other suggestions, too.
Cancel
- Benj Arriola
 
 2011-01-07T09:48:28-08:00
 
 If you were testing if they are indexed or not, the tag will definitely work. Now the the question is are links going to this page still counted and are the links on the page itself, is it still passing link juice to where they are going?
 
 1 0
 
 If you were testing if they are indexed or not, the tag will definitely work. Now the the question is are links going to this page still counted and are the links on the page itself, is it still passing link juice to where they are going?
 Cancel
Paul Marshall

2011-01-07T12:37:52-08:00

Another advantage to using the 301 redirect is to clean up the duplicate content issue and control where your links points by redirecting to either the www or non-www (whichever is preferred) of each of your site's pages. No point having links pointed to the wrong version of the page, lowering link juice by dividing up where links point. And REALLY no point to have 2 versions of each of your site's pages, creating duplicate content. It's surprising to me how few sites use this easy fix.

1 0

Another advantage to using the 301 redirect is to clean up the duplicate content issue and control where your links points by redirecting to either the www or non-www (whichever is preferred) of each of your site's pages. No point having links pointed to the wrong version of the page, lowering link juice by dividing up where links point. And REALLY no point to have 2 versions of each of your site's pages, creating duplicate content. It's surprising to me how few sites use this easy fix.
Cancel
- Benj Arriola
 
 2011-01-07T17:26:19-08:00
 
 Sometimes it is not that easy if you have a large enterprise site of 10,000 pages or more.
 
 Aside from that there are also links coming from other people, on other websites you do not own. Thus you cannot control. So this is where all these solutions in the blog post come in. :D
 
 1 0
 
 Sometimes it is not that easy if you have a large enterprise site of 10,000 pages or more. Aside from that there are also links coming from other people, on other websites you do not own. Thus you cannot control. So this is where all these solutions in the blog post come in. :D
 Cancel
brendan C

2011-01-19T06:00:05-08:00

Impressive post mate

Serves as a good resource for anyone tossing between options.

1 0

Impressive post mate Serves as a good resource for anyone tossing between options.
Cancel

Post Analytics

Comments 80

Log in to Moz

Don't have an account?