We know that Google tends to penalize duplicate content, especially when it's something that's found in exactly the same form on thousands of URLs across the web. So how, then, do we deal with things like product descriptions, when the manufacturers require us to display things in exactly the same way as other companies?
In today's Whiteboard Friday, Rand offers three ways for marketers to include that content while minimizing the risk of a penalty.
For reference, here's a still of this week's whiteboard!
Video Transcription
Howdy Moz fans, and welcome to another edition of Whiteboard Friday. Today I'm going to be chatting a little bit about a very specific particular problem that a lot of e-commerce shops, travel kinds of websites, places that host user-generated and user-review types of content experience with regards to duplicate content.
So what happens, basically, is you get a page like this. I'm at BMO's Travel Gadgets. It's a great website where I can pick up all sorts of travel supplies and gear. The BMO camera 9000 is an interesting one because the camera's manufacturer requires that all websites which display the camera contain a lot of the same information. They want the manufacturer's description. They have specific photographs that they'd like you to use of the product. They might even have user reviews that come with those.
Because of this, a lot of the folks, a lot of the e-commerce sites who post this content find that they're getting trapped in duplicate content filters. Google is not identifying their content as being particularly unique. So they're sort of getting relegated to the back of the index, not ranking particularly well. They may even experience problems like Google Panda, which identifies a lot of this content and says, "Gosh, we've seen this all over the web and thousands of their pages, because they have thousands of products, are all exactly the same as thousands of other websites' other products."
So the challenge becomes: How do they stay unique? How do they stand out from this crowd, and how can they deal with these duplicate content issues?
Of course, this doesn't just apply to a travel gadget shop. It applies broadly to the e-commerce category, but also to categories where content licensing happens a lot. So you could imagine that user reviews of, for example, things like rental properties or hotels or car rentals or flights or all sorts of things related to many, many different kinds of verticals could have this same type of issue.
But there are some ways around it. It's not a huge list of options, but there are some. Number one, you can essentially say, "Hey, I'm going to create so much unique content, all of this stuff that I've marked here in green. I'm going to do some test results with the camera, different photographs. I'm going to do a comparison between this one and other ones. I'm going to do some specs that maybe aren't included by the manufacturer. I'll have my own BMO's editorial review and maybe some reviews that come from BMO customers in particular." That could work great in order to differentiate that page.
Some of the time you don't need that much unique content in order to be considered valuable and unique enough to get out of a Panda problem or a duplicate content issue. However, do be careful not to go way overboard with this. I've seen a lot of SEOs do this where they essentially say, "Okay, you know what? We're just going to hire some relatively low quality, cheap writers." Maybe English isn't even their first language or the country of whatever country you're trying to target, that language is not their first language, and they write a lot of content that just all sits below the fold here. It's really junky. It's not useful to anyone. The only reason they're doing it is to try and get around a duplicate content filter. I definitely don't recommend this. Panda is built even more to handle that type of problem than this one, from Google's perspective anyway.
Number two, if you have some unique content, but you have a significant amount of content that you know is duplicate and you feel is still useful to the user, you want to put it on that page, you can use iframes to keep it kind of out of the engine's index, or at least not associated with this particular URL. If I've got this page here and I say, "Gosh, you know, I do want to put these user reviews, but they're the same as a bunch of other places on the web, or maybe they're duplicates of stuff that happened on other pages of my site." I'm going to take this, and I'm going to build a little iframe, put it around here, embed the iframe on the page, but that doesn't mean that this content is perceived to be a part of this URL. It's coming from it's own separate URL, maybe over here, and that can also work.
Number three, you can take content which is largely duplicative and apply aggregation, visualization, or modifications to that duplicate content in order to build something unique and valuable and new that can rank well. My favorite example of this is what a lot of movie review sites, or review sites of all kinds, like Metacritic and Rotten Tomatoes do, where they're essentially aggregating up review data, and all of the snippets, all of the quotes are coming from all of these different places on the web. So it's essentially a bunch of different duplicates, but because they're the aggregator of all of these unique, useful pieces of content and because they provide their own things like a metascore or a Rotten Tomatoes rating, or an editorial review of their own, it becomes something more. The combination of these duplicative pieces of content becomes more than the sum of its parts, and Google recognizes that and wants to keep it in their index.
These are all options. Then the last recommendation that I have is when you're going through this process, especially if you have a large amount of content that you're already launching with, start with those pages that matter the most. So you could go down a list of the most popular items in your database, the things that you know people are searching for the most, the things that you know you have sold the most of or the internal searches have led to those pages the most; great, start with those pages. Try and take care of them from a uniqueness and value standpoint, and you can even, if you want, especially if you're launching with a large amount of new content all at once, you can take these duplicative pages and keep them out of the index until you've gone through that modification process. Now you sort of go, "All right, this week we got these 10 pages done. Boom, let's make them indexable. Then next week we're going to do 20, and then the week after that we'll get faster. We'll do 50, 100, and soon we'll have our entire 10,000 product page catalog finish and completed, all with unique, useful, valuable information that will get us into Google's index and stop us from being considered duplicate content."
All right everyone, hope you've enjoyed this edition of Whiteboard Friday. We'll see you again next week. Take care.
Like others, the iframe part threw me for a loop. Didn't Google say they "try" to associate the content in an iframe with the actual page it is on? - https://support.google.com/webmasters/answer/34445?hl=en
If they are successful in associating the content in the iframe with the page the frame is on, I'm not sure if that solves the problem (and only Google would really know).
When there is a lot of it, I have always just created an image of the duplicate content and placed that on the page along with my own unique content.
I like your idea,an image of duplicate content surrounded by unique content.And alt attribute is completely under your control.Good idea,for now.
But,Google is working on an algorithm to read the text in the image and in the future it may be a problem.
Put noindex in the HTML doc in the iframe and you're all set ;-)
Excellent complement for iFrame idea.
Thomas' idea is a solid one, but I'd also point out that "associating the iframe content with the page" isn't the same as considering it part of the page for filtering based on duplicate content. If you're using the iframe system to keep out substantively duplicate stuff, you should be fine even without the noindex. If anyone's seen examples to the contrary, please let me know! It's been a few years since I actually tested this (though the sites I've seen do it still seem to be fine).
While the idea of blocking content via iframe is great, wouldn't it be counted as a form of cloaking? We were debating internally on this tactic and my personal perspective is, hiding something (intentionally) from search engines isn't a good practice. Even if Google is lenient about this today, it isn't a futuristic approach.
Cloaking to earn more traffic could be an issue, but Google's totally fine with you keeping anything & everything you'd like out of their index. Noindexing an iFrame isn't something I could imagine them ever being upset about (unless you're doing it on top of other very shady stuff).
Great idea!
I've also found it useful to take a screenshot of the content and alt tag it especially when working with law firms where there is a lot of legal disclaimers that must follow the same verbiage in all advertising.
That's a pretty good idea - that works for shorter things, but what about when you have massive product specs - I always look towards amazon.
DC is a big problem for e-shops downloading descriptions, photos, etc. My client has online bookshop with 200k+ books. All descriptions are the same as seen on many similar websites. I know google penalizes that, but there's no possibility to write unique (good, not short) description for every book. Serps "title + author", "author + title", "title + price" weren't good until changes I've proposed changes on his website. How to handle dc? By adding comments, reviews or expectations (all checked by copyscape and plagiarism smallseotools!), small discussion on book page and helpful/not helpful button. For example: dc without it is around 50% of all text on subpage. With simulated 3 reviews, 10 comments dropped to 10% and lower. Adding new reviews and comments decreases % of dc.
How to persuade clients to write reviews? Two ways I think are good (and I see works on several places like his online bookstore):
- small % discount - but reviews must be written by hand not copied after book was bought; or as "expectations" before buying, and after that next review
- increasing number of reviews (with specified helpful rate) increases % of discount for reviewers
And don't forget rel="author":)
Maybe You have another idea? If You have more questions PM me:)
Interesting thoughts Krzysztof. 200+K is a lot of books!
I would strongly caution against "simulated reviews" as these are often easily identified and can potentially hurt your/clients brand.
Real reviews on ones site can be good and help with DC and are most valuable when real and not paid for (i.e. offering discounts - how could someone legitimately review a product before they buy?)
The value of soliciting reviews for one's own site to address DC issues has to be weighed against the value of having those same users leave reviews on trusted 3rd party sites (Google+, YellowPages, InsiderPages, etc.) which helps drive traffic through higher ranking listings using user reviews that are more likely trustworthy.
Simulated just for testing: how DC will drop if I put that reviews and comments. That was internal test, not published on book page at front end.
...and day by day % of DC will be dropping as new reviews and comments will be added (after check copyscape/plagiarism).
Reviewing before buying is a good point to delete it. I plan to put a badge like "buyer"/"he/she bought" and something like blocking reviewing it in 7 days after buying (must read:D).
Or "expectations" will be still there but I'll told my client to pass only "not-review" posts.
As I said in the video, starting with the most important pages first and going down the list, doing 10-20 a week can still be hugely valuable, even if you never get to all of them.
Agreed but You know bigger impact is from book pages, not category pages. Fortunately number of reviews increasing:)
Very late, but perhaps someone still cares. Probably more efficient would be maintaining a good blog or presenting own reviews etc. This way, the site attracts people interested in books. If they like your content why not buy? Let the 200k books as they are. Help Google with markup, to understand that you sell all this books. And for bestsellers, like others recomend, add good, own content to the book pages.
What are your views on the googleoff/googleon tags, to remove specific parts of a page from index, as an alternative to the iFrame option? See https://developers.google.com/search-appliance/documentation/610/admin_crawl/Preparing under "Excluding Unwanted Text from the Index".
Jacob those tags are not for googlebot but for Googles search appliance. It has nothing to do with web search ;-)
+1 Thomas. Yahoo! used to have an operator that let you exclude individual blocks of content, but no longer. I believe iFrame or Javascript calls to noindex'd/robots.txt-blocked content might be the only ways these days.
Thanks for clearing that up and sorry for the late reply, Thomas. Notify e-mail went into spam folder :(
Nice WBF, Rand. I particularly like the advice on how to approach a content creation project, because I think a lot of people just freeze out of fear when someone says "you need to get unique content onto all of your product pages". For someone with a few thousand products that could feel like a Herculean (and in the face of various Google algo' updates, Sisyphean) task, but starting with the most important products - by which I take it you mean those which are already earning the most revenue - and doing a few at a time makes it feel much more accomplishable.
Great point! Get unique content for your product pages is definitely not an easy task but having starting point makes it achievable.
Yeah - exactly. Using traffic + links + social signals on pages is a great way to sort them for prioritization.
Even though some companies require to have the same pictures and description of their products it could work to ask those companies if they can make an exception. Just prepare a few samples for them where you just make some minor changes to the text together with some height quality pictures and they might agree. In Germany we have saying: "Fragen tut nicht weh" (engl.: Asking doesn't hurt)
I like that approach!
This applies to site with duplicate content from other sites... but what about a site with duplicate content within it's own site. For example a realestate site like https://www.remax.com or https://www.apartments.com where ammenities would be duplcated. would that be so bad?
Another awesome WBF and some Adventure Time thrown in for good measure, well done!
Another great installment of whiteboard Friday. Duplicate content is definitely on the forefront of my mind when analyzing sites. The iFrame idea is a new one to me. I was of the understanding that iFrames were bad for SEO. I was wrong and now I know why. Thanks for the great information.
Rand,
Thanks for the post. How can one overcome a (seemingly inherent) bias in search results for particular types of websites? For example, for product specific queries (especially brand & model specific), we are seeing that either e-commerce sites appear very high or the source (read 'manufacturer' websites appear high). What can be done to help rank a product page for an information aggregator higher than that of the product page on the manufacturer site? Differentiating content alone doesn't seem to work. Look forward to hearing from you (and the others in the community as well).
Thanks,
Manoj
Would a fourth option potentially be to put the product descriptions in block quotes?
Also, if using an iframe for the product descriptions, would you recommend setting up a site seperately and throwing it all in there, rather than using the manufacturers site for the iframe (and going through all that fuss)
Hi Rand! Nice Information and most of SEO people have the same problem which is duplicate content specially in E-Commerce websites. If I show some content as attractful visualization, then its going to work instead of using Ifram?
Isn't it possible that when Google sees highly similar content across many websites they simply ignore the content as opposed to penalize, as they are very well aware that it is data part of such business model? As a result it may not be necessary to worry about penalization in many cases, but rather only focus on how to make those duplicate alike pages more unique? So no index and using frames etc, but not be needed in most cases. Any thoughts on this?
The idea of using i frames for no indexing by Google as a measure to maintain the duplicate content issue was really cool. Thanks for such a great WBF Rand.
Fantastic Whiteboard Friday. We've noticed similar (duplicate content) issues on job boards where recruiters will copy/paste one job description across multiple job boards.
Hi Rand,
awesome Presentation. I have a Question
I am managing a Cameras comparison Website And you know that We have same specification all over.
So I am adding Top three features of cameras on the page to make it Unique.
What other things can be Done.. Ideas ?
So, 10,000 pages at 100 a week is still 100 weeks (or two years!). And that's without adding any new products. I'd suspect that you're *not* going to get continued value out of doing this. Using the example of cameras, you're not going to get the same ROI on say, optimizing a page with a small packet of camera screws as you would from a camera (I'm guessing? I'm not in Big Photo, so maybe the other way round?). When taking this approach, segmentation is *everything*.
Iframe is great idea! I will test it, Thanx
What about if you spend a lot of time to create a unique content and 100 other pages copy this material without enven linking to your site, and not responding to your emails. How to deal with this problem?
Rand , My Question here is : if i have 3 eCommerce websites say A, B, C, 2 of which (B and C) are subsidiaries of a single business (A) and directs the eCommerce buyer to the main site (A) for buying process only (if it's written on the 2 subsidiaries (B and C), Interested in buying click here takes to the main site(A)) and has all the specifications and dimensions and the content exactly the same as main website(A). Hence triple occurrences of the same content, what would be the best way to handle this situation, in that case to avoid duplicate content and panda penalties?
Hi Rands
Thanks again for a great video. I am interested in the 3rd point where you talk about aggregate sites such as Metacritic. You say that google recognizes that the aggregation of content brings added value and that it will index it.
We have a review software website based on aggregation. Those pages do not rank at all in SERP. Now you may say our low DA reflects how google perceives us: not good l i'd say for now. But it is the same problem: The egg or the Chicken. We brought content we thought would bring added value, thus google ranking us higher in its search results, but it does not seem to be the case.
Some say we should put the following tags around it <!--googleoff: index--> et <!--googleon: index--> to avoid google crawling that content...
My question is simple: What would you personally do if you had to launch a website based on aggregation :)?
Best
Hi Rand,
This is a great article. I have a question about duplicate content. What would be the best way to avoid duplicate content penalization for a network of blog sites that syndicate the exact same articles across that entire network?
Thanks in advance for the help!
-Anthony
Our site isn't an e-commerce site, but we do have about 6 pages featuring very similar product features and duplicate verbiage. However, we don't want to re-write it because we want to make it easy for users to compare apples-to-apples. We also have to run disclaimers at the bottom of each page.
Would i-framing the product descriptions and disclaimers be beneficial in this scenario, with the addition of good content?
Another great vid - thanks! With regards to using content in ones own website or properties, across several "intention pages", is it possible to use content in this capacity without penalty?
For example; one of our sites has three different Intention Pages in a Wordpress platform - these are pages, whereas the content for them is created in Posts, and inserted into a page because of the relevance of the Intention Page theme. So, posts 1,3,5 are relevant for one page, whereas posts 1, 3, 6, and 8 are relevant for another Intention Page.
Because we are essentially Aggregating the content around the theme of the page different for one versus the other, would this cause problems?
In a previous WBF Rand discusses this as a way to reduce the amount of content, just needing some clarification.
Hi Rand,
This is a great article. I have a question about duplicate content. What's the best way to avoid duplicate content penalization for a network of blog sites that syndicate the exact same articles across that entire network? I'm dealing with this currently and haven't figured out the best course of action.
Thanks in advance for the help!
-Anthony
Hello Rand,
I just enjoy your videos so much.
I have one question - How about using googleon and googleoff tags instead of iframe/ Ajax ?
Thanks,
Sandip
I believe those are only for Google's Custom Search Engine, and not for general indexing itself.
Nice video. This is really helpful.
Question Rand - can purchased photography from stock websites (used by many) ever become duplicated content? This would obviously mean being hit by the Google shark engine.
I'd like to know what people would suggest for classifieds and jobs boards style site, in some verticals nearly all the content can be duplicate due to the use of datafeeds to share/syndicate the adverts from one portal to another, or from one advertiser to many sites such as car dealers do. Also this has huge problems in terms of scale with sites with many hundreds of thousands of pages. Is it a case that one should leave this issue to one side in this case and work on other factors like page speed, usability, authority and other areas where content can be added such as blogs?
Anyone else have these issues? how do you cope?
Interesting idea to use AJAX! I'm less familiar with advanced coding practices, but would putting the duplicated content in Jacvascript also work? I once audited a large, high-tech website whose content could not be indexed at all because it was entirely in Javascript. (Though I later found on Moz that the Javascript itself could be done in a way to allow indexing.)
Nice WBF, Rand. The mainly typical instances of duplicate content in the context of an eCommerce or online marketing environment are likely to be seen in product descriptions. Often, manufacturers provide descriptions for resellers of their products to use. The only trouble is that if web shop operators all make use of the same product description, this again creates duplicate content issues.
If they only do that (copy), then yes. If they add content - % of dc decreases just like I wrote here.
Some nice options Rand, there is not much discussion on this topic. Only issue is that this all is time consuming which is fine as far as we are making sites search engine friendly.
OK, Here is my issue. I have original content on our company's website in the form of press releases. These then get picked up by many, many different sources. In lieu of tweaking the content on our site to be "just different enough", any suggestions? The rel=canonical tag doesn't seem appropriate for this application and others have suggested a sub domain just for these press releases (i.e., press.company.com). Frankly, I don't see how this (sub domain) will help. Any thoughts or suggestions?
The canonical tag might actually be the best way to deal with it if you want to be as friendly to Google as possible. Otherwise I strongly advise nofollow-ing all the links in the press releases and adding an also nofollowed link at the bottom stating the original source.
rel="author" is there? If yes - great:) rel="canonical" is good on Your own website.
Thanks
www.something.com is a subdomain
something.com is a domain
other.something.com is a subdomain too
Other and www isn't very close to themselves and Google see them rather as two "websites" than one "website". That's why they suggested You that idea.
Tell me if I'm wrong (not 100% sure if this is still/was true)
that are some good ways out of the dc trap - I was used to create a lot of unique content, not only on the bottom of the sites btw. but I really never thought about put an iframe around some duplicate parts... thats cool.
we have a shop system - mostly used by gamestores - computergames for most. and of course there are reviews and descriptions for the game. so much duplicate content - the idea is great. Maybe for an check box - put this into an iframe ... that really helps a lot. Most customers aren't these good writers so it is like you said, they hire some people from anywhere. people wich never heard about that game - and these people write about it....
Thx for that WBF - and enjoy the weekend!
Great post Rand!
My concern is about i-frame!
Does it useful, People can use i-frame in negative way. So Google can launch some criteria in future about i-frame content on web page.
And also agree with - Jacob Worsøe, We can use googleon/off tag to prevent for indexing in Google.
Overall the concepts you explained are good.
I want to include some parts where we can modify our contents as a Unique and valuable contents.
See highlighted portions in this screenshot of Flipkart. -
https://www.diigo.com/item/image/2u3oe/0xu9
What you say?
The iFrame suggestion is actually quite good -- that's exactly what they're for :) The negative uses you mention are probably related to replacing a whole URL's content instead of complementing it. I'm sure Google can tell the difference.
Nice details in the post however I am not clear about the point TWO where you say add the important content from user point of view to a iFrame which will put it a little away from the search engines index. Everyone would like to have the content on his webpage which is indexed by the search engine for the simple purpose of improving the relevancy of the page with the search query. And that data is very important and would have more searches in specific due to its popularity. Having that data in a iFrame or a image can solve the user side of business but not the search ranking benefit is what I think.
Nice post thanks, to add some practicality to it all though it wouldn't help with descriptions across websites if you had it across internal website you can look into things like - rel=canonical and also pagination.
Matt Cutts also awsnered a similair qesution not so long ago -
https://www.youtube.com/watch?v=Vi-wkEeOKxM
This was again on just one site but he says its not a big deal (in fact he says it twice).
some of the above are great if you're having some problems on page but you bring some excellent points if you have duplicate content that's across domains and is unavoidable. The one thing I always like about SEO is that there is always more than one way to solve a problem.
Thanks again for a nice WBF
Well Matt Cutts talks there about legal disclaimer on just one site. That is quite a different issue.
it is and it isn't, if you have to have a product description that has to be there (similar to terms of that product). Duplicate content issues rarely give out a penalty. It is more about Google knowing which page they should rank above others.
I would also like to point out the following helpful post -
search engine land
This comes back to Rands point about making each of your page unique so it stands out from the 100 pages that are the same.
-
I said similar question implying it was relevant in the subject matter but not nessaserrly on the subject.
"If you had it across internal website"
Google understands that boiler plate content like disclaimers and terms & conditions are required by law and shouldn't penalize on those grounds
Another Great WBF! Content duplication causes a lot of issues for online stores but also provides a great opportunity for smaller e-commerce stores to rank competitively for a product if they are willing to write unique descriptions.
Thankyou Rand - for sharing such a helpful tips to optimize duplicate content.
Love the iframe idea, Rand! If the manufacturer is requiring certain pieces of of copy and images to be included with the product listing, an iframe would be a great way to keep it away from search engines (if duplicate content indeed is on thousands of other websites).
Ultimately, this is just a "common sense" approach. The website should be thinking to themselves: "How can I differentiate this camera on MY website so people want to see mine versus any other site?" Which is of course where the unique content comes in.
Other reviews, testing the camera versus others, blogging about the camera (maybe a few different topics) and then including snippets of the post on the product page, unique images / videos of the camera and it's options, etc. Unfortunately - which is probably why you made this video - not many SEOs think with this "common sense" marketing approach…
One can dream of a day where ALL SEOs do, however.
Many product descriptions for popular items are the same across a ton of sites! Trying to differentiate your product page is good for SEO & users who are shopping will notice you. Completely agree with Rand on starting with your 'best' or most profitable items will give you a chance to plug away on updating your product description.
great post rand
Great post Rand and it is good that you've covered one specific area in your post. i just wanna add few points at here.
The best is to convert duplicate content into images like manufacturer provided product descriptions and reviews. Now to add content on the product pages you can include your company qualities and guarantees like “24/7 customer support, on-time delivery, etc.
One more idea is to add few points under the headings like “why to buy from us?” “why choose us”, etc. One more idea is comparison. Compare the product with other products likewise services and brands.
Include Q and A section for public. This will help to get original and user generated content plus you’ll find out the genuine problems of your potential audience.
You can also add images and videos that make your product page comprehensive and you can add few lines as descriptions of visual assets under them. Know if are willing to use the supplier provided content in text form, simply modify it by converting into bullets, exclude few points or merging few general lines of your own.
Remember to make full use of review section by adding unique and genuine reviews. You can add reviews in other languages too with their country names if you are dealing worldwide. I am sure Panda will have no issues with it because it is as like as people from around the world (like me) are commenting here and not all of they possess exception English writing skills.
Anyhow great post Rand and hope these points can be useful too if considered well :)
Good WBF Rand..but it can be a hactic task to implement I frame on evey product page because i have More than 40K products on my website.
Hey Rand,
Superb post, i am always waiting for WBF because every post which published on WBF was such a wonderful. In today's post get brief understanding about content which very helpful to every readers.
Thanks for this post.!!!
Rand, thanks for this WBF.
How about loading with uncrawlable AJAX instead of iFrame? iFrames have a reputation for being able to host spam and a way to inject crap. Also, since iFrame content doesn't show in Google's cache it doesn't mean that Google doesn't use it for both URL discovery (as a matter of fact, Google finds and follows links in iframes - https://www.seroundtable.com/google-iframe-link-14558.html, https://www.rimmkaufman.com/blog/do-search-engines-follow-links-in-iframes/31012012/) and content quality. If you think about it, Google's public cache is more likely not the same with the what they use internally. A spam page will not show in Google's cache but it will be used to identify spam networks and patterns.
Yeah - Google is likely connecting iframe content and the links inside and associating with the page, but I haven't seen them actually filter pages out based on duplicate content in iframes, so you're likely safe there. That said, the uncrawlable AJAX (remote javascript calls to places blocked by robots.txt for example) can also work.
Definitely enjoyed reading this post and absorbing the ideas. I like the idea on aggregation. But the first commenter, Ben Morel, is right. If you have thousands of products this would really suck. I just hope Google can come up with a better algorithm. Duplicate/repetitive doesn't necessarily mean inferior. Sometimes there's just a best way to present something and editing it just to prevent duplication is counter productive, I think.
Interesting points Rand. With regard to your comment about using iFrames so that a segment of content is not indexed as part of the page, I am sure I read somewhere that there was a tag that could be inserted around a block of html to tell search engines not to crawl it as part of the page. I can't remember if this was html5 or microformat but I'm sure I've read about it - can't find it now when searching though - can anyone confirm this? If not I must have dreamt it - if so it would be a good idea - a lot less clunky than iFrames.
I believe you're referring to the on/off tag. https://developers.google.com/search-appliance/documentation/610/admin_crawl/Preparing#pagepart
Interesting topic. This is the main problem that I usually face with E-Commerce website. I really like the 3 points that you have discussed to solve these kind of duplicate content issue. But web is a place where no website can have unique content. Suppose I update my website with unique content and after couple of weeks I see that 5 other websites have copied the same content from my website. What will happen than? Obviously! There is no way to stop someone from coping my content.
What do you recommend for keeping 300K product pages out of the index while optimizing the PDP pages?
I have a question . . . On my sites I sell and give away digital images. I might have twenty pages of images that are basically variations of the same image. For example, I recently created twenty-four pages of Biblical images of "Jacob and Esau." So I wrote 24 short descriptions -- one for each image, but essentially I'm using synonyms for the same terms . I don't have a lot of content on the page . . . Basically a short title and then the image dimensions, etc. I'm wondering in this situation how to avoid a DC penalty. Maybe I should focus the original content on the category page that holds all the images in that set...? I don't really have reviews... Anyone have a suggestion for my situation?
~Karen
As far i have concern regarding the duplicate issue you need to create a different tags and different description for each images whether they are variations of same image.