Since Google released the Canonical Tag in early 2009, we've heard a similar SEO horror story replay itself. It boils down to this: "I accidentally canonicalized my entire site to one page, and my site was completely dropped from the index." Although the evidence of rel-canonical going very wrong was overwhelming, I decided it was time to get some firsthand data in an effort to help people both avoid this problem and potentially fix it.
Experiment Overview
First things first – throughout this post, I'll refer to the "Canonical Tag," by which I mean the meta directive <link rel="canonical"... /> and not canonicalization in general. On August 23, 2010, I added the Canonical Tag sitewide to my usability blog. Each tag was identical, canonicalizing every page to my home-page:
<link rel="canonical" href="https://www.usereffect.com" />
As much as possible, I made no other content changes during the experiment. Every day, I measured ranking for a couple of critical terms along with Google's indexed page count (using the "site:" operator).
Stage I – The Decline
The graph below shows indexed pages from the day I put the Canonical Tag in place until the day I removed it, just under 3 weeks later:
Despite a short-term bump in indexed pages, the overall impact was huge, even in a relatively short term. Total indexed pages dropped from 237 to 103 (57%). The lower, light-red line shows the non-supplemental page count (the pages prior to hitting omitted results). I thought this might be worth tracking, but the pattern was very similar. Although canonicalization can be used to remove duplicate content, Google does NOT consider a wrongly canonicalized page to be a duplicate – the page is simply removed from the index.
I'm going to briefly discuss some major milestones along the decline. Each milestone is marked with the date and the number of days that passed after putting the tag in place (e.g. +1 = 1 day after).
Day +1 (Aug 24) – SEOmoz Canonical Warning
Just over a day past turning the Canonical Tags "on," I noticed a handful of Rel-Canonical warnings in the SEOmoz campaign manager under the "On-page" tab. If you have no Canonical Tag or a self-referencing tag, you should see this:
Keep in mind that an unchecked box may be fine – obviously, some Canonical Tags will point to different URLs. If you start seeing this in huge volumes, though, you may have a problem. Unfortunately, Google Webmaster Tools shows no errors for bad canonicalization.
Day +3 (Aug 26) – Top Page #1 De-indexed
Although indexation actually showed a bump around this time, my most trafficked page, with the #1 spot on Google for a solid 2-word phrase, was de-indexed. My home-page took its place in the rankings for that phrase. This demonstrates a critical point. With many SEO problems, strong pages are buffered a bit due to their "authority", link profile, etc. In this case, since high authority means more frequent crawling, the top pages on my site were the first to be affected. By the time you notice the damage of a bad sitewide canonicalization, your top pages may have been de-indexed for weeks.
Day +12 (Sep 4) – Top Page #2 De-indexed
Just over a week later, I noticed that my 2nd top page had disappeared from the index, also for a pretty competitive keyphrase. My home-page took its place, but unfortunately the ranking dropped from #1 to #9. Unfortunately, I wasn't monitoring this page from the start, so it was probably de-indexed earlier.
Day +19 (Sep 11) – Major Traffic Loss
The de-indexation by itself was starting to worry me at this point, especially for the top pages, but by the 2nd week I was starting to also see significant loss of search traffic:
The graph covers 4 weeks, including the week before the canonicalization. It was about this time that I lost my nerve and decided I'd had enough. So, I set about reversing the process.
Stage II – The "Recovery"
On September 11th, I removed the sitewide Canonical Tag. I continued collecting data until October 14th. Here's the graph of Google's indexed pages during the recovery:
There was a fairly quick bump in indexed pages, followed by a couple of leveling-off periods. The total count (149 on the last day) never regained the original indexation count of 237, even after a full month, but some of that content may have been duplicated.
Unfortunately, while indexation seemed to jump in the first few days, regaining status for my top pages took a while longer. Below are a few milestones, measured from the day I removed the sitewide Canonical Tag.
Day +18 (Sep 29) – Resubmitted XML Sitemap
For the purposes of the experiment, I tried to let recovery proceed on its own, but after a couple of weeks of not regaining my top pages, I started to get itchy. My first step was an easy one, resubmitting my XML sitemap via Google Webmaster Tools.
Day +21 (Oct 2) – Resubmitted Partial XML
Knowing that a basic resubmission probably wouldn't accomplish much, I created a 2nd XML sitemap with just my Top 3 pages and submitted that separately. I didn't have high hopes, but I figured I'd try to kick-start the crawlers.
Day +24 (Oct 5) – Added Unique Canonical Tags
Since the top affected pages were all blog posts, I decided to add back in Canonical Tags, but this time proper tags pointing to the correct, individual pages. My hope was that a good Canonical Tag might offset a bad one, or at least get the crawlers' attention.
Day +26 (Oct 7) – Submitted Reconsideration Request
Finally, almost 4 weeks after removing the Canonical Tag, I got a bit desperate. I submitted my first Google reconsideration request in quite a while. I'll talk about that a bit more later.
Day +27 (Oct 8) – Top Page #1 Re-indexed
Just a day after filing for reconsideration, my Top page regained its #1 spot and kicked out the home-page. Given the timing, I doubt this had anything to do with the request, but the re-implemented Canonical Tags may have helped.
Day +28 (Oct 9) – Top Page #2 Re-indexed
The next day, my #2 page regained its status. This was more important in a way – while the #1 page was just replaced by the home-page in the rankings, the #2 page had fallen off the rankings entirely. Not only was the page re-indexed, but it immediately regained its ranking position. After 4 full weeks, I finally saw some light at the end of the tunnel.
Stage III – The Pleading
Consider this a bit of an epilogue (as if this post wasn't already long enough). I thought our readers might enjoy seeing my reconsideration request. If nothing else, it's honest:
I did something bad. Let's get that out in the open. In late August, I rel-canonicaled my entire site (www.usereffect.com) to the home-page. Here's the thing - I did it on purpose. "Why would you do something that stupid on purpose?" you might ask. Fair enough.
Full disclosure - I write for a well-known SEO blog (SEOmoz.org). For months, we've been hearing horror stories from people who accidentally rel-canonicaled their site to one page. The problem is, they usually didn't know when it started (since it was accidental) and they didn't have much data. So, I decided to collect some. I wasn't trying to mess with Google - I just wanted to get some good data for business owners to help them avoid a costly mistake.
The good news is that my experiment was wildly successful. Within 3 weeks my Google index was chopped in half and my most prominent pages were replaced in the SERPs with the home-page. I decided I made my point and reversed the tags on September 11th (probably not the best choice of dates, in retrospect).
Almost a month later, and some of my key pages are still gone from the index. These are strong pages with good, natural link profiles. I've resubmitted my XML sitemap, submitted a focused sitemap with just those pages and have added new rel-canonicals self-referencing those pages. So far, nothing.
So, embarrassing as it is, I have no option left but to beg the forgiveness of you, the Google Gods. You who are mighty atop your Mountain View, each one better looking and more brilliant than the last, I beseech thee - please look with pity on this mere mortal and grant your bounty upon the following pages that have provoked your disfavor:
[short list of URLs]
Yours in humility,
Dr. Peter J. Meyers ("Dr. Pete")
Lessons Learned
I think the lesson here is pretty straightforward – don't do this. Of course, you'd never canonicalize your entire site to one page on purpose, but with today's sitewide headers and CMS systems, it's shockingly easy to write a header tag that affects your entire site, even across 1000s of pages. I'm not bashing the Canonical Tag as a tool – I think it has some very strategic uses. The problem is that it is one of those rare cases where you can effectively destroy your SEO efforts by changing just one line of code.
With just one 57-character tag, I lost ranking on my most competitive terms and cut my indexed pages and search traffic by more than half. The Canonical Tag is a powerful tool, but use it wisely and plan carefully.
Pete,
Amazing post! You have some real balls to do this as most people wouldn't want to test something like this on their own site let alone their business site. I hope everyone who reads this learns a lesson and makes sure they implement their canonical tags correctly!
Your reconsideration request made me laugh out loud in my office when I first read it. I hope Google got a good laugh from it too and will restore your website to it's former glory and traffic level!
Truthfully, I was a bit over-confident about how long the recovery would take, and I didn't expect the damage to last quite that long. Probably not the greatest plan, but I felt like this was an experiment that had to be done on a live site with at least some authority, and not just a dummy site.
Extra points for bravery!
I second that! 'Nuff Respect, Dr. Pete!
Great write up Dr.P...I appreciate your humility and honesty (your reconsideration request was hilarious - reminded me that we are all human!)
Did you ever consider pinging your affected pages to try and stimulate crawling? I've seen pages on my blog get indexed literally minutes after it was published and pinged.
That's a good thought. Unfortunately, my blog is home-brewed code, and it's missing some functionality, like a ping-back mechanism.
Even though you are blatantly insane (what did you think would happen, man!), I really like that you did this with an established site that you knew inside and out. I expect there is a flaw in the texting / experiment process when people create throw-away domains in order to try things out (I have done this, but after a while I took to doing experiments on a couple of my dad's sites. Don't tell him please). For this, the site had real rankings, real content, and you were aware of how it had performed in search engines in the past. The results here are probably far more realistic in terms of what would happen to someone's business if they made the same mistake.
I can see this becoming more of a problem as the tag gets older and people lazily add it via content management systems, just like back in the day when they noindexed their sites, blocked bots via robots.txt and were allowed unsupervised access to Webmaster Tools' geolocation feature :)
EDIT: Someone else may have asked, but did you look at Bing traffic / rankings / indexation too?
Nice to see you in the light of day (well, "day" on this side of the pond) :)
I didn't track Bing. In retrospect there always seem to be 5 things I wish I'd done. Sometimes, I skip a couple of planning steps in my enthusiasm.
I do think this is one case where it had to be a "real" site. Dummy sites make sense for certain, highly-controlled experiments, but the cases we'd seen in Q&A made me realize that the impact on an established site with authority would be completely different. So, I bit the bullet.
Haha, you make it sound like I only ever come out at night. And that is when we meet. This is getting questionable :p
Mel Carson will not thank me for saying so (sorry Mel), but I wonder if Bing... um... noticed?
If you ever do anything this silly again, track Bing!
Great read, Dr. P! Ya got some big brass ones, for trying something like that. Good info out of the exercise, though. I sure hope you are quickly back to where you started!
Sub headline: A cautionary tale of a Mad Dr, hell-bent on destruction, and his struggle to return from the depths of the abyss.
Great post, really enjoyed reading the self-deprecating resubmission request.
I work for one such company that accidentally placed the incorrect canonical tag on a number of pages pointing to our homepages. Initially everyone was flapping around wondering why we had lost pages from the index.
Quick check of the canonical tag explained why....
Once we removed the canonical we saw re-index of some pages within 2-3 days and almost entire re-indexing within about a week.
Oh sure Ben, just rub Dr. Pete's nose in the fact that you got it all your pages reindexed in short order. Meanwhile, every night he's leaving his back door unlocked and his porch light on just hoping his indexed pages will all come back home.
Much bigger site than Dr Petes blog and we only mucked up around 10% of our pages so the damage and subsequent re-indexing wasn't quite so significant I guess.
And you call ME mad? LOl.
Absolutely brilliant and brave move to test first hnd the issue of the canon tag. Peeps need to be careful trying to over engineer these because as you demonstrated, a simple mistake could have insane issues.
One of the sites I work on is using the canonical tag to point at the page on which it is placed i.e. if the url is www.thisurl.com/canonical then the canonical tag is pointing to.... www.thisurl.com/canonical
So, the tag is not really achieving much, but that's not my concern. The pages on which these tags have been placed generally have a very short life-cycle of maybe a week at a time. So my question to you mere mortals of the Google Gods at whose feet we bow and serve... ahem... is this...
"Would it be better to point the canonical tag at the parent page of these temporary pages or should we leave the canonical out all together?"
My logic for pointing the pages at the parent (permanent) page is that we are effectively signalling to the search engines that they all belong to one page, even though the content is unique on each page. The intention of this approach is to say to the search engines that "We actually have loads of fresh and unique content for this category that is updated frequently, but we don't want to risk clogging up your beautiful, high quality index with out of date pages".
The pages in question have a short lifespan because they carry deals that expire, but that are so hot, they are likely to get shared and linked to i.e. it would be a shame to waste any of that equity.
Opinions on the following approaches or any suggestions are most welcome:
1. Submit a daily (pages are added daily) xml sitemap with the new pages and then 301 redirect them at the end of their life .
2. Continue with the current approach of pointing the canonical at the parent (permanent version)
From what I know and understood, the canonical should have to be used only in case of 100% duplicate content in different URLs, to say to bots "hey boys, this is a duplicate of this (the canonical)... so index the right URL, please".
Thanks, so would you go with option 1? In other words, submitting a daily sitemap and then 301 redirecting pages once they expire?
Why not trying something different and use twitter to call the attention of the bots?
How big is your site? I'm wondering whether there should even be a concern about clogging the index with your pages? I don't have a lot of expereince with "big site" issues.
Also, what links are on your sub pages? I believe if you 301/tag the sub pages, then all of the link juice they get goes to the parent page. Would it be better to let the sub pages retain their link juice and distribute it through all of the links on the sub pages (this may send juice to more internal pages and provide a more even deep link profile)?
Finally, the two options have very different impacts for the end user. 301 pages and users following old links will never see the old content. The canonical tag will leave the original user experience intact. I know you said the content loses value, but there is usally some value to letting users that click on a link see the content that they expect.
Does anyone see a downside to removing the canonical tags and not 301ing these pages assuming that the sub pages already link to the parent page?
***Sorry, I just saw that your pages are deal pages. I believe the above points are still relevant, but the business case probably dictates that you don't want people to see old deals and you do want them to see the new deals. So the 301 probably makes most sense from both a SEO and business standpoint.
We've got about 4,000 pages indexed and I believe Google actively encourages us to use tools like the canonical tag to help them keep a clean index with pages that are useful to the user.
I agree there is perhaps value to be had with keeping the pages live and in the index from what a user expects to see perspective, but I think this is outweighed by the negative impact of a user getting to a page not knowing what to expect and finding a deal that has expired.
Apart from the global navigation, these pages link out to other sites, but then I see no reason why we couldn't cross link these pages to related deals as well.
I'm leaning with option one because we have a hot deal that can be indexed, we are tweeting and facebooking the deals (thanks @gfiorelli1) as they go live and then when they're done, we send the user and Google back to the parent page.
Just the user experience of not finding what they expected... how to deal with that.... hmmmm....
I'd tend to go with option (1), although I have heard of people getting pretty creative with the Canonical Tag. Typically, though, I think the 301 will carry more of any inbound link juice that happens to be hitting those deals page, and it's a bit more in line with Google's expectations.
Thanks. I'm thinking of the following process for these pages:
I'm curious... as the site: operator's results are notoriously "bouncy" - read, incorrect at the best of times - why didn't you use the "URLs in web index" under Sitemaps in Google Webmaster Tools for the page counts? I'm curious how the graphs would have looked with that line also.
Also, to find the non-dupe pages, did you use ** --view?
In this particular case, I found the index had been pretty stable, so I stuck to the site: operator. It also seems to update a bit more frequently - I'm always suspicious of the timeline of the Webmaster Tools data. I feel like site:, despite its flaws, reflects the current index.
I pulled the non-dupes just from clicking through until I hit "Omitted Results" - the index was small enough to do that. For the past year or so, I haven't found the older tricks for seeing non-duped page count to work very well. Practically, though, the two graphs ended up being similar (one number was lower, of course, but the trend was the same), so it turned out not to matter much.
I submitted a reinclusion request myself a couple of months ago. I'm not sure you should put so little store in just how important a move that was, and how good Google are at responding to them. Mine appeared to be read, and actioned within 48 hours.
My own experience of the canonical tag? Be very, very careful, or don't use it at all.
I completely failed to mention in the post that I did get a response about 6 days later, which isn't bad by Google standards. I'm doubtful about next-day service, but you're right, there's no good way to tell what finally made the difference. Obviously, had this been a non-experimental situation, I would've done the first 3 recovery steps all at once and then reconsideration a bit later.
I love that you had the, lets be blunt here, balls to test the improper use of the rel=can tag, fair play to you.
Key learning’s for me here is that fact that traffic did recover at least. In fact there were almost instantaneous results. I recently had a client with one of the most complex could-be duplicate issues. As result, I have had to recommend the partial use of the rel=can tag. Results are still sketchy, but things do look to be improving.
Haha. I loved your reconsideration request to Google. With the "big G" we often feel we need to be formal and "to the letter" - but at the end of the day, its a good old human that has to read your 200-odd words.
Great job. I'm sure your request was (at least) discussed by a few Googlers' at the water cooler :P
There's more method to my madness than I sometimes like to let on. You have to tread carefully, of course, but reconsideration is like any process with far more "entrants" than "winners" - sometimes, it doesn't hurt to find a way to stand out.
I'm surprised how many reconsideration requests that I hear about second-hand seem to be angry or even downright hostile to Google. I understand why people feel that way sometimes, but stop, take a breath, and look at it rationally. Who's going to help you when you just got done telling them how much you hate them?
I had some major issues with this very point. Wordpress automatically adds a canonical tag, but some "pages" in my setup weren't blog posts but custom PHP within the Wordpress "page" framework. I hadn't worked out that the canonicals were making ALL my pages look like one and the same. It was only after submitting a sitemap that I realised Google just wasn't indexing those pages at all.
My post https://www.caperet.com/2010/09/a-wordpress-canonical-problem/ discusses this in some detail. I'll have to be patient though, because a couple of weeks later and still none of my pages are listed, though I have fixed a number of things including canonical metas, extra meta guff, and parameter handling in Webmaster tools.
Great post Pete.
I have experienced lots of sites using it the wrong way as well, for instance doing a canonical rel to a page, that has a 302 to the former...
Thanks for all the tips!
Thanks for post. I had no idea that this might happen, but it makes sense. Good to know to be careful about sitewide changes.
I wonder did you get whole site back to index?
As a side effect you get a new reader of your blog :)
Wow, I think this is my favorite SEO post of the year, partly because it addresses the conundrum of the re="canonical" tag, and because the post reads like suspenseful fiction... "Day +29 after the zombie bots ate the last of my rankings."
It doesn't surprise me that less than 6% of pages use the canonical tag, as Rand reported to us in the latest LinkScape update. When used properly for what it was designed for, great. But I hear far more stories like this of people screwing it up (myself included) than actually seeing a corresponding rise in rankings when using the tag correctly.
It's a choice between the lesser of two evils. Or a double-edged sword. Or two zombie bots...
"Day +15: Last night, there was a knock at the door. We expected the patrol back any minute, so Frank went to check. There was no patrol, just dozens of bots, their glowing eyes fixed on Frank. They all spoke in unison: "Identity Confirmed: Homer Paige". Frank tried to protest, but the bots can't be reasoned with. They dragged him away to their mountain fortress. I'll never forget his impotent screams: "I'm not Homer, I'm Frank!!"
ROTFL!
Peter that is very brave of you. You put your own site in jeopardy for the sake of seo community. I salute your heoric efforts. The next post i would love to see from you is the effect of cross domain canonical tag but on throwaway domains and not on your own site. Cheers :)
Dang Dr Pete, maaaajor guts to try it on your own site, but it's pretty telling that rel canonical tag can create the same sort of headaches as mistakes in your robots.txt file (another one of those "only a few characters can change everything" bits of code)
And I did find your reconsideration request hilarious, but it also came across to me as a perfect model: be upfront, honest, show humility, hide nothing, and don't be pissed off... I've never had to submit one, but it seems like yours is the model I would follow, so a nice extra for this post!
Yes, the resubmission request is a riot. Somewhere in Mountain View is low level Google employee who reads a thousand resubmission requests a day and has your request pinned to his bulletin board to commemorate the highlight of his 2010.
I have to ask, do you consider the test a success and would you ever do something similar again? I guess the point was to demonstrate the pitfalls of the canonicalization tag (accomplished), but this test is a bit like dropping an anvil on your foot to test the hypothesis that it will hurt!
Thanks for one of the more interesting blog posts in a long time.
What I really wanted to get a sense of was the timeline and the extent of the damage, as well as how quickly certain high-authority pages would be affected. So, in that sense, I think it was successful. I'm not sure that particular definition of "success" is one I plan to repeat :)
Good point. I think the scariest takeaway from your study is that a canonical tag error like the one you tested will ravage your strongest pages first. By the time you start to notice that something’s wrong, most of the damage will have already been done. It serves as a warning to handle the canonical tag like unstable explosives in your hand and to double and triple check your work.
I completely agree. In so many SEO scenarios, a mistake will hit your weakest pages first, but with a bad Canonical Tag or Robots.txt, the pages that get crawled the most frequently will take damage the quickest. In many ways, as soon as I lost these 2 major pages, my site was already bleeding, but if I hadn't been watching closely, I might not have noticed until weeks later.
Good lesson how to prevent using the canonical tag.
The most annoying item on Google is the insecurity factor. I am just wondering that after 3 weeks you gained the most of your ranking pages back. Although they are reacting far more quickly than some years ago it still is a kind of gambling when and if the changes (in which area what so ever... ) are visible.
Great post!. This article will be lesson to many people who never implement Rel Canonical tag before. You may send some deep links to your major pages that you lost the ranking and index that it will send some signal to Google to reindex that again. Good luck!
Hi,interesting test.
A question from me: Do you have experience with self-referencing Canonical Tags and existing duplicate content on the same domain ?
The background is that we specified the Canonical Tag only for certain types of pages of our e-commerce website and on the other pages we included self-referencing Canonical Tags.So far, the use of Canonical Tags wasn't successful. My guess is that Google devalues or ignores the Canonical Tag on our whole website because there are double content pages to the self-referencing webpages.Thank you for your help.
Google generally has no issue with a self-referencing tag. Although they originally said to put the tags on the non-canonical versions, clarifications and reports from SEOs suggest that a self-referencing Canonical Tag is perfectly fine. Given certain kinds of site coding and CMS systems, it's often necessary.
It's more likely that either: (1) Google just hasn't processed the tags yet, or (2) You've got another cue in place that's disrupting the Canonical Tag. For example, sometimes people canonicalize to Version A but then link to Version B internally throughout their site. I don't have hard evidence, but I've observed these crossed signals can delay or disrupt how Google reads your canonicalization instructions.
What strategy webmasters have to use when site has pagination. Suppose we have pages with articles which are breaked to pages mypage.html?page=1, mypage.html?page=2 etc. And we can sort these articles by title or by date like mypage.html?page=1&sort=abc, mypage.html?page=2&sort=abc etc. What canonical tags must be on that pages (which have sort=abc). Must page mypage.html?page=12&sort=abc point to mypage.html?page=12 or mypage.html?page=1 ?
That's a much more complex situation. Canonicalizing paginated results can cause some strange behavior, and they aren't technically duplicates, so it's sometimes best to META NOINDEX,FOLLOW pages 2+. Rand has a good discussion of the problem here:
https://www.seomoz.org/blog/pagination-best-practices-for-seo-user-experience
Ahahah, Your reconsideration letter is great, as is the whole story, Know I know to be verry careful with my canon tags and what to look for when pages magicaly decline and consolidate.
This post saved my life.
I was just wondering why one of my websites was not showing up in the Google index, when this post made me realise that I had copied and pasted some of the header text from one of my other sites and left in the canonical reference by mistake.
Never again.
Thanks for the heads up.
Hillarious, but certainly valuable. I know which doctor call if I need some URL euthanasia!
So are you using rel-canonical at all (in the proper way and NOT all pages to home page)? Just curious what youre experience is if you are using it.
Note: I'm not sure if the questions was already asked/answered. But gave up looking after about half the posts. :-)
Dr Pete, 4 words for you: Giant Sack o Nuts. 3 cheers for the man with the iron unit!
Dr. P after that hilarious of a reconsideration request, I wouldn't be surprised if the "Google gods" bestowed a little extra favor on your rankings in the SERPs :) Thanks for sacrificing your blog for the benefit of the community.
An interesting experiment, Glad someone was brave enough to try it.
I have not realy looked into the Canonical Tag but am now interested in finding out more about it for SEO purposes and find out the potential for other areas.
At least if it goes wrong. You managed to find a way to remedy the situation.
Ashley Mitchell
Web Developer
www.themarketingpeople.com
Wow, another real life SEO drama. I am amazed that you would do something like that. I hope everything goes well for you.
It's a pity I found this post so late :(Something similar happened with my customer's site, their webmaster made it unintentionally. Hope we'll regain positions soon. Thanks for the post!
I am an utter greenhorn (no IT training, but learning a lot in the school of hard knocks) using Blogger for about 2.5 months.
I started with a blogger.com domain but NZ has an arrangement with Google so we are switched to a .co.nz version of the domain. This was a 302 redirect which I was told was not good.
So, I got a custom domain (.info) and filled in the setting box to redirect the blogger.com to .info (which is working) to try and get rid of the 302 redirect.
BUT I am getting the error recorded in Webmaster that there are (3) duplicate descriptions. What I've done (without successly solving this:
1. I deleted the description in the layout (no change); and
2. then deleted the description in settings (no change) and
3. then deleted the description in the Header HTML. But it still said there are 3 descriptions, and
4. then used that canonical tag thing which I probably did wrongly
5. I just followed a recommendation that said:Delete this:
<b:if cond='data:blog.metaDescription != ""'>
<meta expr:content='data:blog.metaDescription' name='description'/>
</b:if>
And replace with the following in the Header (I put it just above
<b:if cond='data:blog.url != data:blog.homepageUrl'>
<b:if cond='data:blog.pageType != "item"'>
<b:if cond='data:blog.metaDescription != ""'>
<meta expr:content='data:blog.metaDescription' name='description'/>
</b:if>
</b:if>
</b:if>
That didn't work either.
Do you have a suggestion?
Thank you.
D
www.truthliesdeceptioncoverups.info
You are a brave soul, Dr. Pete. Thanks for taking one for the team on this. :)
Excellent article.
Just a quick head-up for anyone using Joomla 3 as their CMS. There is a bug in the default sef plugin that generates incorrect rel=canonical tags . I couldn't work out why our indexation was decreasing (from 2,500 to 600 pages in 1 month), then I noticed some completely wrong canonical links in the page code.
A quick fix is to comment out lines 49-52 in sef.php, or wait until the next patch gets rolled out.
I know this article is old, but I'm catching up on some relevant posts to my current position with a client or two. It is great to see that such a problems can be fixed.
Out of curiosity, how long did you have to wait for a response from Google, or did you not get one, to your reconsideration request?
Brave man for embarking on this, knowing it would be catastrophic!
This was back before the new reconsideration responses, so I think I just got the typical "Your request has been processed" email. It was pretty quick - maybe a few days?
Okidays - I finally got mine back today (interesting timing), after 5 weeks. Turned out to be unhelpful though, because there's no manual penalty applied, the site ranks in the top 20 in Bing and Yahoo and outside the top 100 in Google...weird.
Never mind, thanks anyway :)
Interesting stuff. Risky and bold move but love the invaluable data you've gained and shared with us.
At least your top performing pages are back.
So how would one go about safely using rel=canonical tag for ecommerce sites that use ?query parameters with dynamic page titles but duplicate meta descriptors? Its a bit confusing but theres quite a bit of systems that do this.
Thanks for the experiment, Dr. Pete!
Have you ever tried to completely remove all pages from G's index (using robots.txt and GWT) - and then remove robots.txt and re-submit. That might clean all canonic issues....
I just went from "gonna have to implement this" to "I'd rather lose few links here and there and save rankings". Dr.P wonderful article and I truly appreciate your dedication and teaching us what not to do. 301 is still a way I trust most...thank you
I actually had this problem with a plugin I built for Wordpress. The plugin itself creates new wordpress pages, but when Wordpress made the rel=canonical tag standard in the core with the release of version 2.9, all of the plugin created pages were given the homepage as their canonical url. As a result, several important pages for many of my websites were dropped from the index.
Since many of the dropped pages had their own unique backlinks, the homepages of those sites actually gained some ground in the SERPs. So, I was getting slightly more traffic for my target keywords, but lost virtually all of my long tail traffic.
I've since fixed the problem and the de-indexed pages are slowly coming back.
I had a similar problem (commented it further up) and wondering what kind of timescale you had before re-indexing happened...
I just double-checked on Webmaster tools and I see my pages are finally listed, 3 weeks after my modifications.
Wow Pete. I remember when you had tweeted awhile back hinting at some catastrophic damage control you were doing. You are one brave dude to so thoroughly monkey with your own site. You were being blinded by science...SCIENCE! (apologies to Thomas Dolby)
Your reconsideration letter to Google was totally hilarious. It belongs in the web hall of fame.
Dr. Pete, your resubmission request was hilarious...next experiment I'd like to see would be a side-by-side comparison of reinclusion requests, one just straightforward "I broke my site, now it's fixed, please include it" and one with humor, honest details and some begging and see if THAT makes a difference :-)
In 100% seriousness, what I see way too often is people who have 15 glaring problems and then do this:
(1) Fix Problem 1 (usually, the easiest)
(2) File for reconsideration
(3) Did it work? No?
(4) Fix Problem 2
(5) File again...
If there's a way to tell Google that you don't care at all about the problem or fixing it, that's the way to do it. Google also isn't going to do your problem-solving for you, right or wrong. Reconsideration isn't a consulting service.
Great case study and as others have said you got balls of steel to try this on an authoritative site, but I understand your thinking behind your choice in this regard. The reinclusion request was great and I am interested to know if/when your site will get back to it's original index page count and rankings for those pages. Please keep us updated!
This is assuming, of course, that at the time of posting this it is still not back to where it was. I saw you mention in your comments that they responded within 6 days of the request, but next it will be interesting to see if how long it takes to get it back to where it was before your test.
Lmao!
My God... that's a prove that you are insane Peter: to experiment directly on your own website. Anyway we love you also because of this crazy side of you.
The is surely a great useful invention, but very tricky. So tricky that if I can use other more reliable methods to obtain the same result (as 301, or the same noindex, follow), I go for the other methods.
Luckily, for the most common open source CMS is becoming usual to see great mature SEO components/plugins taking care of the canonical issue (for instance Magento in its latest version, or All in One SEO for Wordpress, or Joomsef for Joomla).
The biggest problem is when it comes to SEO built-in CMS, where maybe the best solution - please note that I'm just thinking and not sure if I'm correct - could be to assign as canonical any primary URL generated and then all the ancillaries (tags, categories, archives...) should have to use that primary URL as canonical.
For instance: https://www.seomoz.org/blog/catastrophic-canonicalization should be the canonical URL for this page (and present in this page) and used in all the ancillaries pages connected to this page. >> note: why SEOmoz doesn't use the ?
P.S.: you letter to Google is wonderful.
Now, I'm sure that the canonical tag works ^^
Off course, use it with caution ;)
That's interesting. In the very early days of rel=canonical I accidently pointed to the same url on all three pages of a small site. Rather than getting a penalty, I received a PR3 as a reward - on a 8 week old domain with 2 not so great backlinks. I was actually loathe to sort the problem out until Google dropped it back to it's rightful PR1 at the next toobar export.
It can be really tough to tell with Toolbar PR, since the delay could be 3+ months - you might be getting credit (or discredit) for something that happened a long time ago. Unfortunately, even search traffic is a bit hard to gauge, since an existing site is always getting traffic fluctuations. I felt indexed pages was probably the best measure, but it definitely took the whole picture to see the damage.
I guess I don't see the point of this. You effectively redirected all your internal pages to your home page. Well, no wonder you had issues.
My concern is that people aren't going to read exactly what happened here and takeaway that the canonical tag must be bad. There's nothing bad about it -- it was just used entirely incorrectly.
Yes, people shouldn't redirect their entire site to one single page. And yes, that's a horror story I've heard as well, one that's easy to do.
But similarly, people can redirect their entire site to one single page using standard redirection. You don't hear that horror story much, despite it also being easy to do, because such redirection is easily spotted.
I guess I kind of feel sorry for the canonical tag in all this. Lots of SEOs pushed to have it possible. I'd hate to see anyone take away from this that the canonical tag should now be avoided. It should just be used correctly.
I definitely don't want people to walk away with the feeling that the Canonical Tag is an inherently bad thing, and I don't generally spend my time bashing it. I've found it very useful for some forms of duplicate content, in a way that benefits SEO and search users. Unfortunately, I'm not sure how to adequately warn people of the dangers while also hedging everything I say. Used poorly, it is dangerous, and I want people to know that.
The core point, though, wasn't to just demonstrate that it's dangerous. That was pretty clear from plenty of anecdotes and some client experiences. As you said, if you use something wrong, bad things generally happen. You can beat yourself to death with a hammer if you're so inclined. What I wanted to do was provide a real case study of how it impacts sites, what the time course is, and how long it might take to correct if something accidental happened. We've had a fair amount of Q&A from people who ran into this situation and needed help fixing it, so I wanted more data to assist those people.
Aside from the ease of adding a sitewide tag, I also think there's a unique danger with the Canonical Tag. A "good" tag and a "bad" tag look virtually the same. This tag would be completely appropriate and even beneficial on my home-page (especially if I had accidentally indexed non-www, https:, index.php, etc.), but in another setting it was catastrophic. That's the take-home message to me. This is a tool that can go very wrong, and people need to be careful and plan well before implementing any sitewide strategy.
I guess there's as much a danger with the robots.txt file. A "good" file and a "bad" file can look virtually the same, but the bad one could wipe out your entire site :)
I agree, people need to think carefully about using any tool that has a site-wide impact.
Danny - what a peculiar takeaway?... My thoughts were - this is great, now many marketers and site builders who've been so skeptical of whether Google actually respects rel=canonical, particularly in cases where the content between the two pages may not match precisely, can feel more confident about using it as a way to show the engines which page is the right source.
I'd struggle to imagine that professional marketers and site builders would get the impression that rel=canonical is bad simply because Pete's shown that you can make mistakes that have serious reprecussions. That's true of everything from meta robots to robots.txt to titles to URL strings to HTTP headers. I'd think if anything, this test shows rel=canonical to be more of a useful tool (though it illustrates a potential hole/weakness Google might want to shore up for those sites who do make mistakes like this).
I guess I've heard more people sounding dubious about the canonical tag and if they should use it than maybe you've heard. Dunno.
Testing what happened I suppose is interesting, sure. I'm not trying to attack Pete over it, or anything. I guess I was just scratching my head because it felt kind of obvious.
I mean, if I redirected my entire site using server-side redirection to the home page, I'd expect "catastrophic" results. But you never see anyone test this.
If I blocked my entire site using robots.txt, I'd similarly see catastrophic results. Again, no one really goes out to test this (we've had acccidental tests, of course).
What I'd rather see more testing of canonical is testing of the way it is supposed to be used...
1) When did you add it to duplicate pages?
2) Did it recognize and effectively redirect to those pages as advertised?
3) How did cross-domain canonicalization work? As advertised?
That's what I'm itching to see tested.
And if no one comes away thinking that the canonical tag is something to avoid, well, that's cool.
It's probably just a shift in perspective, but lately, I've been spending more time away from SEO blogs and a bit more time on client work and here in SEOmoz Q&A, so I'm rediscovering where the things I thought were "obvious" are still giving people a lot of grief. I hope my experiment/post was useful to that crowd, but I'm as interested in many of those follow-up questions as you are.
Danny, the biggest take away is how devastating and long lasting the effects of a mistake are.
An accidental redirect (which I have done) healed relatively quickly without needing direct requests for mercy.
I have had the same experience as Pete including asking for a reconsideration and I still don’t see formally strong pages being re-indexed 3 months after the fact. That is something every site admin should know. This isn’t another soft indicator like the sitemap change frequency or priority tags – this is serious stuff.
Yes You are absolutely right. Google penalized wrong use of rel=canonical tag. So be careful at the time of using it. It is better that You can go for 302 redirection. Goggle deindexed the old page. Linkjuice may be pass your new page.
Just checked all canonical tags across all of my sites...thanks for the scare
Nicely Done!
a great read.
I think this is a good tool (this post) to show someone who doesn't buy in to doing some in-house SEO. Just another reason why people need to pay attention to this industry.
Canonical tags can be deadly but they can also be really useful. I implemented them onto a site with canonical issues and quickly saw an increase in rankings/traffic after the change.
One of the scariest things you can do is contradict your sitemap, 301 redirect, and canonical tags. I contacted Google about what preference they might have in a scenario like that and they said if it happens lol it's user error. Basically there's no fix in place for such a situation so make sure it all matches up and there's no canonical issues.
Interesting test I commend you for playing with the Cononical tag it is nothing something to mess with.
Thanks buddy it'a really nice introduction u describe here.
We did the exactly same canonical mistake on our site for about a week. However not on purpose.
Google dropped thousands of our pages from the index. We realized the wrong canonical tag and deleted it. Google is crawling our site everyday but it still doesn't put the affected pages back on the index and search results.
What would you reccommend?
Thank you for the post. It helped us a lot with our actual problem. It is really depressing to lose 90% of traffic from Google and not be able to repair it.
Unfortunately, the self-referencing Canonical Tags can be a bit tricky to implement on some sites - I was lucky in that most of my pages were blog pages. If you're talking about a large, e-commerce site, just plan accordingly and be careful. Sometimes, the rush to fix one problem when you're feeling desperate can create bigger problems.
Despite my attempt at humor, I do think this is a perfectly appropriate situation to use reinclusion requests. You know what you did, it was an accident, you've undone it, and now you need pages restored to the index. Lay that story out for Google. There are no guarantees, but this is a situation where it isn't going to hurt. You aren't trying to hid anything and you didn't do anything black-hat.
Thank you.
You are probably right. We were just worried that reconsideration request might take several weeks to be processed at Google. Actually your 6 day response from Google was pretty fast.
It may take some time until we will see some change, but I surely will post here how our site is doing.
I was hoping I'd find that a relatively simple tactic reversed the situation quickly, since almost everyone who does this is innocent, at least in the sense of not meaning to manipulate search results. Unfortunately, all of the solutions seem to take some time to settle out. Post-Caffeine, it seems like certain problems have a faster impact than ever, but reversing them can still feel painfully slow.
I was in the exact same boat as Yajda. I submitted a reconsideration request with details of the mistake (but lacking the humour and deep grovelling). It took longer – two weeks then I saw an immediate boost in traffic and number of pages indexed.
However it didn’t completely fix the situation. I still have around 1000 pages not indexed dispite new backlinks and sitemaps. Even some formally strong pages that would get 100’s of Google referrals a day are still missing 3 months later. The offending canonical link was up for under two weeks.
Bing and Yahoo only took it as a suggestion and their traffic didn't alter that much. It seems overly harsh and unfair of Google to take the tag so seriously for such a long time.
That reconsideration request was awesome! I'd love to sift through all the requests they get. I am sure it is very entertaining and/or annoying.
Either being entertained or annoyed is an appropriate reaction to 99% of everything I do.
Fun read, Pete! It's crazy to see how tools aren't monitoring for mistakes in using this tag yet.
Really cool! Thanks! :)