Since the Panda update, more and more people are trying to control their Google index and prune out low-quality pages. I’m a firm believer in aggressively managing your own index, but it’s not always easy, and I’m seeing a couple of common mistakes pop up. One mistake is thinking that to de-index a page, you should block the crawl paths. Makes sense, right? If you don’t want a page indexed, why would you want it crawled? Unfortunately, while it sounds logical, it’s also completely wrong. Let’s look at an example…
Scenario: Product Reviews
Let’s pretend we have a decent-sized e-commerce site with 1,000 unique product pages. Those pages look something like this:
Each product page has its own URL, of course, and those URLs are structured as follows:
- https://www.example.com/product/1
- https://www.example.com/product/2
- https://www.example.com/product/3
- https://www.example.com/product/1000
Now let’s say that each of these product pages links to a review page for that product:
These review pages also have their own, unique URLs (tied to the product ID), like so:
- https://www.example.com/review/1
- https://www.example.com/review/2
- https://www.example.com/review/3
- https://www.example.com/review/1000
Unfortunately, we’ve just spun out 1,000 duplicate pages, as every review page is really only a form and has no unique content. Those review pages have no search value and are just diluting our index. So, we decide it’s time to take action…
The “Fix”, Part 1
We want these pages gone, so we decide to use the META NOINDEX (Meta Robots) tag. Since we really, really want the pages out completely, we also decide to nofollow the review links. Our first attempt at a fix ends up looking something like this:
On the surface, it makes sense. Here’s the problem, though – those red arrows are now cut paths, potentially blocking the spiders. If the spiders never go back to the review pages, they’ll never read the NOINDEX and they won’t de-index the pages. Best case, it’ll take a lot longer (and de-indexation already takes too long on large sites).
The Fix, Part 2
Instead, let’s leave the path open (let the link be followed). That way, crawlers will continue to visit the pages, and the duplicate review URLs should gradually disappear:
Keep in mind, this process can still take a while (weeks, in most cases). Monitor your index (with the “site:” operator) daily – you’re looking for a gradual decrease over time. If that’s happening, you’re in good shape. Pro tip: Don’t take any single day’s “site:” count too seriously – it can be unreliable from time to time. Look at the trend over time.
New vs. Existing Sites
I think it’s important to note that this problem only applies to existing sites, where the duplicate URLs have already been indexed. If you’re launching a new site, then putting nofollows on the review links is perfectly reasonable. You may also want to put the nofollows in place down the road, after the bad URLs have been de-indexed. The key is not to do it right away – give the crawlers time to do their job.
301, Rel-canonical, etc.
Although my example used nofollow and META NOINDEX, it applies to any method of blocking an internal link (including outright removal) and any page-based or header-based indexation cue. That includes 301-redirects and canonical tags (rel-canonical). To process those signals, Google has to crawl the pages – if you cut the path before Google can re-crawl, then those signals are never going to do their job.
Don’t Get Ahead of Yourself
It’s natural to want to solve problems quickly (especially when you’re facing lost traffic and lost revenue), and indexation issues can be very frustrating, but plan well and give the process time. When you block crawl paths before de-indexation signals are processed or try to throw everything but the kitchen sink at a problem (NOINDEX + 301 + canonical + ?), you often create more problems than you solve. Pick the best tool for the job, and give it time to work.
Update: A couple of commenters pointed out that you can use XML sitemaps to encourage Google to recrawl pages with no internal links. That's a good point and one I honestly forgot to mention. While internal links are still more powerful, an XML sitemap with the nofollow'ed (or removed) URLs can help speed the process. This is especially effective when it's not possible to put the URLs back in place (a total redesign, for example).
Nice post DR Pete, I agree this is an area where people make a few small changes and it can have a big impact on the website overall and hurt the linking strucutre.
In my eyes straight 301's are the best fix if the content is really poor or to re do the content on the page to make it more unique, I never really have been a fan of no index tags or robots.txt blocks unless it is a private section of the site which is a must to be blocked.
But overall love the posts you have been making, the images are the best =)
This blog post and your comment makes my day. Right now, I have similar problem with my eCommerce website. I want to give one live example to give more idea. As per your suggestion, 301 redirect is best solution for poor page. But, I can't remove my review pages because, I have to make it live.
This is my product page:
https://www.vistastores.com/patio-umbrellas-california-umbrella-alus756-sp57-yellow.html
And, This is my review page:
https://www.vistastores.com/review/product/list/id/1453/
I have set Meta Robots NOINDEX,FOLLOW, to product page and restrict review folder by Robots.txt.
I was surviving with big question about my product page performance. Product pages' indexing is quite poor due to certain issue. After reading your comment and this blog post, I assumed that, it's happening due to 3 implementation on review page. What you think about it? Can you give me some input over here? I did not want to add my question in comment because discussion board is right place for it. But, I imagine This is right way and platform to add my issue on same subject blog post.
BTW: Dr. Pete. As I said, You make me happy with this blog post. Thanks for sharing.
"I have set Meta Robots NOINDEX, FOLLOW to product page"
If I've read your comment correctly it sounds like you're planning to apply (or have already applied) NOINDEX to your product pages?
If correct, that is surely a very bad idea...you'd be telling the search engines not to add your *product* pages to their index? It's late here in the UK, so apologies if I've misread your statement.
+1 from my side to find out my comment mistake. That was for my review pages. Today, I have good news about this error. I have removed internal review system from website due to this issue and integrate PowerReviews. I have restricted all review pages by Robots.txt to fix this issue.
I should point out that I only used NOINDEX as an example - my hastily-added point at the end was meant to reflect that the same goes for 301s and rel-canonical. This problem applies to a bunch of tactical combinations.
I will say, though, that in this particular example, a 301-redirect wouldn't be feasible. The review pages are actually distinct pages, and users need to see page that's tied to the proper product. You could capture that in a cookie or session variable and then 301-redirect, but you wouldn't want to send visitors all to one generic review form.
Thanks for your reply. I got your point. I just want to capture page rank which may available after getting few unique review. I'm going to implement NOINDEX FOLLOW tag which help me to avoid poor indexing as well as capture page rank from those pages to my valuable product page.
Hi there.
Got a minor issue.
You've said that Google has crawled the Review pages.
You then state that applying NoFollow to the links to those Review pages cuts those links,
and means Google cannot crawl the Review pages to see the NoIndex.
That's incorrect.
Once Google has crawled a URL - it knows it.
It doens't need links to it any more.
It will still attempt to visit that URL.
You could remove ALL links to a previously crawled URL - and it will still get visits from GoogleBot.
(Yes, no links/fewer links means lower frequency/slower - but still gets hit!)
Further - NoFollow does not mean Google will not nofollow that link.
G have admitted that it is only a "suggestion" and may decide to crawl it anyway.
You make a good point, and it's why I said "potentially blocking the spiders". To be honest, I didn't want to get deep into the nuances and confuse the issue. Google will recrawl from it's "memory" of your index, to a point and does cache your indexed URLs. What I've found in practice, though, is that that usually only covers a small percentage of pages, especially when those pages are deep and/or duplicates.
In this case, if you added nofollow and NOINDEX on the same day to the 1,000 review pages/links, you'd probably see some initial drop, but I suspect it would die out around the 20-30% mark. At that point, Google would stop revisiting the pages that no longer have internal links. They might gradually prune off more, but ends up being a very slow process. By keeping the links open, things go much more smoothly, in my experience.
I'm not disputing the fact that G requires the links to be open.It's covered in enough places (such as Googles Forums).
I was merely pointing out the line about nofollow resulting in G not crawling the destination URLs.
I think there's anecdotal evidence that G doesn't crawl the destination URLs as vigorously or necessarily index them and honor the signals on those pages. You're right, though - it's not black-and-white by a long shot. I didn't want to hedge that point too much and make a confusing issue even more confusing for folks.
Fewer Internal Links = less PR Flow, lower Importance = slower crawl rate.Nothing really confusing about it (atleast, there shouldn't be).
If you really want stuff gone;
That lot should do the job if you are in a hurry.
Not overly scalable - but should work well enough so long as you are willing to invest a little time/patience.
If I had a nickel for every time I've had to explain this, I'd have at least five dollars.
Now I can just send 'em to this post.
Thanks for saving me time. Enjoy your nickels.
Nice post Dr Pete.
So we have done exactly this on our large E-Commerce sites -> a mix of nofollow and noindexing. We are continuing a round of no-indexing on various sections and gradually we are starting to see positive results reported back in WebMaster Tools.
The purpose of the process of de-indexing is to improve our crawl equity and get Google to the fresh content as it updates daily, rather than the bots wasting their bandwidth on deeper and less relevant pages - like deeper pagination and faceted navigation.
We don't believe in 301'ing any of our content - only on expired listings and classifieds. However in the case of facted navigation we are using rel canonical where we are starting to see great results.
Hi DR Pete,
Your process is right, adding nofollows is not the right way the page to be seen by the spiders, furthemore when we want the page no more indexed. Another common mistake, is also when we put in the robots.txt a disallow to prevent pages or directories from being indexed.
If we write
User-agent:*
Disallow: /page1/
we prevent page1 from being crawled NOT indexed, so the on page robots is the best choice
Good advice on planning out your strategy.
I was under the impression that the poorly named rel nofollow does not stop Google’s robot from crawling to the destination page, but just causes the PageRank algo from passing any rank juice.
So it will not stop the destination from being re-crawled and indexed.
As already pointed out, once Google has actually indexed it seems their ability to re-find it during a crawl is not relevant. It's in, and will stay in until there is a direct signal to remove it.
The meta robots tag with noindex seems to be the most direct but slow way to get a page removed from search results. However, I think it is still indexed as you can specify follow which means it has to parse the page, it just does not get included in search results. Like nofollow, noindex is also poorly named.
What I see as a potentially real issue is the blocking via robots.txt. This seems to stop Google from re-crawling a page and therefore finding out it's meta robots tag requests it not to be indexed. So Google webmaster tools gets full of "blocked by robots.txt" errors (does it still report them?). I guess it's the same though, the page is in the index but not shown in search results.
In reality, I don't think there is any way to truly get a page removed from the index, you can just block it from being in search results.
See my reply to @RupertDeBere - I didn't want to confuse the issue too much, but you're right - Google will still cache and recrawl some nofollow'ed URLs. What I've found, in practice, is that deep/thin/duplicate URLs tend to only get recrawled partially and unreliably. So, in a situation like this, the nofollow would definitely hamper and slow your de-indexation efforts.
I've actually found NOINDEX to be fairly fast, in most cases. Rel-canonical can be faster, but it has to be used appropriately. I don't like to use it for generic examples, because people tend to get into trouble with that tag.
What frustrates me the most, especially on very large sites, are the GWT options. Removing a page or even a folder can be very fast. Parameter handling, though, which is in theory very useful, is slow or sometimes doesn't work at all. I really wish they'd work to make these tools and how they use them more consistent and reliable.
Today, I was searching How much time Google will take to De-Index 301 redirected pages on Google. I found this blog post where I have submitted my 2 comments with associated example. But, I have very similar different question for De-Indexing of 301 redirected pages.
I have changed URL structure for 11,000 product pages and set up 301 redirect from OLD URLs to NEW URLs on 3rd July, 2012.
As you mentioned, De-Indexing is quite complex method of Google and may take long time. Google webmaster tools have one section to remove URLs from web search. So, Can I use for my OLD URLs? Will it De-Index my actual product page or not?
I'm going to raise similar question on discussion board. But, I think: This is right platform to drop my question. Is there any specific method which enable me to increase speed of De-Indexing?
Right now, I'm measuring my Indexing and De-Indexing ratio via NEW & OLD XML sitemaps.
It can take a frustratingly long time. The problem is that many deep pages aren't crawled very often, and even when Google re-crawls, they don't always process or honor a 301 the first time they see it, in my experience. One key is to leave crawl paths to the old URLs open - they have to be re-crawled for the 301 to kick in. Also leave XML sitemaps with the old URLs up (you can add the new ones, but keep the old ones active for a while), for the same reason.
How did you manage to solve this? Or was it just a long wait? I have an explosion of pages due to some mistake along the line. For 15.000 products hiting the 1.040.000 links in Google.
I have been removing URL's, blocking content in robots.txt but without any effect in the last 2 weeks.
One of the best blogs of the summer Dr. Pete. De - indexing the pages from google is one of the trickiest but easy jobs. Thanks for your invaluable tips.
When we are done putting the META NOINDEX on the pages we do not want to be in the index, we can use the google url removal tool :) to qucikly get it deindexed without having to go through changing each links to nofollow.
Unfortunately, while the GWT tool works very well for a handful of URLs, or at the folder level, it doesn't scale well. Requesting 100s or 1000s of removals is very tedious, and generally not recommended by Google (they said something about it, but I'm having trouble finding the link). It is fast, though - if you've just got a few problem URLs, like a duplicate home-page, it's a good tool.
Tip: You can GWT folder remove pages by file name, for instance all URLs beginning with /apps/removeme.php.
Yes, Google says removing pages for the "wrong" reasons "may cause problems for your site". What problems, you may wonder. Any idea?
i would add a no-index , follow meta tag to review pages and leave it at that.this way you will remove from index, but link juice going to these pages will be retruned though out links.
Hi..All.
Indexing Problem. This is common for all the Websites which are having more 50K urls. Please check with your sitemap XML and HTML. There are very minute errors which will avoid crawlers from indexing. Check with priority and frequencies. Avoid old unused urls.
Hope this helps.
Hi Pete
Great post. It really helpful. Can you help me better understand my scenario? Right now there is a problem with Google indexing our website.We had around 50K plus pages where all were indexed, but as we changed some parameters to pull the data dynamically, suddenly the index dropped to mere 900. Now the indexed pages are increasing slowly, but there was an interesting observation, when we checked the total indexed pages on Google it is 2400, but when we tried to find indexed pages at the next level (Sub folder/sub level) Google shows all thse pages were indexed as well where in we have around 2000 pages in each sub level. What can this be. Can you please help?
Any given site: results can be a bit hard to trust. A couple of suggestions:
(1) Track it daily - you may see some very low days, but then a steadier number that's more realistic.
(2) Dig into all sub-folders, in a logical manner. If those sub-folder counts don't add up to something close to the overall count, either the overall count is suspicious, or Google is re-evaluating your indexed pages for some reason.
Since you did make a major change, I would re-check that history. When you changed parameters, did site-wide URLs change? Did you set up 301-redirects from the old URLs? Did these parameterized URLs create potential duplicate content? There are a lot of things that could've happened when you made the switch.
Can I use robots.txt to deindex pages?Or Google will maintain the cache copy?
Robots.txt ("Disallow") can work well to prevent pages from being indexed, but seems to do a lousy job of knocking pages out of the index once they've been crawled. Personally, I've had better luck with other methods, such as META NOINDEX, 301s, or canonicals (depending on the situation).
Thanks for the easy to understand examples you have given. This helps me to understand the best practice easier-Thank you!
Hi pete this was an informative post and I enjoyed reading it. However I was thinking can,t we have 5 to 10 random reviews about the same product below the form that can actually provide more user generated content and make page unique as well, instead of adding no follow and no index.
I think I may have made the example seem a bit too literal - I was trying to illustrate the kind of situation where this problem could occur. I'm definitley not suggesting this is an ideal site structure.
In this situation, I was assuming that the actual product reviews lived on the product page, and the review page was nothing but a form that just happened to spin out a unique URL for every product. In that case, the review pages would have no search value.
Yes you are right. Thanks for the informative posts.
Nice article bro ithink you r right
During the period where the pages are noindexed, but the links are still being followed, are you passing wasted page rank to these duplicate pages? And if so, should you eventually change the followed links to nofollow links when most of them have been deindexed? Or is Google smart enough to not bother passing page rank to pages that are noindexed?
I can't even explain how helpful this post is. My numbers took a big hit though when I noticed my Twitter and Facebook connect buttons we're getting crawled... Then I recalled that I just adjusted my robots to "close the flow" - I didn't even think about how Google would read that. Thanks a ton!
Hello Dr. Pete! I think people forget that just because their site is recrawled every day, it does not actually mean that every page is recrawled. It also does not mean that Google recognizes all of the new signals. It can actually takes weeks or even months.
I definitely agree that de-indexation requires plenty of time! To my mind this article is valuable enough for the people who have their own website and want the search engine to get the renewed info from their site. See more and go deeper into the topic on deindex.pro
Perfect post Dr. Peter as always, thanks today you saved my time in finding some common deindexing issues.
I don't want to damage my incoming traffic that I do have, but had to fix issues like Leads Vs Traffic conversion was horrible, broken and non-existent links etc. The question is what should I do with my old pages to not negatively impact search engine placement.
Wouldn't be a way better to include the review pages into the product pages themselves?...Much more unique content and less headache with the noindex..
Thank you for the info. Just experiencing such problem and can't get the desired effect. What can you say about removing old unexisting subdomain contents from google?
Thanks for the post, leads me to a quick question about our site:
We have an automated rel=canonical tagging system that uses the current url structure of the page to automatically add and update the canonical tag in the header. We de-indexed/followed hundreds of user pages, however they still have the rel=canonical tag attached to them, should this be removed as well?
I'm not a big fan of mixed signals when it comes to Google, but if you have a self-referencing canonical and a META NOINDEX,FOLLOW, I think the NOINDEX is going to win out the vast majority of the time. The proof is in the pudding - if the pages drop out of the index, you've got nothing to worry about.
Certainly a very interesting and quality articles. My compliments to the author..
Hola Doctor,
I would rather pay attention to your finale, talking about sitemaps and ways to discover isolated pages.I did some unsuccessful testing trying to get isolated pages indexed only by being prefetched with link rel=next.
Have you - or any of my post neighbours - ever tried this? If so, any positive result?
Thanks for taking the time to write this post, it is not much for many people out there (because we want you to grow us very clever SEO people) but A LOT of other people may have found this very instructive.
Happy weekend everyone
I think this work can also be done in a different way, through xml sitemap as even with nofollow tag google can crawl linked pages but it will not pass any link juice from the host page. I have past experience that inspite of any internal linking i have managed to crawl few pages through xml sitemap. And once you have that access for spider bots, then placing the meta noindex would have done the same job.
Anyways good work and nice info for webmasters.
Great post. My first action would have been to nofollow the review links too, but following them to allow the noindex tag to be seen is (now) insanely obvious.
P.S The diagrams rock
Nice post Pete.
I would like to add a couple of things.
In the part 1 fix, don't you think that as long as the review pages appear in the XML/HTML sitemaps search engines would still be able to reach them regardless of the nofollow?
The second one is about the very common misconception that dissalowing certain directories/files in robots.txt will make them drop out of the index. Unfortunately, this is not neccesarily the case and Google suggests adding a robots noindex as the best way to remove pages from their index.
Using WMT again doesn't always work, and if it does that would only work for Google.
I'm embarrassed to say that I meant to address the XML fix, and I completely forgot. I don't think it's quite as effective as keeping links active, but it certainly can help. It's also good for situations, for whatever reason, you just can't put the links back. I'll add a note in the post - thanks.
I have a wordpress based site and am using sitemap plugin to control crawling, www.mefindcoupon.com, am curious if not indexing categories, and tags pages will benefit our site. What do you think......... also, have over 33,000 indexed pages, would it be wise to eliminate as many as possible. THANKS
Good Post Dr. Pete! Non technical SEO’s often make this kind of logical mistakes. As a non technical seo I want to ask one question and want to share one technique, don’t know they work or not!
Question: What happened if we block pages in robots.txt is they remove from Google index with the passage of time?
Suggestion: I often use another technique that block Google to do not read my dynamic pages with the help of parameters in Google webmaster tools and I think it is a good way to clean our index (I am seeing some results but still in testing phase).
Dr. Pete you may want to tag this post "Magento"
"Pro tip: Don’t take any single day’s “site:” count too seriously – it can be unreliable from time to time. Look at the trend over time."
Definitely. It can all over the map. You can query pages in a subfolder too, not just the whole site, and sometimes the counts in a subfolder will be more than the counts in the folder above it. In general Bing's numbers seem to be a little more unstable (and generally smaller) than Google's, at least in my experience, but it's getting better too. I've been weekly tracking some "site:" counts in an Excel spreadsheet for over a year along with the Webmaster Tools sitemap counts, and monitoring those trends together helps me know I'm moving in the right direction.
I like to break sites into subfoldrs with "site:" and then see if those sections add up, as a gut-check. It not only helps validate the overall indexed page count, but it helps me spot missing problems. Unfortunately, that's a tedious, manual process. I do it on site audits, but it takes a couple of hours to do it well (plus all the time to track it over time).
The same can be done with the robot.txt file no?
I really like the diagrams in this post Dr Pete - very clear & a pleasure to read :)
Meta NOINDEX tags are also a good fix for certain scenarios like getting rid of .HTML files on old IIS servers where 301s aren't easy to manage ;)
i
Good tip on controling your duplicate content Dr. Pete.
But once those review pages start being populated with reviews, they're no longer duplicates, and do become valuable for anyone searching "product1 reviews". I think you would want to monitor your review pages closely, or build in automation that would add/remove the noindex tag based on the presence of a review.
This was just an example, but I intended it to be just a review form tied to the product - the reviews (hypothetically) would be on the product page, but each link to a review form would have a unique URL. I saw a similar case recently. Of course, I don't intend that as an ideal structure - just mean it to illustrate a scenario that could cause this problem.
Canonical tag worked really good every one of the (feels like hundreds) ecommerce sites I've optimized, but it can take a little while for Google to "get it." Ie; whenever they run their canonical algo, then compile.
But since ecomm sites are dynamic, always be on the lookout for a page template that didn't do what you expected. The bigger the site, the more the complexity, and the larger room for error. Scan your templates in QA. I've caught sites improperly funneling spiders months after implementing the canonical because the developers thought they knew what they were doing, or because users were using templates in a different way than they were intended.
Im actually dealing with a similar situation at the moment. An e-commerce site with about 800 products. My initial problem was figuring out how to make sure the category pages weren't going to be causing any duplicate content problems. At the moment I just use the canonical attribute for the product pages, but I'm not using noindex or nofollow anywhere internally. As far as I'm aware, this should be sufficient for Google to figure out what's what.
I was experiencing duplicate pages on my site a few weeks ago, site: on Googlwe showed 7220 pages ! I had to create a rule on my htaccess file and change some code configuration and that fixed tha problem.
Maybe it's my ego, but I can't help think I was an inspiration for this post in some small way! Dr Pete gave me the same advice a couple of weeks ago looking at an issue I had with faceted navigation on an e-commerce post.
One thing I would add is that de-indexing can be SLOW. We had 400k pages indexed on a site that realistically should have been more around the 5k pages mark. Still not sure whether that resulted in a panda-esque penalty of just massive cannibalisation, but most of the most valuable pages absolutely tanked in the results.
We have gone with visciously strict canonical tags. Our plan is to Canonical back to just the bare minimum then look at re-introducing sections one at a time if we think they are significant enough.
Changes were made on-site just over 1 month ago now. We're currently at around 220k indexed pages, although Google does periodically tease us by showing 4k. A few results are coming back, although other things are happening so it is hard to say if that is solely down to the clean-up.
What is worth noting though is that the speed of de-indexing is dropping. Unless the results that we are getting teased with go live, we're not expecting this to be "fixed" for quite a few more weeks.
Sounds like the makings of a good blog post.
Am considering it... I might wait to see whether it has a happy ending first though!
Are you sure you want to be the inspiration for this post? ;)
It really is a painfully slow process. People forget that, just because they're site is recrawled every day, it doesn't mean ever page is recrawled or that Google honors all of the new signals. It really can take weeks or months, and you often have to adjust as you go.
Be a little careful with canonical - used too broadly, it can give you some trouble. It can also be tough to reverse, if you want to re-open content later. I actually like NOINDEX a bit better for that. It's a little easier to reverse if you just want to add content gradually. Of course, it's very situational, which is what makes giving advice so hard.
We will probably be changing the URL structure of any "re-opened" sections anyway. The main motivation for this is a technical consideration, but will hopefully save us from any stubborn canonical instructions that we can't undo.
Compared with the 400,000 URLs we've be staring at lately it's a pretty small consideration anyway.
Thanks Dr Pete for the article.Can I clarify the suggestion for new sites to use internal rel=nofollow , I was under the impression any internal nofollow is not best practice, as this is seen as PR sculpting?
You certainly want to be more cautious, but I think internal nofollows still have a place when you really want to discourage crawlers from going deeper down a path. I sometimes put it at dead-ends or at layers where anything beyond that layer is content you don't want indexed (shopping carts, for example). You can NOINDEX, etc., of course, but I find that the nofollow helps the crawlers sort out what's important and keeps you from wasting their bandwidth. I don't, admittedly, have strong proof of this.
Dr Pete,
Really good post. I am also facing such kind of problem, when I handle e-Commerce sites. Thanks a lot for your valuable suggestion.
By the way your picture representation is really intelligent.
Hi Dr Pete,Again, I am ready to get tons of dislike for this comment, but not every comment will be as sweet as sugar. The post is having no meaning, you could have written this post in 2-3 lines easily instead of uselessly stretching such a minor thing. When I saw the title of the post, I was like, Oh, I am going to have something really good and techincal from Dr Pete, but disappointed.
The title is much irrelevant, in a sense that title is too heavy while the content have nothing worth noticing. All you have to say is that add a noindex to the pages that may seems similar, while the title is saying something else.
And to be true, the post just ended before it even started, now thats not the way posts from Dr pete are. Next time you have to write a post, try to do some home work, write all the points on which you have to write post and then write about it. Otherwise writing such thin posts and uselessly stretching a minor topic wont yeild anything.So now I will get tons of thumbs down from lots of blind followers, who are following the trends of liking stuff from famous face(even if that are time wastage sutff) and disliking comments that shows mirror to the big faces.best of luck for your article next time. This one is flopped :)
I enjoy succinct posts. Too many are streched out way beyond requirements. In my ideal world every post would be bullet pointed with important content highlighted at the start and then a deeper explanation below if required.
This post raises some interesting points to keep in mind when carrying out something which is fairly important in SEO. And remember that not everyone who reads SEOmoz is an SEO pro. Though I might agree, the title is a little ambiguous.
Without my critical comment, I am damn sure you would never have said anything about the title of the post! But look in this community we have to be true and transparent rather than sugar coated fake memeber with diabetes causing sweet comments. TAGEFEE iswhat we need to follow here, and Mozzers may agree(thought my comment may have sparked their anger nerve but still I can hope for something) with this TAGFEE things.But am glad I got one more vote about the ambiguous title... and I hope that the censoring authorities of the SEOMoz should censor and reject any future post with ambiguous titles like this one.
Hey Asad,
I think your assumption is that everyone understands the concepts illustrated in the post and just needs the 'bullets'. Whilst for me personally a shorter version would've been just fine, I'm sure there are enough people out there who are glad to see a graphical illustration of how this works with individual steps.
at the end of the day, as you already knew what is going on, it shouldn't have taken you more than 30 secs to skim the post, so no real 'harm' done in terms of your time wasted, however for anyone new to this, there was a lot of 'good' done.
It's not like it's a 7 minute video where you have to sit through the entire thing just to find out it was nothing new for YOU.
just my $0.02
Veit
Respected eleuth, its not about my time wastage, if I am here in SEOmoz I am ready for both good and bad stuff. And I have the option to just leave the community and not come here, but this is not the case, all I want is to imrove the stuff.I dont know if you have idea or not but this Dr Pete is a super genious when it comes to SEO, and I expect much much better stuff from him rather than such hollow post. The article has a good and valid point, but its a minor issue for which there is no need of taking a whole post. And Dr. Pete is you are listening, Eleuth, said in his comment above "personally a shorter version would've been just fine".
Now let me answer any point, If you are reading a book by einstein, you would most probably be expecting something from advanced physics rather than basics. Same is the case here, for newbies there are a lot of places to learn, but from Dr Pete atleast I expect something a bit advance and posts with thick content.
I get it. We all do. You disapprove of the packaging. You like the message (sorta) and dislike the delivery (alota). Truly and sincerely: I get it. Message received. But here's the thing about your delivery and analysis of the author's that I disapprove of: the title of this blog is "The Daily SEO Blog." That is not a remotely ambiguous title. Not even possibly easily misunderstood. Using your Einstein analogy, which was haste and agenda-driven incidentally, the title of this blog should be more something like, "The Daily Atom-splitting SEO Blog." Allow me to expand. How obvious is it, for instance, that we shouldn't be killing one another all over the world? Does it still happen? Does it still consume headlines? Are there panels and panels of "experts" pouring over the topic? Yes. Are they wasting their time patronizing the super genius in the room? Perhaps, at times yes, but the problem still remains. So let me pay homage to your degree of candor by responding in kind: your delivery - your packaging - your approach to contention (in at least this instance) comes off equal parts brilliant and informed, yet unfortunately for this reader, more parts pretentious, narrow and immature.
So here we are. SEOmoz espouses the genius and often the largely obvious, but perhaps overlooked as well. They're the SEO for everyone. The hostess with the mostess, I guess one could say. This greenhorn has never undertaken a read on this Website that doesn't speak to me. Never above me, never beneath me. Kudos I say! So given the choice between your notions of substance meets bluntness and the good Dr's bedside manner, somebody get me a doctor! Good day.
Well, you have complicated the stuff too much, but the issue is simple, let me try to explain in a bit easier way.
Suppose you buy a box with Ipad 3, written and made on it, so what you expect is an iPad 3 in the box. And when you open it, you get a pair of shoes instead, what are you going to do?
That's what happend here! The box is from very authentic company Apple Inc(Dr Pete in our example). The title is "ipad 3"(Logic meet google crawling to deindex in our example, very fascinating title indeed) and the inner content is not what we expect of it.
Hope you get it now. Let us be honest in our views rather than just blind praising!
I don’t have to say much about your comment but it’s like people do have different opinion and tastes...It’s completely fine that you didn’t like the idea and even great that you share it openly with the other community members but the thing I don’t like is the way to react to it.
I mean telling someone to do some homework, when writing a post for the next time is way too humiliating (especially when you are talking to the industry leader)...
I do respect dr. Pete and I am following him for quite a long time. The reason why I think this post is important (IMHO and you can disagree here) is because in the race of new and complex SEO ideas some people forget the basic thumb rules and make errors... Remember he said “...couple of common mistakes...”
P.S. you see! you didn't got much thumbs down here... I think you should re-consider the way you think about other community memebers...#justathought
Eight thumbs Down is not less I guess, with few thumbs down on sub comments as well.
And the homework statement is not in literal sense, and there is nothing about humiliating as such, its just an honest view about the article where I feel that Dr Pete can provide us with much much better content than that. He is one of the industry leaders, so its obvious he does not need any homework as such. there is nothing humiliating or personal, its simply about the given article we are discussing here.
Consider it as this example, "I dont expect Intel to make that 486 or Pentium-1 anymore, I would expect them to make Corei5, Core i7 or even higher".
Hope you get the point now!
I'm always open to critical feedback, but I feel like you may have missed part of the point of the post - it's not a post about just using NOINDEX. It's a post about using NOINDEX (or any page-based de-indexation cue) correctly. The devil is in the details, as they say.
I wrote this post for the same reason I write many of my technical posts - because I've seen a handful of problems in Q&A and even with my own clients. Many people seem to be misunderstanding the nuances of de-indexation - it sounds simple in theory, but it's incredibly difficult in practice, especially on large sites. I've seen this particular mistake cost people weeks or months (and that means $, in most cases).
I'd also point out that, with 100K subscribers, we can't make 100% of the people happy 100% of the time. Different authors here have different approaches to that problem, and mine is diversity. I try to mix it up - some in-depth posts, some comprehensive, some more basic. Sometimes, I'll even write about entrepreneurship or blogging. Sometimes, I get the mix wrong, and people call me on it - I appreciate that. On the other hand, I'll never make everyone happy with every post.
Sometimes the simple topics are the most important. I've seen noindex attempts blocked by robot.txt and nofollow declarations many times. You've touched on an important skill, and outlined how to actually get it right. It's simple and effective! Good Job :)
Guess what? Your reply is the most sensible and most relevant and addresses my concerns.(One of the four thumbs up for your reply is mine :P)
I dont know why many other respectable memebers are taking it sort of personal, misunderstanding parts of my comment and considering it somewhat humiliating or something. Its just a view about an article, nothing more nothing less.
*White Flag* :)