Some of you know that I spend a lot of time behind the scenes here on Pro Q&A. One of the challenges of Q&A is that we often have to tackle complex problems in a very short amount of time – we might have 10-15 minutes to solve an issue like "Why isn't my page showing up on Google?" with no access to internal data, server-side code, etc.
Of course, I'd never suggest you try to solve your own SEO problems in just 10 minutes, but it's amazing what you can do when you're forced to really make your time count. I'd like to share my 10-minute (give or take) process for solving one common SEO problem – finding a "missing" page. You can actually apply it to a number of problems, including:
- Finding out why a page isn't getting indexed
- Discovering why a page isn't ranking
- Determining if a page has been penalized
- Spotting duplicate content problems
I'll break the 10 minutes down, minute by minute (give or take). The mini-clock on each item shows you the elapsed time, for real-time drama.
0:00-0:30 – Confirm the site is indexed
Always start at the beginning – is your page really missing? Although it sometimes gets a bad rap for accuracy (mainly, the total page counts), Google's site: command is still the best tool for the job. It's great for deep dives, since you can combine it with keyword searches, "keyword" searches (exact match), and other operators (intitle:, inurl:, etc.). Of course, the most basic format is just:
For this particular job, always use the root domain. You never know when Google is indexing multiple sub-domains (or the wrong sub-domain), and that information could come in handy later. Of course, for now you just want to see that Google knows you exist.
0:30-1:00 – Confirm the page is not indexed
Assuming Google knows your site exists, it's time to check the specific page in question. You can enter a full path behind the site: command or use a combination of site: and inurl:
If the page doesn't seem to be on Google's radar, narrow down the problem by testing out just "/folder" and see if anything on the same level is being indexed. If the page isn't being indexed at all, you can skip the next step.
1:00-1:30 – Confirm the page is not ranking
If the page is being indexed but you can't seem to find it in the SERPs, pull out a snippet of the TITLE tag and do an exact-match search (in quotes) on Google. If you still can't find it, combine a site:example.com with your page TITLE or a portion of it. If the page is indexed but not ranking, you can probably skip the next couple of steps (jump to the 4:00 mark).
1:30-2:00 – Check for bad Robots.txt
For now, let's assume your site is being partially indexed, but the page in question is missing from the index. Although bad Robots.txt files are, thankfully, getting rarer, it's still worth taking a quick peek to make sure you're not accidentally blocking search bots. Luckily, the file is almost always at:
https://www.example.com/robots.txt
What you're looking for is source code that looks something like this:
It could either be a directive blocking all user agents, or just one, like Googlebot. Likewise, check for any directives that disallow the specific folder or page in question.
2:00-2:30 – Check for META Noindex
Another accidental blocking problem can occur with a bad META Noindex directive. In the header of the HTML source code (between <head> and </head>), you're looking for something like this:
Although it might seem odd for someone to block a page they clearly want indexed, bad META tags and Rel=Canonical (see below) can easily be created by a bad CMS set-up.
2:30-3:00 – Check for bad Rel=Canonical
This one's a bit trickier. The Rel=Canonical tag is, by itself, often a good thing, helping to effectively canonicalize pages and remove duplicate content. The tag itself looks like this:
The problem comes when you canonicalize too narrowly. Let's say for example, that every page on your site had a canonical tag with the URL "www.example.com" – Google would take that as an instruction to collapse your entire search index down to just ONE page.
Why would you do this? You probably wouldn't, on purpose, but it's easy for a bad CMS or plug-in to go wrong. Even if it's not sitewide, it's easy to canonicalize too narrowly and knock out important pages. This is a problem that seems to be on the rise.
3:00-4:00 – Check for bad header/redirects
In some cases, a page may be returning a bad header, error code (404, for example) or poorly structured redirect (301/302) that's preventing proper indexation. You'll need a header checker for this – there are plenty of free ones online (try HTTP Web-Sniffer). You're looking for a "200 OK" status code. If you receive a string of redirects, a 404, or any error code (4xx or 5xx series), you could have a problem. If you get a redirect (301 or 302), you're sending the "missing" page to another page. Turns out, it's not really missing at all.
4:00-5:00 – Check for cross-site duplication
There are basically two potential buckets of duplicate content – duplicate pages within your site and duplicates between sites. The latter may happen due to sharing content with your own properties, legally repurposing contents (like an affiliate marketer might do), or flat-out scraping. The problem is that, once Google detects these duplicates, it's probably going to pick one and ignore the rest.
If you suspect that content from your "missing" page has been either taken from another site or taken by another site, grab a unique-sounding sentence, and Google it with quotes (to do an exact match). If another site pops up, your page may have been flagged as a duplicate.
5:00-7:00 – Check for internal duplication
Internal duplication usually happens when Google crawls multiple URL variations for the same page, such as CGI parameters in the URL. If Google reaches the same page by two URL paths, it sees two separate pages, and one of them is probably going to get ignored. Sometimes, that's fine, but other times, Google ignores the wrong one.
For internal duplication, use a focused site: query with some unique title keywords from the page (again, in quotes), either stand-alone or using intitle:. URL-driven duplicates naturally have duplicate titles and META data, so the page title is one of the easiest places to find it. If you see either the same page pop up multiple times with different URLs, or one or two pages followed by this:
...then it's entirely possible that your missing page was filtered out due to internal duplication.
7:00-8:00 – Review anchor text quality
These last two are a bit tougher and more subjective, but I want to give a few quick tips for where to start if you suspect a page-specific penalty or devaluation. One pretty easy to spot problem is when you have a pattern of suspicious anchor text – usually, an uncommon keyword combination that dominates your inbound links. This could come from a very aggressive (and often low-quality) link-building campaign or from something like a widget that's dominating your link profile.
Open Site Explorer allows you to pretty easily look at your anchor text in broad strokes. Just enter your URL, click on Anchor Text Distributions (the 4th tab), and select Phrases:
What you're looking for is a pattern of unnatural repetition. Some repetition is fine – you're naturally going to have anchor text to your domain name keywords and your exact brand name, for example. Let's say, though, that 70% of our links pointing back to SEOmoz had the anchor text "Danny Dover Is Awesome." That would be unnatural. If Google thinks this is a sign of manipulative link building, you may see that target page penalized.
8:00-10:00 – Review link profile quality
Link profile quality can be very subjective, and it's not a task that you can do justice to in two minutes, but if you do have a penalty in play, it's sometimes easy to spot some shady links quickly. Again, I'm going to use Open Site Explorer, and I'm going to select the following options: Followed + 301, External Pages Only, All Pages on The Root Domain:
You can export the links to Excel if you want to (great for deep analysis), but for now, just spot-check. If there's something fishy on the first couple of pages, odds are pretty good that the weaker links are a mess. Click through to a few pages, looking out for issues such as:
- Suspicious anchor text (irrelevant, spammy, etc.)
- Sites with wildly irrelevant topics
- Links embedded in an obviously paid or exchanged block
- Links that are part of a multi-link page footer
- Advertising links that are followed (and shouldn't be)
Also, look for any over-reliance on one kind of low-quality link (blog comments, article marketing, etc.). Although a full link-profile analysis can take hours, it's often surprisingly easy to spot spammy link-building in just a few minutes. If you can spot it that fast, chances are pretty good that Google can, too.
(10:00) – Time's Up
Ten minutes may not seem like much (it may have taken you that long just to read this post), but once you put a process in place, you can learn a lot about a site in just a few minutes. Of course, finding a problem and solving it are two entirely different things, but I hope this at least gives you the beginning of a process to try out yourself and refine for your own SEO issues.
Pete,
This is an amazing little guide for someone to start finding some problems they may be having on their site. I follow a similar flowchart when looking for problems on my clients websites. Anyone who may be wondering why or what the issue on their site, should start here!
Thumbs up!
Agreed. I haven't had an issue (yet) on my website but this is a great guide on where to start if it ever happens. I never thought about accidentally collapsing my entire site to one page due to a canonical issue. Good heads up Pete.
Thanks for the great article.
There is only one thing I would add: Check all URLs with Searchbots as clients (simply use the FF addon user-agent-switcher). It always surprises me how often (hopefully) unintended forms of cloaking can be seen out there.
That's an excellent addition. I usually save the one until after I suspect a problem, but you're absolutely right - there's still a lot of accidental cloaking out there.
I am getting another type of problem.Last 4-5 months my site was ranked in 8 -10 in 1st Page But Last week suddenly i lost rank.First 2 days i was 23 and now 30.I am not understanding the problem.will be happy if got a suggestion
Hey Phranz,
Could you share your process for doing that please?
Hi Donnie,
glad to do that.
Simply grab the user-agent switcher for Firefox (https://addons.mozilla.org/en-US/firefox/addon/59/). After installation head to Extras > Default User Agent. In this section you can pick any of the major Search Bots as clients. As of now you can pick Googlebot 2.1, MSN Bot 1.1 (until it becomes replace by Bing Bot in October) and Yahoo! Slurp. After you have chosen one of these simply refresh the URLs you want to check to see if everything is delivered as intended.
Hope that helps.
Took >10 min to read Pete's excellent Quickie How To in 10 due to all the comments.
10 minutes WELL spent!
Awesome Post ! Should be mandatory for everybody who wants to improve own SEO.
Can't imagine my morning without going through google webmaster tool. Looking for everything, every tiny thing to make sure that it is correct and more important that nothing wrong.
Such small thing such as keywords don't match the content or missing alt tags .... (list can be endless) can have negative effect.
Thanks again
very useful post peter. You can add one more weapon to your arsenal to quickly check for keyword stuffing, hidden text etc:
https://tool.motoricerca.info/spam-detector/
and those who can spend more than 10 minutes:The web developer Toolbar
https://addons.mozilla.org/en-US/firefox/addon/60/
Here is a post on seomoz by Rob on how to use this toolbar effectively:
https://www.seomoz.org/blog/web-developer-toolbar-for-seo
First time I've seen that spam detection tool. Interesting concept - thanks.
There is infact a sort of spam detection tool in seomoz webapp (if i am not mistaken). It is on the 'on page report card' where the application check for keyword stuffing in the document and page title. It comes in the 'high importance factors' category. However 'hidden text' and 'doorway pages' are not listed anywhere. I think these can be valuable addition to the application and important factors for grading the on page optimization of a web page.
Good point - I've passed that feedback on to the team. We've left out some of the old-school stuff like hidden text.
This is mostly review but still a great resource for quick-fixes to pages. Great for beginners and veterans. Thanks!!
A good quick check list indeed.
I am sure all SEOs do it all the time but you have worded the whole process very aptly.
And yes that canonical tag is a real trickster indeed!
Thanks.
I remember when I first discovered the canonical tag, then wondered why all of my pages were removed from the index apart from my home page. There was a massive slap on my forehead when I figured out the problem.
Fortunately, the problem was resolved really easily and my pages creeped back into the index and started ranking again.
This was a long loooooong time ago by the way!! Back in the dark years when if someone asked me what SEO was I would've said "essy-what?".
When rel=canonical first came out, I would've never thought it could be a recipe for disaster, but I've seen situations like that at least a handful of times now, and some didn't recover quite so quickly. It's an amazingly useful tool, but anything that you can apply sitewide can be dangerous. It's amazing how a single line of HTML can destroy a site.
I think, "It's amazing how a single line of HTML can destroy a site." should be painted in large letters on the wall behind my computer screen to remind me to be a bit more careful when tinkering ...
Great post, not only helpful in terms of this problem, but structuring a 10 minute response to many problem issues.
I would not have thought it either but I was amateur enough to think I was doing it right with very little research. A less well-learned!!
One of my design assistants once added an incorrect canonical tag to an important page on one site. Initially it was ok, Google passed the ranking on to the linked page which I thought was cool (and possibly could be used to manipulate results). But a week or too later it bombed - took months to get things right again!
One of the sites that I was working on (A JOOMLA SITE) also met the same fate inspite of my instructions to the developer that the tag should be on the home page only.
When the indexing started showing a reduced no. in WMT the corrective action was taken.
But being a Joomla site it took its own sweet time but nevertheless the indexing and rankings came back.
Next I hope to see an article about how to deal with content duplicated across the Web, especially commonly duplicated content such as product descriptions.
One more reason could be x-robots-noindex directive returned in the HTTP headers for the page.
SEO Doctor add-on for detects all of them automatically in 5 sec :)
https://www.prelovac.com/vladimir/browser-addons/seo-doctor
Thumbs up Pete. I find that I thrive on having list "to do's" like this. Even if I eventually do it all from memory, having a list that spells it out, step by step, (like yours) always keeps me focused.
Hey GNC, you seem Firegolem with these changes of avatar ;)
I just couldn't bring myself to get rid of my faithful old buddy. He's been gone for almost 4 years now and I still miss him.
PS - Nice to see you today at the webinar.
I to find these type of processes especially helpful. Efficiency is one of the most valuable assets.
-seomozer in training
Bookmarked checklist, check!
I love this guide Pete! It's the short and sweet guides like this that really help me out. I sometimes get very lost in the large posts, but ones like this are much more easy to digest.
One thing I always do is run the URL through Integrity, that will show me any errors it may have too. It's not necessarily 100% complete but it helps me out and it's automated so there's no problem with a bit of extra information. Just a note: Integrity is an application for OS X, not sure what alternatives there are for Windows, sorry!! If someone could reply to this with an alternative, that would be great.
I'm 100% going to action this. Going to keep it in my bookmark and head back to it maybe this evening or tomorrow evening when I'm doing a bit of research for my personal site. Thanks again! :-)
Xenu's Link Sleuth is what I use on my Windows machine. I use Integrity on my Mac.
Same as me then, I wasn't sure if there were any alternatives to Xenu's Link Sleuth but I think that it's probably up there as a benchmark free application for 'doze.
Great job, Pete, this rocks!
Although I like httpfox better than web-sniffer for checking response headers, but perhaps that's because I've never tried web-sniffer :-)
Quick question in regards to the last step you mention: finding the spammy links. When examining my own site, I discovered that I have about 20 unsolicited links that are from an unrelated industry blog. Looks pretty spammy to me, and I'm wondering what your recommendation is? Should I ask the site owner to remove them? To me, they look like they could have easily been a part of a paid link campaign, and although I know that's not the case, Google won't. Thoughts?
You know, in general, I wouldn't worry too much. On balance, links are good the vast majority of the time. It's really wide patterns you need to look out for, especially if you had a hand in creating them.
A lot of it boils down to your link profile overall, IMO. If you have 100+ back-links and in that mix is some strong, relevant links, I wouldn't worry about 20 links from one site (that's another critical factor - it's 20 links, but, I assume, just 1 domain). If these are the only links you have, then I'd recommend doing some of your own link-building soon to counteract the questionable profile.
Nice article Mr. Pete. With this very easy step by step guide, I can insert 3 seconds on each step sipping my coffee.;)
Assuming there is a case of a page-level penalty due to an anchor text issue, what can be done? Can we just build more "un-optimized" anchor text links to that page and get the penalty removed?
Often, there's a temporal pattern at play, such as building up dozens of links in a very short time period. So, the good news is that, if you lay off that tactic, the problem sometimes goes away. Generally, though, I think the best approach is to diversify - take a break on low-quality links and build some high-quality ones with unique anchor text. The plus side to that is it's always a win long-term.
Had to register and post a comment...
first off, massive thanks for all you do for the SEO community.
I've been an seomoz lurker for years, when I switched jobs last year, my current employer had a pro account and I got to benefit from all that goodness.
I've been banging my head against a wall the last month figuring out why the majority of our pages wouldn't get indexed, we recently relaunched our flagship website and had to do some hacks to get it to work exactly how we wanted it. One of those hacks was to our CMS seo plugin, which created some really funky canonical rules - after reading this post I checked all our canonicalization and no wonder none of our pages were being indexed, it was a terrible terrible mess.
THANKYOU for this post, I've lost sleep over this issue and would never have even though about checking the canonical since it's handled by our CMS and we use the same (none hacked) cms for all our other websites without issue. Time to go renew the pro account and give back!
Always glad to be of help. CMS systems are a huge time-saver and a boon to web development, in general, but once you automate something, there's always the chance for trouble. Today, it's a bad canonical tag - tomorrow, killer robots bent on revenge.
Good to know this quick tips to check if something's wrong with a site.Thanks Pete!
Good advice. Once you get in the hang of things it does only take about 10 minutes as you mentioned. Like the tip about OSE anchor text disctribution which can be usueful to see when a site has been 'over optimised' for a certain anchor text - when a less than honest SEO company has paid for a batch of backlinks.
Our stuff is definitely being scraped ... front-page copy, blog posts, images ... everything. And the content is showing up in both aggregrator pages on our industry topic but also on pages that are in no way relevant to our industry.
Even worse ... the plagiarizing pages are ranking higher in Google than the pages on our site that are being scraped!
Sorry to be vague, but we're all of a sudden getting killed in Google rankings, dropping off the front page on many keywords.
So ... do we completely reword our pages to keep them fresh and not be penalized for duplicate content? Suggestions?
Unfortunately, these situations can get pretty complex. It is possible to have higher-authority sites copy your content and for Google to credit it to them. In most cases, though, you may have internal issues that are weakening your own ranking, such as duplicate content. Make sure your own house is in order as much as possible. I'd also suggest submitting XML sitemaps and using other cues that will help Google see your content first. If the other site is getting indexed faster, Google might see the content from them first.
Excellent resource. Been looking for an answer to this for a client. Thanks!
One of the most difficult parts as an SEO. Much more technical, analytical and hypothetical because your chasing a unicorn. For the most part, it could be out of your control as well as it could be search engines algos.
You kind of know whats wrong, you have a good assumption about it. But it could be many, many factors.
You just have keep your head down, keep on working through it.
Hello,
Very nice posting, like a lot..I also suggest you others, do this exercise all time.. thanks for such a nice informations..
Regards
Good post. What else can be added to the routine of seo?
Great post!,, thanks a lot
Good post!
Quick and easy tips to follow! I had an experience just recently where one of my sites were indexed and ranked on all 3 major search engines but then on Bing one day, my pages just dropped off. They were fine on Google/Yahoo but on Bing they were no longer indexed. In a few days they all came back, but I'm curious as to why that happened.
"or poorly structured redirect (301/302)"
what about 300 redirect ? (multiple choices)
I've honestly never encountered a 300 code in my 13 years of web work, and even finding references was a bit tricky, but apparently Google doesn't handle them particularly well:
https://www.seroundtable.com/archives/020555.html
When I have read "4:00-5:00 – Check for cross-site duplication" ,
I used the following sentence from my site www.seo-cook.com search in Google, to my surprise, there are no page results on Google's first SERPs. Oh, my god, I have not focused on my site recently, my content has been plagiarized by somebody who has no links to my site, it's really a good way to check whether your page content has been plagiarized by others!
"Consider how profitable you could be if your Chinese prospective customers found your website near the top of Baidu search results for the keywords used in your industry."
It can be a bit tricky, because sometimes people will use a snippet, but then link back to you. I wouldn't worry to much about that. It's the wholesale content theft that's an issue. Unfortunately, you're in a market that I barely understand, either from a legal or SEO standpoint. Also not sure how Baidu handles dupes.
Thank about this news
https://giaiphapseo.net
It was an awesome post to speed up your Analytical Skills, some times we waste a lot of time in analyzing these errors, I have spent my yesterday in analyzing Adwords and Analytics account integration error and unable to resolve the problem.
So it was good one for me :)
Great post, Pete!
Question; let's say you find some low quality links. Then what? Kind of hard to edit someone else's site(s).
Thanks!
Well if 94501 can get away with throwing a question in here, I'm going to give it a shot (while publicly displaying my ignorance to boot).
Page penalty - that really happens? (I always thought a penalty would apply to the entire site)
It does happen and my gut feeling is that it's getting more common. We've seen individual pages blocked.
I would think if it is your intellectual property you could file a DMCA notice. I do not have much experience for this kind of thing.
https://en.wikipedia.org/wiki/Digital_Millennium_Copyright_Act
Yeah, it's tough, and Google isn't very consistent. On the one hand, they know that other people could try to doctor your links to make them look bad, so those penalties aren't common, but there are patterns to avoid.
Generally, some possibilities - (1) If they're really bad, ask to have them removed (especially if they're obviously spammy, like hidden text). (2) Build high-quality links and lay low on low-quality links - it's all about your overall profile. (3) If the spammy links are all of one type (one set of anchor text, all blog comments, etc.), lay off that tactic for a while - DIVERSIFY.
Firstly: Great post - nice clear way to solve several (potentially large) headaches.. :)
Pete, Just wondering if you had any idea/experience on timescales for getting the page re-indexed/non-penalized? Dealing with a site currently that I believe certain pages have been slapped for too many low quality identical anchor text links. We're now trying to clean up, by gradually building a wide range of higher quality, less anchor specific links - but unsure when we may see positive results (currently the page is not indexed). A guess would be 4-6weeks, but I have no evidence/direct experience. What do you think?
Best - Tom
Unfortunately, it really depends a lot on the situation. A technical issue (like a bad Robots.txt entry, etc.) could clear up in just a couple of weeks, but an actual penalty (especially a site-wide one) might take months or even require a reconsideration request. Luckily, most problems are somewhere in-between, so while nothing is "average", 4-6 weeks isn't a bad rough average.
Very Good Guide indeed! It is short and simple!
Great Job Pete!
Very well wriiten and explained Dr.Pete. It was clean and clear. This guide would rather help all kind of seo's in their work. It was a very good summary and put up very well.
Thanks and thumbsup for that real good post.
Very good guide and article. Have to test this out on my pages.
Thanks for posting - a very good step by step guide
It's good to see the process used to make a diagnosis quickly and efficiently to find the errors and implement appropriate solutions.
Thanks !
Relly nice and quick guide.
Cheerz,
Wil.
Useful post. I instaly tried 10 minutes and found the Link Errors in my blog. Great Post. Thanks
Nice post, always useful to get a quick indication of what causes a page not to rank.
This is really helpful, I am pretty new to this game and this is gonna come in real handy.
Thumbs up
Fantastic post, thanks! Essential and spot on, I am going to try these 10 minutes tips right now!
Great article! Clean and right to the point. I tested all this out on my company's page :)
Looks like 8:00 AM for me. This is my daily task!
I love the timer graphic on each step, great for us visual learners.
Like my Grandma used to say - "You can never have too much pie [chart]".
Ah yes My Grandma always uses that one.
The following takes place between 00'.00'' and 00'.30''
Useful post Peter, as always, even though I wished you could have find a way to insert again a Matt Cutt retouched photo.
Jokes apart, I find your guide very useful also for another use than the first pitch, as a way to organize the examination we all do of a potential client website.
Infact, at least it is my case, when I'm contacted by clients it is usually with that kind of Q&A questions, therefore to have a methodology to follow is a sure time saver (especially when you have many quotes to answer in your inbox).
P.S.: your Q&A answers here are great practical mini-guides, useful not only to those ones who did the question. I wanted to show my appreciation for that not so evident part of your job, and this post is a good occasion to do it.
I appreciate the kind words. I assume my Q&A answers are rambling proof of my insanity, so I try not to go back and read them.
Well, in insanity there's always the truth, or so were saying in the Ancient times.
Under 8:00-10:00 – Review link profile quality You list "Links that are part of a multi-link page footer." I am not sure what you are driving at with this. What is the issue?
In other words, if there's a site linking to the site in question and that link is part of a footer crammed full of external links, it tends to be a bad quality signal. Not only is Google tending to devalue massive footers, but a massive footer full of external links is often a sign of a link farm. It's also can suggest that someone has dozens or hundreds of similar links, and the best case is usually that those links have been devalued.
Thanks for the clarification. By massive links in a footer, are we talking about footers like that on this page? This page's footer seems like a good practice to me. Are the links in footers like this being devalued?
It's a bit nuanced, I admit. I should clarify, though, that I'm talking more about footers that link out excessively. Internally-linked footers are a bit different. Google may not give a footer like ours a ton of value, but it wouldn't get you into trouble, generally speaking. Plus, there's always visitor value.
The extreme cases I see are when 20 sites all cross-link and they have no topical relevance to each other. Often, those are footer links, and it can be a very bad quality signal.
added to favourites, i love check lists to save me going around in circles.
This is way more than 6 people to read/comment on your post, Pete. Turns out you and @jennita underestimate your writing prowess.
Like goodnewscowboy, it's really great to have these typical tasks documented and in a very user-friendly format.
I need my brain cells for creating new things, not for trying to remember all established tactics.
Really awesome guide to time-efficient problemsolving.
Thumbs up Dr. Pete :)
Nice and helpful post.. It covers all the basics step by step..
Thanks..
Great article, going to use it as my guide! Thank you!
Nice clear and concise breakdown that will help with some executive discussions. Great work, Dr Peter.
Thanks Pete :-) a nice step-by-step investigative process - that works!