Technical problems, errors and surprise releases are all regular features in the day to day management of a website when you’re an SEO. There’s no doubt that maintaining a quick, error free and well optimised site can lead to long term traffic success. Here are some of my tips for regular checks you should be doing to stay on top of your website to maximise your search engine performance.
General Error Checking
General errors can crop up continually with any website and left unchecked, their volume could spiral out of control. Working on improving and resolving large numbers of 404 and timeout errors on your site can help search engines minimise the bandwidth used to completely crawl your site. It’s arguable that minimising crawl errors and general accessibility issues can help get new and updated content into search engine indexes more quickly and often, a good thing for SEO!
If you want to get smart with error handling and other crawl issues, start by getting a Google Webmaster Tools account. Take a look at “Crawl errors” found via the “diagnostics” panel after you’ve verified your site:
Paying particular attention to the “Not found” and “Timed out” reports, it’s wise to test each error with a http header checker online or using a Firefox plug-in such as Live Http Headers or Http Fox. I find that drilling down into the first 100 or so errors, you tend to find a common pattern with many that lead to only a few fixes being required. I like to focus on 404 error pages that have external links first to get maximum SEO value from legacy links.
It’s important to note that sometimes, there’s more to an error report than just the URL listed in the console. I’ve found issues such as multiple redirects ending in a 404 error which is important information to brief your developers, potentially saving them a lot of diagnostics time.
As a side note, be careful how you interpret the “Restricted by robots.txt” reports. Sometimes, those URL’s aren’t directly blocked by robots.txt at all! If you’ve been scratching your head about the URLs in the report, run the http header check. Often, a URL listed in this report is part of a chain of redirects that ends or contains a URL that is blocked by robots.txt.
For extra insight, you should try the IIS SEO Toolkit or running the classic Xenu’s Link Sleuth Crawl both of which can reveal a number of additional problems. Tom wrote a nice article on Xenu and amongst his tips, setting the options to “Treat redirections as errors” is one of my favourites. As well as internal crawl error checking, a site of any size should try to avoid redirects via internal links. From time to time, using Fetch as Googlebot inside Webmaster tools or browsing your site with JavaScript and CSS disabled using Web Developer Toolbar with your user agent set to Googlebot can also reveal hidden problems.
Linking Out to 404 Errors?
Linking out to expired external URLs isn’t great for user experience, and implies perhaps that as a resource, your site is getting out of date. Consider checking your outbound external links for errors by using the “Check external links” setting in Xenu.
Canonicalisation
You spent time and effort specifying rules for canonicalized URLs across your site, but when was the last time you checked the rules you painstakingly devised are still in place? Thanks to the ever evolving nature of our websites, things change. Redirect rules can be left out of updated site releases and your canonicalization is back to square one. You should always be working towards reducing internal duplicate content as a best practice gesture, and without solely relying on the rel=”canonical” attribute.
Checking the following can quickly reveal if you could have a problem:
- www or non www redirects (choose either, but always use a 301)
- trailing slash (choose to leave out like SEOmoz, or in, like SEOgadget but don’t allow both)
- Case redirects – a 301 redirect to all lower case URLs can solve a lot of headaches or title case redirects if you want to capitalise place names like some travel sites do
“Spot checks” of Front End Code, Missing Page Titles and Duplicate Meta
Just every now and again, it’s nice to take another look at your own code. Even if you don’t find a problem that needs fixing, you might find inspiration to make an enhancement, test a new approach or bring your site up to date with SEO best practice.
One quick check I find useful is under “Diagnostics” > “HTML suggestions” in Webmaster tools:
Duplicated title tags or meta descriptions or both can reveal problems with your dynamic page templates, missed opportunities or canonicalization issues.
Site Indexation
Site indexation, or the number of pages that receive one visit or more from a search engine in a given period of time, is a powerful metric to quickly assess how many pages on your site are generating traffic.
Aside from the obvious merit in tracking site indexation over time as an SEO KPI, the metric can also reveal unintended indexing issues like leaked tracking or exit URLs on affiliate sites or huge amounts of indexed duplicate content. If the number of pages Google claims to have indexed on your site is vastly different to the site indexation numbers you’re seeing through analytics, you may have found a new problem to solve.
Indexed Development / Staging Servers
Is your staging or development server accessible from outside your office IP range? It might be worth checking that none of your development pages are cached by the major search engines. There’s nothing worse than discovering a ranking development server URL (it does happen!) with dummy products and prices in the database. You just know that customer is going to have a bad time on a development server! If you discover an issue, talk to your development team about restricting access via IP to the staging site or consider redirecting search engine bots to the correct version of your site.
Significant / Recent Changes to Server Performance
Google have put a lot of effort into helping webmasters identify site speed issues and it could make a lot of sense to keep a regular check on your performance if you’re not doing so already. There are a few useful tools out there to help you speed up your site, starting with Google’s “Site performance” reported located under “Labs” in Webmaster tools:
It’s good to check out the “Time spent downloading a page (in milliseconds)” report found under “Diagnostics > Crawl stats” in Webmaster tools, too:
Tackling search engine accessibility issues like errors and canonicalization problems is a really important part of your SEO routine. It’s also a favourite subject of mine! What checks do you carry out regularly to manage the performance of your website? Do you have your own routine? If you manage a large site, or many large sites, what "industrial strength" tools or automated processes do you gain the most insight from?
This is a post by Richard Baxter, Founder and SEO Consultant at SEOgadget.co.uk - a performance SEO Agency specialising in helping people and organisations succeed in search. Follow him on Twitter and Google Buzz.
Good solid advice Richard (as we have come to expect from you).
Nice reminder of the "Indexed Development/Staging Servers." I have been guilty of developing and promoting at the same time (over eager) and ended up frustrating visitors. Nothing like stumbling into a store that says "Open" only to find out there's nothing on the shelves yet.
Thanks
I know! Indexed dev servers can be easily over looked. Just every now and again I've seen them out rank new websites, too. Typically, the "staging" URL gives them away, so something like inurl:staging -intitle:staging is a good starting point to find general sites on the web with indexed dev servers. Happy hunting!
Soooo true.
Thanks for this (quick) advice on how to remove this kind of servers from SERPs when it (and because it) happens!
I remember having a hard time removing one of our client dummy website from the SERPs, even with META robots and robots.txt!
Thanks for such great list! I'm adding it to my daily todo list, and right away...
Thanks Richard for the list and a great post.
I must admit, I use Google Webmaster Tools almost daily... :) as well as Google Analytics.
Nice advices. These little crawl problems used to occur often to my clients. Thanks :)
If your site is not that large, you can also run the Xenu tool. It will not only get you broken links, but also how the number of "inlinks"
and "outlinks" from each page. This is useful information when you are working on your internal link plan. In addition the report also has the "title", "level" and "size" of the file. All very handy to spot any obvious errors you may otherwise overlook.
Health checks are extremely crucial for the website's improvement. Nice article! :)
Stop giving my secrets away Richard!!! :P
Dude, very nice. But you spelled canonicalization wrong ... tsk tsk. *ducks*
Okay, seriously though, only thing I might add is to look at what pages are getting links ... missed 301 opportunities might be there. Also checking who is linking to an old page would aide in identifying who to contact to get the page changed.
He spelled a lot of things wrong! My spell checker was going crazy. :P And YES that was a joke about the English versions of words. It's a great article!
Yep - I happen to be a bit of a Top Pages on Domain fan for that. Always handle errors in order of the most linked to - "focus on 404 error pages that have external links first to get maximum SEO value from legacy links. ".
As for the spelling - I had a really hard time, what with being from the UK and all! Maybe next time I'll do a US English version :-)
Dictionaries are great but I don't take them too seriously, just have a look at this guy (cross your legs first, gents).
Thanks for the loss of a half hour jdeb! I followed your link to Wikipedia and then another link, and before I knew it...POOF!
Very nice list. I visit Google Webmaster Tools every day in fact :)
What I would add to this list is checking main Google Analytics KPIs such as the number of search engine visits through most important templates or the number of keywords bringing traffic. Such stats can also tell us about technical problems or some surprises.
excellent post, Richard, as usual
Plenty of Google Webmaster Tools goodness. Maybe not a daily task, but the Fetch as Googlebot feature is handy for troubleshooting. As for indexing, you can compare URLs indexed vs URLs submitted, under Sitemaps.
Hi Richard,
Really good detailed post
On your 404's @jaamit on twitter reccomended this to me.
https://www.linkpatch.com
it automatically sends you details of 404's when people appear on them. This is now on my housekeeping tasks. And a free tool for 1 website.
Interesting stuff Shane thanks for the link. I'll take a look at that a little later.
Or, you could use GA to do the same thing for free on all your websites:
https://www.google.com/support/googleanalytics/bin/answer.py?hl=en&answer=86927
Very nice article on the importance of Google Webmaster. For some strange reason, a lot of SEO people just don't know how useful this tool can be in making your job a whole lot easier. Its one of the things I always recommend to people looking to improve their website's visiblity personally. Thumbs up..
Nice post and good advices. Spending some time now and then on the Webmaster Tools definitely worth.
Thanks a lot Richard! First off, welcome aboard and nice to see you posting over here.
Second, great post! Far too often it seems like people consider SEO a one-off event. While a lot can be accomplished (onsite) by putting your site in order and following best practice, SEO is not the sort of thing where you can "set it and forget it" (to quote one of my favorite infomercials). This is a nice reminder to folks that while it is crucial to take care of a lot of things up front, maintenance is just as important-- and a lot easier to handle than waiting for everything to go wrong.
Great post!
A quick checkup on a site found some health issues with a site!
This is an interesting article alright. But I believe that it would be even better, if it had the different points made in kind of a checklist, for SEO masters to check usual errors.
A well presented guide to using Google webmaster's error checking tools and general 'house keeping'. The speed and performance of websites seem to be a key element for Google at the moment even down to putting any Javascript at the bottom of the page for load speed. Any thoughts or opinions on this subject?
Thanks Richard for a great blog
Chris
Hi Chris, so the advice I try to give to clients with JavaScript heavy pages is try to take your JavaScript and host it on as few external files as possible (preferably, 1) rather than having the code in the source of the webpage. Google page speed (and webmaster tools site performance Labs) will make those types of recommendations too, if there are too many http requests on a page load.
You can also give the same advice for clients who use inline css styles on their webpages - always host on an external css file!
Hopefully this question isn't too far off the point Richard. How do CMS systems (Joomla, WordPress, etc.)fare with Java Script loading times? And is there any way to tweak them for performance?
I'm not that knowledgeable with CMS but am getting more involved in Joomla with a few clients as of late.
Nice post - it's nice to see how a site should be cared for after the initial seo broken down. I keep meaning to fully explore the Google Webmaster Tools, and this will definitly give me some areas to use first!
Thanks! I'm a believer in managing your site for better search engine accessibility - best of luck with Webmaster Tools :-)
Great post...I do use this weekly just to check. I do have a question about the results for the NOT FOUND tab. Do you know how delayed these results are? Are they updated from the last crawl of the site? What would you suggest for a realtime tool to get the same answers?
Tony
am having a php based educational site avatto.com, having more then 6k pages,I have duplicate title tags, meta tags etc, as its not possible to give it separately for every page, will it affect seo adversely
Great post, these points often get overlooked or forgotten and they are becoming increasingly important from SEO and usability perspective.
You say not to rely solely on the canonical link tag... is there any evidence of it not being treated the same as a 301 in terms of link juice?
I think the issue is that it hasn't been proven that it is being treated the same as a 301.
Absolutely - the canonical tag hasn't been around all that long and deployed incorrectly can introduce strange ranking behaviour. It's an important part of best practice but I believe not having to rely on it is a more important part of best practice.
Hope that made sense!
hi..friends I am here for helpmy one site on which I am working is in trouble nowits ranking is continuously fluctuating...sometimes it comes into top 10 of Google and some times it gone down up to 500.....what should I do so it comes back to top 10 and become stable...should I stop directory submission for some times??Please reply...
Hi ankitva:
This isn't a forum per se. The best way to get an answer to a general, non post related SEO question is to join as a Pro member. You get to ask 2 questions per month, and I've found the answers to be excellent and of high quality.
PS - Your avitar looks very scary (I'm just sayin')
PPS - LOL! Says the man with his silly dog mug.