I love technical SEO (most of the time). However, it can be frustrating to come across the same site problems over and over again. In the years I've been doing SEO, I'm still surprised to see so many different websites suffering from the same issues.
This post outlines some of the most common problems I've encountered when doing site audits, along with some not-so-common ones at the end. Hopefully the solutions will help you when you come across these issues, because chances are that you will at some point!
1. Uppercase vs Lowercase URLs
From my experience, this problem is most common on websites that use
.NET. The problem stems from the fact that the server is configured to respond to URLs with uppercase letters and not to redirect or rewrite to the lowercase version.
I will admit that recently, this problem hasn't been as common as it was because generally, the search engines have gotten much better at choosing the canonical version and ignoring the duplicates. However, I've seen too many instances of search engines not always doing this properly, which means that you should make it explicit and not rely on the search engines to figure it out for themselves.
How to solve:
There is a
URL rewrite module which can help solve this problem on IIS 7 servers. The tool has a nice option within the interface that allows you to enforce lowercase URLs. If you do this, a rule will be added to the web config file which will solve the problem.
More resources for solutions:
2. Multiple versions of the homepage
Again, this is a problem I've encountered more with .NET websites, but it can happen quite easily on other platforms. If I start a site audit on a site which I know is .NET, I will almost immediately go and check if this page exists:
www.example.com/default.aspx
The verdict? It usually does! This is a duplicate of the homepage that the search engines can usually find via navigation or XML sitemaps.
Other platforms can also generate URLs like this:
www.example.com/index.html
www.example.com/home
I won't get into the minor details of how these pages are generated because the solution is quite simple. Again, modern search engines can deal with this problem, but it is still best practice to remove the issue in the first place and make it clear.
How to solve:
Finding these pages can be a bit tricky as different platforms can generate different URL structures, so the solution can be a bit of a guessing game. Instead, do a crawl of your site, export the crawl into a CSV, filter by the META title column, and search for the homepage title. You'll easily be able to find duplicates of your homepage.
I always prefer to solve this problem by adding a 301 redirect to the duplicate version of the page which points to the correct version. You can also solve the issue by using the rel=canonical tag, but I stand by a 301 redirect in most cases.
Another solution is to conduct a site crawl using a tool like
Screaming Frog to find internal links pointing to the duplicate page. You can then go in and edit the duplicate pages so they point directly to the correct URL, rather than having internal links going via a 301 and losing a bit of link equity.
Additional tip - you can usually decide if this is actually a problem by looking at the Google cache of each URL. If Google hasn't figured out the duplicate URLs are the same, you will often see different PageRank levels as well as different cache dates.
More resources for solutions:
3. Query parameters added to the end of URLs
This problem tends to come up most often on eCommerce websites that are database driven. There of a chance of occurrence on any site, but the problem tends to be bigger on eCommerce websites as there are often loads of product attributes and filtering options such as colour, size, etc. Here is an example from
Go Outdoors (not a client):
In this case, the URLs users click on are relatively friendly in terms of SEO, but quite often you can end up with URLs such as this:
www.example.com/product-category?colour=12
This example would filter the product category by a certain colour. Filtering in this capacity is good for users but may not be great for search, especially if customers do not search for the specific type of product using colour. If this is the case, this URL is not a great landing page to target with certain keywords.
Another possible issue that has a tendency to use up TONS of crawl budget is when said parameters are combined together. To make things worse, sometimes the parameters can be combined in different orders but will return the same content. For example:
www.example.com/product-category?colour=12&size=5
www.example.com/product-category?size=5&colour=12
Both of these URLs would return the same content but because the paths are different, the pages could be interpreted as duplicate content.
I worked on a client website a couple of years back who had this issue. We worked out that with all the filtering options they had, there were over a BILLION URLs that could be crawled by Google. This number was off the charts when you consider that there were only about 20,000 products offered.
Remember, Google does allocate crawl budget based on your PageRank. You need to ensure that this budget is being used in the most efficient way possible.
How to solve:
Before going further, I want to address another common, related problem: the URLs may not be SEO friendly because they are not database driven. This isn't the issue I'm concerned about in this particular scenario as I'm more concerned about wasted crawl budget and having pages indexed which do not need to be, but it is still relevant.
The first place to start is addressing which pages you want to allow Google to crawl and index. This decision should be driven by your keyword research, and you need to cross reference all database attributes with your core target keywords. Let's continue with the theme from Go Outdoors for our example:
Here are our core keywords:
-
Waterproof jackets
-
Hiking boots
-
Women's walking trousers
On an eCommerce website, each of these products will have attributes associated with them which will be part of the database. Some common examples include:
-
Size (i.e. Large)
-
Colour (i.e. Black)
-
Price (i.e. £49.99)
-
Brand (i.e. North Face)
Your job is to find out which of these attributes are part of the keywords used to find the products. You also need to determine what combination (if any) of these attributes are used by your audience.
In doing so, you may find that there is a high search volume for keywords that include "North Face" + "waterproof jackets." This means that you will want a landing page for "North Face waterproof jackets" to be crawlable and indexable. You may also want to make sure that the database attribute has an SEO friendly URL, so rather than "waterproof-jackets/?brand=5" you will choose "waterproof-jackets/north-face/." You also want to make sure that these URLs are part of the navigation structure of your website to ensure a good flow of PageRank so that users can find these pages easily.
On the other hand, you may find that there is not much search volume for keywords that combine "North Face" with "Black" (for example, "black North Face jackets"). This means that you probably do not want the page with these two attributes to be crawlable and indexable.
Once you have a clear picture of which attributes you want indexed and which you don't, it is time for the next step, which is dependant on whether the URLs are already indexed or not.
If the URLs are not already indexed, the simplest step to take is to add the URL structure to your robots.txt file. You may need to play around with some Regex to achieve this. Make sure you test your regex properly so you don't block anything by accident. Also, be sure to use the
Fetch as Google feature in Webmaster Tools. It's important to note that if the URLs are already indexed, adding them to your robots.txt file will NOT get them out of the index.
If the URLs are indexed, I'm afraid you need to use a plaster to fix the problem: the rel=canonical tag. In many cases, you are not fortunate enough to work on a website when it is being developed. The result is that you may inherit a situation like the one above and not be able to fix the core problem. In cases such as this, the rel=canonical tag serves as a plaster put over the issue with the hope that you can fix it properly later. You'll want to add the rel=canonical tag to the URLs you do not want indexed and point to the most relevant URL which you do want indexed.
More resources for solutions:
4. Soft 404 errors
This happens more often than you'd expect. A user will not notice anything different, but search engine crawlers sure do.
A
soft 404 is a page that looks like a 404 but returns a HTTP status code 200. In this instance, the user sees some text along the lines of "Sorry the page you requested cannot be found." But behind the scenes, a code 200 is telling search engines that the page is working correctly. This disconnect can cause problems with pages being crawled and indexed when you do not want them to be.
A soft 404 also means you cannot spot real broken pages and identify areas of your website where users are receiving a bad experience. From a link building perspective (I had to mention it somewhere!), neither solution is a good option. You may have incoming links to broken URLs, but the links will be hard to track down and redirect to the correct page.
How to solve:
Fortunately, this is a relatively simply fix for a developer who can set the page to return a 404 status code instead of a 200. Whilst you're there, you can have some fun and make a cool 404 page for your user's enjoyment. Here are
some examples of awesome 404 pages, and I have to point to Distilled's
own page here :)
To find soft 404s, you can use the feature in Google Webmaster Tools which will tell you about the ones Google has detected:
You can also perform a manual check by going to a broken URL on your site (such as
www.example.com/5435fdfdfd) and seeing what status code you get. A tool I really like for checking the status code is
Web Sniffer, or you can use the
Ayima tool if you use Google Chrome.
More resources for solutions:
5. 302 redirects instead of 301 redirects
Again, this is an easy redirect for developers to get wrong because, from a user's perspective, they can't tell the difference. However, the search engines treat these redirects very differently. Just to recap, a 301 redirect is permanent and the search engines will treat it as such; they'll pass link equity across to the new page. A 302 redirect is a temporary redirect and the search engines will not pass link equity because they expect the original page to come back at some point.
How to solve:
To find 302 redirected URLs, I recommend using a deep crawler such as
Screaming Frog or the
IIS SEO Toolkit. You can then filter by 302s and check to see if they should really be 302s, or if they should be 301s instead.
To fix the problem, you will need to ask your developers to change the rule so that a 301 redirect is used rather than a 302 redirect.
More resources for solutions:
6. Broken/Outdated sitemaps
Whilst not essential, XML sitemaps are very useful to the search engines to make sure they can find all URLs that you care about. They can give the search engines a nudge in the right direction. Unfortunately, some XML sitemaps are generated one-time-only and quickly become outdated, causing them to contain broken links and not contain new URLs.
Ideally, your XML sitemaps should be updated regularly so that broken URLs are removed and new URLs are added. This is more important if you're a large website that adds new pages all the time. Bing
has also said that they have a threshold for "dirt" in a sitemap and if the threshold is hit, they will not trust it as much.
How to solve:
First, you should do an audit of your current sitemap to find broken links.
This great tool from Mike King can do the job.
Second, you should speak to your developers about making your XML sitemap dynamic so that it updates regularly. Depending on your resources, this could be once a day, once a week, or once a month. There will be some development time required here, but it will save you (and them) plenty of time in the long run.
An extra tip here: you can experiment and create sitemaps which only contain new products and have these particular sitemaps update more regularly than your standard sitemaps. You could also do a bit of extra-lifting if you have dev resources to create a sitemap which only contains URLs which are not indexed.
More resources for solutions:
A few uncommon technical problems
I want to include a few problems that are not common and can actually be tricky to spot. The issues I'll share have all been seen recently on my client projects.
7. Ordering your robots.txt file wrong
I came across an example of this very recently, which led to a number of pages being crawled and indexed which were blocked in robots.txt.
The reason that the URLs in this case were crawled was because the commands within the robots.txt file was wrong. Individually the commands were correct, but they didn't work together correctly.
Google explicitly say this in their
guidelines but I have to be honest, I hadn't really come across this problem before so it was a bit of a surprise.
How to solve:
Use your robots commands carefully and if you have separate commands for Googlebot, make sure you also tell Googlebot what other commands to follow - even if they have already been mentioned in the catchall command. Make use of the testing feature in Google Webmaster Tools that allows you to test how Google will react to your robots.txt file.
8. Invisible character in robots.txt
I recently did a technical audit for one of my clients and noticed a warning in Google Webmaster Tools stating that "Syntax was not understood" on one of the lines. When I viewed the file and tested it, everything looked fine. I showed the issue to
Tom Anthony who fetched the file via the command line and he diagnosed the problem: an invisible character had somehow found it's way into the file.
I managed to look rather silly at this point by re-opening the file and looking for it!
How to solve:
The fix is quite simple. Simply rewrite the robots.txt file and run it through the command line again to re-check. If you're unfamiliar with the command line, check out this post by
Craig Bradford over at Distilled.
9. Google crawling base64 URLs
This problem was a very interesting one we recently came across, and another one that Tom spotted. One of our clients saw a massive increase in the number of 404 errors being reported in Webmaster Tools. We went in to take a look and found that nearly all of the errors were being generated by URLs in this format:
/aWYgeW91IGhhdmUgZGVjb2RlZA0KdGhpcyB5b3Ugc2hvdWxkIGRlZmluaXRlbHkNCmdldCBhIGxpZmU=/
Webmaster tools will tell you where these 404s are linked from, so we went to the page to findout how this URL was being generarted. As hard as we tried, we couldn't find it. After lots of digging, we were able to see that these were authentication tokens generated by Ruby on Rails to try and prevent cross site requests. There were a few in the code of the page, and Google were trying to crawl them!
In addition to the main, problem, the authentication tokens are all generated on the fly and are unique, hence why we couldn't find the ones that Google were telling us about.
How to solve:
In this case, we were quite lucky because we were able to add some Regex to the robots.txt file which told Google to stop crawling these URLs. It took a bit of time for Webmaster Tools to settle down, but eventually everything was calm.
10. Misconfigured servers
This issue is actually written by Tom, who worked on this particular client project. We encountered a problem with a website's main landing/login page not ranking. The page had been ranking and at some point had dropped out, and the client was at a loss. The pages all looked fine, loaded fine, and didn't seem to be doing any cloaking as far as we could see.
After lots of investigation and digging, it turned out that there was a subtle problem caused by a mis-configuration of the server software, with the HTTP headers from their server.
Normally an 'Accept' header would be sent by a client (your browser) to state which file types it understands, and very rarely this would modify what the server does. The server when it sends a file always sends a "Content-Type" header to specify if the file is HTML/PDF/JPEG/something else.
Their server (they're using Nginx) was returning a "Content-Type" that was a mirror of the first fiel type found in the clients "Accept" header. If you sent an accept header that started "text/html," then that is what the server would send back as the content-type header. This is peculiar behaviour, but it wasn't being noticed because browsers almost always send "text/html" as the start of their Accept header.
However, Googlebot sends "Accept: */*" when it is crawling (meaning it accepts anything).
(See: https://webcache.googleusercontent.com/search?sourceid=chrome&ie=UTF-8&q=cache:https://www.ericgiguere.com/tools/http-header-viewer.html)
I found if I sent a */* header this caused the server to fall down as */* is not a valid content-type and the server would crumble and send an error response.
Changing your browsers user agent to Googlebot does not influence the HTTP headers, and tools such as web-sniffer also don't send the same HTTP headers as Googlebot, so you would never notice this issue with them!
Within a few days of fixing the issue, the pages were re-indexed and the client saw a spike in revenue.
Cheers Paddy! On-site SEO doesn't get enough attention (probably because it's not as glorious as scoring a great link), but it's my favorite. Number 8 and 9 on this list are new to me and obviously took a keen eye to find. Thanks for sharing them with the community. One Guinness coming up!
Good submit Paddy, a number of definitely stable tips. Thank you regarding revealing that will. Outputting the rules regarding htaccess can be a very cool element.
One Guinness coming up! | FTFY
Also thanks Paddy for the good post and saying 'Additional tip' and not 'Pro tip'...very refreshing
I don't usually gravitate toward technical reads, but when I do, I look for Paddy Moogan's byline. Cheers.
There's only one Paddy Moogan.
+1 Indeed sir!
Great post! # 3 is always a challenge, especially when the client is using a hosted eCommerce solution that does not allow much if any back end intervention.
Hi,
I am just looking for this type of post on technical issues. In this post you have cover mostly all issues which we faced during our SEO and on-site optimization. You have covered maximum problems sometimes people does not pay more attention on these all issues but these all are bigger problems so we have to look at these all problems for making our site Google friendly. Still many people are confused between 301 and 302 redirection but you have described really nice way.
Enough for one to solve the technical aspects(most of them) of the SEO that may we face during some projects, are well explained in one article. I like it and for reference keep instead of google for clarification.
Great post Paddy, some really solid tips. The duplicate home page is also something I immediately look for and there is a great little tool over at https://www.ragepank.com/redirect-check/ which does it for you. It even generates the redirect codes to insert into the .htaccess file if the site is running on apache. (I'm nearly reluctant to give this little nugget away to be honest but share and share alike eh!!).
Great tip
Nice one Sean, thanks for sharing that. Outputting the rules of htaccess is a very cool feature.
Fantastic tool - thanks for sharing Sean!
You're welcome. It's a great little time-saver. I even use it from time to time for building backlinks...I search for one of the footprints e.g. "index.html" + keyword, check whether they're running on apache then contact them (stock email!) outlining how they can improve their home page canonicalisation issues and mention that if they're feeling particularly grateful they might drop you a backlink!
I've seen a lot of false positives using this tool actually. I've contacted the owner of ragepank and he admits this as well "
redirect tool does sometimes have trouble with certain server configurations - something about the PHP script it uses for querying. " I would always double check via another tool.
FYI
Love this post, Paddy - thanks for making things as easy as possible. There's also an Apache URL rewrite module available in case some folks aren't using IIS.
Thanks for adding that in Jon, very useful tool!
Dupe home-page URLs are so common that I always do a quick-and-dirty check with:
site:example.com intitle:"Your Page Title"
It'll at least give you a quick view of what's indexed.
I look forward to your next post "Why .Net Is A Pain in The Ass" - which is just this post again, but with a new title :)
Thanks Paddy, this is a really interesting post. I will certainly be using some of the tips.
Thanks Paddy for the very thorough and practical run down on these common errors. It will certainly serve our team well. As always, great job!
Excellent post and very actionable.
Very helpful. Thanks Paddy
Great post Paddy.
Thank You!
Hi, thanks for the information in this article, very useful!
One thing, the sitemap validator which you have linked to does not seem to work (tried both a URL and uploading a file)?
No biggie, just wondered if it was just me?! :)
Nice SEO post, i liked the most original 404 pages, dead (Zelda) link is awesome :)
thanks, nice job
Both Google and Bing allow you to exclude url query parameters within their webmaster tools. As long as your filter parameters are different from the parameters you use in the actual product detail url, it shouldn't be that big of a deal to filter these out from search.
Very good read Sir Moogan :) Its the little details like the misconfigured server that can have one wondering for days or weeks.
Towards the end you misspelled file.
that was a mirror of the first fiel type found in
Great post Paddy, thank you for a quick overview of the most common technical SEO problems. I may send your post to some of my clients so they get what I am talking about when pointing out these...Like yourself I come across the same problems over and over again..I am surprised how little web-developers sometimes do and how messed up some websites are...with duplicated meta tags, duplicated content, canonicalization issues and problems with URL redirects...recently I came across an interesting problem with a website using secure URLs across some part of the page and non secure URLs throughout the rest of it...the redirects are all over the place...so it is quite captivating issue which I have never seen before...
Just for interest, how did you resolve that issue?
Great post. I have already shared this with our development team. Can you do a similar post on schemas and microformats?
Useful post. I can see this one being linked from the Q&A day on day for a long time!
Great Post, technical problems ALWAYS come up in my work, and it's good to have solid references to get these solved.
Thanks
Zach
Well Paddy,
It's good but one more thing is that We need resolve URL like URL is Opening with WWW Or without WWW. It's also depend an Important role in SEO.
I think its this can also resolved through re-writing URLs if you have IIS 7 server available...
First, Conform what you want to say. It's a URL Canonicalization. So, we can also solve With .htaccess.
Thanks Paddy for this awesome detailed post that is surly going to help me as these days I am actively dealing with the .Net platform websites...
I agree with almost all the points and want to add note that there are websites that really do not contain IIS 7 and this is where the problems starts as without it URL rewriting cannot be done and going for IIS 7 usually cost them a lot at step 1.
Excellent post that clearly guide us readers of these pointers, Paddy! Items 4 and 10 are what interests me most though. I'll be looking to check and update those in my websites very, very soon. Thanks very much for sharing these. Cheers!
Thanks for the article - I found it useful it's easy to get hung up on some issues.
Great work Paddy, your above common technical seo problems and solutions are so much important to me as i wanted to learn advanced webmaster skills. I have learn lots of things to maintain a website properly from this post.
Great Stuff. cant argue will simple and straightfoward.
Thanks Paddy
For someone starting to get more hand on with the on-site technicals this is a great great post. Cheers Paddy
Hi Paddy,
Great tips! Once I get into site tweaking mode, it can be a long drawn exercise to fix errors and improve little features to prop up your website. I rely on google webmaster to fix 404 and redirects, while using webpagetest and pingdom to rectify speed issues. Thanks for the article!
cheers, Priyank
Thanks Paddy for another useful post.
This is great, I have done these points most in my projects. Interesting stuffs with 3 point with query parameter. that's really good and working point for most.
Terrific post and extremely useful to the growing breed of non-tech SEOs like me,
Great Post. I especially liked the soft 404 error and URL parameter tips. I never paid attention to 404 soft errors; will start looking into these.
Thanks.
Paddy -
The soft 404s and Robots tips are amazing. Thanks for sharing those publicly!
Very useful tip here. I found it very interesting for me the solve for base64 situation for Google...
Hey Paddy, love this post.
This post answers a large number of Q&A Forum technical questions. For anyone who frequents the Q&A Forums or has ever looked up this information more than once, you need to bookmark this post.
Solid tips.
man, how can you write this big content. You are Fab..
the 301 redirect is the best..
Fixing technical issues is the #1 task I perform on any campaign. Making sure that the website is crawled properly as well as the way you want will boost UX, especially when giving more attention to point 3 and 4!
Very nice post Paddy, clearly laid out and fairly easy to understand for a technical SEO post =) been doing a few site audits recently so this is something I can relate to =)
Great detailed post Paddy. A couple of points I would like to add.
Note sure i entirely agree with the 301 redirect solution for the homepage. If there are multiple versions of the homepage, chances are that, there will be multiple versions of other pages as well and in that case a canonical tag will be the preferred option. This helps avoid too many 301s.
With regards to same pages with different querry parameters, both Google and Bing have parameter blocking functionality which makes it easy for you to tell the search engines to ignore those urls.
All in all, some decent action points people can arm themselves with.
Thanks for the comment!
I see what you mean about the 301 redirect for the homepage, both solutions will work. It can just come down to preference and what your developers can do.
Re the query parameters, agree it is a good option but I have had different results when using this in practice. For example on one client website, they pretty much ignored all my requested to block URLs. Have you found they work every time for your own websites?
These are both okay solutions (IMO) but the first thing to ensure is that all links point consistently to the canonical page.
301 redirects, whilst useful are too often used as a cure-all by SEOs. Rather than fixing a link, they choose only to redirect the URL. It's always best to do both as a direct link is quicker for the user, probably won't lose PageRank and will reduce any problems if the 301 is lost in future.
Misconfigured Servers is a great issue, well done for diagnosing. I recently found a related problem where the server was returning a Content-Type header for an important category page as text/JavaScript instead of text/html and therefore wasn't being indexed by search engines.
It is a good post Paddy.
I would discuss a bit further the multiple version for home page since on IIS it is not working the 301 redirection from default to canonical version...but it is an old story ;-)
Hadn't seen the base64 issue before, learn something new everyday!
Thanks for this post. Technical SEO can be the hardest part when you run into problems that are not as easy to spot. Having a checklist to start from is very helpful, and I have shared this material with a couple of clients already. Keep up the great work!
Hi Paddy. You have given a nice post on the major SEO technical problems. But I have some problem with Link Building process. I heard that if we done link building regularly than google will punish that site. Is it true? Could you please help me regarding this problem?
Paddy,
Even though this is an old post a good one none the less which provides more depth on how to address common URL problems using Microsoft's IIS URL rewrite extension - https://weblogs.asp.net/scottgu/archive/2010/04/20/tip-trick-fix-common-seo-problems-using-the-url-rewrite-extension.aspx
Cheers,
Vahe
SEO Company New Delhi thanks Paddy.... I have a website https://www.asdtpl.com/ that not showing cache date and can't create sitemap for this please help me what should i do.............
I also ran into the "weird character in robots.txt" issue. I am reasonably sure at this point that it was actually caused by the Byte Order Mark that my editor put into the file when I saved it as unicode. So be sure to save your robots locally as a plain 'ol text file, not unicode.
Considering the filter-url-variations: How about masking the links using a PRG pattern? Any opinions?
Very interesting this article, has helped me a lot and has given me new knowledge on the subject
Right. Just apply for SEO sales position and this was my first read on the subject. I'm sold. Thx Moz, I was able to use techipedia to search definitions and now have a growing handle on the subject. Cheers!
As real estate agents we continually have hundreds of new pages added to our website daily along the same lines hundreds going sold... they idx removes all the info on the property and sends a 404 not found to google and a 404 page to the client saying ... Sorry this one not available.. some folks are switching the 404 on the fly to a 200 so that google does not remove it.. this can cause the client website to build up a ton of soft 404 as google is being tricked into thinking the page still has useful info.. so the client may have 20,000 pages indexed but over say 10 years when the solds are not removed.. 30 sold per day x 365 days = 10,950 x 10 years = 100,950 pages indexed that are sold.. now the agent only has a 10,000 page content web and 100,000 worthless pages.. what are your thoughts on this?
IT does give the client the false impression that they are doing better than their friend, but when in fact google thinks you have nothing but garbage with 100,000 pages saying "Sorry this one is sold.."
Recently another article just came out on MOZ basically says that if its a 404 and it walks, talks or crawls.. 301 it or 200 it... exactly opposite to what google says.. when a home sells ... 404 it... period... they are very specific.
Thoughts....
Thanks for the great post on SEO and technical errors.
Hi ,
I am Techiwebi, and i am facing problem in webmaster tools (when i add my site in webmaster tool i can't find my all indexed pages and there is no data available in internal link under traffic field ),so please solve my issues as soon as possible and there is some canonical issues also.and i read somewhere that .htaccess file is for apache server and we are working in php .then what is good for me?
my site is "www.techiwebi.com"
Please visit my page and help me to find a solution Please Visit https://eeeprojectz.blogspot.com
My blog site has 32 blocked url by robots.txt
How to slove it ?
Being in the SEO i can attest that i have encountered these problems and currently encountering some of these that is why Ive been thankful that you have brought this topic specifically for the 404 errors and the 302 redirects which i have been encounteing over and over again that ive searching for its cure. Now that i have landed on your page you resolved my issues. Thanks to your post.
I would either download the adwords buddy online course to solve the problem you mentioned, they have lots of video instructionals that go over live setups of accounts and whatnot or just work with johns agency, results driven marketing, call them at 888-648-5526 , they give you a free audit of your adwords account if you call in and were really helpful in setting up my bosses company with landing pages.
Thanks for sharing such a valuable post. Yes definitely, apart from link building, on page factors are necessary to be taken into consideration in SEO. All these small technical problems leads to a bigger issue. Every SEO professional should take care of these problems and improve ranking on search engines by practicing proper SEO techniques.
Wow. Just Wow. Great job on some common yet often overlooked errors. It makes my job so much easier.
Great post. It's awesome that these last few post are about technical seo. I think we need some more of that stuff on SEOmoz. Most people know that they need to fix things but when it comes down to doing it. There are only a few places to look on the web geared towards SEO.
Nice to hear some technical solutions!
Re: Query parameters added to the end of URLs
Glad someone has pointed out this issue with regard to eCommerce sites. I've seen major sites with this issue and probably don't even realize it! Magento is my hands down favorite choice for eCommerce, but even with all it's magic, it's chuck full of such url issues, they have plugins to fix the problemo.
Multiple versions of the homepage
I've always just done permanent rewrites to a single for that issue, but it hasn't always worked, (non-SEO wiki sites) so I took notes on your points there well written thanks!
Great post, Paddy.
Would you also consider having URLs accessible with and without the WWW prefix to be a common technical SEO problem? It's just like the multiple versions of the homepage, except on a much wider scale. Even though a lot of hosting providers have the option to automatically redirect non-WWW versions, I still see the issue in about 1/4 of the sites I audit.
This is a clearly written and easy to understand article that can immediately be put to use. Nice job Paddy!
Hi Paddy,
thanks for the great article. Instantly shared it with my team - some great points here.
One question: In #3 you state: "If the URLs are not already indexed, the simplest step to take is to add the URL structure to your robots.txt file"
If you have Links going to these sites, wouldn't that mean a big loss of link juice. Don't links to pages blocked by robots.txt leak link juice?
Hi Raoul,
Thanks for the comment.
Yes that is kind of true. URLs that are blocked in robots.txt can accumulate PageRank but technically can't pass it to any other pages if Google can't crawl the page. I'm not 100% sure if those links would still help the domain though, so I'd agree it is possible that some link equity would be lost.
These types of pages are not that link worthy though and given the choice, I'd probably prefer not to have site architecture issues. But if you're worried about it, the rel=canonical tag is probably a better solution.
Cheers.
Paddy
Hi Paddy,
thanks for the response, that's what I was thinking - i guess the canonical might be more appropriate but with crawling budget in mind the best might be to hide these links from crawlers overall.
Still incredible resource - As I said I passed it to my team for learning.
Keep up the great work.
Cheers,
Raoul
Solid points Paddy! Tip #8 is new to me but one we'll be adding to the arsenal for sure.
this is very serious problem i also want to know about upar case problem please send me to propper answer
Informative content here with Paddy.
Whenever I'm working with a new client these are some of the first things I look for, especially the duplicate homepage. Most site owners don't even realize it's an issue because it's just the way the site was built. It's always worth getting the little things on your site in order first!
I do add some of the above listed things while doing the on-Page SEO but certainly all the things you described are very important. I am sure I will be taking care of things like this in future project :)
One problem I see all the time in sites using .net is that the URLs don't accept common tagging parameters ( /?asdf). I not only have trouble with the tags themselves when sending paid media to the site - we'll get some errant version of the intended page - but I also have a difficult time explaining the RIGHT fix. I've seen attemps with URL redirects and meta refreshes. Any suggestions would be AMAZING.
These may be common but they are not obvious to everyone. Fantastic list of issues and solutions, thanks!
this is great.now i hav idea of solving Seo problems.. thankss
Great and informative post Paddy! The on page elements you described are necessary and very useful in our SEO strategy and about 301, I think it is good if we have more than one version of home page.
Thank you for good Web problems review and recommendations. I copied them on my PC for future reference.
That is rock solid article, it help me a lot. I specially like "soft 404 errors. Thanks..
Thanks for this informative tips. I'm doing SEO services and I found this very interesting in achieving great results.
Very helpful and informative post! Thanks for the valuable information. I have myself encountered a number of times multiple versions of home page problem and had a hard time solving the issue.
Thanks Paddy.
About to embark on a new website for my employer and some of these tips are very useful.
Thanks Paddy! It's so great to see solutions to all the main technical SEO problems in one place. Would love to see this kind of article written on nearly every other SEO topic as well.
AWESOME onsite SEO article! Thank you!
Thanks for the article mate! I always appreciate some easy-to-follow technical SEO words.
Some good advice there, I think I need to give my site a once over and make sure all the SEO is in place.
Great info, cheeers
A really useful post thanks Paddy. I particularly liked your ideas for keyword research relating to the query parameters in #3.