I've been working over Labor Day weekend a bit, reviewing sites for clients (and friends) and continually find that site architecture best practices are being ignored. Since Monday isn't a holiday globally, I thought it would be worthwhile to provide a quick peek into what I'd recommend on the structural side of SEO:
- Dynamic URLs - If it's even remotely possible, avoid them completely. From what we've seen, a couple of the search engines actually have different trust metrics or ranking criteria they apply to dynamic URLs. It's not as big a deal for Google as for others, but why take a chance? ISAPI and mod_rewrite are simple to implement and certainly worth the time.
- 3 Clicks to Any Page - Normally, webdev industry insiders consider this a rule for usability, but it's also critical to successful SEO. If you want spiders to quickly find your content and engines to rank it well, make your sitemap page accessible from every page on the site, and if you've got a monstrously huge site, use sub-sitemaps for unique sections to ensure that thousands of pages can be accessed in 2-3 clicks/links.
- Avoid Unneccessary Subdomains - It's up for speculation as to whether each of the engines applies the entirety of a domain's trust and linkjuice weight to subdomains. Some think it's on a case-by-case basis, which I find reasonable, and others thing they are generally devalued as compared to the primary domain. In either case, unless you're looking to dominate the SERPs via a subdomain takeover (like this guy), subdomain content can easily go in a subfolder.
- Internal Anchor Text Bombing - The funny part about this tactic is - it used to work. You could change the link to your site's home page to read "denver mortgage refinance" and actually rank for it. Luckily, Google & Yahoo! got smart right around the same time and actually started penalizing sites that used this tactic. Your best bet now is to write internal anchor text for visitors, not engines. If you run a Denver real estate site and link to your refinance page, it might be fine to use that anchor text, but for primary site navigation, this technique is more likely to hurt than help.
- PageRank Flow - Two words will suffice - ignore it. PageRank flow through a site used to be a valid tactic, but these days, you're wasting valuable time determining the number of links, where they point and attempting to modify your site based on the 7-year old formula. Similarly, keeping a careful eye on your outbound links doesn't pay - just think of the human user and deliver what they'd want (it's remarkable how far this will take you).
- One Piece of Content, One URL - This probably trips up more big, commercial sites than any other. The issue is that the same content is accessible in multiple ways and on multiple URLs, forcing the search engines (and visitors) to choose which is the canonical version, which to link to and which to disregard. No one wins when sites fight themselves - make peace and if you have to deliver the content in different ways, rely on cookies or session IDs so you don't confuse the spiders.
Hopefully, these suggestions are already issues you're familiar with and apply without a second thought. If you've got others to suggest, I'm all ears (even though I'm technically taking the day off).
About PageRank Flow... I agree with rand about the fact that watching "single page" pageranks is vein, but I disagree with rand about pr-flow being deceased. Tough it is nearly not as effective SEO technique as it used to be in old days, in some cases it can provide the necessary cut (the same goes with internal anchor text).
Pagerank flow is also very accurate way to measure site architechture: how functional architechture is, how many clicks a page needs before it's found, what portions of site internal links *push out* the most etc. highly detailed data that are very usefull to e-commerce sites.
Regarding dynamic URLs, what issues can a mod_rewrite create if you transition months after launching the site? Google has already indexed my pages (including all my dynamic URLs) so is it really worth it to rewrite all the URLs? Won't this be a step backwards with Google?
If it's an on-site change, just 301 the old URLs to the new ones, make sure you're submitted to GG Sitemaps (seriously, it works wonders for big changes) and you should be great. The concerns about 301ing is primarily if you're moving domains. Moving pages inside domains isn't really an issue.
Can you clarify or give some support for the subdomain argument?
The subdomain trick still works wonders and unless you think Google will overcorrect this on their next algo switch, I don't see the harm.
I'm not necessarily denying it, but I personally haven't come across too much information that supports this particular point of yours (unlike the others, which I pretty much completely agree with).
What I mean is - unless you have solid rankings for #1 and are looking to use subdomains to have multiple listings on page 1 for the search phrases that all come to your site, I'd elect to keep all content behind the top level domain. Search engines will count all those links as bolstering the whole of the site, and to me, getting that solidified link love is better than spreading it out among many subdomains, betting on the subdomain trick to always bring you extra traffic from multiple listings.
Rand, about the subdomain issue, do believe it is a bad idea to split you main agency site www.example.com and your blog site blog.example.com? Do you have any evidence or experience to support this? Or are you strictly speaking about subdomain spamming?
This has really made me unsure about my next blog installation...
sasa - I don't see a big problem with using blog.example.com, but I do know that search engines have in the past looked at subdomains as potentially separate from the main domain. Thus, if you host it at example.com/blog/, you can be sure that links to the blog will pass the trust, reputation, link love, etc. to the domain as a whole, but if you host it at blog.example.com, you can't be 100% assured of that. Thus, in the world of best practices, I'd dodge the subdomain for blog use unless you've got a good business reason for it.
Yeah, but I wonder how much of an issue this is really going to be? Do I loose like .1 % or 20 % of link love? What kinda hits me as wrong is the assumption that subdomains is a suboptimal thing. In my eyes it would be just wrong if a SE devalues this case. It would be cool to get an anwser about this from Matt Cutts. You know of a good way to contact him about this? Maybe a blog post? The reason for me wanting to use a subdomain is purely technical. The blog software I am trying to use just doesn't work reliable out of a directory (yet). I would take me days to fix this which is not possible at the moment. Putting it on a subdomain works flawlessly. I could 301 to the directory whenever the support for directory installtions matures. This should consolidate the link power.
Could someone clear something up for me...
Does linking to /folder/page.html count as absolute or relative link.
Since you can link to /folder/page.html from anywhere on the website it could be argued it is absolute.
However since it doesn't include the domain name it could be said to be relative...
Am I reading to much into this?
This article, albeit two years old, bothers me. Rand, I just got finished reading one of the SEOmoz PRO articles called "The Professionals Guide to PageRank Optimization". In the article, the author stresses the importance of PageRank sculpting and flow within your website - yet you say this is a waste of time?
In 2009, today, what's the general concensous with regards to PR optimization? A waste of time? A good practice?
I am in the same boat. I just the "pro guide to PR optimization". Is PR sculpting worthwhile? With a universal nav every page, more or less, is linked to every otherpage an equal number of times. In this case there may be a navigational hierarchy, but isn't internal PR basically flat? I am interested to hear comments on this.
Well truely I am a bit surprised to know from you that google penalizes the sites that use keywords as anchor text of Internal links..... I always used this.... what I did is to use "home" with "nofollow" and use another keyword full anchor text anywhere on the page....... Does google really penalise these sites????? well, I learned seo by looking at the sites that were placed higher in the serps.... I searched for a highly competitive keyword "website templates" and the first result was www.templatemonster .com which is using this technique still..... do you came up with an example ever where google penalised a site for using keywords in the anchor text??????? I would certainly like to know more on this.....
hi im tammy from Bali (Asia), I just wanted to say that actualy x8drums.com think uses this anchor text thingee and is still on top, could you analyze their website and tell me what you think ? they are top 10 for the keyword "djembe, hand drums" Google.com
thanks
Tammy
Brief analysis concludes that it isn't a very difficult keyword to rank for. Additionally, that home link isn't the only place where the keyword is found. I'm sure Google's algorithm is complex enough to analyze where else the keyword appears and determine if the site owner is anchor text bombing or if the site actually about that.
 Also, from my experience, one can still anchor-text bomb for more obscure keywords with limited success only as a result of said phrase not often showing up in websites.
Since I'm about to call SEO to an end is 9 days - I thought I would share a different view:
Dynamic URLs - Disagree (wholeheartedly) mod rewrites fix this and '0' search engines will have a problem. Even without none of the big 3 have any problems so long as you limit the params.
This - 3 Clicks to Any Page & that - Avoid Unneccessary Subdomains - are like oxymorons. Clearly the opposing force on larger website - the former is is impractical as another rule is limit '# of links' on any giving page. Google suggests 100 but even that for visitor usability is pretty steep - thus the latter is an aid here... in either case advance planning makes it work.
Internal Anchor Text Bombing - still quite effective - amazingly "home" is the best example here. While one could argue (top 10) nytimes.com, latimes.com, bbc.co.uk, suntimes.com, setiathome.berkeley.edu, usatoday.com, ebay.com, guardian.co.uk, boston.com, nasa.gov all offer something for the "home" they are not overly "home" related. It is better to view this from the vantagepoint of breadcrumb - they add massive improvements and also are a great asset to usability.
The latter two PageRank Flow & One Piece of Content, One URL are great advice.
Dynamic URLs- From my experience this is correct. The only time I've had issues with them (with any search engine) is if your site forces a session ID to show in the URL, and that is easy enough to fix. HOWEVER, I do think that using mod_rewrite or ISAPI is a very good idea, perhaps not for search engines but for bookmarking and URL memorability.
3 Clicks to Any Page- This IS a good idea, and it's not too difficult to do, even for large sites (just categorize things). As far as how many links Google likes to see on a page, I think it can be very relative. If they find a lot of VALUABLE content that is not duplicated and is linked-to and considered to be a resource by other people/sites in the industry, they tend to allow a few more links on a page. As long as everything is organized and easy to navigate, I think the stadard of 2-3 clicks to what you need is a good one to keep to..
Good discussion on a great post! Thanks Rand.
One Piece of Content, One URL
This is also my favorite one, as I see that it is often not respected which can have dire consequences.
There are two issues with it, actually: - content accessible on several different URLs - one URL for several alternative contents The second case is rarer and I have seen it mostly occur with language detection based on cookies or the browser preferences. This is a true showstopper as only the default language version can be indexed.
I find the website architecture for SEO theme quite of interest as it is linked to information architecture and useability matters. Working on SEO, information architecture or useability sometimes improves the whole website for both users and search engines.
One Piece of Content, One URL
Funny, I just wrote a little about that, after reading Do not Crawl in the DUST: Different URLs with Similar Text which is the extended abstract of a poster presented this May at the WWW2006.
Even if a search engine gets smarter in understanding which pages at different URLs are very similar, or the same, you are still at their mercy when it comes time to decide which URL is the one they will fetch, index, or serve. That's a situation I hate being in.
Yep, I spent almost half an hour today trying to explain my client why his dozens of duplicate pages could pose a serious problem. Duplicate content seems to be one of the worst issues that seo's are facing these days :(
Agree with you there, Nadir. It is an issue that is prominent and it can be very helpful in addressing it. Fortunately, it's a solvable problem, with enough effort.
While I agree with Rand that it isn't normally worth the effort to become fixiated on pagerank flow, especially on the level of granularity that he describes, that flow isn't being helped any when you have Google thinking that a 3,000 page site actually has 90,000 pages, most of them substantially similar or the same.
Good list Rand, I would like to add this one:
-Use a "smart" robots.txt file: for big ecommerce sites, a robots.txt file that forbids bots to crawl "useless" pages can allow them to crawl your website more efficiently.
That's another good point. I'll make sure to add that to some of my sites (some I don't really care), but my top priority sites I'll make sure to add that to make the crawling more effective. Thanks!
@ dynamic URLs. Since I do my work in ASP and I see this question raised quite often, I would just add that you can also use a custom 404 if you don't have mod_rewrite available at your server.
I've developed my own method since "all" the tutorials I've seen are not very flexible, or are using all kinds of strange methods (like ex. linking to somepage-1.html and then using the custom 404 to do a 301-redirect to page.asp?id=1).
Perhaps I write a tutorial some day if somebody would like to know how I solve this.
Your first note say to not use dynamic links, so would you consider your blog static link or dynamic, using ?ID=#### for blog entries, non-dynamic? My understanding of a dynamic link is something such as mysite.php?thispage=home, using the question mark (?). Is that no longer true for search engines?
LOL, good call!
Jonathan - This is a case of "do as I say, not as I do." Those dynamic URLs DO hurt SEOmoz's search traffic. I plan to move them over with a mod_rewrite in the near future - I'll make sure the results get posted publicly.
Ahhh gotcha okay. I'm already doing it for all my sites, but I was just wondering. Thanks!
This is interesting. I was just reading on some SEO site that using navigation with "HOME" in the anchor text can hurt your rankings.
Can't use Keywords.... can't use Home... what's a webmaster to do?
Kurt - not sure who told you that, but I'd disbelieve it. Use Home - that's the convention and a search engine would be crazu to penalize 95% of the sites out there that do use it.
great blog.. nice tutorials Isulong SEOph
I will throw another one into the mix...
Using images when HTML will suffice. I have seen site after site wonder why they do not rank the way they want, then you look at the site, and its all images!
Spend the time and use HTML (stylized) wherever possible.
I completely agree with you Allen. When you have a menu, for example, try to use CSS and text, avoiding images at all costs -- and at a bare minimum, have a sitemap or a footer with those links if you *have* to use images.
For the 3 clicks away from the homepage bit - that's pretty hard for forums. I run vbulletin and I'm really trying to get as many of my threads indexed as possible. But the problem I'm facing is that in the categories you can have hundreds of pages of thread listings (obviously some are FAR more than three clicks away from the homepage).
If your site has good content that is well organized (in a sitemap for example) and the links to threads don't look like spam, they will get indexed. I was speaking about the 3 click from homepage as a standard to be kept for your site's visitors. People don't generally stay at a site for too long if they can't get to where they need to be in a couple clicks.