I recently read jennita's excellent post, "URL Rewrites and 301 Redirects - How does it all work?", and thought a mod_rewrite example might be helpful to some. So, here's some example code of how I have used mod_rewrite to replace dynamic URLs with SEO friendly URLs.
Please note that these examples are for *nix based web servers running Apache. If you have a Windows based web host running IIS, then this code won't help you.
EXAMPLE 1 - E-COMMERCE SITE
Sad Original URL = site.com/page.php?category=2&product=54
Happy URL :) = site.com/sandwiches/rueben-sandwich/
step 1:
Make sure that all category names and product names are unique in your database.
step 2:
Replace all references to Original URL with the New URL throughout your website.
step 3:
Use mod_rewrite in your .htaccess file to parse out the elements of the URL. Like this:
RewriteEngine On
RewriteRule /(.*)/(.*)/$ page.php?category=$1&product=$2
the (.*) pulls the elements out and puts them in variables $1 and $2.
step 4:
Update your code on the page.php file to get the data from the database via the category and product name instead of by ID (this is why the names must be unique). For example:
Before: "select * from database_table where categegory_id='$category' and product_id='$product'"
After: "select * from database_table where categegory_name='$category' and product_name='$product'"
NOTE: This is just a quick and simple example that doesn't consider sanitizing input for security (which you must do) or any table joins you might need to do on your site.
EXAMPLE 2 - CUSTOM CMS DRIVEN SITE
Here's another example. This is the way I manage the URLs on my custom CMS as described in my previous YOUmoz post.
step 1:
URLs can be absolutely anything you can dream up. Everything after site.com becomes that page's unique page_id.
site.com/section1/subsection/page-title/
site.com/blog/blog-section/blog-post/
site.com/about-us/
site.com/anything-you-want/does-not/matter-how/many/slashes-or-dashes/
step 2:
In your .htaccess file everything after the site.com domain name is going to be the unique page ID. So, I use this RewriteRule to pass it to the load_page.php file.
RewriteEngine On
RewriteRule (.*)/$ load_page.php?&page_id=$1
step 3:
In load_page.php, I get all the content for the page from the database like this:
"select * from pages where page_url='".mysql_real_escape_string($page_id)."'";
Hope this is helpful. Please note that this post has been written for a technical audience. I'll try my best to answer any questions posted in the comments.
Thanks for adding these examples! Anyone interested in IIS examples?
IIS rewrites .. first take a year old goat without blemish, a strong double-edged blade of steel with an ivory handle, ...
Very much so, as I'll be doing this for 6,000 existing URLs so would welcome any advice
Until Jennita or someone else post something useful, try to mix tips from this article, carefully read Readme.txt file from Ionics Isapi Rewrite Filter (free ISAPI rewrite filter for IIS) and, of course, use your imagination.
I have satisfied all my wishes with Ionics Filter with only few rules, although I don't know anything about regular expressions (Readme.txt was pretty understandable to me).
Alternatively you can use commercial ISAPI_Rewrite filter.
I've used ISAPI_Rewrite to rewrite thousands of dynamic URLs, I found it to work like a charm. I've also had to use a couple rewrite modules for DNN (dotnetnuke) which actually worked pretty well. I'll work on a post that talks about specifics!
Just don't forget to recommend some non-commercial options. Readers would love it.
Great follow up post Whitespark! I rewrite all my dynamic url to friendly ones on every site I create. I agree that this is just a little touch of what you can do with the htaccess file.
*** I rewrite all my dynamic url to friendly ones on every site I create. ***
Can I pick up on the common terminology that most people use to describe this stuff, and in so doing unintentionally confuse the newbies?
A rewrite does *not* 'make' anything.
A rewite certainly does *not* change one URL into another.
If you want to use 'friendly' URLs on your site, the first thing you have to do is use those 'friendly' URLs in the links in the internal navigation on the pages of your site. It is links that 'define' URLs.
Once someone clicks one of those links and the URL request arrives at your server, then the rewrite accepts that URL request and uses it to find the path and file inside the server that will be used to fulfil that request.
A rewrite might accept a *URL* request for www.example.com/interesting-article-about-some-subject and use the rewrite rules to get that content from the *file* /article-1372.html - that is, a rewrite converts an external URL request into an internal filepath and filename mapping.
The part of the system that tells users asking for the old URL, that the resource is at a new URL, is the matching redirect(s) that I alluded to, above.
One part of the confusion is that Mod_Rewrite is used both for the redirect code and for the rewrite code. The redirect code must contain the target domain name and path and the [R=301,L] flags, and the rewrite must contain just the internal filepath and [L] flag.
Warning: If you include either a domain name or [R] in your rule then you get a redirect, not a rewrite.
The biggest problems encountered with botched redirects and rewrites are:
- failing to redirect to 'friendly' URLs if parameter-based URLs are requested, leading to Duplicate Content indexing,
- failing to fix the domain name in the very same redirect as fixing the other issues, leading to a redirection chain for some non-canonical requests,
- having a rewrite which will fetch the content for multiple (similar) versions of URL: such as with and without trailing slash, or for www and non-www, or with multiple casing issues, or having part of the URL wild-carded and value unchecked and unverified, and many other things which often get overlooked.
- mixing Redirect or RedirectMatch (from Mod_Alias) and RewriteRule (from Mod_Rewrite) in the same .htaccess file. That can cause problems with things being procesed in the wrong order. If you use RewriteRule for some of your rules, you need to use it for *all* of your rules.
- listing rewrites before redirects and thereby exposing internally rewritten filepaths back out on to the web. List redirects first.
- Listing redirects in the wrong order. List the most specific redirects first, and the most general last. Gettng the order wrong can lead to an unwanted redirection chain for certain requests.
In regard to "failing to redirect to 'friendly' URLs if parameter-based URLs are requested, leading to Duplicate Content indexing," and using the same example as above, you are saying that:
Original URL = site.com/page.php?category=2&product=54 should redirect to:
New URL= site.com/sandwiches/rueben-sandwich/
And
New URL=site.com/sandwiches/rueben-sandwich/ should have a rewrite to:
Original URL = site.com/page.php?category=2&product=54
Is that accurate?
And when you do this, customers, Google Search and Google Analytics will just see:
site.com/sandwiches/rueben-sandwich/
Correct?
i want to ReWrite This link
https://softwaredepo.net/download?id=1150
to
https://softwaredepo.net/download/1150.html
is it possible ?? if possible how ??
please help me .. please help me .. please help me .. please help me ..
Whitespark,
It's good one, again!
I have similar approach to this problem, with one little exception.
This happy URL site.com/sandwiches/rueben-sandwich/ technically points to the index (default) page on a 3rd level of hierarchy. Contrary, following URL site.com/sandwiches/rueben-sandwich (without slash) points to the "rueben-sandwich" page which is located on a 2nd level of hierarchy.
I really can't say if search engines like 2nd level more that 3rd hierarchy level, but am pretty sure that they don't like 3rd level more than 2nd.
Oh, I also always add this code to my .htaccess files which forces the slash at the end of all URLs.
RewriteRule ^/*(.+/)?([^.]*[^/])$ https://%{HTTP_HOST}/$1$2/ [L,R=301]
I'd force URL without slash. just like SEOmoz does.
https://www.seomoz.org/ugc/using-mod-rewrite-to-convert-dynamic-urls-to-seo-friendly-urls
But I agree with you that this last check (to avoid duplicated content), you added now, is very important.
What would you folks do?
I would also force not have a trailing slash when defining URLs to be used with rewrites. A URL with a trailing slash denotes a physical folder.
For an extentionless URL, the URL will not end with an extension, nor will it end with a slash.
This also makes your rewrite rules somewhat simpler, as you use a RewriteCond to test for a URL request without extension and without trailing slash, instead of using a resource intensive -d and -f test.
You also always need to be mindful that URL requests for /robots.txt, image files, CSS and JS files, SE account verification files, and so on, also do not need to be rewritten.
Agreed. The trailing slash only makes sense if you are going to be passing parameters via querystring, at which point, whyTF are you mod-rewriting anyway. g1smd is right.
See this is what makes no sense to me. In terms of your internal folders, it would appear to be 2 levels deep, but if you link to it from the home page it is one level deep, isn't it? Also he forces a slash on to the URL, so the URl (technically) is just 'designed' he could make it without the / or add it.
So what I'm saying is there is a clear difference between URL, folder and click path.
With all the mod_rewrite articles out there that show you how to redirect a request, none of them show you how to keep the url in the address bar the same?For example, you say you can redirect:/somepage-1234.htmlto the page:/?somepage=1234 [R=301]But when you do that, the user is taken to that page, and it is shown in the address bar as the dynamic page. So how to you keep the original url in the address bar?
Hi,
I am new in php and .htaccess is also.My problem is how to manage .htaccess file if two parameter in url then call sometimes file1.php and sometime call file2.php.
Ex.
https://localhost/CTCExpress/sarees -- RewriteRule ^([a-zA-Z]+)$ product.php?cID=$1 [NC,L]
https://localhost/CTCExpress/sarees/Designer ---- RewriteRule ^([a-zA-Z]+)/([a-zA-Z]+)$ product.php?cID=$1&typeID=$2 [NC,L]
Now i want to call
https://localhost/CTCExpress/saress/White+South+Sil... -- RewriteRule ^([a-zA-Z]+)/([a-zA-Z]+)$ product-detail.php?pID=$1
https://localhost/CTCExpress/saress/Designer/White+South+Sil... -- RewriteRule ^([a-zA-Z]+)/([a-zA-Z]+)$ product-detail.php?pID=$1
Please suggest how can i manage.This is eCommerce website.
This is my testing URL.
https://zeeia.com/CTCExpress/product.php?cID=1
Another fine post Whitespark. Thanks for giving concrete code examples.
RewriteRule /(.*)/(.*)/$ page.php?category=$1&product=$2
Always try to avoid using .* in pattern matching. The .* pattern is greedy, promiscuous and ambiguous.
Be especially aware that using multiple .* patterns in a rule often requires the parser to try hundreds, maybe even thousands, of 'back-off and retry' operations before a match is found. This is very inefficient.
Always try to craft a pattern that can be parsed from left to right in one operation.
I would suggest using this alternative rule (here coded for use in .htaccess):
RewriteRule ^([^/]+)/([^/]+)$ /page.php?category=$1&product=$2 [L]
Always add [L] to the end of each rule unless you know exactly why it should be omitted.
Be aware that for the RewriteRule, when used in .htaccess, that path information is 'localised'. In this case, when used in root, it means the leading slash is not seen.
plz can anyone tell me how the URL rewrite works? i cant understand
Regards
Khan
Fine post . Good work Whitespark
There's a final step (maybe two) needed here to keep things tidy, and prevent Duplicate Content issues:
1. Redirect requests for the old format 'dynamic' URLs over to the new format 'friendly' URLs.
2. Protect the site from being accessed using parameter-based URLs.
In some cases those steps are one and the same, and at other times they are two different steps.
If both the old and new URLs use simple names and/or numbers then this can be done in .htaccess with a simple RewriteRule. That rule should force a 301 redirect, and the correct domain name at the same time for all requests for old format paths.
If the new URLs use words and numbers that don't appear in the old URLs then you need to rewrite those requsts for the old URL to then use a script that looks the old URLs up in the database and sends out the redirect HTTP header from inside the script.
Example:
Your site used to use URLs like: example.com/?item=23456&size=16
You now use URLs like: www.example.com/blue-shirt/16
You need two redirects:
If: /?product=blue-shirt&size=16 or /index.php?product=blue-shirt&size=16 is requested, redirect to: www.example.com/blue-shirt/16
If: /?item=23456&size=16 or /index.php?item=23456&size=16 is requested, redirect to: www.example.com/blue-shirt/16
Once the correct URL is being requested, your internal rewrite is then used to service that request and deliver the correct content.
Don't forget that your redirects also need to cater for:
1. requests with additional, redundant, parameters and redirect those to the canonical URL,
2. requests with the paramters in a different order, and redirect those to the canonical form.
That is, there is no point in protecting your content from being indexed at: /?product=blue-shirt&size=16 if you are not also going to protect it from being indexed at: /?size=16&product=blue-shirt and at: /?product=blue-shirt&size=16&someparam=somevalue (and other parameter order combinations) too.
g1smd brings up a great point. Going the extra mile to tidy up any loose ends with mod_rewrite will pay off dividends. I worked with a large affiliate program to help improve their SEO and identifying the canonical url to tell the search engines was the last piece in the puzzle for us to hit it out of the park. This will clear up any confusion that may arise from your new redirects.
Whoever thumbed this down - carry on living in ignorance mate.
Why not bring some discussion to the thread rather than thumbing down stuff you obviously don't understand?
I think that problem with your thumbs down is not whether those people agree with you or not. I believe they don't like your general writing style rather than what you have to say about particular topic.
Generally, quality of comment should be dominant here, not style, but we're all human beings (read: not always rational).
Great post - we've also just started to Mod Rewrite images that are uploaded dynamically and assigned a number.
We are hoping that this will help in Google Image serach and general search term focus on a particular product page :)
Anyone can say, how to create Mod_rewrite in dotnetnuke
[links removed]
As mentioned in the comments, https://iirf.codeplex.com/ is quite helpful.
Using this i'm encountering a HUGE problem. When I try to use the <img> tag in html, it wont let the images render.
Is there a way to fix this? Would LOVE to use this example.
Helpful but tell me how put unique title desc and keyword to a .net site?
ah if only web developers would take some more time planning the site so this would not be required....
Whitespark, great follow up with the examples! I'm going to try out that rewrite rule that appends lines with a trailing slash.
Caution, if you're using content that's not from the DB (and hence have .htaccess rules to check for files or folders first before running an expensive DB query) then you need to be aware that a little missing slash can cause an internal redirect and extra server load, ditto a stray extra slash.
But y'all knew that right.
Malware warning on your link, pbhj. (MW;DR)
I forgot something very important
Knowing that, unlike a domain name, URL is case sensitive, it's a must to have this check rule.
If regular URL is site.com/page, but requested is site.com/PAGE or site.com/paGe there must be a 301 redirection to site.com/page
Even SEOmoz failed here so Rebecca .... (I know that mentioning your name will trigger your filters)
Actually a separate 301 is NOT necessary in regards to case.
Simply add the NC flag at the end of your rule (NC = NO CASE)
Check out the flags at the end of the RewriteRule section in the mod_rewrite manual: https://httpd.apache.org/docs/1.3/mod/mod_rewrite.html#RewriteRule
Note this only applies to Apache. The rule is different in IIS, Lighttpd, and Nginx.
You can use the [NC] flag with the old URLs to redirect to the new URLs, but do be aware that if your rewrite accepts aNy CasE of URL then you have just created a Duplicate Content problem.
If you really want to avoid a lot of horribile complication of using a RewriteMap and having to parse every URL request at least 26 times to fix it, then simply use all lower-case for all URLs and make all included upper-case elements in requests force a 404 fail.
IIS takes this to a new level of #FAIL with its own designed-in, built-in stupidity of not caring what case the request is in, nor what case the real filename on the server uses.
g1smd are you an redirect god?
I think he might be. He should have written this post. :)
Wait, what? Who thumbed me down for that?
I wasn't being sarcastic. I really think g1smd could have done a better job on this post than me. He clearly has a lot of knowledge in this area.
Looking at his comments, it seems like someone is going rampant with the thumbs down button on everything he writes. g1smd has provided a lot of great additional information to this post. If you disagree, then we would love to hear your opinion, which would be more productive than just childishly thumbing down everything he says.
Here's a great article on .htaccess I just came across:
16 great .htaccess tricks and hacks
Thanks whitespark, can you recommend a good Php developer who is use to large Mod Rewrite projects.
Ok, how about this one:
I'd like to redirect
https://mysite.com/index.php?page=oldpage.html
to
https://mysite.com/oldpage.html
Any help is much appreciated!
Billie