Support Center » Knowledgebase » Unix Tips » Search Engine Friendly URL

Search Engine Friendly URL



Most dynamic pages nowadays are accessed via URLs like these:

www.site.com/article.php?id=123 and so on. Notice the bold part is the parameters that get passed to the show.php script.

URLs like these can be simplified if you can recognise a pattern to it. For example, you can make the URL shorter and nicer:

www.site.com/articles/123.html or any other syntax that you prefer.

Apache's mod_rewrite feature can help in this situation by converting the "made up" URL (the one with 123.html) into the real URL (the one with ?id=123) to be fed to your dynamic script. So when someone types

www.yoursite.com/articles/33.html

mod_rewrite will translate it so that apache will think the request was www.yoursite.com/article.php?id=33 and invoke your article.php script properly.

As an added advantage, the URL will be easier to type and shorter to remember as well.

To have mod_rewrite translate the URLs as above, create a text file and save it as .htaccess (yes this is the name of the file and it starts with a dot) if you haven't already had one. You will need to add the following lines into your .htaccess file:

RewriteEngine on
RewriteBase /
RewriteRule ^articles/(.*).html article.php?id=$1 [L]

Note that the .htaccess file needs to be uploaded to your web site in the same directory as your index page.

The first part of RewriteRule tells mod_rewrite what to look for in the URL, and the second part is the replacement URL.

The first part really is a string called "regular expression". It is a syntax that allows for powerful pattern matching as you see in this example, this regular expression syntax will match "articles/anything.html". The "anything" part here is represented with .* which means exactly that - anything.

In the second part notice that the article ID has been assigned to $1. This $1 means the first memorized part of the expression of the first part. To define which part gets memorized, enclose it with parentheses (). If there are multiple parentheses, they will be assigned to $2, $3 and so on based on the order of their appearance.

Also notice that the "real" dot is escaped (prepended) with the backslash character "". Normally a dot represents "any" character, so in order to match a real dot, you must escape it with the backslash. This is because the dot in regular expression is used as a special character to match any character. To actually match an actual dot, you need to "escape" it with .

In order to fully understand regular expression, you will need to read up on it. However for simple tasks, the above example plus some twists and trial and error will get you there too.

 

Testing

Once the URL Rewriting has been set up as above, you can try it by typing www.yoursite.com/articles/123.html and see if it works.

 

The Next Step

The next step is to actually change the links on your pages / scripts so that it points to the /articles/123.html instead of /article.php?id=123. This may require some hacking into the script or template.

 

References:
Apache URL Rewriting Guide
Apache mod_rewrite Reference
Regular Expression Tutorial



Article Details
Article ID: 61
Created On: 09 Nov 2003 06:00 AM
 Back
 Login [Lost Password] 
Email:
Password:
Remember Me:
Please note that the login and password to the support area is NOT the same as your hosting control panel login and password
 
 Search
 Article Options
Home | Register | Submit a Ticket | Knowledgebase | Downloads | Control Panel User's Guide | Server Status
Language: