The SEO Files

Search Engine Optimisation

Tip of the Day

Search engines love fresh content so introducing a server side dynamic content section that puts up fresh content every time a bot visits is good. Learn more here...

Latest SEO Test

How much do DOFOLLOW Blog comment links really help your site? Our DOFOLLOW Blog link test finds out the answers. Click here to read more...

Thread Round Up

This week saw useful threads on Multiple GEO Targeting and a list of Social Bookmarking Sites. Click here to read more....

Robots.txt Tutorial

What is a Robots.txt file?

Ok, so what is this robots.txt I am hearing all about? Good question, glad you asked. The robots.txt file is a simple text file (name robots.txt would you believe) that you place in the root (www.yourdomain.com/robots.txt) directory of your site.

Why you should use a robots.txt file

The robots.txt file is a protocol developed so that search engines would know wether or not any particular web site should be indexed or if any parts of it (a members area for example) should be ignored. The major search engines adhere to the robots.txt protocol and will always look for a robots.txt file first. You may have noticed a 404 error where search engines have been looking for this file.

How to use your robots.txt file

Firstly just create a file with any text editor and call it robots.txt. Leave it blank for now, a blank robots.txt file is ok to use. Upload it to your root directory. For example, look at the robots.txt file for this site. This is perfectly adequate. Search Engines will come and look at the file and see that there is no blocking rule for them and will come and spider the rest of your site.

Example Uses

User-agent: * Disallow: /directory/ Disallow: /foo.html

This example will dissallow all * user agents from a directory called directory, and anything in that directory and a file called foo.html. Any where else on the site is allowed to be indexed.

User-agent: * Disallow:

The example above shows that all user agents have no files or directories disaalowed there fore any user agent can visit any part of the site.

User-agent: Microsoft URL Control Disallow: /

The above example shows a user agent called Microsoft URL control being ecluded from viewing the complete site.

For more information on robots.txt go to the robots.txt site.