What is Robots.txt and why should I use it?
July 31st, 2007 by Dustin Brewer
I am here to answer just that question. There has been a lot of mystery I have gathered from fellow web developers about the benefits of using a robots.txt. Most are under the false impression that the only purpose for this text file is to tell search robots not to index a certain section of their site. This isn’t the case at all! There are much more useful reasons to be using a Robots.txt that have to do very much with Search Engine Optimization (SEO) that most people don’t realize.
What is this thing!?
What is robots.txt? Well quite simply it is a text file you place in a directory that search robots look for when crawling your site. Basically when the Google or Yahoo robot go to your domain the first thing they do is look for your robots.txt file to see if there are any exclusions or inclusions available. You can even specify what robots you want to search your site and which you don’t. This can be handy for avoiding known spambots that will scrape your content or email address from your site. Not all listen, but those that do can be put to the curb.
The Robot shuffle
This will disallow these robots from indexing your site
User-agent: MeanBot
User-agent: BadBot
Disallow: /
This will ban all robots from indexing your site
User-agent: *
Disallow: /
This will ban these directories from being indexed by all bots
User-agent: *
Disallow: /private/
Disallow: /usr/
Disallow: /tmp/
This will ban the tmp directory from BadBot
User-agent: BadBot
Disallow: /tmp/
I’m sure you get the basic idea. You have to make sure that you specify the unique bots first when you are doing this though. Robots will read from top to bottom so if you make an all inclusive one at the top you may block a bot that you want to index your site. You can have as many of these rules and sub-rules in your robots.txt file as you wish.
So that is pretty much it for robots.txt I think I pretty much covered everything about the handy little text file you could want to know. If you are worried about possible issues with your robots.txt file there is a validator here that can help you ensure that the bots will read your text file properly. For those curious here is a list of bots and their descriptions.
Popularity: 4%







