Your robots.txt isn’t safe from the SERPs
October 15th, 2007 by Dustin Brewer
It appears as though your robots.txt file that tells search engines what to crawl and what not to crawl is being indexed by Google. This means all those pages/files that you don’t want crawled are available to the general public by just doing a Google search for the robots.txt file. Oddly enough they are even getting PageRanks.The top robots files are whitehouse.gov, google and Wikipedia. So what does this matter? If you have pages that you don’t want to be indexed by Google they could be sensitive pages that you don’t want anyone to see outside of your company. So the robots.txt files seems to give out that information freely, not only in the SERPs but also to people that may have malicious intent to just randomly check your Robots.txt file for pages that may be of interest.
You can check the results by doing a search like “robots.txt filetype:txt” and see all of the results for different site’s robots.txt files.
The best way to avoid these kinds of trouble is to ensure that you setup your site with an .htaccess file to restrict these pages from being viewed without a password. This file is secure and can’t be indexed by search engines.
This just serves as a reminder for those that don’t realize the full exposing potential of what is in your robots.txt file. Also, the robots.txt file isn’t always in charge. All of the bad bots don’t bother looking at your robots.txt file to see if they are â��allowedâ�� to go to certain pages, they just go through your site anyway. I wouldn’t be surprised if some of them even go straight for the pages that are disallowed with their creators hoping they will find what they are looking for more quickly.
Popularity: 4%








October 15th, 2007
Yeah I have seen some people do some interesting things in a robots.txt file.
October 15th, 2007
Wow you finally figure this out you idiot. Robots.txt is a guide not a rule. Oh another stupid thing most people do is add
Disallow: companysecrets.doc
October 15th, 2007
I never thought of it like that