robots.txt : Java Glossary

go to home page R words local find full screen, hide local find menu Google search web for more information on this topic jump to foot of page translate this page with Babelfish by Roedy Green ©1996-2008 Canadian Mind Products
index page for letter ⇒ punctuation 0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z (all)
robots.txt
robots.txt is a file you can place in the root directory of your website to tell web crawlers (search engines) which pages to index and which to ignore. A typical robots.txt file might look like this:
# parts of the mindprod.com website not indexed
user-agent: *
disallow: /template.html
disallow: /include/
disallow: /jgloss/include/
It means, for all browsers, don’t look at the file template.html or anything in the two directories mentioned. There is no way to tell it to avoid certain file extensions.

CMP homejump to top
CMP logo
feedback Please email your feedback for publication, errors, omissions, broken/redirected link reports
and suggestions to improve this page to Roedy Green : feedback email
made with CSS
HTML Checked!
ICRA ratings logo
mindprod.com IP:[65.110.21.43]
Your face IP:[38.103.63.61] The information on this page is for non-military use only.
You are visitor number 6,941. Military use includes use by defence contractors.
You can get a fresh copy of this page from: or possibly from your local J: drive (Java virtual drive/mindprod.com website mirror)
http://mindprod.com/jgloss/robotstxt.html J:\mindprod\jgloss\robotstxt.html