Notes on robots.txt.
A Web Robot (also known as a crawler, spider or search engine bot) is a program that traverses the Web automatically.
1 things they are used by is search engines to download & index web content.
robots.txt file communicates with these robots using the Robots Exclusion Protocol.
It can be used to inform the robot which areas of the site should & shouldn't be scanned.
robots.txt file need to be located the site's root directory.
robots.txt file below communicates:
- all bots are allowed to crawl the entire site &
- the site's sitemap file is located at