
robots.txt
Planted:
Status: decay
A Web Robot (also known as a crawler, spider or search engine bot) is a program that traverses the Web automatically.
1 things they are used by is search engines to download and index web content.
A website's robots.txt
file communicates with these robots using the Robots Exclusion Protocol.
It can be used to inform the robot which areas of the site should and shouldn't be scanned.
A robots.txt
file need to be located the site's root directory.
Example
The robots.txt
file below communicates:
- ▪ all bots are allowed to crawl the entire site and
- ▪ the site's sitemap file is located at
https://garden.bradwoods.io/sitemap.xml
.
/robots.txt
User-agent: * Allow: / Sitemap: https://garden.bradwoods.io/sitemap.xml