

robots.txt
Planted:
Tended:
Status: decay
Hits: 216
A Web Robot (also known as a crawler, spider or search engine bot) is a program that traverses the Web automatically.
1 things they are used by is search engines to download and index web content.
A website's robots.txt
file communicates with these robots using the Robots Exclusion Protocol.
It can be used to inform the robot which areas of the site should and shouldn't be scanned.
A robots.txt
file need to be located the site's root directory.
Example
The robots.txt
file below communicates:
- ▪ block OpenAI, Google Bard and Common Crawl bots from crawling,
- ▪ all other bots are allowed to crawl the entire site and
- ▪ the site's sitemap file is located at
https://garden.bradwoods.io/sitemap.xml
.
/robots.txt
User-agent: GPTBotDisallow: /User-agent: ChatGPT-UserDisallow: /User-agent: Google-ExtendedDisallow: /User-agent: CCBotDisallow: /User-agent: *Allow: /Sitemap: https://garden.bradwoods.io/sitemap.xml
Feedback
Have any feedback about this note or just want to comment on the state of the economy?