Question 1

1. What is a robots.txt file?

Accepted Answer

The `robots.txt` file is a standard text file placed in the root directory of your website (e.g., `example.com/robots.txt`). It tells search engine crawlers which URLs they are allowed to access and index, and which ones they should ignore.

Question 2

2. Why should I block AI bots?

Accepted Answer

Many organizations block AI bots (like OpenAI's GPTBot or Google-Extended) to prevent their website's proprietary content, articles, or user data from being scraped and used to train Large Language Models (LLMs) without compensation or attribution.

Question 3

3. Does Disallow: / mean my site won't show in Google?

Accepted Answer

Yes! If you set the Global Rules to 'Disallow All', you will output `Disallow: /` under `User-agent: *`. This asks all ethical search engines (like Google and Bing) to completely ignore your entire website. Only use this for private staging or development environments.

Question 4

4. Where do I put this file?

Accepted Answer

Save the downloaded file exactly as `robots.txt` and upload it to the root public directory of your web server so it is accessible at `https://yourdomain.com/robots.txt`.

robots.txt Generator

Block AI Scrapers

Output File