robots.txt Generator
Generate robots.txt files with advanced controls for AI crawlers, custom user-agents, and sitemap inclusion.
Block AI Scrapers
Output File
# Global Rules for all standard crawlers User-agent: * Disallow: /admin/ Disallow: /private/ Disallow: /tmp/ # Block AI Crawlers and Scrapers User-agent: GPTBot Disallow: / User-agent: CCBot Disallow: / User-agent: Google-Extended Disallow: / User-agent: Anthropic-ai Disallow: / User-agent: Claude-Web Disallow: / User-agent: PerplexityBot Disallow: / User-agent: OmgiliBot Disallow: / User-agent: Amazonbot Disallow: / User-agent: Applebot-Extended Disallow: /
robots.txt Generator Overview
The `robots.txt` file is your website's first line of communication with web crawlers. A single typo in this file can accidentally de-index your entire application from Google. Our free generator provides a fail-safe visual interface to construct these files perfectly.
With the rise of Large Language Models, web scraping has become aggressive. We included a dedicated "AI Crawler" block module, allowing you to instantly sever access to bots from OpenAI, Google Gemini, Anthropic, and Perplexity, protecting your proprietary content.
Frequently Asked Questions
1. What is a robots.txt file?
The `robots.txt` file is a standard text file placed in the root directory of your website (e.g., `example.com/robots.txt`). It tells search engine crawlers which URLs they are allowed to access and index, and which ones they should ignore.
2. Why should I block AI bots?
Many organizations block AI bots (like OpenAI's GPTBot or Google-Extended) to prevent their website's proprietary content, articles, or user data from being scraped and used to train Large Language Models (LLMs) without compensation or attribution.
3. Does Disallow: / mean my site won't show in Google?
Yes! If you set the Global Rules to 'Disallow All', you will output `Disallow: /` under `User-agent: *`. This asks all ethical search engines (like Google and Bing) to completely ignore your entire website. Only use this for private staging or development environments.
4. Where do I put this file?
Save the downloaded file exactly as `robots.txt` and upload it to the root public directory of your web server so it is accessible at `https://yourdomain.com/robots.txt`.