robots.txt Generator

100% Free · Client-Side

Generate robots.txt files with advanced controls for AI crawlers, custom user-agents, and sitemap inclusion.

Block AI Scrapers

Enabled

Prevent Large Language Models (LLMs) from scraping your content for training data.

No custom bot rules defined.
No sitemaps provided.

Output File

# Global Rules for all standard crawlers
User-agent: *
Disallow: /admin/
Disallow: /private/
Disallow: /tmp/

# Block AI Crawlers and Scrapers
User-agent: GPTBot
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Anthropic-ai
Disallow: /

User-agent: Claude-Web
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: OmgiliBot
Disallow: /

User-agent: Amazonbot
Disallow: /

User-agent: Applebot-Extended
Disallow: /

robots.txt Generator Overview

The `robots.txt` file is your website's first line of communication with web crawlers. A single typo in this file can accidentally de-index your entire application from Google. Our free generator provides a fail-safe visual interface to construct these files perfectly.

With the rise of Large Language Models, web scraping has become aggressive. We included a dedicated "AI Crawler" block module, allowing you to instantly sever access to bots from OpenAI, Google Gemini, Anthropic, and Perplexity, protecting your proprietary content.

Frequently Asked Questions

1. What is a robots.txt file?

The `robots.txt` file is a standard text file placed in the root directory of your website (e.g., `example.com/robots.txt`). It tells search engine crawlers which URLs they are allowed to access and index, and which ones they should ignore.

2. Why should I block AI bots?

Many organizations block AI bots (like OpenAI's GPTBot or Google-Extended) to prevent their website's proprietary content, articles, or user data from being scraped and used to train Large Language Models (LLMs) without compensation or attribution.

3. Does Disallow: / mean my site won't show in Google?

Yes! If you set the Global Rules to 'Disallow All', you will output `Disallow: /` under `User-agent: *`. This asks all ethical search engines (like Google and Bing) to completely ignore your entire website. Only use this for private staging or development environments.

4. Where do I put this file?

Save the downloaded file exactly as `robots.txt` and upload it to the root public directory of your web server so it is accessible at `https://yourdomain.com/robots.txt`.