Robots.txt Generator

Control how search engines crawl your website with a properly formatted robots.txt file.

100% Private & Secure

Your robots.txt is generated locally in your browser. No data is sent to our servers.

Generated robots.txt

⚠️ Important Notes

  • Place this file at the root: yoursite.com/robots.txt
  • Always test using Google Search Console before deploying.
  • Blocking all bots can severely impact your SEO visibility.

Common User-Agents

*All Bots
GooglebotGoogle
BingbotBing
GPTBotOpenAI
CCBotCommon Crawl
SlurpYahoo
DuckDuckBotDuckDuckGo
YandexYandex

Robots.txt: The Complete Guide

What is robots.txt?

A robots.txt file is a text file that tells web robots (most often search engine crawlers) which pages on your site to crawl and which not to crawl. It is part of the Robots Exclusion Protocol (REP).

Why do you need it?

While not strictly mandatory, a robots.txt file is crucial for managing your "crawl budget" (the number of pages a search engine bot crawls on your site). It prevents bots from wasting time on duplicate or unimportant pages (like admin panels or search results) and ensures they focus on your valuable content.

Key Directives

  • User-agent: Identifies which crawler the rule applies to (e.g., * for all, Googlebot for Google).
  • Disallow: Tells the crawler NOT to access a specific path.
  • Allow: Overrides a Disallow directive for a specific sub-path (Googlebot only).
  • Sitemap: Points crawlers to your XML sitemap location.
  • Crawl-delay: Tells bots to wait between requests (ignored by Google).

Blocking AI Bots

You can use robots.txt to prevent AI crawlers like GPTBot (OpenAI) or CCBot (Common Crawl) from using your content to train their models. Our tool includes a preset for this.

Robots.txt Glossary

Crawler / Bot / Spider
Automated software used by search engines to discover and index web pages. Examples: Googlebot, Bingbot.
Crawl Budget
The number of pages a search engine bot will crawl on your site within a given timeframe. Optimizing this is key for large sites.
Wildcard (*)
A symbol used to represent "any" sequence of characters. Used in User-agent: * to apply rules to all bots.
Root Directory
The top-level folder of your website. The robots.txt file MUST be placed here (e.g., example.com/robots.txt) to be valid.
Directives
Instructions given to bots in the robots.txt file, such as Allow, Disallow, and Sitemap.
Indexing vs. Crawling
Crawling is looking at the page. Indexing is storing it to show in search results. Robots.txt controls crawling, NOT indexing. To prevent indexing, use a 'noindex' meta tag.

Frequently Asked Questions about Robots.txt Generator

robots.txt is a text file that tells web robots (most often search engines) which pages on your site to crawl and which not to crawl.

Robots.txt Generator - Free Online Tool | Karvics | Karvics