What is robots.txt?
robots.txt is a file at the root of a website that tells search engine crawlers and AI bots which pages they are allowed or not allowed to access. Placed at yoursite.com/robots.txt, it uses simple directives to allow or disallow specific crawlers from specific paths.
The file is a set of instructions, not an enforcement mechanism — well-behaved crawlers like Googlebot respect it, but it does not physically block access. It is commonly used to keep crawlers out of admin pages, duplicate content, or sections that should not appear in search.
A misconfigured robots.txt is a surprisingly common and damaging SEO mistake. A single overly broad "disallow" line can accidentally hide an entire site from search engines, erasing rankings overnight.
Why it matters
One wrong line in robots.txt can block your whole site from being indexed, undoing months of SEO work. Founders who do not check it can unknowingly make themselves invisible to Google.
With AI search rising, robots.txt also controls whether AI crawlers can reach your content. Blocking them means no AI citations; allowing the right ones is now part of being discoverable.
How Distro helps
Distro's technical checks review your robots.txt for blocks that hurt search and AI visibility, flagging fixes as missions. Run your free growth report to catch crawler issues before they cost you traffic.
Related terms
llms.txt
llms.txt is a text file placed at the root of a website that describes the site's content in a format optimized for large language models to read and cite.
XML Sitemap
An XML sitemap is a file that lists all the pages on a website to help search engines discover and index content more efficiently.