
Diffbot Crawler Simulator
Test how Diffbot's web scraping crawler sees your website. Understand structured data extraction.
Test your site with Diffbot
Enter your URL to see how Diffbot views your website.
What is Diffbot?
Diffbot is a web scraping and data extraction company that uses AI to understand and structure web content. Their crawler automatically extracts articles, products, events, and other structured data from websites. Many businesses use Diffbot for competitive intelligence, content aggregation, and data services.
Why Allow Diffbot?
Diffbot robots.txt Configuration
Control how Diffbot accesses your website using robots.txt directives. Add these rules to your robots.txt file at the root of your domain.
Allow Diffbot
# Allow Diffbot User-agent: Diffbot Allow: /
Block Diffbot
# Block Diffbot User-agent: Diffbot Disallow: /
User-Agent String: Diffbot/1.0 (+http://www.diffbot.com)
Diffbot FAQ
Related Crawler Simulators
Make Your Site Crawlable
JavaScript websites often have indexing issues. LovableHTML pre-renders your SPA into crawler-friendly HTML so Diffbot and other bots can read your content.
