AI that reads websites and structures them into facts
Diffbot builds AI models that read websites and structure them into facts.
Unlike flat data dumps from traditional web scraping, facts structured by Diffbot are linked to each other like nodes in a graph. "Diffbot" mentioned in this MIT Technology Review article is resolved to the same "Diffbot" on LinkedIn or diffbot.com.
This graph structure is key to enabling intelligent AI applications like the Knowledge Graph and GraphRAG, but most people simply know us for Extract API, the automatic web scraper that doesn't use any rules.

A bio of Mike Tung structured by Natural Language API into linked facts
Diffbot is used by companies all over the world to tap into unstructured data from the public web without the hassle of building messy web ETL pipelines.
- DuckDuckGo uses Extract to structure product data for shopping search
- ProQuo AI uses org data in the Knowledge Graph to drive predictive business development
- Contingent uses news data in the Knowledge Graph to reveal supply chain insights on target companies
Diffbot APIs make it possible to
- Classify and extract meaningful entities from existing web pages
- Find meaningful entities from within entire websites
- Search the public web as a humongous graph database
- Enhance your existing data with public web data
- Identify entities and classify the context of raw text
Diffbot APIs can also be combined to power highly intelligent automated systems, such as this automated sanctions tracker.
We'd love to hear about the intelligent apps you're building with Diffbot. Find us on LinkedIn, Bluesky, and Mastodon.
Amazing! Where Do I Start?
For a more general overview of everything Diffbot has to offer, start here.
If you're ready to start making API calls, head on over to Authentication to setup your token and select one of the APIs featured above.