Sun (Light Mode)

Diffbot builds AI models that read websites and structure them into facts.

Unlike flat data dumps from traditional web scraping, facts structured by Diffbot are linked to each other like nodes in a graph. "Diffbot" mentioned in this MIT Technology Review article is resolved to the same "Diffbot" on LinkedIn or diffbot.com.

This graph structure is key to enabling intelligent AI applications like the Knowledge Graph and GraphRAG, but most people simply know us for Extract API, the automatic web scraper that doesn't use any rules.

A bio of Mike Tung structured by Natural Language API into linked facts

Diffbot is used by companies all over the world to tap into unstructured data from the public web without the hassle of building messy web ETL pipelines.

DuckDuckGo uses Extract to structure product data for shopping search
ProQuo AI uses org data in the Knowledge Graph to drive predictive business development
Contingent uses news data in the Knowledge Graph to reveal supply chain insights on target companies

Diffbot APIs make it possible to

Classify and extract meaningful entities from existing web pages
Find meaningful entities from within entire websites
Search the public web as a humongous graph database
Enhance your existing data with public web data
Identify entities and classify the context of raw text

Diffbot APIs can also be combined to power highly intelligent automated systems, such as this automated sanctions tracker.

We'd love to hear about the intelligent apps you're building with Diffbot. Find us on LinkedIn, Bluesky, and Mastodon.

Amazing! Where Do I Start?

For a more general overview of everything Diffbot has to offer, start here.

If you're ready to start making API calls, head on over to Authentication to setup your token and select one of the APIs featured above.