Skip to main content
Open In ColabOpen on GitHub

PullMdLoader

Loader for converting URLs into Markdown using the pull.md service.

This package implements a document loader for web content. Unlike traditional web scrapers, PullMdLoader can handle web pages built with dynamic JavaScript frameworks like React, Angular, or Vue.js, converting them into Markdown without local rendering.

Overview

Integration details

ClassPackageLocalSerializableJS Support
PullMdLoaderlangchain-pull-md

Setup

Installation

pip install langchain-pull-md

Initialization

from langchain_pull_md.markdown_loader import PullMdLoader

# Instantiate the loader with a URL
loader = PullMdLoader(url="https://example.com")

Load

documents = loader.load()
documents[0].metadata
{'source': 'https://example.com',
'page_content': '# Example Domain\nThis domain is used for illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.'}

Lazy Load

No lazy loading is implemented. PullMdLoader performs a real-time conversion of the provided URL into Markdown format whenever the load method is called.

API reference:


Was this page helpful?