← Back to all tools
Visit site
Overall score
3.9/ 5
SME Fit4/5
flat pricing + free tier · technical setup
JTBD5/5
clearly named, measurable job
Integration4/5
API + 6 integrations
Trust2/5
growing, founded 2024
Quality5/5
4.8 on GitHub (40,000 reviews)
Compliance2/5
customer-choice residency
About
Crawl4AI is an Apache 2.0 open-source Python crawler and scraper built specifically for LLM and RAG pipelines. It outputs clean Markdown, supports CSS, XPath, or LLM-based extraction, and ships with browser hooks, stealth mode, and parallel crawling.
Best for: Developer-led SMEs and data teams building RAG or agent pipelines who want a free, self-hosted scraper they can fully control.
Pricing
| Tier | Monthly | Annual /mo | Billing | Notes |
|---|---|---|---|---|
| Open Source | — | Free | flat | Full Python library;Docker image;all features;community support · Apache 2.0 license; self-hosted |
Key features
- Clean Markdown output for LLMs
- CSS, XPath, and LLM-based extraction
- Parallel crawling with chunking
- Stealth mode and proxy support
- Docker self-hosting
- Hooks and session re-use
Integrations
OpenAIAnthropicOllamaLangChainLlamaIndexDocker
Trust & compliance
- Founded
- 2024
- Status
- active
- SOC 2
- unknown
- GDPR
- unknown
- Data residency
- customer_choice
- External rating
- 4.8 on GitHub (40000 reviews)
- Last verified
- May 2026