← Back to all tools

Crawl4AI

Open-source LLM-friendly web crawler

Consider with caveatsAPIFree tier
Visit site

Overall score

3.9/ 5
SME Fit4/5flat pricing + free tier · technical setup
JTBD5/5clearly named, measurable job
Integration4/5API + 6 integrations
Trust2/5growing, founded 2024
Quality5/54.8 on GitHub (40,000 reviews)
Compliance2/5customer-choice residency

About

Crawl4AI is an Apache 2.0 open-source Python crawler and scraper built specifically for LLM and RAG pipelines. It outputs clean Markdown, supports CSS, XPath, or LLM-based extraction, and ships with browser hooks, stealth mode, and parallel crawling.

Best for: Developer-led SMEs and data teams building RAG or agent pipelines who want a free, self-hosted scraper they can fully control.

Pricing

TierMonthlyAnnual /moBillingNotes
Open SourceFreeflatFull Python library;Docker image;all features;community support · Apache 2.0 license; self-hosted

Key features

  • Clean Markdown output for LLMs
  • CSS, XPath, and LLM-based extraction
  • Parallel crawling with chunking
  • Stealth mode and proxy support
  • Docker self-hosting
  • Hooks and session re-use

Integrations

OpenAIAnthropicOllamaLangChainLlamaIndexDocker

Trust & compliance

Founded
2024
Status
active
SOC 2
unknown
GDPR
unknown
Data residency
customer_choice
External rating
4.8 on GitHub (40000 reviews)
Last verified
May 2026