ScrapeIQ is a production-deployed AI document assistant that scrapes and indexes legal sources, runs hybrid (vector + keyword) search across them, and generates draft documents from your templates. Cyrillic and Latin scripts supported out of the box.
Hybrid RAG search
Vector similarity combined with keyword retrieval — outperforms either alone for dense legal text.
Document generation
Draft contracts, opinions, and reports from indexed sources via the /api/generator endpoint.
Compliance audit trail
/api/audit logs every document access and operation; structured for SOC 2 and ISO 27001 audits.
Scheduled crawling
Async job queue with cron-driven refresh, diff detection, and alerting.
GPU acceleration
Optional CUDA inference path; tested in production on RTX 6000 Ada.
Cyrillic + Latin
Multi-script normalisation indexes Serbian, Russian, and Latin-script documents as a single corpus.
Tech stack
- Python
- FastAPI
- ChromaDB
- Ollama
- LangChain
- Playwright
- PostgreSQL
- Docker

Who it's for
Built for document-heavy teams.
Where finding the right passage — and proving who accessed it — actually matters.
Legal & compliance teams
Search statutes, filings, and precedent across scripts, and draft from your own sources.
Regulatory affairs
Track regulatory change with scheduled crawling, diff detection, and alerts.
Knowledge & operations
Turn a growing document backlog into answers your team can cite.
FAQ
Frequently asked questions
Keyword search already works for us — why change?
Hybrid retrieval adds semantic recall on top of exact-term precision, so you miss fewer relevant passages in dense documents while still matching the exact terms keyword search catches.
Our documents are in Cyrillic — can it handle that?
Yes. Multi-script normalisation indexes Cyrillic and Latin as a single corpus, so nothing fragments into separate silos.
Can we trust the generated drafts?
Drafts are built from your own indexed sources and templates, not from a model's open-ended guessing, and every operation is logged for review and structured for SOC 2 / ISO 27001 evidence.
Have a project in mind?
Tell us what you want to build. We respond within one business day.