Retrieval & Ingestion

Build RAG that holds up in production.

Composable ingestion, multimodal retrieval, and agentic search — engineered for the questions basic RAG gets wrong.

What basic RAG can’t do

State-of-the-art RAG still gets 37% of questions wrong.

On Meta’s CRAG benchmark, frontier LLMs without retrieval score below 34%. Straightforward RAG gets you to 44%. Industry-best RAG lands at 63% — with a 17% hallucination rate on the rest. The 37-point gap between that and correct is the entire engineering challenge of production retrieval.

34%
No RAG
44%
Basic RAG
63%
Industry RAG

4,409 questions · five domains · CRAG Benchmark, NeurIPS 2024

What we built

Retrieval as infrastructure.

Not a feature you bolt on — a platform designed for the hard questions under production load.

Composable primitives

Parsers, chunkers, enrichers, embedders, extractors, and rerankers live in a registry. Each dataset gets a strategy composed from the catalogue — not a fixed pipeline wearing a RAG costume.

AI strategist per dataset

An agent analyses a representative sample, reviews every primitive with benchmark scores, and proposes a full ingestion recipe with written rationale. Runs once per dataset at $0.50–$2.00 — not per query.

Eval-gated blue / green deploys

Every strategy change is a hypothesis. New indexes build alongside the live one, run against a golden query set, and only promote on pass. Rollback is a one-line database update.

Multi-modal: text, graph, visual

Text embeddings for prose, knowledge-graph extraction for relational questions, ColPali-family visual retrieval for figure-heavy corpora. All fused via reciprocal-rank at query time.

The engine

Eight layers. Every call. Every layer, callable.

Intent, routing, memory, retrieval, tool selection, reasoning, policy, learning. Every call runs the stack; every layer answers to agents.

01Intent
02Prompt router
03Memory
04Super RAG
05Tool router
06Reasoning
07Policy & guardrails
08Continuous learning

Where basic RAG fails

Three places this matters.

01

Messy PDFs, parsed correctly.

Dense tables, merged cells, scanned pages, multi-column layouts. Confidence-gated parsing detects when the primary parser failed and escalates to a vision-language model — so retrieval isn’t bottlenecked on mangled input.

02

Figure-heavy corpora, seen.

Charts, diagrams, S-N curves carry information OCR can’t extract. ColPali-family multi-vector embeddings index full page rasters, fused with text retrieval at query time. The Strategy Agent enables this path only when figure density warrants it.

03

Multi-hop questions, answered.

Comparisons and cross-document synthesis break single-shot RAG. Knowledge-graph extraction builds typed relations at ingestion time, so traversal joins evidence across docs with provenance back to the source chunk.

Compounds

These aren’t marginal improvements.

Each layer stacks on the last. Run all of them, and the math starts to favour getting the hard questions right.

49%
fewer retrieval failures
from contextual enrichment alone
Anthropic
67%
with contextual + BM25 + reranking
compounding ingestion-side improvements
Anthropic
20 – 30%
boost from hybrid search
over vector-only via reciprocal-rank fusion
Weaviate
3.4×
comprehensiveness on entity-rich corpora
graph-based over vanilla RAG
Microsoft Research

Next step

Bring us your hardest retrieval problem.

We’ll map your corpus, score where your current stack breaks, and walk through what a production-grade pipeline looks like against your actual data — whether we work together or not.

30 min · free · no deck, no pitch

Retrieval & Ingestion — The Build Bot