Retrieval & Ingestion

Build RAG that holds up in production.

Composable ingestion, multimodal retrieval, and agentic search — engineered for the questions basic RAG gets wrong.

30 min · free · no deck

Book a 30-min Fit Call Read the architecture

What basic RAG can’t do

State-of-the-art RAG still gets 37% of questions wrong.

On Meta’s CRAG benchmark, frontier LLMs without retrieval score below 34%. Straightforward RAG gets you to 44%. Industry-best RAG lands at 63% — with a 17% hallucination rate on the rest. The 37-point gap between that and correct is the entire engineering challenge of production retrieval.

34%

No RAG

44%

Basic RAG

63%

Industry RAG

4,409 questions · five domains · CRAG Benchmark, NeurIPS 2024

What we built

Retrieval as infrastructure.

Not a feature you bolt on — a platform designed for the hard questions under production load.

Composable primitives

Parsers, chunkers, enrichers, embedders, extractors, and rerankers live in a registry. Each dataset gets a strategy composed from the catalogue — not a fixed pipeline wearing a RAG costume.

AI strategist per dataset

An agent analyses a representative sample, reviews every primitive with benchmark scores, and proposes a full ingestion recipe with written rationale. Runs once per dataset at $0.50–$2.00 — not per query.

Eval-gated blue / green deploys

Every strategy change is a hypothesis. New indexes build alongside the live one, run against a golden query set, and only promote on pass. Rollback is a one-line database update.

Multi-modal: text, graph, visual

Text embeddings for prose, knowledge-graph extraction for relational questions, ColPali-family visual retrieval for figure-heavy corpora. All fused via reciprocal-rank at query time.

The engine

Eight layers. Every call. Every layer, callable.

Intent, routing, memory, retrieval, tool selection, reasoning, policy, learning. Every call runs the stack; every layer answers to agents.

01Intent

02Prompt router

03Memory

04Super RAG

05Tool router

06Reasoning

07Policy & guardrails

08Continuous learning

Read the full architecture walkthrough

Where basic RAG fails

Three places this matters.

Messy PDFs, parsed correctly.

Dense tables, merged cells, scanned pages, multi-column layouts. Confidence-gated parsing detects when the primary parser failed and escalates to a vision-language model — so retrieval isn’t bottlenecked on mangled input.

Evidence

kapa.ai — 100+ production RAG teams

Figure-heavy corpora, seen.

Charts, diagrams, S-N curves carry information OCR can’t extract. ColPali-family multi-vector embeddings index full page rasters, fused with text retrieval at query time. The Strategy Agent enables this path only when figure density warrants it.

Evidence

ViDoRe Leaderboard

Multi-hop questions, answered.

Comparisons and cross-document synthesis break single-shot RAG. Knowledge-graph extraction builds typed relations at ingestion time, so traversal joins evidence across docs with provenance back to the source chunk.

Evidence

Microsoft GraphRAG

Compounds

These aren’t marginal improvements.

Each layer stacks on the last. Run all of them, and the math starts to favour getting the hard questions right.

49%

fewer retrieval failures

from contextual enrichment alone

→ Anthropic

67%

with contextual + BM25 + reranking

compounding ingestion-side improvements

→ Anthropic

20 – 30%

boost from hybrid search

over vector-only via reciprocal-rank fusion

→ Weaviate

3.4×

comprehensiveness on entity-rich corpora

graph-based over vanilla RAG

→ Microsoft Research

Next step

Bring us your hardest retrieval problem.

We’ll map your corpus, score where your current stack breaks, and walk through what a production-grade pipeline looks like against your actual data — whether we work together or not.

30 min · free · no deck, no pitch

Book a 30-min Fit Call Score your current system first