RAG Beyond Chatbots
Most RAG failures are ingestion failures. Chunking, metadata, and document quality decide retrieval quality far more than the prompt does.
Retrieval-augmented generation gets talked about as a prompt and a vector database. In practice, the prompt is the last thing that goes wrong.
Lesson. Most RAG failures are ingestion failures. Chunking, metadata, document quality, and retrieval strategy matter more than prompts.
Where RAG actually breaks
When a RAG system gives bad answers, the cause is usually upstream of the model:
- Chunking. Split documents badly and the relevant fact is severed from its context, or buried with noise. Retrieval can only return what ingestion created.
- Metadata. Without good metadata (source, date, type, entity), you can't filter, and the retriever drowns in near-duplicates.
- Document quality. Garbage in, confident garbage out. Tables, scans, and inconsistent formats wreck naive ingestion.
- Retrieval strategy. Pure vector similarity isn't always right. Hybrid (keyword + vector), filters, and re-ranking often matter more than the embedding model.
Fix ingestion and most "the LLM is hallucinating" problems quietly disappear.
The contrarian take
Opinion. Most companies should fix search before adding agents.
A lot of "we need an AI agent" is really "our information is impossible to find." Reliable retrieval over clean, well-structured data solves more real problems than an autonomous agent layered on top of a broken corpus. Agents amplify whatever is underneath — including the mess.
RAG beyond Q&A
The interesting uses aren't chatbots. They're grounding for structured work:
- giving document extraction the context it needs to interpret a field correctly
- financial research and lookups against a curated corpus
- enrichment steps in a workflow, where retrieved context improves an automated decision that a human still reviews
Rule of thumb
Spend your effort on the ingestion pipeline and the retrieval strategy. The model and the prompt are the easy, swappable parts.