Retrieval-augmented generation (RAG)

A technique where an LLM retrieves relevant documents from a knowledge store (often a vector database) and uses them to ground its answers.

Updated 2026-06-17

Retrieval-augmented generation (RAG) grounds a language model’s answers in your own data. Instead of relying only on what the model memorised, the system retrieves relevant chunks — for example from transcripts stored in a vector database — and feeds them to the model as context.

For audio, RAG is how you “ask questions” of your recordings: transcripts are chunked, embedded and retrieved at query time so the answer cites your actual content.