Introduction to RAG (Retrieval-Augmented Generation)

Jul 15, 2025

Learn how RAG improves LLM responses by retrieving relevant context from your data and generates answers that are safer, more reliable and more accurate.

Definitions

LLM - Large language Model

Examples: OpenAI’s ChatGPT, Google’s Gemini, Anthropic’s Claude, etc.

RAG - Retrieval Augmented Generation

Background

RAG is a popular Gen AI concept that was developed in 2020. You may have already used it without knowing it.

High Level Description

RAG is the process of retrieving relevant context from a data store, combining it with your input, and feeding it to an LLM so that a more accurate and relevant response is generated.

Question

What happens when you upload a file to a Qikr AI and ask it a question?

Answer

Qikr AI leverages RAG to generate the responses. Here’s how:

  1. File is uploaded to Qikr AI.

  2. File is split into chunks.

  3. Chunks are inserted into a data store.

  4. Then, when you send a question to Qikr AI, it executes the following steps:

  5. Retrieves the relevant chunks from the data store.

  6. Combines it with your question.

  7. Feeds it to the LLM.

  8. And then, voila, the LLM generates a response.

That’s an example of RAG in the wild!

Benefits

  1. Generate responses that are more accurate and relevant than calling an LLM directly.

  2. Customize how responses are formatted by an LLM (users can explicitly define what material is used to generate a response).

  3. Receive responses with their sources cited (easy to fact check and get the exact section of a document used to generate the response).

  4. And many more!

RAG Call Flow Diagrams

Diagram 1: User Uploads a File

Step 1: User uploads a document to Qikr AI.

Step 2: Qikr AI breaks down the document into small chunks (ex: 100 word blocks) and inserts them into a data store.

Diagram 2: User Sends a Question

Step 1: User asks Qikr AI a question.

Step 2: Qikr AI retrieves the relevant chunks from the data store based on question.

Step 3: Qikr AI provides both the relevant chunks and the question to the LLM.

Step 4: The LLM generates a response using all of that information.

Step 5: Qikr AI provides a response to the user.

Resources

https://aws.amazon.com/what-is/retrieval-augmented-generation/

https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/

https://en.wikipedia.org/wiki/Retrieval-augmented_generation

The industry-leading Gen AI platform, fully customizable and tailored to your needs.

info@qikrai.com

Stay up to date

Get the latest updates and exclusive tips for smarter Generative AI adoption.

© 2025. All rights reserved. Qikr AI

The industry-leading Gen AI platform, fully customizable and tailored to your needs.

info@qikrai.com

Stay up to date

Get the latest updates and exclusive tips for smarter Generative AI adoption.

© 2025. All rights reserved. Qikr AI

The industry-leading Gen AI platform, fully customizable and tailored to your needs.

info@qikrai.com

Stay up to date

Get the latest updates and exclusive tips for smarter Generative AI adoption.

© 2025. All rights reserved. Qikr AI