How RAG Assembles the Final Prompt
Build a RAG App — Chat with Your PDFsGenerate Answers and Finish the AppHow RAG Assembles the Final Prompt
Exit
How RAG Assembles the Final Prompt
Understand the prompt template before writing the generation code
The prompt is the bridge
Retrieval gives you relevant text. Generation turns that text into an answer. The prompt is what connects them.
A RAG prompt has three parts:
- System instruction — tells the model its role and constraints
- Context — the retrieved chunks, pasted verbatim
- Question — what the user asked
A concrete example
You are a helpful assistant. Answer the question using only the context below.
If the answer is not in the context, say "I don't know."
Context:
The total amount due is $450.00, payable by March 31, 2025.
Payment can be made by bank transfer or credit card.
Question:
What is the total amount due?The model reads the context and answers: "The total amount due is $450.00."
It does not guess. It does not hallucinate. It reads.
What you will write
build_prompt(question, context_chunks)— assembles the string abovegenerate_answer(prompt)— sends it to Gemini and returns the response textprint_result(answer, source_chunks)— displays the answer and its sourcessave_embeddings(chunks, embeddings, cache_path)— caches vectors to diskload_embeddings(cache_path)— loads cached vectorsmain()— wires the full pipeline from CLI arguments