Print the Answer and Sources

Display the generated answer, with optional source passages

💻

Writing code and entering commands is only available on desktop. Open this page on a larger screen to complete this chapter.

Why show sources?

The previous chapter generated an answer from retrieved chunks. But how does the user know the answer is trustworthy? By showing the source passages alongside the answer.

Displaying sources serves three purposes:

Trust — the user can read the original text and verify the answer themselves.
Debugging — if the answer is wrong, you can check whether the retrieval step returned the right chunks. A bad answer from good chunks means the model misread them. A bad answer from irrelevant chunks means the search function needs tuning.
Transparency — the user sees exactly what the model was given. No hidden context, no mystery.

This is the same idea behind footnotes in academic papers — every claim points back to its source.

The `show_sources` parameter

Sources are helpful during development and for power users, but not always wanted in production. The show_sources parameter (default True) lets callers choose:

Call	Behaviour
`print_result(answer, chunks)`	Prints answer and sources
`print_result(answer, chunks, show_sources=False)`	Prints answer only

Python pattern: `enumerate` with a start index

The loop uses enumerate(source_chunks, 1). The second argument tells Python to start counting from 1 instead of the default 0. This produces human-friendly labels: "Source 1", "Source 2", rather than "Source 0", "Source 1".

def print_result(answer, source_chunks, show_sources=True):
    print("Answer:")
    print(answer)
    if show_sources:
        print("\nSources:")
        for i, chunk in enumerate(source_chunks, 1):
            print(f"Source {i}:\n{chunk}\n")

Instructions

Complete the print_result function. The starter code provides the signature.

Print "Answer:".
Print answer.
Add an if show_sources: block.
Inside the block, print "\nSources:".
Inside the block, create a for loop with variables i and chunk over enumerate(source_chunks, 1). Inside the loop, print f"Source {i}:\n{chunk}\n".

← Previous Chapter Next Chapter →

import os
import time
import numpy as np
import pypdf
from dotenv import load_dotenv
from google import genai
from google.genai import types

def extract_text(pdf_path):
    reader = pypdf.PdfReader(pdf_path)
    pages = [page.extract_text() for page in reader.pages]
    return "\n".join(pages)

def chunk_text(text, chunk_size=500, overlap=100):
    chunks = []
    for i in range(0, len(text), chunk_size - overlap):
        chunks.append(text[i : i + chunk_size])
    return chunks

def preview_chunks(chunks):
    print(f"Total chunks: {len(chunks)}")
    print(f"First chunk:\n{chunks[0]}")

def create_client():
    load_dotenv()
    api_key = os.getenv("GEMINI_API_KEY")
    client = genai.Client(api_key=api_key)
    return client

def embed_text(client, text):
    result = client.models.embed_content(model="gemini-embedding-001", contents=text, config=types.EmbedContentConfig(task_type="RETRIEVAL_DOCUMENT"))
    return result.embeddings[0].values

def embed_all_chunks(client, chunks):
    BATCH_SIZE = 90
    embeddings = []
    for i in range(0, len(chunks), BATCH_SIZE):
        batch = chunks[i : i + BATCH_SIZE]
        for chunk in batch:
            embeddings.append(embed_text(client, chunk))
        if i + BATCH_SIZE < len(chunks):
            print("Rate limit pause — waiting 60 seconds...")
            time.sleep(60)
    return embeddings

def cosine_similarity(vec_a, vec_b):
    dot = np.dot(vec_a, vec_b)
    norm = np.linalg.norm(vec_a) * np.linalg.norm(vec_b)
    return dot / norm

def search(client, query, chunks, embeddings, top_k=3):
    result = client.models.embed_content(model="gemini-embedding-001", contents=query, config=types.EmbedContentConfig(task_type="RETRIEVAL_QUERY"))
    query_vector = result.embeddings[0].values
    scores = [(cosine_similarity(query_vector, emb), chunk) for emb, chunk in zip(embeddings, chunks)]
    scores.sort(key=lambda x: x[0], reverse=True)
    return [chunk for _, chunk in scores[:top_k]]

def test_search(client, pdf_path, question):
    text = extract_text(pdf_path)
    chunks = chunk_text(text)
    embeddings = embed_all_chunks(client, chunks)
    results = search(client, question, chunks, embeddings)
    for i, chunk in enumerate(results, 1):
        print(f"Result {i}:\n{chunk}\n")

def build_prompt(question, context_chunks):
    context = "\n\n".join(context_chunks)
    prompt = f"You are a helpful assistant. Answer the question using only the context below.\nIf the answer is not in the context, say \"I don't know.\"\n\nContext:\n{context}\n\nQuestion:\n{question}"
    return prompt

def generate_answer(client, prompt):
    response = client.models.generate_content(model="gemini-2.5-flash", contents=prompt)
    return response.text

def print_result(answer, source_chunks, show_sources=True):
    # Step 1: Print "Answer:"
    # Step 2: Print answer
    # Step 3: If show_sources is True:
        # Step 4: Print "\nSources:"
        # Step 5: Loop over source_chunks and print each

Print the Answer and Sources

Why show sources?

The show_sources parameter

Python pattern: enumerate with a start index

Instructions

Interactive Code Editor

The `show_sources` parameter

Python pattern: `enumerate` with a start index