Write the Search Function
Score every chunk against a query and return the top matches
Writing code and entering commands is only available on desktop. Open this page on a larger screen to complete this chapter.
Putting the pieces together
You can now extract text, split it into chunks, embed those chunks, and score two vectors with cosine similarity. The search function ties all of this together: given a user's question, find the chunks that are most likely to contain the answer.
Why the query needs its own embedding
When you embedded chunks in the previous lesson, you used task_type="RETRIEVAL_DOCUMENT". For the user's question you use a different task type: "RETRIEVAL_QUERY".
Why two types? The embedding model is trained to place a short question near the documents that answer it — even though the question and the answer use different words. Setting the task type tells the model which role the text plays, so it can optimise the vector accordingly.
The search algorithm
The function takes five arguments:
| Argument | Purpose |
|---|---|
client | The Gemini API client |
query | The user's question as a string |
chunks | The list of text chunks from the PDF |
embeddings | The list of embedding vectors (one per chunk, same order) |
top_k | How many results to return (default 3) |
The steps inside the function:
- Embed the query to get a query vector.
- Score every chunk by computing cosine similarity between the query vector and the chunk's embedding.
- Sort by score, highest first.
- Return the top-k chunks.
Python patterns in this function
The solution uses two patterns you will see often in Python:
zip(embeddings, chunks)— walks two lists in lockstep, pairing the first embedding with the first chunk, the second with the second, and so on.- List comprehension — builds a new list in a single expression.
[(score, chunk) for emb, chunk in zip(...)]creates a list of(score, chunk)pairs.
def search(client, query, chunks, embeddings, top_k=3):
result = client.models.embed_content(
model="gemini-embedding-001",
contents=query,
config=types.EmbedContentConfig(task_type="RETRIEVAL_QUERY")
)
query_vector = result.embeddings[0].values
scores = [(cosine_similarity(query_vector, emb), chunk)
for emb, chunk in zip(embeddings, chunks)]
scores.sort(key=lambda x: x[0], reverse=True)
return [chunk for _, chunk in scores[:top_k]]
Instructions
Complete the search function. The starter code provides the signature.
- Create a variable named
result. Assign itclient.models.embed_content(model="gemini-embedding-001", contents=query, config=types.EmbedContentConfig(task_type="RETRIEVAL_QUERY")). - Create a variable named
query_vector. Assign itresult.embeddings[0].values. - Create a variable named
scores. Assign it a list comprehension that produces(cosine_similarity(query_vector, emb), chunk)for eachemb, chunkinzip(embeddings, chunks). - Sort
scoresby callingscores.sort(key=lambda x: x[0], reverse=True). - Return a list comprehension that extracts
chunkfrom each_, chunkinscores[:top_k].
import os
import time
import numpy as np
import pypdf
from dotenv import load_dotenv
from google import genai
from google.genai import types
def extract_text(pdf_path):
reader = pypdf.PdfReader(pdf_path)
pages = [page.extract_text() for page in reader.pages]
return "\n".join(pages)
def chunk_text(text, chunk_size=500, overlap=100):
chunks = []
for i in range(0, len(text), chunk_size - overlap):
chunks.append(text[i : i + chunk_size])
return chunks
def preview_chunks(chunks):
print(f"Total chunks: {len(chunks)}")
print(f"First chunk:\n{chunks[0]}")
def create_client():
load_dotenv()
api_key = os.getenv("GEMINI_API_KEY")
client = genai.Client(api_key=api_key)
return client
def embed_text(client, text):
result = client.models.embed_content(model="gemini-embedding-001", contents=text, config=types.EmbedContentConfig(task_type="RETRIEVAL_DOCUMENT"))
return result.embeddings[0].values
def embed_all_chunks(client, chunks):
BATCH_SIZE = 90
embeddings = []
for i in range(0, len(chunks), BATCH_SIZE):
batch = chunks[i : i + BATCH_SIZE]
for chunk in batch:
embeddings.append(embed_text(client, chunk))
if i + BATCH_SIZE < len(chunks):
print("Rate limit pause — waiting 60 seconds...")
time.sleep(60)
return embeddings
def cosine_similarity(vec_a, vec_b):
dot = np.dot(vec_a, vec_b)
norm = np.linalg.norm(vec_a) * np.linalg.norm(vec_b)
return dot / norm
def search(client, query, chunks, embeddings, top_k=3):
# Step 1: Embed the query with task_type="RETRIEVAL_QUERY"
# Step 2: Extract query_vector from result
# Step 3: Score every chunk with cosine_similarity
# Step 4: Sort scores descending
# Step 5: Return top_k chunks
Interactive Code Editor
Sign in to write and run code, track your progress, and unlock all chapters.
Sign In to Start Coding