Embed All Chunks

Loop over every chunk in batches to stay within the free tier rate limit

💻

Writing code and entering commands is only available on desktop. Open this page on a larger screen to complete this chapter.

Embedding a list of chunks

The Gemini free tier allows 100 embedding requests per minute. A large PDF can produce hundreds of chunks, so you must pace your requests.

The solution is to process chunks in batches of 90, then pause for 60 seconds between batches.

import time

BATCH_SIZE = 90
embeddings = []
for i in range(0, len(chunks), BATCH_SIZE):
    batch = chunks[i : i + BATCH_SIZE]
    for chunk in batch:
        embeddings.append(embed_text(client, chunk))
    if i + BATCH_SIZE < len(chunks):
        print("Rate limit pause — waiting 60 seconds...")
        time.sleep(60)
return embeddings

The index of each vector matches the index of its chunk — embeddings[3] is the vector for chunks[3].

Instructions

Complete the embed_all_chunks function. The starter code provides the signature.

  1. Create a variable named BATCH_SIZE. Assign it 90.
  2. Create an empty list named embeddings.
  3. Create a for loop with variable i over range(0, len(chunks), BATCH_SIZE).
  4. Inside the loop, create a variable named batch. Assign it chunks[i : i + BATCH_SIZE].
  5. Create an inner for loop with variable chunk over batch. Inside it, append embed_text(client, chunk) to embeddings.
  6. After the inner loop, add an if statement: if i + BATCH_SIZE < len(chunks):. Inside it, print "Rate limit pause — waiting 60 seconds..." and call time.sleep(60).
  7. After the outer loop, return embeddings.