Cache Embeddings to Disk

Save embeddings to a JSON file so the app skips re-embedding on subsequent runs

💻

Writing code and entering commands is only available on desktop. Open this page on a larger screen to complete this chapter.

The problem with re-embedding every run

Embedding 1,731 chunks takes several minutes and costs API quota. If you ask a second question about the same PDF, you should not pay that cost again.

The solution: save the chunks and their vectors to a JSON file after the first run. On the next run, load from that file instead of calling the API.

Run 1: embed → save to cache.json → answer question
Run 2: load from cache.json → answer question (instant)

The cache format

A single JSON object with two keys:

{
  "chunks": ["chunk one...", "chunk two...", ...],
  "embeddings": [[0.12, -0.04, ...], [0.33, 0.91, ...], ...]
}

The json module is part of Python's standard library — no installation required.

Instructions

Write two functions. The starter code provides both signatures.

  1. In save_embeddings, create a variable named data. Assign it a dict with two keys: "chunks" set to chunks, and "embeddings" set to embeddings.
  2. Open cache_path for writing using with open(cache_path, "w") as f:.
  3. Inside the with block, call json.dump(data, f).
  4. In load_embeddings, add an if statement: if not os.path.exists(cache_path):. Inside it, return None.
  5. Open cache_path for reading using with open(cache_path) as f:.
  6. Inside the with block, create a variable named data. Assign it json.load(f).
  7. Return data["chunks"], data["embeddings"].