One-Shot vs. Interactive - Add an Interactive Loop | Build an AI-Powered CLI Assistant

Understand why a persistent loop is better than restarting the program for each question

The cost of restarting

Right now the assistant runs once: it indexes your folder (or loads from cache) and exits. To ask a second question, you restart the program. Even with the cache, that means loading the JSON file, re-creating the client, and re-embedding your query every single time.

An interactive assistant does that setup once and then waits for questions. The startup cost is paid exactly once per session.

The REPL pattern

A REPL (Read-Eval-Print Loop) reads input, processes it, prints output, and repeats. The Python pattern is:

while True:
    question = input("You: ").strip()
    if not question:
        continue
    # search, prompt, generate, print

while True — the loop runs until the program is explicitly stopped (Ctrl+C or a quit command you add later).
input("You: ") — displays You: as a prompt and blocks until the user presses Enter.
.strip() — removes leading and trailing whitespace so accidental spaces don't cause unexpected behavior.
if not question: continue — skips to the next iteration when the user presses Enter without typing anything, preventing a search with an empty query.

What the next chapters do

Before wiring the loop, you will update build_prompt to handle the dict chunks that index_folder returns. Then you will add the loop function, connect the pipeline inside it, and update main to call it.