LLMs Are Stateless

Understand why follow-up questions fail and how conversation history fixes them

The problem with follow-ups

Ask your assistant: "What does the chunk_text function do?" Then ask: "What are its default parameter values?"

The second question fails — the model returns a vague answer or says "I don't know." Each API call is independent. The model has no memory of the previous exchange. When you ask "what are its default parameter values?", there is no "it" in the current prompt.

This behavior is called stateless: the model processes each prompt in isolation, with no memory of prior prompts.

The solution: include history in every prompt

The fix is to include the full conversation history in each new prompt. Before sending a question, you prepend the previous turns:

Conversation so far:
You: What does the chunk_text function do?
Assistant: chunk_text splits a text string into overlapping fixed-size chunks.

Question:
What are its default parameter values?

Now the model has context and can answer correctly.

A note on context limits

History grows with every turn. Long conversations and large indexed files push the prompt toward the model's context limit — the maximum text it can process in one call. A later course covers strategies for managing this.

Next Chapter →