What Embeddings Are

Understand vectors as meaning in number form — no linear algebra required

Numbers that capture meaning

An embedding is a list of numbers — typically 768 or more — that represents the meaning of a piece of text.

The key property: text with similar meaning has similar numbers.

Consider these two sentences:

  • "The invoice total is $450."
  • "The amount owed is $450."

An embedding model assigns them vectors that are very close to each other in space, even though the words are different.

A semantically unrelated sentence — "The weather is nice today." — gets a vector that points in a completely different direction.

Why this matters for search

When you embed your question and embed your chunks, you can measure how close each chunk is to the question in vector space. The closest chunks are the most relevant.

This is semantic search — it finds meaning, not just matching words.

What the Gemini API returns

When you call genai.embed_content(), the API returns a dictionary. The key "embedding" contains a Python list of floats.

result = genai.embed_content(
    model="models/text-embedding-004",
    content="The invoice total is $450.",
    task_type="retrieval_document"
)
vector = result["embedding"]  # list of 768 floats

You will call this function once per chunk, then once per query.

Next Chapter →