Lesson Complete!
Extract and Chunk Text
What you did in this lesson
- Learned why token limits make chunking necessary
- Wrote
extract_text()— pulls all page text into one string - Wrote
chunk_text()— splits that string into overlapping pieces - Wrote
preview_chunks()— inspects the output before moving on
What comes next
You have a list of text chunks. In Lesson 3, you will convert each chunk into a vector — a list of numbers that captures its meaning. This is what makes semantic search possible.