Mini Project: Text Statistics
Exit
Mini Project: Text Statistics
Build a text analyzer that counts words, sentences, and top words
💻
Writing code and entering commands is only available on desktop. Open this page on a larger screen to complete this chapter.
What you will build
A text analyzer that reads an article file and reports word count, sentence count, and the three most common words.
Example output
Words: 26
Sentences: 5
Top 3 words: [('data', 5), ('python', 5), ('is', 2)]Design
| Function | Purpose |
|---|---|
count_words(text) | Return the total word count |
count_sentences(text) | Return the number of sentences (count of .) |
top_words(text, n) | Return the n most frequent words as a sorted list of tuples |
Note on top_words
Strip punctuation from each word before counting so "Python." and "Python" count as the same word. Sort results by count descending, then alphabetically to break ties.
Instructions
Build the text statistics analyzer.
- Add
from collections import Counterat the top. - Define a function named
count_wordsthat takestext. Inside, returnlen(text.split()). - Define a function named
count_sentencesthat takestext. Inside, returntext.count("."). - Define a function named
top_wordsthat takestextandn. Inside, createwordsby callingtext.lower().split(). Createcleanas a list comprehension that strips".,!?;:"from each word inwords. Createcounter = Counter(clean). Createsorted_itemsby callingsorted(counter.items(), key=lambda item: (-item[1], item[0])). Returnsorted_items[:n]. - Open
article.txtin read mode and assignfile.read()totext. - Call
print(f"Words: {count_words(text)}"). - Call
print(f"Sentences: {count_sentences(text)}"). - Call
print(f"Top 3 words: {top_words(text, 3)}").
# article.txt contains: # Data science uses Python. Python is the top language for data. # Data analysis with Python is rewarding. Python excels at data processing. # Data engineers prefer Python. # Step 1: Import Counter from collections # Step 2: Define count_words(text) # Step 3: Define count_sentences(text) # Step 4: Define top_words(text, n) — strip punctuation, count, sort, return top n # Step 5: Open article.txt and read into text # Step 6: Print word count # Step 7: Print sentence count # Step 8: Print top 3 words
Interactive Code Editor
Sign in to write and run code, track your progress, and unlock all chapters.
Sign In to Start Coding