collections

Use Counter, defaultdict, and namedtuple for specialized data handling

💻

Writing code and entering commands is only available on desktop. Open this page on a larger screen to complete this chapter.

Beyond basic lists and dicts

Python's built-in list and dict cover most needs. But some tasks require counting items, grouping data, or creating lightweight data objects. Writing this logic by hand is repetitive and error-prone.

The collections module provides specialized containers that solve these common patterns in one line.

Counter: count things instantly

In the Data Structures course, you wrote loops to count word frequencies. Counter does the same thing without the loop:

from collections import Counter

words = ["apple", "banana", "apple", "cherry", "banana", "apple"]
counts = Counter(words)
print(counts)  # Counter({'apple': 3, 'banana': 2, 'cherry': 1})

Counter is a dictionary subclass. You access counts like a regular dict: counts["apple"] returns 3. If a key doesn't exist, it returns 0 instead of raising a KeyError.

The most_common() method returns items sorted by frequency:

counts.most_common(2)  # [('apple', 3), ('banana', 2)]

defaultdict: dictionaries with default values

A regular dict raises KeyError when you access a missing key. defaultdict creates the key automatically with a default value:

from collections import defaultdict

groups = defaultdict(list)
groups["fruit"].append("apple")
groups["fruit"].append("banana")
groups["veggie"].append("carrot")
print(dict(groups))  # {'fruit': ['apple', 'banana'], 'veggie': ['carrot']}

The argument to defaultdict is a function that creates the default value. list creates an empty list. int creates 0. set creates an empty set.

Without defaultdict, you would need an if key not in dict check before every access. defaultdict eliminates that boilerplate.

namedtuple: lightweight data objects

A regular tuple stores values by position. You access them with indices like point[0] and point[1]. This works, but the code is hard to read — what does index 0 mean?

namedtuple creates tuple subclasses with named fields:

from collections import namedtuple

Point = namedtuple("Point", ["x", "y"])
p = Point(3, 4)
print(p.x)  # 3
print(p.y)  # 4

Named tuples are immutable (like regular tuples) but readable (like dictionaries). Use them for small, fixed data structures — coordinates, RGB colors, database records.

Instructions

Use Counter, defaultdict, and namedtuple to process data.

  1. Import Counter, defaultdict, and namedtuple from collections.
  2. Create a variable named words and assign it the list ["python", "java", "python", "rust", "java", "python", "go", "rust"].
  3. Create a variable named word_counts and assign it Counter(words).
  4. Call print() with word_counts.most_common(3).
  5. Create a variable named scores and assign it defaultdict(list). Call scores["math"].append(95). Call scores["math"].append(87). Call scores["science"].append(92).
  6. Call print() with dict(scores).
  7. Create a namedtuple type named Student with fields "name", "grade", "gpa". Assign it to a variable named Student.
  8. Create a variable named s and assign it Student("Alice", 10, 3.8).
  9. Call print() with f"{s.name}: Grade {s.grade}, GPA {s.gpa}".