{'abc': 123, 'def': 456}
Lecture 04
Python dicts are a heterogeneous, ordered*, mutable containers of key value pairs. Each entry consists of a key (immutable) and a value (anything) - they are designed for the efficient lookup of values using a key.
dicts are typically constructed using {} with :
or via dict() using a list of key value tuples,
The keys for a dict must be immutable objects (e.g. integer, double, string, or tuple). Values may be of any type (mutable or immutable).
The [] operator exists for dicts but is used for key-based value look ups,
Since dictionaries are mutable, it is possible to insert new key value pairs as well as replace the value associated with an existing key.
New key/value pairs can be inserted,
The del can be used to remove a key and its value,
Dictionaries can be used with for loops (and comprehensions). They will both iterate over the keys only. To iterate over the keys and values use items().
Write a function that takes two dictionaries as arguments and merges them into a single dictionary. If there are any duplicate keys, the value from the second dictionary should be used.
In Python a set is a heterogeneous, unordered, mutable container of unique immutable elements.
A set is constructed using {} (without using :) or via set(),
Sets do not use the [] operator for element checking or removal,
Sets have their own special methods for adding and removing elements,
It is possible to use comprehensions with both sets and dicts,
Set comprehension,
{'h', 'u', 'k', 'm', 'd', 'y', 'z', ' ', 'l', 'b', 'r', 'w', 'e', 'c', 'a', 'g', 'x', 'i', 't', 'j', 'f', 'o', 'p', 'q', 'n'}
Note that tuple comprehensions do not exist,
the above code constructs a generator (more on these later).
If necessary you can use a list comprehension and then cast to a tuple,
Deques are heterogeneous, ordered, mutable collections of elements and behave in much the same way as lists. They are designed to be efficient for adding and removing elements from the beginning and end of the collection.
Values can be added to the beginning and end via .appendleft() and .append() respectively,
maxlendeques can be constructed with an optional maxlen argument which sets their maximum size - if this is exceeded values from the opposite end will be dropped.
This is a tool that is used to describe the complexity (usually in time but also in memory usage) of an algorithm. The goal is to broadly group algorithms based on how their complexity grows as the size of an input grows.
Consider a mathematical function that exactly captures this relationship (e.g. the number of steps in a given algorithm given an input of size \(n\)). The Big-O value for that algorithm will then be the largest term involving \(n\) in that function.
Generally algorithms will vary depending on the exact nature of the data and so often we talk about Big-O in terms of expected complexity and worse case complexity, we also often consider amortization for these worst cases.
| Complexity | Big-O |
|---|---|
| Constant | O(\(1\)) |
| Logarithmic | O(\(\log n\)) |
| Linear | O(\(n\)) |
| Quasilinear | O(\(n \log n\)) |
| Quadratic | O(\(n^2\)) |
| Cubic | O(\(n^3\)) |
| Exponential | O(\(C^n\)) |
Note that these terms ignore all smaller order terms as well as the constant in front of the highest order term (this can be signficant in practice).
| Operation | list (array) | dict (& set) | deque |
|---|---|---|---|
| Copy | O(n) | O(n) | O(n) |
| Append | O(1) | — | O(1) |
| Insert | O(n) | O(1) | O(n) |
| Get item | O(1) | O(1) | O(n) |
| Set item | O(1) | O(1) | O(n) |
| Delete item | O(n) | O(1) | O(n) |
x in s |
O(n) | O(1) | O(n) |
pop() |
O(1) | — | O(1) |
pop(0) |
O(n) | — | O(1) |
For each of the following scenarios, which is the most appropriate data structure and why?
A fixed collection of 100 integers.
A queue (first in first out) of customer records.
A stack (first in last out) of customer records.
A count of word occurrences within a document.
The heights of the bars in a histogram with even binwidths.
To tie things back to Sta 523 - the following R objects are implemented using the following data structures.
Atomic vectors - Array of the given type (int, double, etc.)
Generic vectors (lists) - Array of SEXPs (R object pointers)
Environments - Hash map with string-based keys
Pairlists - Linked list
Sta 663 - Spring 2026