Collections
Categories: magic methods
A collection is a general name for types of object that contain a group of other items. Examples are lists, tuples, dictionaries and sets.
You can create your own collections. Our example Matrix class is a collection. In a later article we will look at a priority queue class, another collection.
Collection behaviour
There are various common types of behaviour that list, tuples etc share. We will probably want to emulate some of those behaviours in own collections.
- You can find the length (number of elements) in a collection using the built-in
len
function. - You can access (read, write or delete) elements using
[]
notation. Some collections, such as lists, use integers or slices to select elements. Some, such as dictionaries, can use string or even tuple values as indices. - You can loop over elements in a collection using a for loop.
- You can check if a collection contains an element using the
in
operator.
We can do all these things with our own custom collections, by defining the relevant magic methods as defined below. Of course, not every collection has to implement every behaviour.
We will extend our Matrix class as as example.
Supporting len
We can support the built-in len
function by adding a __len__
method to our Matrix class:
class Matrix:
def __init__(self, a, b, c, d):
self.data = [a, b, c, d]
def __str__(self):
return '[{}, {}][{}, {}]'.format(self.data[0],
self.data[1],
self.data[2],
self.data[3])
def __len__(self):
return 4
a = Matrix(1, 2, 3, 4)
print(len(a))
Since our Matrix class always has exactly 4 elements, our __len__
method always returns 4. In most cases, you will need to determine the current size of your collection and return that.
Getting and setting elements
We can get items in our collection, using the list style [] notation. We just need to define __getitem__
like this:
class Matrix:
def __init__(self, a, b, c, d):
self.data = [a, b, c, d]
def __str__(self):
return '[{}, {}][{}, {}]'.format(self.data[0],
self.data[1],
self.data[2],
self.data[3])
def __getitem__(self, i):
return self.data[i]
a = Matrix(1, 2, 3, 4)
print(a[1])
When we print a[1]
, Python calls the __getitem__
method with i
set to 1, which returns the value of element 1 of the matrix data.
We can set an item by defining the __setitem__
method and adding it to the Matrix class:
def __setitem__(self, i, value):
self.data[i] = value
Here is the code to test this:
a[2] = 10
print(a)
Which sets element 2 to value 10 (it does this by calling __setitem__
with i
set to 2 and value
set to 10). The code prints the new value of a
which is:
[1, 2][10, 4]
Deleting items
Many collections allow you to delete an item using the following syntax:
del a[1] #Remove element at position 1
Our Matrix class, by definition, always has exactly 4 elements, so it doesn't support delete. If it did we could implement it like this:
def __delitem__(self, i):
del self.data[i]
Now if we use del a[1]
on a Matrix object a
, it will remove element 1 from the self.data
list, leaving just 3 elements. However, other functions in the Matrix class rely on there being 4 elements - which is fine, because the matrix is supposed to have 4 elements. For example __str__
accesses element index 3, which will no longer exist after running del
. This means that print
won't work for Matrix after running del
.
The code above is for illustration, it is how you might implement del
for other collections. But the Matrix class, by its nature, doesn't support del
.
For loops
Our Matrix class automatically supports for
loops:
a = Matrix(1, 2, 3, 4)
for x in a:
print(x)
This works because the Matrix object supports __getitem__
. Python implements the for loop by calling __getitem__
with values 0, then 1, then 2 etc, and loops while ever a valid result is obtained. As soon as the call throws an exception, the loop will end.
This works well for simple cases like Matrix. The alternative is to define the __iter__
method, which must return an iterator that accesses the elements. In our case, since our data is held in a list, we can use the built-in iter
function to obtain an iterator for the data, and return that. Here is our __iter__
method:
def __iter__(self):
return iter(self.data)
Using __iter__
might be slightly more efficient, as it uses the list's built-in iterator. You can also override __iter__
if you need special behaviour.
The in operator
Matrix supports the in
operator (and the not in
operator):
a = Matrix(1, 2, 3, 4)
print(3 in a, 5 in a)
The default implementation works by iterating the object, effectively using a for loop. It will use __iter__
if it exists, otherwise it will use __getitem__
.
You can provide your own implementation by defining a __contains__
method, that accepts a value and returns True
if the value is in the collection, or False
otherwise. In our case, we can use the in
operator on self.data
:
def __contains__(self, value):
return value in self.data
The full code
Here is the complete code for our Matrix collection implementation:
class Matrix:
def __init__(self, a, b, c, d):
self.data = [a, b, c, d]
def __str__(self):
return '[{}, {}][{}, {}]'.format(self.data[0],
self.data[1],
self.data[2],
self.data[3])
def __getitem__(self, i):
return self.data[i]
def __setitem__(self, i, value):
self.data[i] = value
def __iter__(self):
return iter(self.data)
def __contains__(self, value):
return value in self.data
See also
Join the PythonInformer Newsletter
Sign up using this form to receive an email when new content is added:
Popular tags
2d arrays abstract data type alignment and angle animation arc array arrays bar chart bar style behavioural pattern bezier curve built-in function callable object chain circle classes clipping close closure cmyk colour combinations comparison operator comprehension context context manager conversion count creational pattern data science data types decorator design pattern device space dictionary drawing duck typing efficiency ellipse else encryption enumerate fill filter font font style for loop formula function function composition function plot functools game development generativepy tutorial generator geometry gif global variable gradient greyscale higher order function hsl html image image processing imagesurface immutable object in operator index inner function input installing iter iterable iterator itertools join l system lambda function latex len lerp line line plot line style linear gradient linspace list list comprehension logical operator lru_cache magic method mandelbrot mandelbrot set map marker style matplotlib monad mutability named parameter numeric python numpy object open operator optimisation optional parameter or pandas partial application path pattern permutations pie chart pil pillow polygon pong positional parameter print product programming paradigms programming techniques pure function python standard library radial gradient range recipes rectangle recursion reduce regular polygon repeat rgb rotation roundrect scaling scatter plot scipy sector segment sequence setup shape singleton slice slicing sound spirograph sprite square str stream string stroke structural pattern subpath symmetric encryption template tex text text metrics tinkerbell fractal transform translation transparency triangle truthy value tuple turtle unpacking user space vectorisation webserver website while loop zip zip_longest