Named tuples


Martin McBride, 2020-03-08
Tags tuple named tuple dictionary
Categories python language intermediate python

Tuples are a great way to create ad hoc data structures. Named tuples extend this idea by allowing the values within a tuple to be referred to by name. We will start with a quick recap on tuples before looking at named tuples

Tuples recap

Tuples are often used to group several related data values. For example we could a colour is often represented by 3 values: red, green and blue. We can store this as a tuple like this:

background = (1, 0.5, 0) # (r, g, b)

We can use tuple packing to return a colour value from a function:

def get_background_color():
  # Do something to get r, g, b values

  return r, g, b

color = get_background_color()

There are two ways to access the values in a tuple - unpacking and indexing:

r, g, b = color

green = color[1]

The first line unpacks the three elements of color into the variables r, g, and b. The second line gets the second element of color and stores it in the variable green.

Both these method rely on you remembering the order of the elements in the tuple. That is fine for RGB colours because they have a natural order. It becpomes more difficult if you have a record holding, for example, employee information. It could include first name, surname, employee number, job title, etc, but there is no obvious way to be sure what order they are stored in, and it would be very easy to make a mistake.

Wouldn't it be great if you could access the fields by name?

Defining a named tuple

A namedtuple give you the ability to name the individual elements of a tuple. You can use the names when you define the tuple, and also to access the tuple elements.

Here is how we define a namedtuple. Note that we need to import it from collections.

from collections import namedtuple

Color = namedtuple('Color', ['red', 'green', 'blue'])

This creates a new class that implements a specific type of namedtuple. The first parameter of the namedtuple call is the name of the new class - we are going to call it 'Color'. The second parameter is a list of the field names. We are defining 3 fields, called 'red', 'green', and 'blue'.

The namedtuple function doesn't return a named tuple. It actually returns a factory function that creates new named tuples of class Color. We assign this factory method to a variable Color. It is common to use the same name for the class and the factory, but you don't have to.

You can define the fields of a named tuple using a string instead of a list, for example:

Color = namedtuple('Color', 'red, green, blue'])

This has exactly the same effect as the list in the previous example. The identifiers in the string can be separated by whitespace, or commas, or both.

Creating and using named tuples

Having defined our named tuple, we can create instances of it using the factory function Color, like this:

color1 = Color(red=1, green=0.5, blue=0)
print(color1)
color2 = Color(blue=1, red=1, green=0)
print(color2)

Which prints:

Color(red=1, green=0.5, blue=0)
Color(red=1, green=0, blue=1)

Unlike a normal tuple, this constructor used named arguments. We can define the 'red', 'green' and 'blue' values in any order, we don't need to worry about what order they are stored within the tuple itself.

We can also access the elements by name, like this:

print(color1.red) # Prints 1

Again, we can access the elements of the named tuple without needing to remember the order they are stored in.

Field names

There are some restrictions on the names of fields. The first restriction is that all field names must be valid Python identifiers - that is to say, they must be names that would be valid as variable names, ie:

  • They must be a combination of letters a-z, letters A-Z, digits 1-9 and underscore characters.
  • The must not cannot start with a digit.
  • They must not be Python keywords (if, for, def etc).

The second restriction is that the names must not start with an underscore. This is because names that start with underscores are reserved for special named tuple utility functions, see below.

Named tuples have all the features of regular tuples

Named tuples are a special type of tuple, with extra features. But you can still use them like normal tuples as well if you prefer. You can use positional or named parameters to create the tuple. The positional order is red, green, blue as per the original definition of Color.

color2 = Color(blue=1, red=1, green=0)
print(color2)
color3 = Color(0, 0, 1)                 # Positional, must be in order r, g, b
print(color3)
color4 = Color(0.5, blue=1, green=0.5)  # Positional red, then named blue, green
print(color4)

Which prints:

Color(red=1, green=0, blue=1)
Color(red=0, green=0, blue=1)
Color(red=0.5, green=0.5, blue=1)

You can access elements using names or indices:

print(color2.red)  # Named red field
print(color2.[0])  # Indexed field 0, which is red because order is r, g, b

You can also use unpacking, again the order is r, g, b due to the definition of Color:

r, g, b = color4

Utility methods

Named tuples come with a few additional methods that can be quite useful. The method names all start with underscores - that is why field names aren't allowed to start with underscores. Doing this avoids any of the built in methods clashing with field names you might want to use.

_make(iter) will create a new named tuple instance from a sequence or any other iterable. For example:

k = [1, 0, 0.5]
color5 = Color._make(k)
print(color5)           #Color(red=1, green=0, blue=0.5)

Of course you could alternatively use:

k = [1, 0, 0.5]
color5 = Color(*k)

_asdict() creates a dict object with the fields and values:

color = Color(1, 0, 0.5)
d = color._asdict()
print(d)

giving:

dict([('red', 1), ('green', 0), ('blue', 0.5)])

This is useful if you want to iterate over the keys.

Note that Python versions 3.1 to 3.7 return an OrderedDict, a special version of dict that has preserves the order of the entries (so the entries will always be in the order defined for the named tuple: red, green, blue in this case. As of Python 3.8 ordinary dict objects are ordered automatically so an ordinary dict is returned.

_fields is a tuple containing the fields of the named tuple (but not the values):

color = Color(1, 0, 0.5)
t = color._fields
print(t)

Notice that _fields not a function, it is just a data member of the named tuple. The code above gives:

('red', 'green', 'blue')

_replace() creates a new named tuple, of the same type as the original, but with one or more of its values changed. For example:

k = [1, 0, 0.5]
color = Color(1, 0, 0.5)
color2 = color._replace(red=0.3, blue=0.6)
print(color2)

This creates a new named tuple, replacing the red and blue fields with new value but leaving the green unchanged. So color2 becomes:

Color(red=0.3, green=0, blue=0.6)

Under the hood

When you define a named tuple, Python actually creates a new class. The new class is created dynamically from the supplied field names. So our Color named tuple class has data members called red, green and blue. If we were to create a named tuple to hold an (x, y) screen coordinate, it would have data members named x and y.

This makes named tuples very efficient, compared to, say, a dictionary. We could use dictionaries to represent a colour:

color = dict(red=1, green=0.5, blue=0)
print(color['blue'])

The main problem here is that every colour dictionary we define has to keep its own copies of the strings 'red', 'green' and 'blue'. Python might optimise this to some extent by reusing constant strings, but a dictionary object is still larger than a tuple.

A named tuple instance occupies more or less the same amount of memory as regular tuple.

Summary

Named tuples provide the convenience of using tuples as an ad hoc way of grouping related data items to create a record, but they have the added advantage of allowing you to create and read the individual fields by name as well as index.


Tag cloud

2d arrays abstract data type alignment and array arrays bezier curve built-in function close closure colour comparison operator comprehension context conversion data types design pattern device space dictionary duck typing efficiency encryption enumerate filter font font style for loop function function plot functools generator gif gradient html image processing imagesurface immutable object index input installing iter iterator itertools lambda function len linspace list list comprehension logical operator lru_cache mandelbrot map monad mutability named parameter numeric python numpy object open operator optional parameter or path positional parameter print pure function radial gradient range recursion reduce rotation scaling sequence slice slicing sound spirograph str stream string subpath symmetric encryption template text text metrics transform translation transparency tuple unpacking user space vectorisation webserver website while loop zip

Copyright (c) Axlesoft Ltd 2020