Named tuples

By Martin McBride, 2020-03-08
Tags: tuple named tuple dictionary
Categories: python language intermediate python


Tuples are a great way to create ad hoc data structures. Named tuples extend this idea by allowing the values within a tuple to be referred to by name. We will start with a quick recap on tuples before looking at named tuples

Tuples recap

Tuples are often used to group several related data values. For example we could a colour is often represented by 3 values: red, green and blue. We can store this as a tuple like this:

background = (1, 0.5, 0) # (r, g, b)

We can use tuple packing to return a colour value from a function:

def get_background_color():
  # Do something to get r, g, b values

  return r, g, b

color = get_background_color()

There are two ways to access the values in a tuple - unpacking and indexing:

r, g, b = color

green = color[1]

The first line unpacks the three elements of color into the variables r, g, and b. The second line gets the second element of color and stores it in the variable green.

Both these method rely on you remembering the order of the elements in the tuple. That is fine for RGB colours because they have a natural order. It becpomes more difficult if you have a record holding, for example, employee information. It could include first name, surname, employee number, job title, etc, but there is no obvious way to be sure what order they are stored in, and it would be very easy to make a mistake.

Wouldn't it be great if you could access the fields by name?

Defining a named tuple

A namedtuple give you the ability to name the individual elements of a tuple. You can use the names when you define the tuple, and also to access the tuple elements.

Here is how we define a namedtuple. Note that we need to import it from collections.

from collections import namedtuple

Color = namedtuple('Color', ['red', 'green', 'blue'])

This creates a new class that implements a specific type of namedtuple. The first parameter of the namedtuple call is the name of the new class - we are going to call it 'Color'. The second parameter is a list of the field names. We are defining 3 fields, called 'red', 'green', and 'blue'.

The namedtuple function doesn't return a named tuple. It actually returns a factory function that creates new named tuples of class Color. We assign this factory method to a variable Color. It is common to use the same name for the class and the factory, but you don't have to.

You can define the fields of a named tuple using a string instead of a list, for example:

Color = namedtuple('Color', 'red, green, blue'])

This has exactly the same effect as the list in the previous example. The identifiers in the string can be separated by whitespace, or commas, or both.

Creating and using named tuples

Having defined our named tuple, we can create instances of it using the factory function Color, like this:

color1 = Color(red=1, green=0.5, blue=0)
print(color1)
color2 = Color(blue=1, red=1, green=0)
print(color2)

Which prints:

Color(red=1, green=0.5, blue=0)
Color(red=1, green=0, blue=1)

Unlike a normal tuple, this constructor used named arguments. We can define the 'red', 'green' and 'blue' values in any order, we don't need to worry about what order they are stored within the tuple itself.

We can also access the elements by name, like this:

print(color1.red) # Prints 1

Again, we can access the elements of the named tuple without needing to remember the order they are stored in.

Field names

There are some restrictions on the names of fields. The first restriction is that all field names must be valid Python identifiers - that is to say, they must be names that would be valid as variable names, ie:

  • They must be a combination of letters a-z, letters A-Z, digits 1-9 and underscore characters.
  • The must not cannot start with a digit.
  • They must not be Python keywords (if, for, def etc).

The second restriction is that the names must not start with an underscore. This is because names that start with underscores are reserved for special named tuple utility functions, see below.

Named tuples have all the features of regular tuples

Named tuples are a special type of tuple, with extra features. But you can still use them like normal tuples as well if you prefer. You can use positional or named parameters to create the tuple. The positional order is red, green, blue as per the original definition of Color.

color2 = Color(blue=1, red=1, green=0)
print(color2)
color3 = Color(0, 0, 1)                 # Positional, must be in order r, g, b
print(color3)
color4 = Color(0.5, blue=1, green=0.5)  # Positional red, then named blue, green
print(color4)

Which prints:

Color(red=1, green=0, blue=1)
Color(red=0, green=0, blue=1)
Color(red=0.5, green=0.5, blue=1)

You can access elements using names or indices:

print(color2.red)  # Named red field
print(color2.[0])  # Indexed field 0, which is red because order is r, g, b

You can also use unpacking, again the order is r, g, b due to the definition of Color:

r, g, b = color4

Utility methods

Named tuples come with a few additional methods that can be quite useful. The method names all start with underscores - that is why field names aren't allowed to start with underscores. Doing this avoids any of the built in methods clashing with field names you might want to use.

_make(iter) will create a new named tuple instance from a sequence or any other iterable. For example:

k = [1, 0, 0.5]
color5 = Color._make(k)
print(color5)           #Color(red=1, green=0, blue=0.5)

Of course you could alternatively use:

k = [1, 0, 0.5]
color5 = Color(*k)

_asdict() creates a dict object with the fields and values:

color = Color(1, 0, 0.5)
d = color._asdict()
print(d)

giving:

dict([('red', 1), ('green', 0), ('blue', 0.5)])

This is useful if you want to iterate over the keys.

Note that Python versions 3.1 to 3.7 return an OrderedDict, a special version of dict that has preserves the order of the entries (so the entries will always be in the order defined for the named tuple: red, green, blue in this case. As of Python 3.8 ordinary dict objects are ordered automatically so an ordinary dict is returned.

_fields is a tuple containing the fields of the named tuple (but not the values):

color = Color(1, 0, 0.5)
t = color._fields
print(t)

Notice that _fields not a function, it is just a data member of the named tuple. The code above gives:

('red', 'green', 'blue')

_replace() creates a new named tuple, of the same type as the original, but with one or more of its values changed. For example:

k = [1, 0, 0.5]
color = Color(1, 0, 0.5)
color2 = color._replace(red=0.3, blue=0.6)
print(color2)

This creates a new named tuple, replacing the red and blue fields with new value but leaving the green unchanged. So color2 becomes:

Color(red=0.3, green=0, blue=0.6)

Under the hood

When you define a named tuple, Python actually creates a new class. The new class is created dynamically from the supplied field names. So our Color named tuple class has data members called red, green and blue. If we were to create a named tuple to hold an (x, y) screen coordinate, it would have data members named x and y.

This makes named tuples very efficient, compared to, say, a dictionary. We could use dictionaries to represent a colour:

color = dict(red=1, green=0.5, blue=0)
print(color['blue'])

The main problem here is that every colour dictionary we define has to keep its own copies of the strings 'red', 'green' and 'blue'. Python might optimise this to some extent by reusing constant strings, but a dictionary object is still larger than a tuple.

A named tuple instance occupies more or less the same amount of memory as regular tuple.

Summary

Named tuples provide the convenience of using tuples as an ad hoc way of grouping related data items to create a record, but they have the added advantage of allowing you to create and read the individual fields by name as well as index.

See also

If you found this article useful, you might be interested in the book NumPy Recipes or other books by the same author.

Join the PythonInformer Newsletter

Sign up using this form to receive an email when new content is added:

Popular tags

2d arrays abstract data type alignment and angle animation arc array arrays bar chart bar style behavioural pattern bezier curve built-in function callable object chain circle classes clipping close closure cmyk colour combinations comparison operator comprehension context context manager conversion count creational pattern data science data types decorator design pattern device space dictionary drawing duck typing efficiency ellipse else encryption enumerate fill filter font font style for loop formula function function composition function plot functools game development generativepy tutorial generator geometry gif global variable gradient greyscale higher order function hsl html image image processing imagesurface immutable object in operator index inner function input installing iter iterable iterator itertools join l system lambda function latex len lerp line line plot line style linear gradient linspace list list comprehension logical operator lru_cache magic method mandelbrot mandelbrot set map marker style matplotlib monad mutability named parameter numeric python numpy object open operator optimisation optional parameter or pandas partial application path pattern permutations pie chart pil pillow polygon pong positional parameter print product programming paradigms programming techniques pure function python standard library radial gradient range recipes rectangle recursion reduce regular polygon repeat rgb rotation roundrect scaling scatter plot scipy sector segment sequence setup shape singleton slice slicing sound spirograph sprite square str stream string stroke structural pattern subpath symmetric encryption template tex text text metrics tinkerbell fractal transform translation transparency triangle truthy value tuple turtle unpacking user space vectorisation webserver website while loop zip zip_longest