# Using zip in a for loop

By Martin McBride, 2022-09-18
Tags: zip zip_longest for loop
Categories: python language intermediate python

We sometimes need to loop over two different sequences at the same time. For example, if we had two lists:

colors = ["red", "green", "blue"]
shapes = ["circle", "square", "triangle"]


and we wanted to print these lists side by side:

red circle
green square
blue triangle


We could do this using a loop counter, but a better way is to use the zip function, as we will see.

## Using a loop counter

We can just loop three times and use the counter to index the lists:

for i in range(len(colors)):
print(colors[i], shapes[i])


This works, and produces the required result, but is not generally considered to be Pythonic code. It is usually best to avoid loop counters wherever possible.

The solution above has several problems. The code is more complex than it needs to be. It also only works with sequence types (such as lists, tuples or strings), it doesn't work with lazy iterators. And finally, it might not work if the sequences have different lengths.

Fortunately, you don't have to forget everything you have learnt so far - you can use the zip function. This allows us access 2 or more sequences within a Pythonic for loop.

## A better solution - using zip

The zip function can be used to loop over 2 (or more) sequences at the same time. It is used like this:

for c, s in zip(colors, shapes):
print(c, s)


In this code, on each pass through the loop the c variable steps through the colors one by one, and the s variable steps through the shapes. So it prints the same output as before:

red circle
green square
blue triangle


But the code is simpler and more declarative. It says exactly what it is doing, with no extra loop counters as distractions.

## How zip works

We can see how zip works with the following code:

for t in zip(colors, shapes):
print(t)


This creates the following output:

("red", "circle")
("green", "square")
("blue", "triangle")


The zip function is named after the behaviour of a clothes zipper. In code, zip means joining two sequences, element by element. So the two lists:

["red",
"green",
"blue"]

["circle",
"square",
"triangle"]


become a sequence of tuples as shown above.

Instead of using a tuple t, we can unpack the tuple into separate variables, c and s:

for t in zip(colors, shapes):
c, s = t
print(c, s)


Finally, we can move the unpacking step to be part of the for statement, which gives us the final code:

for c, s in zip(colors, shapes):
print(c, s)


This diagram illustrates the process:

## Zipping more than 2 sequences

We can zip 3 (or more) sequences. In this example we have introduced a sequence count:

count = [2, 10, 5]
for c, s, n in zip(colors, shapes, count):
print(c, s, n)


We have added an extra list, count, that contains some numbers. We would now like to print a list of all three sequences, side by side.

Fortunately, the zip function can accept any number of arguments, so we can simply add count as an extra argument. This means that zip will now create a list of tuples where each tuple contains 3 values - a color, a shape, and a count.

When we unpack these tuples, we must provide 3 variables to receive the values. The number of variables must always equal the number of arguments passed into the zip function.

The output of this code will be:

red circle 2
green square 10
blue triangle 5


## Zipping sequences with unequal lengths

What would happen if we supplied the zip function with a set of sequences that didn't have equal lengths? Let's see:

count = [2, 10, 5, 7]
for c, n in zip(colors, count):
print(c, n)


In this case, colors has 3 values, as usual, but we have supplied a count list with 4 values. Here is the output:

red 2
green 10
blue 5


zip only outputs 3 values. The output of zip is controlled by the length of the shortest input sequence, which in this case is colors with a length of 3. Only 3 values are created, and the extra element of count is ignored.

## Using zip_longest

If we wanted to include all the elements of count, we could use the zip_longest function:

from itertools import zip_longest

count = [2, 10, 5, 7]
for c, n in zip_longest(colors, count):
print(c, n)


Notice that zip_longest isn't a built-in function, it is part of the itertools module. We need to import it before we can use it.

The output of zip_longest is controlled by the length of the longest input sequence, which in this case is count with a length of 4. So 4 values are created. Since colors only contains 3 values, a value of None is used in place of the missing value.

red 2
green 10
blue 5
None 7


## Unzipping

While we are looking at zip, it is worth mentioning that we can also unzip a set of values. Here is how we do it:

zipped = [("red", "circle"), ("green", "square"), ("blue", "triangle")]
colors2, shapes2 = zip(*zipped)
print(colors2, shapes2)


zipped contains our data. To unzip the data, we use the zip function, but we pass *zipped into it. The output is:

('red', 'green', 'blue') ('circle', 'square', 'triangle')


How does this work? Well, the asterisk operator * unpacks a sequence and passes each value into the function as a separate argument. This means that:

zip(*zipped)


is equivalent to passing three separate arguments into zip (see calling functions):

zip(("red", "circle"), ("green", "square"), ("blue", "triangle"))


If we zip these values together, it creates two tuples. The first contains the 3 first values of the input parameters (ie the colours). The second contains the 3 second values of the input parameters (the shapes). Which effectively unzips the data.

## Other useful techniques

List comprehension are another useful way to process iterables, especially if you need the result in the form of a list.

The itertools module contains several useful variants of zip and similar functions.