Iterators

By Martin McBride, 2018-03-17
Tags: iterator lazy evaluation
Categories: functional programming


Programmers often need to deal with lists of values, and most languages provide data structures for this purpose (in Python the type list is usually used).

However, in functional programming we don't usually deal with lists directly. Instead we use an iterator that fetches values from a list or other sequence. This has two advantages:

  • An iterator can read from a list but it can't modify the list. So if you pass an iterator into a function, you can be sure that the function will not change the list. This is important because in functional programming we prefer to use functions that don't have any side effects (and modifying a list passed into the function is a side effect).
  • Iterators support lazy evaluation, which we will explore later in this article.

Iterables

A Python lists, tuples and strings are all types of iterable object. This means that you can iterate over the object, reading the values in order.

Iterators and iterables are slightly different things, as you will see in the next article in this series.

The most common way to iterate over an iterable is in a for loop:

k = [10, 20, 30, 40]

for x in k:
    print(x)

The for loop reads the values from the iterable, one at a time, and executes the loop for each value.

Lazy evaluation

Now we will use a slightly more complex example to illustrate what is meant by lazy evaluation. We will use the built-in map function that accepts a function and an iterable. It return an iterator that applies the function to each item in the input iterable.

def print_all(it):
    for x in it:
        print(x)

def square(x):
    return x*x

range_it = range(10)
map_it = map(square, range_it)
print_all(map_it)

Here we define function square that calculates the square of x.

We call the range function to create a sequence of values 0 ... 9. range returns an iterable that will produce those values when they are requested.

The map function applies square to the values in range_it. The result of this is an iterator map_it. The iterator will produce the sequence 1, 4, 9, 16 ... 81 (the squares of values in the original list).

We then use the previously defined function, print_all to print the values. It will print 1, 4, 9, 16 ... 81.

Although it is simple enough to understand what the code will do, the order of execution is quite interesting:

1. When we call the range function, it simply returns an iterable. It doesn't create the values yet.

2. When we call the map function, it simply returns an iterator. It doesn't do any calculations. The square function doesn't get called at this stage.

3. The print_all function accepts the map_it iterator and loops over its values.

Each time through the for loop in print_all, the following happens:

  • The for loop requests the next value from the map_it iterator.
  • The map_it iterator requests the next value from the range_it iterator.
  • The map_it iterator applies the square function to the value, and returns the result.
  • The print_all function prints the result.
  • Loop back until finished...

What the code has done is to set up a chain of linked iterators from the print_all for loop right back to the range function. This means that each number in the range isn't created until it is needed. The complete list of values is never stored in memory, and the first value is printed almost immediately.

Lazy evaluation and large sequences

To test this, let's make the range much bigger:

range_it = range(10**1000)

10 to the power of 1000 is a truly huge number - a one followed by 1000 zeros. Your computer can't store a list that big (it is larger than the number of atoms in the universe), and it would never be able to fill the list with numbers (it would almost certainlytake longer than the predicted lifetime of the universe, however fast computers might get).

So if Python actually tried to create this range as a list, the program would never get past this step.

But in fact, all the range function does is create an iterable that promises to create all those numbers in the future. And that takes almost no time at all.

Try running the program with this new range. It will start printing values immediately. Just don't expect it to finish any time soon!

See also

If you found this article useful, you might be interested in the book NumPy Recipes or other books by the same author.

Join the PythonInformer Newsletter

Sign up using this form to receive an email when new content is added:

Popular tags

2d arrays abstract data type alignment and angle animation arc array arrays bar chart bar style behavioural pattern bezier curve built-in function callable object chain circle classes clipping close closure cmyk colour combinations comparison operator comprehension context context manager conversion count creational pattern data science data types decorator design pattern device space dictionary drawing duck typing efficiency ellipse else encryption enumerate fill filter font font style for loop formula function function composition function plot functools game development generativepy tutorial generator geometry gif global variable gradient greyscale higher order function hsl html image image processing imagesurface immutable object in operator index inner function input installing iter iterable iterator itertools join l system lambda function latex len lerp line line plot line style linear gradient linspace list list comprehension logical operator lru_cache magic method mandelbrot mandelbrot set map marker style matplotlib monad mutability named parameter numeric python numpy object open operator optimisation optional parameter or pandas partial application path pattern permutations pie chart pil pillow polygon pong positional parameter print product programming paradigms programming techniques pure function python standard library radial gradient range recipes rectangle recursion reduce regular polygon repeat rgb rotation roundrect scaling scatter plot scipy sector segment sequence setup shape singleton slice slicing sound spirograph sprite square str stream string stroke structural pattern subpath symmetric encryption template tex text text metrics tinkerbell fractal transform translation transparency triangle truthy value tuple turtle unpacking user space vectorisation webserver website while loop zip zip_longest