# Advanced vectorisation in numpy

Martin McBride, 2021-02-27

Tags arrays where fancy indexing vectorisation

Categories numpy

Here are some additional vectorisation techniques.

## The where parameter

All universal functions have a `where`

parameter.

The `where`

parameter takes an array-like set of booleans. It applies the function only where the array is true. For example:

a = np.array([-1, 2, -3, 4]) t = a>0 b = np.sqrt(a, where=t)

Here we first create an array `t`

that is true wherever the value in `a`

is greater than 0. That is:

t = [False True False True]

When we calculate the square root, the calculation is only performed where `t`

is true, so we don't attempt to calculate negative square roots. This means that in the output array `b`

, only elements 1 and 3 are assigned values.

b = [?? 1.41421356 ?? 2.]

You can visualise it like this:

The `np.sqrt`

function takes the input array `a`

, and the where array `t`

. It uses `t`

as a filter, so that it only calculates the square root and writes teh result into `b`

for those indices where `t`

is true.

Here is the equivalent code using a loop:

a = np.array([-1, 2, -3, 4]) t = a>0 b = np.empty_like(a) for i in range(a.size): if i[i]: b[i] = sqrt(a[i])

Using the `where`

parameter allows us to perform the same functionality as the loop, but with the speed of a vectorised function.

If you run this example yourself, elements 0 and 2 (shown as ??) are uninitialised. They could be anything, maybe a crazy large floating point number, maybe something that looks sensible like 1. But even if it looks sensible, it is just whatever happened to be in memory, so it is actually meaningless.

We can avoid this randomness by creating an output array that is initialised to a particular value (for example using `full`

), and then use `out`

to update that array.

a = np.array([-1, 2, -3, 4]) b = np.full(4, -1.0) np.sqrt(a, where=a>0, out=b)

This initialised `b`

to -1, so that any elements that are left unchanged by `where`

will remain set to -1:

b = [-1. 1.41421356 -1. 2. ]

## Fancy indexing

Fancy indexing allows you to use one array as an *index* into another array. It can be used to pick values from an array, and also to implement table lookup algorithms.

Here is a simple example:

a = np.array([1, 3, 5, 7, 9]) idx = np.array([1, 1, 2, 3, 3, 4]) b = a[idx]

Here the array `a`

is a set of values for testing.

The array `idx`

contains values that act as indexes into the array `a`

. So value 1 indexes element 1 in `a`

(which has the value 3), etc.

Now look at the final line:

b = a[idx]

Normally `a[n]`

fetches the nth element from `a`

. But because we are passing in an array, `idx`

, the statement fetches an array of elements.

`idx`

has values 1, 1, 2, 3, 3, 4. If we look at `b`

it contains:

b = [3 3 5 7 7 9]

The array has been filled with element 1 from `a`

(which is 3), then element 1 again, then element 2 from `a`

(which is 5), then element 3 from `a`

(which is 7), and so on.

This is called fancy indexing. It selects values from `a`

based on the index values in `idx`

.

Notice that the length of `b`

is the same as the length of `idx`

, rather than `a`

. In fact, `b`

will always be the same shape as `idx`

. For example:

a = np.array([1, 3, 5, 7, 9]) idx = np.array([0, 4, 2, 3]) b = a[idx]

Here, `b`

will be an array of length 6 (the same as `idx`

). The first element of `b`

will be set to element 0 from `a`

. The second element of `b`

will be set to element 4 from `a`

, and so on. The result is:

b = [1 9, 5 7]

Notice that the shape of `b`

is controlled by the shape of `idx`

, as this diagram shows:

What is happening, in effect, is that the array `idx`

is being passed through a lookup table (array `a`

) and then written out to array `b`

. So `b`

will always be the same shape as `idx`

.

Here is the equivalent code using a loop:

a = np.array([1, 3, 5, 7, 9]) idx = np.array([0, 4, 2, 3]) b = np.empty_like(idx) for i in range(idx.size): b[i] = a[idx[i]]

The fancy indexing version is generally far more efficient than the loop.

Visit the PythonInformer Discussion Forum for numeric Python.