Creating random data in numpy

By Martin McBride, 2019-09-16
Tags: arrays random
Categories: numpy


In this section we will look at how to create numpy arrays initialised with random data.

There are various ways to create an array of random numbers in numpy.

If you read the numpy documentation, you will find that most of the random functions have several variants that do more or less the same thing. They might vary in minor ways - parameter order, whether the value range is inclusive or exclusive etc. The basic set described below should be enough to do everything you need, but if you prefer to use the other variants they will deliver the same results.

random.random

This will create an array of random numbers in the range 0.0 up to but not including 1.0. This means that the range can included anything from 0.0 up to the largest float that is less than 1 (eg something like 0.99999999...), but it will never actually include 1.0. In maths we sometimes write this as [0.0, 1.0). The values are distributed uniformly, so every values is equally likely to occur.

r = np.random.random((3, 2))
print(r)

This creates a 3 by 2 array of random numbers, like this (of course you will get different numbers):

[[0.40704545 0.47734427]
 [0.76764629 0.37887717]
 [0.82443478 0.36409071]]

If you want to create random number over a different range, for example [a, b), you can do it using vectorised operators like this:

r = (b - a)*np.random.random((3, 2)) + a
print(r)

random.randint

The randint function creates an array of integers. In its simplest form it creates values in the range [0, high), that is integers from 0 up to but not including high:

r = np.random.randint(4, size=(3, 4))
print(r)

Notice that the size is passed in as a named parameter, unfortunately it isn't just the first parameter like most numpy functions.

This code, with a value of 4, will create value in the range 0 to 3:

[[3 3 0 3]
 [3 1 3 0]
 [2 3 3 1]]

You can also pass in two values, low and high, resulting in numbers in the range [low, high). For example to simulate a dice (output values 1 to 6 inclusive), you would use values 1 and 7:

r = np.random.randint(1, 7, size=10)
print(r)

giving:

[1 3 3 5 4 1 2 1 6 4]

random.choice

choice picks values at random from a list (in this case the list is all prime numbers less than 20):

r = np.random.choice([2, 3, 5, 7, 11, 13, 17, 19], size=10)
print(r)

giving:

[17 19  7  5 11 11  2  7 11  3]

There are other options (for example you can set different probabilities for each item in the list) but we won't cover that here.

random.standard_normal

This function creates values using the standard Normal distribution. The Normal distribution is the classic bell shaped curve, centred on zero.

r = np.random.standard_normal((3, 3))
print(r)

Giving:

[[-0.20059509 -1.70950313  0.1355992 ]
 [-0.84462048  1.27934375  1.30837433]
 [-1.34519813 -1.18474318 -0.83397725]]

See also

If you found this article useful, you might be interested in the book NumPy Recipes or other books by the same author.

Join the PythonInformer Newsletter

Sign up using this form to receive an email when new content is added:

Popular tags

2d arrays abstract data type alignment and angle animation arc array arrays bar chart bar style behavioural pattern bezier curve built-in function callable object chain circle classes clipping close closure cmyk colour combinations comparison operator comprehension context context manager conversion count creational pattern data science data types decorator design pattern device space dictionary drawing duck typing efficiency ellipse else encryption enumerate fill filter font font style for loop formula function function composition function plot functools game development generativepy tutorial generator geometry gif global variable gradient greyscale higher order function hsl html image image processing imagesurface immutable object in operator index inner function input installing iter iterable iterator itertools join l system lambda function latex len lerp line line plot line style linear gradient linspace list list comprehension logical operator lru_cache magic method mandelbrot mandelbrot set map marker style matplotlib monad mutability named parameter numeric python numpy object open operator optimisation optional parameter or pandas partial application path pattern permutations pie chart pil pillow polygon pong positional parameter print product programming paradigms programming techniques pure function python standard library radial gradient range recipes rectangle recursion reduce regular polygon repeat rgb rotation roundrect scaling scatter plot scipy sector segment sequence setup shape singleton slice slicing sound spirograph sprite square str stream string stroke structural pattern subpath symmetric encryption template tex text text metrics tinkerbell fractal transform translation transparency triangle truthy value tuple turtle unpacking user space vectorisation webserver website while loop zip zip_longest