Optimisation good practice

By Martin McBride, 2022-03-05
Tags: optimisation built-in function tuple list comprehension join truthy value operator chaining in operator global variable enumerate
Categories: python language intermediate python


When optimising Python code, it is generally best to concentrate your efforts on identifying the true bottlenecks in your code and speeding those areas up.

But are is one exception. Certain coding techniques fall under the category of good practice and might also speed up your code. You should do these things anyway, and if they speed up your code that is a bonus.

Here is a list of some of these techniques, in no particular order. It is as comprehensive as I can make it, but if I have missed anything please comment and I will endeavour to update the article.

1 Use built-in functions/methods where they exist

For example, if you want to find the sum of all the elements of a list of integers, use the built-in sum function. Don't write your own function to do the same thing or use a loop in your code. That just adds code bloat, and anyone reading your code will be left wondering if there was some reason you decided to roll your own version. Python has many built-in modules and functions, it is worth knowing at least the most commonly used ones.

Plus, as a bonus, the built-in function is very likely to be faster.

2 Use tuple packing notation for swapping two variables

If you want to swap the values of two variables a and b, in most languages you will have to store one of the values in a temporary variable, like this:

temp = b
b = a
a = temp

That works in Python, but this code does exactly the same thing:

a, b = b, a

Much clearer, and usually a little faster.

3 Prefer tuples for immutable data

On a related point, if you are using a sequence of values that you never intend to change, you should generally declare it as a tuple rather than a list. This is mainly because it makes your intent clear, and also avoids the data being accidentally modified later on. But tuples are usually slightly faster too.

4 Use list comprehensions rather than loops to construct lists

If you need to create a list from an existing sequence, code like this is best avoided:

a = [1, 2, 3]
b = []
for x in a:
    b.append(x*2)

It works, but a list comprehension is simpler and faster:

a = [1, 2, 3]
b = [x*2 for x in a]

5 Avoid using + for string concatenation

If you wish to concatenate several strings, it is generally better to avoid this sort of code:

x = "abc"
y = "def"
z = "ghi"
s = x + ", " + y + ", " + z

In this situation, Python will add the strings, one by one, creating a new intermediate string for every new string added. This is obviously quite inefficient, although in most situations it won't cause any actual performance problems. But it also looks pretty ugly, so it should be avoided for that reason alone.

A better way is to use the join method:

s = ", ".join((a, b, c))

Remember that you can also use build strings using the f-string method (or the old style format method if you are using an older version of Python).

6 Use truthy values

Remember that in Python, certain objects count as true or false when used in an if statement or while loop. These are:

  • Numerical values: zero counts as false, non-zero counts as true.
  • A List, Tuple, String, Dict, or Set: empty (ie zero-length) counts as false, non-empty counts as true.
  • None counts as false.

These are called truthy values. This can be used to shorten a logical test, for example:

# Number: don't do this
if n != 0:
    ...

# Do this instead
if n:
    ...

# List: don't do this
if len(k) != 0:
    ...

# Do this instead
if k:
    ...

This might seem odd if you haven't met it before, but it is what most experienced Python programmers would use and expect. It is also a little faster because no comparison is executed.

Note that the two cases don't always behave in the same way. For example, if n happened to have the value None, then if n != 0 would throw an exception because None cannot be compared to zero. But if n would execute without error.

7 Use chained comparisons

If you need to compare a value against an upper and lower bound, you can (and should) used operator chaining:

# Don't do this:
if a > 10 and a < 100:
    ...

# Do this:
if 10 < a < 100:
    ...

This is more readable and usually a bit faster.

In particular, if the comparison involves a function call, it can be significantly faster:

# f(x) is called twice:
if f(x) > 10 and f(x) < 100:
    ...

# f(x) is only called once:
if 10 < f(x) < 100:
    ...

8 Use the in operator to test membership

If you want to check if a particular value is present in a list, tuple, or set, you should use the in operator:

k = [1, 2, 3]
if 2 in k:
    ...

There are other ways to do it. For example, the count method is useful if you want to know how many times a value appears in the sequence (it will be zero if the value isn't there at all). The index will tell you the position of the value in the sequence (it raises a ValueError if it isn't there). But using the built-in in operator is more direct and the intent of the code is more clear. And it is likely to be a bit faster.

9 Avoid global variables

A global variable is a variable that is declared at the top level so that it can be accessed by any part of the program.

While it can be very convenient to be able to access shared data from anywhere, it usually causes more problems than it solves, mainly because it allows any part of the code to introduce unexpected side effects. So globals are generally to be avoided. But if you need an extra reason to not use them, they are also slower to access.

10 Use the latest release of Python

New versions of Python are released quite frequently (at the time of writing Python 3.9 has been updated 8 times in the last year). It is worth keeping up to date as new versions often have bug fixes and security fixes, but they sometimes have performance improvements too.

11 Use enumerate if you need a loop index

If for some reason you really need a loop index, you should use the enumerate function:

# Don't do this
k = [10, 11, 12]
for i in range(len(k)):
    print(i, k[i])

# Do this
for i, x in enumerate(k):
    print(i, x)

This is faster and clearer.

Summary

There are few absolute rules in programming, but these tips will usually make your code more readable and easier to maintain. In some cases they might speed your code up too.

See also

If you found this article useful, you might be interested in the book NumPy Recipes or other books by the same author.

Join the PythonInformer Newsletter

Sign up using this form to receive an email when new content is added:

Popular tags

2d arrays abstract data type alignment and angle animation arc array arrays bar chart bar style behavioural pattern bezier curve built-in function callable object chain circle classes clipping close closure cmyk colour combinations comparison operator comprehension context context manager conversion count creational pattern data science data types decorator design pattern device space dictionary drawing duck typing efficiency ellipse else encryption enumerate fill filter font font style for loop formula function function composition function plot functools game development generativepy tutorial generator geometry gif global variable gradient greyscale higher order function hsl html image image processing imagesurface immutable object in operator index inner function input installing iter iterable iterator itertools join l system lambda function latex len lerp line line plot line style linear gradient linspace list list comprehension logical operator lru_cache magic method mandelbrot mandelbrot set map marker style matplotlib monad mutability named parameter numeric python numpy object open operator optimisation optional parameter or pandas partial application path pattern permutations pie chart pil pillow polygon pong positional parameter print product programming paradigms programming techniques pure function python standard library radial gradient range recipes rectangle recursion reduce regular polygon repeat rgb rotation roundrect scaling scatter plot scipy sector segment sequence setup shape singleton slice slicing sound spirograph sprite square str stream string stroke structural pattern subpath symmetric encryption template tex text text metrics tinkerbell fractal transform translation transparency triangle truthy value tuple turtle unpacking user space vectorisation webserver website while loop zip zip_longest