Accelerating Python code with Numba

1 minute read

Published: February 12, 2023

Numba is an open-source numerical Python compiler that translates a subset of Python and NumPy code into machine code, executed on the CPU. It can be used to significantly speed up the execution of numerical computations, especially those that are performed in a loop, by compiling the code into machine code rather than interpreting it dynamically.

Numba provides a just-in-time (JIT) compiler, which means that it compiles the code on-the-fly, just before it is executed. This allows for the optimizations to be applied exactly where they are needed, rather than in advance. Numba supports a wide range of Python data types and functions, and it provides a simple, easy-to-use API for adding JIT compilation to existing code.

Numba allows you to write high-performance Python code by leveraging the power of JIT compilation. It can be used for a wide range of applications, from scientific computing and data analysis to machine learning.

Example

Here is a simple example of how Numba can be used to speed up a calculation:

Consider the following function:

	def go_fast_nonumba(a):
	trace = 0.0
	for i in range(a.shape[0]):
	trace += np.tanh(a[i, i])
	return a + trace

view raw numba_nonumba.py hosted with ❤ by GitHub

This function can be slow when working with large inputs. To speed up the calculation, we can use Numba to compile the function:

	@jit(nopython=True)
	def go_fast(a): # Function is compiled and runs in machine code
	trace = 0.0
	for i in range(a.shape[0]):
	trace += np.tanh(a[i, i])
	return a + trace

view raw numba_def.py hosted with ❤ by GitHub

By adding the @jit decorator, Numba compiles the function into machine code, which can be executed much faster than the interpreted code. The compiled function can then be used just like any other Python function:

	from numba import jit
	import numpy as np
	import time

	x = np.arange(10000).reshape(100, 100)

	@jit(nopython=True)
	def go_fast(a): # Function is compiled and runs in machine code
	trace = 0.0
	for i in range(a.shape[0]):
	trace += np.tanh(a[i, i])
	return a + trace

	def go_fast_nonumba(a):
	trace = 0.0
	for i in range(a.shape[0]):
	trace += np.tanh(a[i, i])
	return a + trace

	# DO NOT REPORT THIS... COMPILATION TIME IS INCLUDED IN THE EXECUTION TIME!
	start = time.time()
	go_fast(x)
	end = time.time()
	print("Elapsed (with compilation) = %s" % (end - start))

	n_rep = 100;
	# NOW THE FUNCTION IS COMPILED, RE-TIME IT EXECUTING FROM CACHE
	start = time.time()
	[go_fast(x) for _ in range(n_rep)]
	elp1 = time.time() - start
	print("Elapsed (after compilation) = %s" % (elp1 / n_rep))


	start = time.time()
	[go_fast_nonumba(x) for _ in range(n_rep)]
	elp2 = time.time() - start
	print("Elapsed (No NUMBA) = %s" % (elp2 / n_rep))
	print('Ratio: %s' % (elp1/elp2))

view raw numba_total.py hosted with ❤ by GitHub

The output of the above script is:

	Elapsed (with compilation) = 0.1845691204071045
	Elapsed (after compilation) = 1.4524459838867188e-05
	Elapsed (No NUMBA) = 0.0002772068977355957
	Ratio: 0.05239573747086498

view raw numba_out hosted with ❤ by GitHub

In this example, we see a significant speedup in the execution time of the function. The same technique can be used to speed up other types of calculations, such as matrix operations, Monte Carlo simulations, and more.

Reference

[1] A ~5 minute guide to Numba

Share on

Twitter Facebook LinkedIn

Ramin Toosi

Accelerating Python code with Numba

Example

Reference

Share on

You May Also Enjoy

Fine-Tuning LLaMA 3.2-11B-Vision for Product Descriptions [Read More]

Diffusion From Scratch To Generate Cute Anime Faces! [Read More]

Running Llama 3.2 in Rust [Read More]

Boosting Model Efficiency with Quantization and Pruning in PyTorch [Read More]