CoCalc Shared FilesNumpy Broadcasting Tutorial.sagewsOpen in CoCalc with one click!
Authors: Trevor Clark, Harald Schilly, William A. Stein
Views : 5
%auto import numpy as np # Import the numpy library

Broadcasting with NumPy

Often large amounts of data needs to be manipulated by addition, subtraction, division, and multiplication. In order to make programing simpler, and computation more efficient, problems are often vectorized. In other words, problems are formulated in terms of vectors, matrices, tensors, and scalers (0th order tensors), generally referred to as arrays. Operations commonly need to be applied element wise to matrix and vector quantities. With a lower level language, like C, these operations would be performed using loops, such as a for loop. However higher level languages, like NumPy in Python, allows programmers to generalize operations to apply to arrays, without need for explicit loops. In addition to allowing the programmer to avoid having to write loops, NumPy utilizes libraries that use specialized CPU routines which are more efficient than simple loops. Broadcasting refers how computations with arrays of different sizes, or shapes, are performed. Before defining specifically how broadcasting in NumPy works consider a few examples.

Here are two arrays with shape or dimension of (3,).

a = np.array([1.0, 2.0, 3.0]) # shape (3,) b = np.array([2.0, 2.0, 2.0]) # shape (3,) a,b
(array([ 1., 2., 3.]), array([ 2., 2., 2.]))

a*b in effect multiplies corresponding elements in each array together and results in another array of shape (3,)

a*b
array([ 2., 4., 6.])

This result is fairly straight forward. Notice that the array b is the same number for all array positions. Here we can use NumPy's broadcasting feature to simplify the operation. Again array a has shape (3,), but b is a scaler or a 0th-order tensor.

a = np.array([1.0, 2.0, 3.0]) # shape (3,) b = 2.0 # scalar a,b
(array([ 1., 2., 3.]), 2.00000000000000)
a*b
array([ 2., 4., 6.])

This is the same result as the first case, but b did not need to be defined as an array with the same shape as array a. b can be thought of as being "stretched" across the length of array a. For this example it seems trivial, but when moving to larger and higher order arrays this is a powerful tool.

The next example shows how to multiply each column of an array of shape (2,3) by a different value using broadcasting. Each element in the first column of a is multiplied by the first element of b, each element in the second column of a is multiplied by the second element of b, and each element in the third column of a is multiplied by the third element of b. This results in having the first column of a multiplied by 1, the second column multiplied by 2, and the third column multiplied by 3.

a = np.array([[1.0, 2.0, 3.0],[2.0, 4.0, 6.0]]) # is an array of shape (2,3) b = np.array([1.0,2.0,3.0]) # b is an array of shape (3,) a*b
array([[ 1., 4., 9.], [ 2., 8., 18.]])

Similarly, broadcasting can be used to add 1 to the first column of a, 2 to the second column of a, and 3 to the third column of a.

a+b
array([[ 2., 4., 6.], [ 3., 6., 9.]])

In the following case broadcasting does not work.

a = np.array([1,2,3]) # shape (3,) b = np.array([1,2,3,4]) # shape (4,) a*b
Error in lines 3-3 Traceback (most recent call last): File "/projects/sage/sage-6.10/local/lib/python2.7/site-packages/smc_sagews/sage_server.py", line 904, in execute exec compile(block+'\n', '', 'single') in namespace, locals File "", line 1, in <module> ValueError: operands could not be broadcast together with shapes (3,) (4,)

To understand when broadcasting does and doesn't work a formal understanding is required.

Definition

The following is based on the official NumPy documentation

Broadcasting describes how NumPy treats arrays of different sizes (matrices, vectors, and tensors) during an operation.

When operating on arrays NumPy compares their shapes starting with the last dimension. Where the shape of an n-dimensional array is of the form (d1 , d2 , d3, ... , dn). In order for dimensions of two different arrays to be compatible for broadcasting one of the following two conditions must be met.

  1. They must be equal
  2. One of them must be 1

The resulting array has a size that is the maximum along each of the input arrays.

Note: The shape of any given array, a, can be found by a.shape.

Examples

In the previous example array a had shape (3,) and b was a scalar, or in otherwords of shape (1,).


A    (2d array):   1x3

B    (1d array):     1

a*b  (2d array):   1x3

Numpy compares the last dimensions 3 and 1, the second condition for broadcasting is met. b is "stretched" to match a and the operation proceeds.

Here are a couple more generic examples.


A        (3d array):   256 x 256 x 3

B        (1d array):               3

Result   (3d array):   256 x 256 x 3

In this case both A and B arrays have axes with length one that are expanded when broadcasting.


A        (4d array):   8 x 1 x 6 x 1

B        (3d array):       7 x 1 x 5

Result   (4d array):   8 x 7 x 6 x 5

Now a couple examples that do not work


A   (1d array):   3

B   (1d array):   4

A * B will result in an error because their shapes are not compatible according to the two conditions stated earlier.


A   (1d array):       2 x 1

B   (1d array):   8 x 4 x 3

Here we see that the last dimension is compatible and the second to last is not.

Outer Operations and np.newaxis

When using broadcasting the command np.newaxis can be helpful. The newaxis object creates an axis of length one. For example, given a 1-dimensional array with shape (x,) using new axis the shape can be changed to (1,x) or (x,1) depending on what is desired. In effect making a row or column vector.

The following example has an array, a, of shape (4,) and a second array, b, of shape (3,). According to the two conditions stated earlier broadcasting for these two matrices is not valid

a = np.array([0.0,10.0,20.0,30.0]) # a is of shape (4,) b = np.array([0,1.0,2.0]) # b is of shape (3,) a + b
Error in lines 3-3 Traceback (most recent call last): File "/projects/sage/sage-6.10/local/lib/python2.7/site-packages/smc_sagews/sage_server.py", line 904, in execute exec compile(block+'\n', '', 'single') in namespace, locals File "", line 1, in <module> ValueError: operands could not be broadcast together with shapes (4,) (3,)

An axis of length 1 can be created for array a changing its shape to (4,1) using np.newaxis to make broadcasting valid.

a[:,np.newaxis]+b # a is now of shape (4,1) and the operation can be completed
array([[ 0., 1., 2.], [ 10., 11., 12.], [ 20., 21., 22.], [ 30., 31., 32.]])

The result can be visualized with the following figure.

“EricsBroadcastingDoc - SciPy wiki dump,” Scipy: EricsBroadcastingDoc. [Online]. Available: http://scipy.github.io/old-wiki/pages/EricsBroadcastingDoc. [Accessed: 02-Apr-2016].

This corresponds to an "outer" operation. A common outer operation is the outer product of two vectors. Using numpy broadcasting the outer product of a and b can easily be realized.

a[:,np.newaxis]*b
array([[ 0., 0., 0.], [ 0., 10., 20.], [ 0., 20., 40.], [ 0., 30., 60.]])

Practical Applications

Lets explore two practical examples that showcase how useful broadcasting can be.

Example 1

This is an example taken with permission from Eli Bendersky's website from a blog post explaining broadcasting in NumPy.

E. Bendersky, “Broadcasting arrays in Numpy - Eli Bendersky’s website.” [Online]. Available: http://eli.thegreenplace.net/2015/broadcasting-arrays-in-numpy/. [Accessed: 02-Apr-2016].

Say we have a large data set; each datum is a list of parameters. In NumPy terms, we have a 2-D array, where each row is a datum and the number of rows is the size of the data set. Suppose we want to apply some sort of scaling to all this data - every parameter gets its own scaling factor; in other words, every parameter is multiplied by some factor.

Just to have something tangible to think about, let's count calories in foods using a macro-nutrient breakdown. Roughly put, the caloric parts of food are made of fats (9 calories per gram), protein (4 calories per gram) and carbs (4 calories per gram). So if we list some foods (our data), and for each food list its macro-nutrient breakdown (parameters), we can then multiply each nutrient by its caloric value (apply scaling) to compute the caloric breakdown of each food item:

Here is the first table shown above that has quantities in terms of grams (g).

macros = np.array([ [0.3, 2.5, 3.5], [2.9, 27.5, 0], [0.4, 1.3, 23.9], [14.4, 6, 2.3]]) # shape (4,3)

Next a key to convert grams of fats, proteins, and carbs to calories is created. 9 cal per gram of fat, 4 cals per gram of protein, and 4 cals per gram of carbs.

cal_per_macro = np.array([9,4,4]) # shape (3,)

Now, even if the table has an arbitrarily large number of entries converting grams of nutrient to calories is easy. As shown below the second table is generated.

Food_cal = macros*cal_per_macro Food_cal
array([[ 2.7, 10. , 14. ], [ 26.1, 110. , 0. ], [ 3.6, 5.2, 95.6], [ 129.6, 24. , 9.2]])

Example 2

Consider an image represented as an array of RGB values of shape (d1,d2,3), where d1 x d2 is the size of the image in pixels, and for each pixel there are three numeric values Red, Green, and Blue (RGB).

Here the SciPy library is used to import a generic image and matplotlib is used to display the image.

%auto import scipy.misc import matplotlib.pyplot as plt
image = scipy.misc.face() # import a generic image from the scipy library print('image.shape = ' + str(image.shape)) f = plt.imshow(image) plt.show(f)
image.shape = (768, 1024, 3)

The shape is (768,1024,3) so the size of the image is 768 x 1024. We can use broadcasting to apply a filter to the RGB values at each pixel. As a simple example, to leave the R values and get rid of the G and B values multiply the image array by an array of shape (3,): [1,0,0].

red_filter = np.array([1,0,0],dtype='uint8') # shape (3,) red_image = image * red_filter f = plt.imshow(red_image) plt.show(f)

Note for Matlab/Octave users

Various other high level programming languages offer similar functionality. For Matlab/Octave users, the case of an array times a scalar works the same, but when two arrays are multiplied together the behavior is very different. Similar functionality to NumPy's broadcasting can be achieved in Matlab/Octave using the bsxfun() or the repmat() function, but it is not as straightforward or readable as the built in broadcasting for numpy. The Matlab/Octave * operator is similar to the NumPy .dot() or * using the NumPy matrix class.

For example, vectors are always matrices, and may need transposition for scalar multiplication.

%octave a = [1, 2; 3, 4]; b = [5, 10]; a*b'
ans = 25 55

Note for R users

Numpy does not have R's way of repeating vectors that are too short.

a = np.array([1,2,3,4]) b = np.array([5,10]) a+b
Error in lines 3-3 Traceback (most recent call last): File "/projects/sage/sage-6.10/local/lib/python2.7/site-packages/smc_sagews/sage_server.py", line 904, in execute exec compile(block+'\n', '', 'single') in namespace, locals File "", line 1, in <module> ValueError: operands could not be broadcast together with shapes (4,) (2,)

while this works in R

%r a <- c(1,2,3,4) b <- c(5, 10) a + b
[1] 6 12 8 14

Explore

You can explore broadcasting by generating arrays with different sizes performing, an operation with them, and observing the result using the code below.

np.random.randint(N,size=(d1,d2,...,dn)) generates an array with shape (d1,d2,...,dn) filled with random integers between 0 and N.

a = np.random.randint(10,size=(4,1)); b = np.random.randint(10,size=(2,4,3)); a*b
array([[[42, 0, 36], [ 8, 40, 24], [ 0, 0, 0], [ 0, 63, 54]], [[ 6, 54, 0], [32, 16, 64], [ 0, 0, 0], [54, 0, 9]]])

Exercises

  1. Using broadcasting add 100 to each entry in array a.
a = np.array([[1,2,3,4],[5,6,7,8]])
# Answer b = 100; a+b
array([[101, 102, 103, 104], [105, 106, 107, 108]])
  1. Create an array, b1, that when multiplied with array a will cause the first column of a to be multiplied by 10, the second column by 5, the third column by 2, and the fourth column by 1.
a = np.array([[ 1, 2, 5, 10], [ 1, 2, 5, 10], [ 1, 2, 5, 10], [ 1, 2, 5, 10]])
# Answer b1 = np.array([10,5,2,1]) a*b1
array([[10, 10, 10, 10], [10, 10, 10, 10], [10, 10, 10, 10], [10, 10, 10, 10]])
  1. Create and array, b2, that when multiplied with array a using np.newaxis will cause the first row of a to be multiplied by 1, the second row by 2, the third row by 3, and the fourth row by 4.
a = np.array([[ 1, 2, 5, 10], [ 1, 2, 5, 10], [ 1, 2, 5, 10], [ 1, 2, 5, 10]])
# Answer b2 = np.array([1,2,3,4]) a*b2[:,np.newaxis]
array([[ 1, 2, 5, 10], [ 2, 4, 10, 20], [ 3, 6, 15, 30], [ 4, 8, 20, 40]])
  1. Using np.newaxis compute the outer product of the vectors [1,2,3,4,5] and [2,4,6,8,10].
# Answer a = np.array([1,2,3,4,5]) b = np.array([2,4,6,8,10]) a[:, np.newaxis] *b
array([[ 2, 4, 6, 8, 10], [ 4, 8, 12, 16, 20], [ 6, 12, 18, 24, 30], [ 8, 16, 24, 32, 40], [10, 20, 30, 40, 50]])
2*a*a[:, np.newaxis]
array([[ 2, 4, 6, 8, 10], [ 4, 8, 12, 16, 20], [ 6, 12, 18, 24, 30], [ 8, 16, 24, 32, 40], [10, 20, 30, 40, 50]])