Lecture 21

Numpy

MCS 275 Spring 2024
Emily Dumas

View as:   Presentation   ·   PDF-exportable  ·   Printable

Lecture 21: Numpy

Reminders and announcements:

  • Homework 7 due tomorrow (I fixed a typo where it showed an impossible due date)
  • I'm grading Project 2
  • A bunch of CSV/JSON/Pillow stuff was added in samplecode/fileformats folder

Working with data

New unit: working with data (mostly numeric data) in Python.

Covers Python packages numpy and matplotlib.

These are widely used in many fields, and particularly in the physical sciences and data science.

A good book

For this unit I strongly recommend reading:

It is available for free online. Chapter 2 is about numpy.

Installing numpy

In most cases, pip is all you need:

python3 -m pip install numpy

Other methods are described in the Numpy docs.

Test:


        >>> import numpy
        >>> numpy.__version__
        '1.21.5'
    

Import as

You can give a module a new name at import time, e.g.


        import math as sun
        sun.tan(0.5)
    

Since numpy has a lot of global names, most people import it under a shorter name to save typing:


        import numpy as np
    

numpy purpose

  • Fast, type-homogeneous, multidimensional arrays
    • e.g. vector, matrix, tensor, ...
  • Large library of mathematical functions and algorithms (especially linear algebra)

Numpy is one of the most-used Python packages in scientific computing (computational math, data science, machine learning, ...).

arrays

Implemented in `np.ndarray` class, usually made with `np.array`.

Without numpy:


        v = [2,3] 
        w = [3,-2]
        v + w    # [2,3,3,-2]
        3*v      # [2,3,2,3,2,3]
        v.dot(w) # fail!
        A = [ [2,1], [1,1] ]
        type(A)  # list
        A*v      # fail!
    

With numpy:


        v = np.array([2,3])
        w = np.array([3,-2])
        v + w    # [5,1]
        3*v      # [6,9]
        v.dot(w) # 0
        A = np.array([ [2,1], [1,1] ])
        A*v      # possibly confusing answer
        A.dot(v) # [7,5] (matrix-vector mult)
    

Notebook time

I'll build a Python notebook demonstrating some basic features of numpy.

After lecture it will be in the course sample code repo.

Indexing and slicing

Numpy has powerful syntax for retrieving individual elements or collections of elements of arrays.

Most basic version: A[i,j] gives the element at row i, column j for a 2D array. Similar in higher dimensions, e.g. A[i,j,k,l].

Slices return views of part of the array, not copies.

Ufuncs

Numpy's "ufuncs" or universal functions are functions that can be applied directly to arrays, automatically acting on each element.

Numpy provides a lot of these.

Usually, ufuncs allow you to avoid explicit iteration over array elements (which is much slower).

Bool gotcha

np.array([5,0,1])==np.array([0,0,0])

evaluates to

np.array([False,True,False])

and numpy arrays do not support boolean coercion so this cannot appear in if.

To test if two arrays are equal, use one of:

np.all(A==B)
np.array_equal(A,B)

Aggregations

Numpy has operations like sum, product, max, min, all, any, that reduce array dimension.

References

Revision history

  • 2023-03-05 Finalization of the 2023 lecture this was based on
  • 2024-02-26 Initial publication