MCS 275 Spring 2021
Emily Dumas
Course bulletins:
In Python, a sequence is an object containing elements that can be accessed by a nonnegative integer index.
e.g. list
, tuple
, str
An iterable is a more general concept for an object that can provide items one by one when used in a for
loop.
Sequences can do this, but there are other examples:
iterable | value |
---|---|
file | line of text |
sqlite3.Cursor | row |
dict, dict.keys() | key |
range | integer |
Unlike a sequence, an iterable may not store (or know) the next item until it is requested.
This is called laziness and can provide significant advantages.
Generators are do-it-yourself lazy iterables.
In a function, return x
will:
x
for the purposes of evaluationWhen a function call is used as an iterable, the statement yield x
will:
x
the next value given by the iterableThe next time a value is needed, execution of the function will continue from where it left off.
Imagine you can write a function which will print a bunch of values (perhaps doing calculations along the way).
If you change print(x)
to yield x
, then you get a function that can be used as an iterable, lazily producing the same values.
Behind the scenes, a function containing yield
will return a generator object (just once), which is an iterable.
It contains the local state of the function, and to provide a value it runs the function until the next yield
.
The list
and tuple
constructors accept an iterable.
So if g
is a generator object, list(g)
will pull all of its items and put them in a list.
Generator objects are "one-shot" iterables, i.e. you can only iterate over them once.
Since generator objects are usually return values of functions, it is typical to have the function call in the loop that performs iteration.
A generator can delegate to another generator, i.e. say "take values from this other generator until it is exhausted".
The syntax is
yield from GENERATOR
which is approximately equivalent to:
for x in GENERATOR:
yield x
You can often remove the brackets from a list comprehension to get a generator comprehension; it behaves similarly but evaluates lazily.
# Create a list, then sum it
# Uses memory proportional to N
sum([ x**2 for x in range(1,N+1) ])
# Create a generator, then sum values
# it yields. Memory usage independent
# of N.
sum( x**2 for x in range(1,N+1) )
This won't work in a context that needs a sequence (e.g. in len()
, random.choice()
, ...).
To finish off MCS 275, four pieces of advice: