MCS 275 Spring 2022
Emily Dumas
Course bulletins:
I've converted the example program urlreadtext.py
to a nicer version fetch.py
that uses argparse
.
In Python, a sequence is an object containing elements that can be accessed by a nonnegative integer index.
e.g. list
, tuple
, str
An iterable is a more general concept for an object that can provide items one by one when used in a for
loop.
Sequences can do this, but there are other examples:
iterable | value |
---|---|
file | line of text |
sqlite3.Cursor* | row |
dict | key |
range | integer |
* That's the return type of .execute(...)
in sqlite3
.
Unlike a sequence, an iterable may not store (or know) the next item until it is requested.
This is called laziness and can provide significant advantages.
Generators are do-it-yourself lazy iterables.
In a function, return x
will:
x
for the purposes of evaluationWhen a function call is used as an iterable, the statement yield x
will:
x
the next value given by the iterableThe next time a value is needed, execution of the function will continue from where it left off.
Imagine you can write a function which will print a bunch of values (perhaps doing calculations along the way).
If you change print(x)
to yield x
, then you get a function that can be used as an iterable, lazily producing the same values.
Behind the scenes, a function containing yield
will return a generator object (just once), which is an iterable.
It contains the local state of the function, and to provide a value it runs the function until the next yield
.
The list
and tuple
constructors accept an iterable.
So if g
is a generator object, list(g)
will pull all of its items and put them in a list.
Generator objects are "one-shot" iterables, i.e. you can only iterate over them once.
Since generator objects are usually return values of functions, it is typical to have the function call in the loop that performs iteration.
The built-in function next
will get the next value from an iterable (e.g. generator object).
It raises StopIteration
if no more items are available.
A generator can temporarily delegate to another generator, i.e. say "take values from this other generator until it is exhausted".
The syntax is
yield from GENERATOR
which is approximately equivalent to:
for x in GENERATOR:
yield x
You can often remove the brackets from a list comprehension to get a generator comprehension; it behaves similarly but evaluates lazily.
# Create a list, then sum it
# Uses memory proportional to N
sum([ x**2 for x in range(1,N+1) ])
# Create a generator, then sum values
# it yields. Memory usage independent
# of N.
sum( x**2 for x in range(1,N+1) )
This won't work in a context that needs a sequence (e.g. in len()
, random.choice()
, ...).