MCS 260 Fall 2021
Emily Dumas
CSV is a format for storing tabular data in a text file.
Basic format: Each line contains some values, separated by commas. There is no universally accepted way to indicate types (e.g. string vs int).
fullname,midterm,final,hwk_avg
Maureen Singh,82.0,91.5,94.0
Yousuf Shaw,78.0,89.0,96.5
Often, the first line contains column headers.
csv.reader(f)
returns an iterable that gives the rows one by one, as lists of values.
Use the return value in a for
loop to process the file row by row.
Hit it with list()
if you need the whole list a once (rare).
Important: When opening to read/write CSV you need to give open()
an extra argument newline=""
.
If the CSV file has has headers, a better option is csv.DictReader(f)
which yields rows as dictionaries, using column headers as keys.
csv.DictReader
does not return the header row.
csv.writer(f)
takes a file object and returns a writer object, which has a useful method:
.writerow(L)
— Write the items in iterable L
to a row in the file.Note: When opening to read/write CSV you need to give open()
an extra argument newline=""
.
csv.DictWriter(f,fieldnames=L)
specifies an iterable L
of field names, and returns a writer object that expects rows as dictionaries. Useful methods:
.writeheader()
— Write the field names to a header row..writerow(d)
— Write the values from dictionary d
to a line of the output file (but only the ones corresponding to keys that are field names).The CSV reader and writer functions can use a separator other than a comma, e.g. specify delimiter="\t"
to read or write tab separated values (TSV).
Some CSV files put values in quotes so that the separator character can appear in the value, e.g.
fullname,occupation
Octavia Spencer,"actor,author"
"Bond, James Bond","spy"
The csv module supports this convention.
csv
module documentation is quite good, especially the Examples.