A document from MCS 260 Fall 2021, instructor Emily Dumas. You can also get the notebook file.

MCS 260 Fall 2021 Worksheet 7

  • Course instructor: Emily Dumas

Topics

This worksheet focuses on JSON and CSV files, as well as the data structures stack and queue.

Problem 1 is different this time

As I announced in lecture, this and future worksheets will have a first problem that is treated differently:

  • Tuesday lab students: Problem 1 will be presented and solved as a group, in a discussion led by your TA.
  • Thursday lab students: Please attempt problem 1 before coming to lab. Bring a solution, or bring questions. The problem will be discussed as a group.

Resources

The main course materials to refer to for this worksheet are:

(Lecture videos are not linked on worksheets, but are also useful to review while working on worksheets. Video links can be found in the course course Blackboard site.)

1. Parentheses

In Lecture 17, we wrote a program parentheses.py to examine mathematical expressions like ((1+2)+3) and check that it is possible to match the parentheses up in pairs. It uses a stack to keep track of all the unmatched left parentheses while scanning through the string.

Modify this program (saving the new one as parentheses2.py) so that while reading an expression, it prints a message every time it encounters a ")" that shows which "(" matches it. The output should look like this:

Expression: (((1+2)/7) + (9/(2+3)))
Detected matching pair of parentheses enclosing: (1+2)
Detected matching pair of parentheses enclosing: ((1+2)/7)
Detected matching pair of parentheses enclosing: (2+3)
Detected matching pair of parentheses enclosing: (9/(2+3))
Detected matching pair of parentheses enclosing: (((1+2)/7) + (9/(2+3)))
Parentheses matched successfully.

2. TV Episodes: From JSON to CSV

From 2010 to 2012, the National Geographic Channel produced a weekly hour-long TV series titled "Python Hunters". While you might imagine this was a show about computer programming, it was actually about snakes.

A JSON file containing a list of episodes can be found from the link below:

(This is not an official reference, but a user-generated catalog from a web site called TVMaze. They make such datasets available for anyone to use, for free, as long as they give proper attribution.)

You're going to need this data file in this problem, so you should visit the link and save it somewhere (e.g. as pythonhunter.json).

The JSON object in this file is a list, and each item in the list is a dictionary describing an episode . All of the dictionaries have the same keys, which are characteristics of an episode.

The same information could be presented in table form and stored in a CSV file. For example, if the only characteristics of an episode we care about are:

  • season
  • number
  • name
  • airdate

then the episodes could be listed in a CSV file that would look like this:

season,number,name,airdate
1,1,The Perfect Storm,2010-07-12
1,2,The Big Freeze,2010-07-19

Write a Python program episodes_to_csv.py that reads the JSON file containing the episode list and creates a CSV file in the format described above.

Use Python's json module to read the JSON file, and use the csv module to write the CSV file. CSV is simple enough that you could probably write the data in this format manually (i.e. with file.write()), but the point of this problem is to get some practice using the csv module.

3. CSV totaler

Write a program csvtotaler.py that expects two command line arguments:

  • sys.argv[1] is the input filename, an existing CSV file
  • sys.argv[2] is the output filename, which will be created by this program (also CSV)

The CSV file specified as the first argument is expected to contain a header row, followed by columns of integer values. There can be any number of columns, and any number of rows, but all the values are expected to be integers. Here's a sample of what it might look like:

Received,Shipped,Damaged,Returned
291,155,3,8
408,120,5,0
355,109,0,3

The program should read this and write a new CSV file that has all the same columns, as well as a new column called "Rowtype" that appears before the others. Each row of data from the input file should be copied here, with the value "Data" in the "Rowtype" column. Then, the sum of each column should be computed and these columnwise totals written to a row at the bottom whose "Rowtype" is "Total". Thus, for the CSV shown above, the output file would contain:

Rowtype,Received,Shipped,Damaged,Returned
Data,291,155,3,8
Data,408,120,5,0
Data,355,109,0,3
Total,1054,384,8,11

4. GovTrack US house of representatives JSON

Download and save this JSON file containing data about current members of the US House of Representatives.

https://www.govtrack.us/api/v2/role?current=true&role_type=representative&limit=438

Now, write a program that reads this and prints the twitter ids of all the representatives from Pennsylvania.

(Do that by exploring the structure of the JSON file yourself, either by loading it into a variable in the REPL or opening the file in an editor and looking around. The point is to practice understanding the layout of information in a JSON file and turning it into code that can access and filter it in Python.)

Bonus round

  • Add a feature so that parentheses2.py highlights the parenthesized part of the whole expression, rather than just printing the part inside the parentheses, e.g.
Expression: (((1+2)/7) + (9/(2+3)))
Detected matching pair of parentheses:
(((1+2)/7) + (9/(2+3)))
  ^---^
Detected matching pair of parentheses enclosing:
(((1+2)/7) + (9/(2+3)))
 ^-------^
Detected matching pair of parentheses enclosing:
(((1+2)/7) + (9/(2+3)))
                ^---^
Detected matching pair of parentheses enclosing:
(((1+2)/7) + (9/(2+3)))
             ^-------^
Detected matching pair of parentheses enclosing:
(((1+2)/7) + (9/(2+3)))
^---------------------^
Parentheses matched successfully.
  • Make it so that parentheses2.py shows all parentheses that haven't been closed, rather than just the rightmost one, in case there "(" with no matching ")".

  • Make parentheses3.py that shows a text-based bar graph of how many parentheses are open at every point in the expression, like this:

(((1+2)/7) + (9/(2+3)))
@@@@@@@@@@@@@@@@@@@@@@@
 @@@@@@@@@   @@@@@@@@@
  @@@@@         @@@@@

The number of "@" characters appearing directly below any place in the expression tells you how many parentheses are open at that point.

Revision history

  • 2021-10-05 Initial release