This homework assignment must be submitted in Gradescope by Noon central time on Tuesday April 11, 2023.
Collaboration is prohibited, and you may only access resources (books, online, etc.) listed below.
This homework is about SQLite.
Most relevant:
Less likely to be relevant, but also allowed:
This homework assignment has 2 problems, numbered 2 and 3. The grading breakdown is:
Points | Item |
---|---|
3 | Autograder |
6 | Problem 2 |
6 | Problem 3 |
15 | Total |
The part marked "autograder" reflects points assigned to your submission based on some simple automated checks for Python syntax, etc. The result of these checks is shown immediately after you submit.
Ask your instructor or TA a question by email, in office hours, or on discord.
The work for this problem should go in a file called hwk12prob2.py
.
If you don't already have it, download the powerplants SQLite database from Worksheet 12 and save it in a place you can find:
As a reminder, the only table is called powerplants
and the columns present in that table are described at
Below you'll find a basic Python script that opens that database and prints the total number of columns. I'm providing it so you focus more on queries than on Python script details. The script requires powerplants.sqlite
to be in the same directory.
"Count rows in the powerplants table"
import sqlite3
# Open connection
con = sqlite3.connect("powerplants.sqlite")
# Run query
res = con.execute("SELECT COUNT(*) FROM powerplants;")
# Get one result row; there is also a .fetchall() method to get
# a list of rows, or you can iterate over result rows in a for loop.
first_result_row = res.fetchone()
# Now first_result_row is a tuple like (30000,)
# We want its first entry
numrows = first_result_row[0]
# Close connection
con.close()
print("The number of rows in table `powerplants` is {}.".format(numrows))
Compose appropriate SQL queries and modify this Python script so that it prints answers to all of the following questions:
Determine which of these companies own power plants in at least two different countries:
Biffa
EDF
Eskom
IPP
PacifiCorp
For any that own plants in two or more countries, print the company name.
Note: I don't mean you should run the queries yourself and then copy the answers into print()
calls. Instead, the Python script you submit should execute the queries each time it is run.
import sqlite3
# Open connection
con = sqlite3.connect("powerplants.sqlite")
# 1
query1 = con.execute("SELECT COUNT(*) FROM powerplants WHERE country = 'Uganda';")
num_uganda, = query1.fetchone() # Use comma to unpack tuple of length 1
print("Uganda has {} power plants listed in the database".format(num_uganda))
# 2
query2 = con.execute("SELECT DISTINCT primary_fuel FROM powerplants WHERE country = 'Jamaica';")
fuels = [row[0] for row in query2]
fuels = ", ".join(fuels)
print("Jamaica uses the following fuels: {}".format(fuels))
# 3
query3_numerator = con.execute("""SELECT COUNT(*) FROM powerplants
WHERE (output_gwh_2017 IS NOT NULL AND output_gwh_2018 IS NOT NULL
AND output_gwh_2017 > output_gwh_2018);""")
query3_denominator = con.execute("""SELECT COUNT(*) FROM powerplants
WHERE (output_gwh_2017 IS NOT NULL AND output_gwh_2018 IS NOT NULL);""")
numerator = query3_numerator.fetchone()[0]
denominator = query3_denominator.fetchone()[0]
print("""Among power plants that have 2017 and 2018 data available, {:.2f}% had greater output in 2017 than 2018""".format(100*numerator/denominator))
# 4
query4 = con.execute("SELECT name FROM powerplants WHERE country='India' ORDER BY output_gwh_2016 DESC LIMIT 5;")
print("The top 5 power plants in India by output in 2016 are:")
for i, row in enumerate(query4):
print("#{}: {}".format(i+1, row[0]))
# 5
# Use a GROUP BY .. HAVING statement to only select owners present in at least 2 countries
query5 = con.execute("""SELECT DISTINCT owner FROM powerplants
WHERE owner IN ('Biffa', 'EDF', 'Eskom', 'IPP', 'PacifiCorp')
GROUP BY owner HAVING COUNT(DISTINCT country) >= 2;""")
companies = [row[0] for row in query5]
companies = ", ".join(companies)
print("The following companies have powerplants in at least two different countries: {}".format(companies))
# Close connection
con.close()
todo.py
for demos¶The work for this problem should go in a file called hwk12prob3.py
.
This problem requires you to be familiar with the "todo list" sample application from the course sample code repository, which uses a SQLite database. It can be found at:
Download it, save it somewhere, try it out in the terminal, and take a look at the code. That script will make its own database the first time you run it. Try commands like:
python3 todo.py add Submit homework 12
python3 todo.py add Finish project 4
python3 todo.py list
python3 todo.py done 1
python3 todo.py list
python3 todo.py listall
As you see, this script stores a task list in a database table, and lets you add, remove, or modify items in that list by running the script with appropriate command line arguments.
For reference, here's the table schema it uses in todo.sqlite
:
CREATE TABLE tasks (
id INTEGER PRIMARY KEY,
description TEXT,
completed INTEGER DEFAULT 0
);
Imagine you're planning to use todo.py
as part of a lecture demonstration. It might not be ideal to start with an empty database, and it might also be helpful to have a way to mark every task as not completed (undone) as a sort of "reset" to allow you to once again show how to mark a few of them completed (done).
Write a script hwk12prob3.py
that expects one command line argument. The behavior should be:
reset
, sampledata
, or cleardone
, print an error message and exit.todo.sqlite
.tasks
doesn't already exist, create a table with the columns id
, description
, completed
as described above.reset
, delete all rows from the table tasks
sampledata
, add 10 rows into the database with distinct descriptions. Have all of them start out with completed
set to 0
. Don't set the id
column values explicitly, so if you run this twice, 10 more rows will be added.cleardone
, set completed
to 0
in every rowThis could therefore be used as a companion to todo.py
for demonstrations, e.g.
python3 hwk12prob3.py reset
and then python3 hwk12prob3.py sampledata
so you have a known set of sample tasks (none of them done) in the database.python3 hwk12prob3.py cleardone
Note: The script hwk12prob3.py
does not need to duplicate any features of todo.py
and should not be written by modifying todo.py
. Instead, the two scripts are meant to be used together, accessing the same database but doing different things.
import sys
import sqlite3
import datetime
if sys.argv[1] not in ["reset", "sampledata", "cleardone"]:
print("Error: Only the following commands are accepted: reset, sampledata, cleardone")
exit()
con = sqlite3.connect("todo.sqlite")
# Initial table setup
con.execute("""CREATE TABLE IF NOT EXIST tasks (id INTEGER PRIMARY KEY,
description TEXT,
completed INTEGER DEFAULT 0);""")
# Delete all entries
if sys.argv[1] == "reset":
con.execute("DELETE * FROM tasks;")
# Insert 10 new rows
elif sys.argv[1] == "sampledata":
now = datetime.datetime.now() # This line is not required - but it makes our example descriptions unique
for i in range(10):
# In your homework submissions, your task descriptions may vary - you can put anything you like!
description = "Finish example task {} generated on {}".format(i+1, now)
# In the line below, `description` is in a tuple with only 1 entry because we need to
# use a list/tuple that contains all values that we want to insert into `?` placeholders
con.execute("INSERT INTO tasks (description) VALUES (?);", (description,))
# Set all tasks as completed
elif sys.argv[1] == "cleardone":
con.execute("UPDATE tasks SET completed = 1;")
con.commit()
con.close()