A document from MCS 275 Spring 2021, instructor Emily Dumas. You can also get the notebook file.

urllib examples

MCS 275 Spring 2021 - Instructor Emily Dumas

Lecture 39

Some APIs to try:

http://roll.diceapi.com/json/d6
http://roll.diceapi.com/json/3d12
https://cat-fact.herokuapp.com/facts/random?amount=2
https://api.fda.gov/drug/label.json?search=_exists_:boxed_warning+AND+effective_time:20210101+TO+20210419
https://api.exchangerate-api.com/v4/latest/USD

Some URLs to scrape (always taking care to limit request volume and follow terms of service!)

https://catalog.uic.edu/ucat/academic-calendar/
https://www.dumas.io/teaching/2020/fall/mcs260/
https://mathoverflow.net/

API request example

In [3]:
import json
from urllib.request import urlopen

with urlopen("https://www.boredapi.com/api/activity") as response:
    data_bytes = response.read() # returns the body
    data = json.loads(data_bytes)
print("Maybe you could... ",data["activity"])
Maybe you could...  Have a paper airplane contest with some friends
In [4]:
data
Out[4]:
{'activity': 'Have a paper airplane contest with some friends',
 'type': 'social',
 'participants': 4,
 'price': 0.02,
 'link': '',
 'key': '8557562',
 'accessibility': 0.05}

Web page example

In [10]:
from urllib.request import urlopen

with urlopen("https://example.com/") as response:
    html = response.read()
    charset = response.headers.get_content_charset()
    htmlstr = html.decode(charset)
In [12]:
print(htmlstr)
<!doctype html>
<html>
<head>
    <title>Example Domain</title>

    <meta charset="utf-8" />
    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1" />
    <style type="text/css">
    body {
        background-color: #f0f0f2;
        margin: 0;
        padding: 0;
        font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
        
    }
    div {
        width: 600px;
        margin: 5em auto;
        padding: 2em;
        background-color: #fdfdff;
        border-radius: 0.5em;
        box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);
    }
    a:link, a:visited {
        color: #38488f;
        text-decoration: none;
    }
    @media (max-width: 700px) {
        div {
            margin: 0 auto;
            width: auto;
        }
    }
    </style>    
</head>

<body>
<div>
    <h1>Example Domain</h1>
    <p>This domain is for use in illustrative examples in documents. You may use this
    domain in literature without prior coordination or asking for permission.</p>
    <p><a href="https://www.iana.org/domains/example">More information...</a></p>
</div>
</body>
</html>

Note: HTML documents often reference other resources needed to display them properly (e.g. CSS stylesheets, images, ...). This request only gives the HTML.