Mapping & Geocoding in Python with Folium

Here we turn addresses into places and putting them on the map.

Geocoding and Mapping in Python

For this tutorial there is a complete jupyter notebook you can save to your own machine, and run by first starting Jupyter Notebooks; you’ll want this list of Ottawa places. Right-click and save-as both of those links. Use Anaconda Navigator to fire up a jupyter notebook. You can then open that file and follow along; alternatively you can create an empty new jupyter notebook and copy the code below into new cells as you go.

This notebook file is a largely unaltered copy of Melanie Walsh 2021 ‘Introduction to Cultural Analytics: Mapping’ https://melaniewalsh.github.io/Intro-Cultural-Analytics/Mapping/Mapping.html, with only some minor modifications. You are encouraged to follow the remainder of her tutorial to learn how to add a custom map background, and how to publish your resulting map to the web.

Mapping with Python

In this lesson, we’re going to learn how to analyze and visualize geographic data.

Geocoding with GeoPy

First, we’re going to geocode data — aka get coordinates from addresses or place names — with the Python package GeoPy. GeoPy makes it easier to use a range of third-party geocoding API services, such as Google, Bing, ArcGIS, and OpenStreetMap. These are essentially gazetteers of place names and their geographic locations.

Though most of these services require an API key, Nominatim, which uses OpenStreetMap data, does not, which is why we’re going to use it here.

Install GeoPy

!pip install geopy

The ! is used in a jupyter notebook to tell the notebook to run the code that follows as if it were at the command line, by the way.

Import Nominatim

From GeoPy’s list of possible geocoding services, we’re going to import Nominatim:

from geopy.geocoders import Nominatim

Nominatim & OpenStreetMap

Nominatim (which means “name” in Latin) uses OpenStreetMap data to match addresses with geopgraphic coordinates. Though we don’t need an API key to use Nominatim, we do need to create a unique application name.

Here we’re initializing Nominatim as a variable called geolocator. Change the application name below to your own application name:

geolocator = Nominatim(user_agent="GIVE-A-NAME-HERE-app", timeout=2)

To geocode an address or location, we simply use the .geocode() function:

location = geolocator.geocode("Wellington Street Ottawa")

And then to see the result we run:

location

An Alternative: Google Geocoding API

The Google Geocoding API is superior to Nominatim, but it requires an API key and more set up. To enable the Google Geocoding API and get an API key, see Get Started with Google Maps Platform and Get Started with Geocoding API.

If you want to just continue without mucking about with Google, skip down to the heading ‘Continue’

# once you get an API Key from Google, you would uncomment the three lines below,
# inserting the API key into the appropriate spot in the second line

#from geopy.geocoders import GoogleV3
#google_geolocator = GoogleV3(api_key="YOUR-API-KEY HERE")
#google_geolocator.geocode("Wellington Street")

Get Address

print(location.address)

Get Latitude and Longitude

print(location.latitude, location.longitude)

Get “Importance” Score

print(f"Importance: {location.raw['importance']}")

Get Class and Type

print(f"Class: {location.raw['class']} \nType: {location.raw['type']}")

Continue…

Get Multiple Possible Matches

possible_locations = geolocator.geocode("Wellington Street", exactly_one=False)

for location in possible_locations:
    print(location.address)
    print(location.latitude, location.longitude)
    print(f"Importance: {location.raw['importance']}")
location = geolocator.geocode("Wellington Street, Ottawa Ontario")

print(location.address)
print(location.latitude, location.longitude)
print(f"Importance: {location.raw['importance']}")

Geocode with Pandas

To geocode every location in a CSV file, we can use Pandas, make a Python function, and .apply() it to every row in the CSV file.

# you might need to uncomment the pip install pandas line and run it to install pandas
# pandas is a package that lets you work with tabular data etc

#!pip install pandas

import pandas as pd
pd.set_option("max_rows", 400)
pd.set_option("max_colwidth", 400)

Here we make a function with geolocator.geocode() and ask it to return the address, lat/lon, and importance score:

def find_location(row):

    place = row['place']

    location = geolocator.geocode(place)

    if location != None:
        return location.address, location.latitude, location.longitude, location.raw['importance']
    else:
        return "Not Found", "Not Found", "Not Found", "Not Found"

To start exploring, let’s read in a CSV file with a list of places in and around Ithaca.

ottawa_df = pd.read_csv("ottawa-places.csv")
ottawa_df

Now let’s .apply() our function to this Pandas dataframe and see what results Nominatim’s geocoding service spits out.

ottawa_df[['address', 'lat', 'lon', 'importance']] = ottawa_df.apply(find_location, axis="columns", result_type="expand")
ottawa_df

Making Interactive Maps

To map our geocoded coordinates, we’re going to use the Python library Folium. Folium is built on top of the popular JavaScript library Leaflet.

To install and import Folium, run the cells below:

!pip install folium
import folium

Base Map

First, we need to establish a base map. This is where we’ll map our geocoded Ithaca locations. To do so, we’re going to call folium.Map()and enter the general latitude/longitude coordinates of the Ithaca area at a particular zoom.

(To find latitude/longitude coordintes for a particular location, you can use Google Maps, as described here.)

ottawa_map = folium.Map(location=[45.42, -75.69], zoom_start=12)
ottawa_map

Add a Marker

Adding a marker to a map is easy with Folium! We’ll simply call folium.Marker() at a particular lat/lon, enter some text to display when the marker is clicked on, and then add it to our base map.

folium.Marker(location=[45.385858, -75.695004], popup="Crafting Digital History!").add_to(ottawa_map)
ottawa_map

Add Markers From Pandas Data

To add markers for every location in our Pandas dataframe, we can make a Python function and .apply() it to every row in the dataframe.

def create_map_markers(row, map_name):
    folium.Marker(location=[row['lat'], row['lon']], popup=row['place']).add_to(map_name)

Before we apply this function to our dataframe, we’re going to drop any locations that were “Not Found” (which would cause folium.Marker() to return an error).

found_ottawa_locations = ottawa_df[ottawa_df['address'] != "Not Found"]
found_ottawa_locations.apply(create_map_markers, map_name=ottawa_map, axis='columns')
ottawa_map

Save Map

ottawa_map.save("Ottawa-map.html")

Drop that html file on a webserver (for instance, a github repository with github pages enabled) to see it in action! Alternatively, at the command prompt in the directory where you saved that html file, you can use Python’s built-in webserver by running the command:

$ python -m http.server 8000

You would then open your browser at https://localhost:8000/Ottawa-map.html.

Torn Apart / Separados

The data in this section was drawn from Torn Apart / Separados Project. It maps the locations of Immigration and Customs Enforcement (ICE) detention facilities, as featured in Volume 1.

Go to https://github.com/melaniewalsh/Intro-Cultural-Analytics/tree/master/book/data to get the data files for this next section OR insert the following string pattern into the file names as appropriate to load directly from the web:

eg, where the code says ICE_df = pd.read_csv("../data/ICE-facilities.csv")

you’d change that to

ICE_df = pd.read_csv("https://raw.githubusercontent.com/melaniewalsh/Intro-Cultural-Analytics/master/book/data/ICE-facilities.csv")

Add a Circle Marker

There are a few different kinds of markers that we can add to a Folium map, including circles. To make a circle, we can call folium.CircleMarker() with a particular radius and the option to fill in the circle. You can explore more customization options in the Folium documentation. We’re also going to add a hover tooltip in addition to a popup.

def create_ICE_map_markers(row, map_name):

    folium.CircleMarker(location=[row['lat'], row['lon']], raidus=100, fill=True,
                popup=folium.Popup(f"{row['Name'].title()} <br> {row['City'].title()}, {row['State']}", max_width=200),
                  tooltip=f"{row['Name'].title()} <br> {row['City'].title()}, {row['State']}"
                 ).add_to(map_name)
ICE_df = pd.read_csv("https://raw.githubusercontent.com/melaniewalsh/Intro-Cultural-Analytics/master/book/data/ICE-facilities.csv")
ICE_df
US_map = folium.Map(location=[42, -102], zoom_start=4)
US_map
ICE_df = ICE_df.dropna(subset=['lat', 'lon'])
ICE_df.apply(create_ICE_map_markers, map_name=US_map, axis="columns")
US_map

Choropleth Maps

Choropleth map = a map where areas are shaded according to a value

The data in this section was drawn from Torn Apart / Separados Project. This data maps the “cumulative ICE awards since 2014 to contractors by congressional district,” as featured in Volume 2.

To create a chropleth map with Folium, we need to pair a “geo.json” file (which indicates which parts of the map to shade) with a CSV file (which includes the variable that we want to shade by).

The following data was drawn from the Torn Apart / Separados project

US_districts_geo_json = "../data/ICE_money_districts.geo.json"
US_districts_csv = pd.read_csv("../data/ICE_money_districts.csv")
US_districts_csv = US_districts_csv .dropna(subset=['districtName', 'representative'])
US_districts_csv
US_map = folium.Map(location=[42, -102], zoom_start=4)

folium.Choropleth(
    geo_data = US_districts_geo_json,
    name = 'choropleth',
    data = US_districts_csv,
    columns = ['districtName', 'total_awards'],
    key_on = 'feature.properties.districtName',
    fill_color = 'GnBu',
    line_opacity = 0.2,
    legend_name= 'Total ICE Money Received'
).add_to(US_map)

US_map

Add a Tooltip to Choropleth

tooltip = folium.features.GeoJson(
    US_districts_geo_json,
    tooltip=folium.features.GeoJsonTooltip(fields=['representative', 'state', 'party', 'total_value'], localize=True)
                                )
US_map.add_child(tooltip)
US_map