A Software Framework for Working with OpenStreetMap Data

Jennings Anderson, Robert Soden, Kenneth M. Anderson, Marina Kogan, & Leysia Palen



Editing OpenStreetMap


Research Motivation

OpenStreetMap (OSM) is a site of novel online collaboration and innovation to create a free and open map of the world.

From a Human Centered Computing perspective, we want to understand how collaboration in OSM works.

  • Identify potential improvements to the sofware interface
  • Reduce potential errors and make map creation more efficient and productive.

Crisis-Mapping: The collaboration of many people around the world to improve the map in response to a mass emergency or disaster.

Challenges in Studying Collaboration in OSM

  1. Map datasets tend to be extremely large—often consuming terabytes or petabytes of information
  2. Maps show an aggregate product and are therefore not good at conveying how they were created.
  3. OSM's tagging schema is in constant flux.
  4. OSM's objects may be mapped differently by different people.

OSM Research as a Big Data Problem

  1. Full history files available nearly every week: 48GB (compressed PBF) or >1TB (uncompressed XML)

    Constantly growing & changing

  2. No central schema, attributes of map objects are flexible
  3. Queries are an evolving target

Requires a nimble & extensible framework to handle these changes.

Crisis Mapping & Crisis Informatics

The 2010 Haiti Earthquake

OSM Data Model

OpenStreetMap Data: Overview

OpenStreetMap Data: Tags

  • Descriptive, non-spatial characteristic of a map object
  • Unrestricted key-value pairs added to any map object.

Example tags for OSM objects in New York City

Objects w/ tag Key Most-common values
66% building yes, garage, house, school
64% height 8.2, 8.0
13% highway residential
11% name (various)
2% amenity parking, bicycle parking

Analysis with Tags

No Tags

322 users added 956,725 nodes to the map in the month after the 2010 Haiti earthquake.

Using Tags

308 users added 40,067 roads to the map and 162 users added 20,696 buildings to the map. 148 of these users were the same, adding buildings and roads.

The Epic-OSM Software Framework

Conceptual Framework

Analysis Window

An executable specification which defines spatio-temporal bounds for a set of queries.

title: 'Kauai 2015 Edits'
email: 'jennings.anderson@colorado.edu'

start_date: '2015-01-01 00:00:00Z'
end_date:   '2016-01-01 00:00:00Z'
bbox:       '-159.787827, 21.868277, -159.292313, 22.234997'

#Database & IO Configuration
pbf_file: '/data/osm_files/splitter-exports/north-america/hawaii.osh.pbf'
database: 'kauai'
write_directory: '/data/www/kauai'

Flexible Query Language

Metaprogramming allows extensible query language right in the analysis window.

User Questions:
 - 'new_user_count'
 - 'experienced_user_count'
 - 'total_user_count'

Node Questions:
 - 'nodes_x_day'
 - 'nodes_x_hour'
 - 'nodes_x_year'
 - 'nodes_x_day(step: 7, constraints: {version: {"$gt":1})

Way Questions:
 - 'ways_x_day(constraints: {"tags.building" : "yes", version: 1})'

Changeset Questions:
 - 'changesets_x_hour'
 - 'changesets_x_day(step: 2, user: "Jennings Anderson")'

Simple Output: JSON Files

File: new_user_count.json
{"New User Count":14}

File: experienced_user_count.json
{"Experienced User Count":49}

File: total_user_count.json
{"Total User Count":63}

File: total_nodes_edited.json
{"Total Nodes Edited":57119}

Output: JSON Temporal Buckets

Other results are returned in temporal buckets

File: changesets_x_day.json

[ {"start_date":"2015-03-27 00:00:00 +0000",
   "end_date":"2015-03-28 00:00:00 +0000",
     {"closed_at":"2015-03-27 20:08:59 UTC","open":"false","min_lat":39.9064499,"max_lat":39.9064499,"min_lon":-105.0846401,"max_lon":-105.0846401,"id":"29786881","uid":"571489","user":"Stevestr","created_at":"2015-03-27 20:08:53 UTC","tags":[{"created_by":"iD 1.7.0"},{"imagery_used":"Bing"}],"geometry":{"type":"Point","coordinates":[-105.0846401,39.9064499]}},{"comment":"","closed_at":"2015-03-27 20:13:40 UTC","open":"false","min_lat":39.7316547,"max_lat":40.0355334,"min_lon":-105.2707028,"max_lon":-104.9806091,"id":"29786960","uid":"571489","user":"Stevestr","created_at":"2015-03-27 20:13:33 UTC","tags":[{"created_by":"iD 1.7.0"},{"imagery_used":"Bing"}],"geometry":{"type":"Polygon","coordinates":[[[-105.2707028,39.7316547],[-105.2707028,40.0355334],[-104.9806091,40.0355334],[-104.9806091,39.7316547],[-105.2707028,39.7316547]]]}},
     {"closed_at":"2015-03-27 20:15:24 UTC","open":"false","min_lat":39.9059943,"max_lat":39.906068,"min_lon":-105.0851904,"max_lon":-105.085152,"id":"29787012","uid":"571489","user":"Stevestr","created_at":"2015-03-27 20:15:22 UTC","tags":[{"created_by":"iD 1.7.0"},{"imagery_used":"Bing"}],"geometry":{"type":"Polygon","coordinates":[[[-105.0851904,39.9059943],[-105.0851904,39.906068],[-105.085152,39.906068],[-105.085152,39.9059943],[-105.0851904,39.9059943]]]}}]},

  {"start_date":"2015-03-28 00:00:00 +0000",
   "end_date":"2015-03-29 00:00:00 +0000",
     {"comment":"tags corrected","closed_at":"2015-03-28 10:33:53 UTC","open":"false","min_lat":40.0203642,"max_lat":40.0203642,"min_lon":-105.256535,"max_lon":-105.256535,"id":"29802368","uid":"78656","user":"Walter Schlögl","created_at":"2015-03-28 10:33:52 UTC","tags":[{"created_by":"JOSM/1.5 (8109 de)"}],"geometry":{"type":"Point","coordinates":[-105.256535,40.0203642]}}]},

  {"start_date":"2015-03-29 00:00:00 +0000",
    "end_date":"2015-03-30 00:00:00 +0000",

  {"start_date":"2015-03-30 00:00:00 +0000",
   "end_date":"2015-03-31 00:00:00 +0000",
   "objects":[]} ]


Current Framework Deployment

2010 Haiti Earthquake

In support of a paper for CHI 2016:

Haiti: Analysis Window

title: 'Haiti Earthquake 2010 - Editing Networks'

#Analysis Window Information
start_date: '2010-01-12 00:00:00 -0700'
end_date:   '2010-01-26 00:00:00 -0700'

#Bounding Box:
bbox: '-74.5532226563,17.8794313865,-71.7297363281,19.9888363024' 

Changeset Questions:
 - changesets_x_hour

Network Questions:
 - overlapping_changesets:
    step: 4
    unit: hour
    changeset_area: 44470390.451710135 #75th percentile
    files: '/data/www/haiti-networks-4/overlapping_changesets_by_4_hour'

 - intersecting_roads:
    step: 4
    unit: hour
    files: '/data/www/haiti-networks-4/intersecting_roads_4_hour'
    constraints: {'version': 1, 'tags.highway' : {'$ne' : null}}
 - co_editing_objects:
     step: 4
     unit: hour
     files: '/data/www/haiti-networks-4/co_editing_objects_4_hour'

2015 Nepal Earthquake


Moving Forward

Currently working to implement streaming analysis support in addition to batch processing.

Interface with new OSM software in development


Kauai: Analysis Window

#Analysis Window Information
start_date: '2015-01-01 00:00:00Z'
end_date:   '2016-01-01 00:00:00Z'

bbox: '-159.787827,21.868277,-159.292313,22.234997'

#Database & IO Configuration
pbf_file: '/data/osm_files/splitter-exports/north-america/hawaii.osh.pbf'
database: 'kauai'
write_directory: '/data/www/kauai'

#User Information
title: 'Kauai 2015 Edits'
email: 'jennings.anderson@colorado.edu'

User Questions:
 - new_user_count
 - experienced_user_count
 - total_user_count

Node Questions:
 - total_nodes_edited
 - node_added_count
 - nodes_edited_by_new_mappers
 - nodes_edited_by_experienced_mappers
 - top_new_node_tags

#Changeset Questions:
 - changesets_x_hour

Way Questions:
 - new_ways_per_day
 - top_new_way_tags

Kauai: Running Epic-OSM

Thank You



With Special Thanks To:

Mikel Maron, Mapbox, Development Seed, & The OpenStreetMap Community