Intro
Our goal was to allow users to compare bicycle and pedestrian safety in Seattle through data on reported collisions to learn which areas of the city may be more dangerous. By providing an interactive visualization we were able to support information-seeking tasks which include a general overview and direct comparisons followed by more exploratory analysis using location zooming, conditional filtering, and information highlighting.
Using collision data from the Seattle Department of Transportation, we analysed various city areas, dates/times, and conditions where bicycle or pedestrian accidents occurred between the years 2004 and 2016.*
Process
Design Sketches and Paper Prototypes
Our paper prototype concept was developed through sketching and group discussion. Our concept featured three different data visualization types: A line chart of annual totals for the two types of collisions, a heat map of saturated color blocks for the variables of month and year, and a street map of geolocated points where collisions occurred.
We decided to create a line chart to provide the user with a quick overview of annual trends and to compare how many accidents occurred each year for bicycles and pedestrians separately. We also wanted a heat map to allow users to compare collisions types relative to the time of year - which months had more reported accidents for each type. And the key visualization was a street map to identify accident areas within Seattle and to allow a more filtered search of data by specific conditions or dates and times.We knew that a map-based visualization would be a key component since our initial design question focused on identifying areas within the city of Seattle where there were large numbers of accidents involving bicycles. While a map is effective in displaying location information, it makes it difficult to identify trends that aren't related to frequency and position, thus the other two visualizations.
Prototype Testing
We wanted our street map to be our most comprehensive visualization so we conducted two rounds of testing: One to figure out what story to tell with the data and the other to test how users interacted with our design. We limited our paper prototype testing to the street map because the interactivity of the other two visualizations was not as complex.
During our initial paper prototyping, we asked our participants what they felt would be interesting to sort by. Our participants listed a large number of filters: weather, road conditions, time of day, type of accident, severity of accident, number of deaths, season, etc. We chose to present users with as many filters as possible. Users were asked to complete three separate tasks to find out if they could successfully use the map's data filters, to observe how they would compare data, and to see whether they could navigate the map to analyze different areas of the city.
Results
Based on feedback and observations, we implemented a few minor changes to our prototype. We added "all" as an option for each dropdown filter so users did not feel they were required to make a selection for each one. We had intended to use a variable encoded in the size of the location point on the map to indicate areas where more collisions were recorded. However, when we tested with users most did not notice the differences in point sizes but looked instead for concentrations of them as an indication of more collisions. We decided to remove the size of the point as an indication for frequency. Another point of confusion was the "outcome" label for injury or non-injury collisions. Users felt that a third category, "fatalities" was a relevant filter and questioned its omission. Users also expected to be able to select a range of dates for comparison purposes.
Outcomes
Our visualization "Bicycle and Pedestrian collisions in Seattle" helped us learn about designing a visualization, determining user needs when looking at a large dataset, and evaluating the usability of visualizations. We experimented with different types of graphs, filters, and encodings to find the most appropriate and useful visualization. User testing helped us reduce user confusion and frustration and enhance interesting areas of our data. As Seattleites, we wanted our visualization to be an in-depth analysis of bicycle and pedestrian collisions that would be useful to a variety of people.
* The complete data set is quite large - over 186,000 data points each, representing 42 variables - and spans a period from 2004 through 2017. The variables address nominal data, such as intersection name, types of collisions, collision descriptions, road conditions, light conditions, severity descriptions, and weather. Quantitative data includes ratio types like incident numbers, distances, report numbers, fatalities, serious injuries, and latitude longitude coordinates as well as interval data in the form of collision dates. A link to the complete data set is hosted on the City of Seattle Open Data website.