two million calls to 311
02/13/2017Two million calls to 311 was the most technically interesting of my open data explorations.
311 uses a flat classification scheme for incoming calls, one mostly based on how and to whom response responsibility will be routed. As a result, there were hundreds of categories of wildly different levels of specifity. For example, the majority of noise complaints get lumped into the catch-all category of Residential Noise (200,000 calls)—but there are related categories like helicopter noise (1000) and sanitation truck noise (200 calls) with far fewer reports. I was interested in showing these different categories in relative terms without obfuscating the minority classes.
For the frontend I designed an interactive hierarchical treemap visualization, implemented in D3 (see above). The visualization lets you click through individual categorical buckets to view detailed breakdown of their subcategories. Code.
However, to get this working I needed something that would take the flat list of categories and subcategories and turn it into a hierarchical tree of "large" categories (which stand on their own) and "small" categories (collected in a Other
category). This turned out to be a nontrivial algorithmic challenge. To solve it, I wrote a small single-purpose npm
package called threshold-tree
.
threshold-tree
uses a recursive pre-order tree-traversal algorithm to push small categories that fail a threshold test (they are much smaller than their siblings) to child Other
nodes. Other
children are then themselves subdivided. The final visualization featured three layers of subdivision, e.g.: Root > Other (190,000) > Crime (33,000) > Other (1,600) > Illegal Fireworks (200)
.