Using the missingno package to visualize missing data03/28/2016
Once your data is safely localized, one of the first and most important things that you have to do at the beginning of any data analytics project is taking "a lay of the land" with your data. Data is fundamentally messy: full of oddities and noise and incomplete entries. Getting a handle on this weirdness is an essential first step to getting anything actually done with it, and as much as 80% of project time might end up getting sunk in it.
To help with that process I built
missingno, a missing data visualization tool and
the subject of this post. The package (still can't believe the
name was never taken) exposes a series of top-level data visualizations taking pandas
DataFrame as input and gives up-tweaked data nullity visualizations as output:
Head over to the the GitHub repository to learn more.