A classification scheme (and python function) for categorising file extensions into broad file types.
The file 'binfiles.py' contains the categories and their related extensions, and a simple function that can be called to check some extension string against the lists and return a category.
The file 'results-tables.xls' provides descriptive statistics for how files of the categorised types comprise six kinds of file collections, as seen in collected data and described in a forthcoming manuscript (submitted for review). Statistics are log normal (e.g., log normal mean*) as described by Limpert, Stahel, & Abbt (2001) and without outliers (removed using interquartile range).