Rob Kitchin and Gavin McArdle have published a new paper entitled ‘The diverse nature of big data‘ available as Programmable City Working Paper 15 on SSRN.
Abstract: Big data has been variously defined in the literature. In the main, definitions suggest that big data are those that possess a suite of key traits: volume, velocity and variety (the 3Vs), but also exhaustivity, resolution, indexicality, relationality, extensionality and scalability. However, these definitions lack ontological clarity, with the term acting as an amorphous, catch-all label for a wide selection of data. In this paper, we consider the question ‘what makes big data, big data?’, applying Kitchin’s (2013, 2014) taxonomy of seven big data traits to 26 datasets drawn from seven domains, each of which is considered in the literature to constitute big data. The results demonstrate that only a handful of datasets possess all seven traits, and some do not possess either volume and/or variety. Instead, there are multiple forms of big data. Our analysis reveals that the key definitional boundary markers are the traits of velocity and exhaustivity. We contend that big data as an analytical category needs to be unpacked, with the genus of big data further delineated and its various species identified. It is only through such ontological work that we will gain conceptual clarity about what constitutes big data, formulate how best to make sense of it, and identify how it might be best used to make sense of the world.
Key words: big data, ontology, taxonomy, types, characteristics