The first Programmable City Working Paper has been published on SSRN, written by Rob Kitchin and Tracey P. Lauriault, and concerns the relationship between small and big data, the scaling-up of small data into data infrastructures, and how to conceptualize and make sense of such infrastructures.
Small data, data infrastructures and big data
The production of academic knowledge has progressed for the past few centuries using small data studies characterized by sampled data generated to answer specific questions. It is a strategy that has been remarkably successful, enabling the sciences, social sciences and humanities to advance in leaps and bounds. This approach is presently being challenged by the development of big data. Small data studies will, however, continue to be important in the future because of their utility in answering targeted queries. Nevertheless, small data are being made more big data-like through the development of new data infrastructures that pool, scale and link small data in order to create larger datasets, encourage sharing and re-use, and open them up to combination with big data and analysis using big data analytics. This paper examines the logic and value of small data studies, their relationship to emerging big data and data science, and the implications of scaling small data into data infrastructures, with a focus on spatial data examples. The final section provides a framework for conceptualizing and making sense of data and data infrastructures.
Key words: big data, small data, data infrastructures, data politics, spatial data infrastructures, cyber-infrastructures, epistemology