Towards geographies of and produced by data brokers

Today, Rob Kitchin participated in a panel session on spatialized information economies at the Association of American Geographers in Chicago, organized by Jeremy Crampton and Agnieszka Leszczynski.  Below is his script for an intervention titled ‘Towards geographies of and produced by data brokers’.

There have long been spatialized information economies – ever since maps, gazetteers and almanacs have been created and traded.  There’s also a well established century old history of political polling and spatialised market research and data services.  With the development of digital data from the 1950s on, the markets for spatial data and information have steadily diversified in products and exploded in volume of trade, with the growth of new market sectors for creating and processing spatial data such as GIS and CAD, and new spatial info products such as geodemographics.  This is particularly the case in the era of big data, where there is now a deluge of diverse types of continuously produced georeferenced data (principally through GPS and zip code), including digital CCTV, clickstream, online and store transactions, CRM, sensors and scanners, social media, wearables, IoT, and so on.

The data produced from these sources has become a highly valuable commodity and they have led to the rapid growth of a set of data brokers (sometimes called data aggregators, consolidators or re-sellers) who trade in a number of multi-billion dollar data markets.  Data brokers capture, gather together and repackage data for rent (for one time use or use under licensing conditions) or re-sale.  By assembling data from a variety of sources data brokers construct a vast relational data infrastructure.  For example, Epsilon is reputed to own data on 300 million company loyalty card members worldwide.  Acxiom is reputed to have constructed a databank concerning 500 million active consumers worldwide, with about 1,500 data points per person, and claim to be able to provide a ‘360-degree view’ on consumers (meshing off-line, online and mobile data).  It also manages separately customer databases for, or works with, 47 of the Fortune 100 companies.  Datalogix claim to store data relating to over a trillion dollars worth of offline purchases.  Other data broker and analysis companies include Alliance Data Systems, eBureau, ChoicePoint, Corelogic, Equifax, Experian, Facebook, ID Analytics, Infogroup, Innovis, Intelius, Recorded Future, Seisint and TransUnion.

Each company tends to specialize in different types of data and data products and services.  Products include:

  • lists of potential customers/clients who meet certain criteria and consumer and place profiles
  • search and background checks
  • derived data products wherein brokers have added value through integration and analytics
  • data analysis products that are used to micro-target advertising and marketing campaigns (by social characteristics and/or by location), assess credit worthiness and socially and spatially sort individuals, provide tracing services, predictive modelling as to what individuals might do under different circumstances and in different places, or how much risk a person constitutes, and supply detailed business analytics.

The worry of some, including Edith Ramirez, the chairperson of the Federal Trade Commission (FTC) in the US, is that such firms practice a form of ‘data determinism’ in which individuals are not profiled and judged just on the basis of what they have done, but on the prediction of what they might do in the future using algorithms that are far from perfect, which may hold in-built biases relating to race, ethnicity, gender and sexuality, and yet are black-boxed and lack meaningful oversight and remediate procedures.  Moreover, they employ the data for purposes for which they were never generated and data are hoarded as a speculative measure that they may have future value, breaking data minimization rules that stipulate that only data of defined value should be retained.  And given the volume of sensitive personal records they are a prime target for criminals intent on conducting identity theft fraud.

Interestingly, given the volumes and diversity of personal and place-based data that data brokers and analysis companies possess, and how their products are used to socially and spatially sort and target individuals and households, there has been remarkably little critical attention paid to their operations.  Indeed, there is a dearth of academic and media analysis about such companies and the implications of their work and products.  This is in part because the industry is relatively low-profile and secretive, not wishing to draw public attention to and undermine public trust in their assets and activities, which in turn might lead to public campaigns for transparency, accountability and regulation.  Moreover, data brokers are generally unregulated and are not required by law to provide individuals access to the data held about them, nor are they obliged to correct errors relating to those individuals.  As such, there is a pressing need for us to conduct research on both the geographies of the data brokerage industry (in terms of where they are located and for what reasons) and the geographies produced by that industry; to map out their associated spatial informational economies.  At present, we have little detailed understanding of either, which is why it is difficult – for me at least – to answer list of questions posed by Agnieszka and Jeremy for this session (which are listed below).

Questions of labor, legal frameworks, and privacy

  • what is the legal status of geolocational privacy rights; privacy and national security (eg PCLOB); and legal rulings (eg Jones v. USA, Riley v. California)?
  • what is the landscape of legal geographies around spatial information/the spatialization of content?
  • are laborers in the spatialized information economy experiencing increasing “control” over them? If so, does this control reflect itself in everyday working conditions? Are these working conditions exploitive?
  • what is the status of regulatory and/or oversight over labor in this sector of the economy?
  • what are the (global) geographies of divisions of labour in the spatial information economy?

Questions of innovation, investment and technology

  • what characterizes geospatial information/product lifecycles and growth trajectories?
  • what is the context of geospatial product development?
  • on the consumer side, how are geoweb technological innovations marketed?
  • are there geographical clusters of innovation and if so what is giving rise to their concentration (eg., proximity to other capital, deregulated conditions, tax incentives?)
  • how is venture capital invested in the spatial info economy, and what are the sources of investment?

Questions of (cyber)security, surveillance, and cyberwarfare

  • in what ways is geolocation underwriting and increasingly central to the surveillance activities and practices of the securities agencies?
  • how are the decentralized, global geographies of data (deterritorialized collection and flow, reterritorialized storage and analysis) complicating the cybersecurity/cyberwarfare equation?