Unexpected Data Bias in Smartphone Trace Data

This study, a joint study with Professor Amanda Stathopoulos‘ group, explores the impact of shifting device representation bias in smartphone tracking data collected before and after Apple’s 2021 privacy updates on user location tracking. It demonstrates that privacy regulations can significantly and unexpectedly affect the quality of these data, which are crucial for decision making across governmental, corporate, and academic institutions worldwide. The research also corrects misconceptions about representation bias previously speculated in the literature. Overall, the findings equip users of location-based device data with a better understanding of potential pitfalls, enabling them to anticipate the changes caused by the evolving regulatory landscape and to devise appropriate coping strategies. This finding is contrary to popular concerns about the under-representation of low-income populations in LBS data.

Download the preprint here and read the abstract below:


As smartphones become ubiquitous, practitioners look to the data generated by location-tracking services enabled on these devices as a comprehensive, yet low-cost means of studying people’s daily activities. It is now widely accepted that smartphone data traces can serve as a powerful analytical tool for research and policymaking. As the use of these data grows, though, so too do concerns regarding the privacy regulations surrounding location tracking of private citizens. Here, we examine how Apple’s tightened privacy measures, designed to restrict location-tracking on their devices, affect the quality of passively generated trace data. Using a large sample of such data collected in the Chicago metro area, we discover a significant drop in iOS data availability post-privacy regulations. The results also reveal a surprising puzzle: the reduced tracking is not uniform and contradicts customary concerns about the under-representation bias of low-income population. Instead, we find a negative correlation between device representation level and income, as well as population density. These findings reframe the debate over the increasing reliance on smartphone data, highlighting the need to understand evolving issues in tracking, coverage, and representation, which are essential for the validity of research and planning.

Leave a Reply

Your email address will not be published. Required fields are marked *