Category Archives: Recent research

What cellphone data reveal about teleworking

My student Tianxing has been working hard to decipher cellphone data for some time now. Earlier this year, we have completed a paper, in collaboration with Amanda’s group,  showing unexpected representative biases in cellphone data that appears to have direct link with privacy regulations.  In this study, we uncovered work types of cellphone users using a clustering algorithm, and validated the results against  surveys data and regression analysis.  You may download a preprint here. The abstract follows.


In a short period, the COVID-19 pandemic has transformed telework into a common practice for a significant portion of the workforce. This shift has profound implications for land use, urban development, and transportation. Traditional survey-based methods for tracking these changes are struggling to keep pace with the rapidity of this transformation. Here, we propose a method to identify different types of workers from mobile phone data, which allows us to closely examine the correlation between work arrangements, mobility patterns and key socio-demographic attributes. By applying a hierarchical clustering algorithm to a set of features extracted from a mobile phone data set, six different work types are identified and their validity is confirmed using different approaches. We find teleworkers tend to travel slower than regular workers but faster than non-workers. They also travel a shorter distance to reach their primary activity location than regular workers, but a longer distance to reach other activity locations than both regular and non-workers. Our regression analysis further shows that, largely in agreement with findings in literature, racial minority and low income groups are less likely to telework. Implications for the use of trace data to model the evolving relationship between mobility and worker-classification are discussed.

Is Fare Free Transit Just?

I became interested in fare free transit since  Michelle Wu was elected the Major of Boston. She was the first female Asian major of the city, though her reputation as a disciple of Elizabeth Warren, the liberal firebrand in the U.S. senate,  probably overrode her other identities.  Among many of her agenda items was fare free transit (FFT), which caught my attention  not because it is especially progressive, but because it is a transit policy, which I happen to know something about.  Another source of inspiration for this paper came from Steven Dubner’s podcast on the subject a couple of years ago, which is entitled “Should Public Transit be Free”.

I shared the preprint with my department chair, Prof. Kim Gray, who is an environmental engineer but has a broad interest in anything related to sustainability and climate change. She was impressed and asked her assistant, Miss. Gina Twardosz, to write a news article to be posed on the department website. If you don’t want to read the paper itself, here is the link to that article. The abstract follows.


Abstract: Using a stylized transit design model, this study examines fare-free transit (FFT) through the lens of distributive justice. We pose a direct question: Is FFT just according to John Rawls’s theory of justice? Specifically, is it compatible with the resource allocation that maximizes the utility of the most disadvantaged travelers? We compare this egalitarian principle with a utilitarian one, which asserts that an allocation is optimal when it maximizes the total utility of all travelers. FFT is of course not free. In the absence of farebox revenue, a transit system must either cut services or turn to alternative sources, such as local dedicated taxes and fees levied on drivers. Thus, our model incorporates both finance and operational decisions, and captures the interaction between traffic congestion and travelers’ income level and mode choice. Using a case study built with empirical data in Chicago, we show that fare is not the first choice under either moral principle. For the egalitarian, the most desirable funding source is the driver fee, whereas taxation is preferred by the utilitarian. It follows that FFT can be both just and utility-maximizing, if one is allowed to raise taxes and charge drivers with impunity. However, as the flexibility in finance diminishes, so does the appeal of FFT. In such cases, the proposed model serves as a decision-support tool for finding sensible compromises that address the varied interests and ideologies at play. For example, it reveals that at the current tax rate of about 1% in Chicago, the Rawlsian egalitarian can justify FFT only if drivers pay about $1,800/year to fund transit, which amounts to about 18% of an average U.S. household’s driving cost.

Unexpected Data Bias in Smartphone Trace Data

This study, a joint study with Professor Amanda Stathopoulos‘ group, explores the impact of shifting device representation bias in smartphone tracking data collected before and after Apple’s 2021 privacy updates on user location tracking. It demonstrates that privacy regulations can significantly and unexpectedly affect the quality of these data, which are crucial for decision making across governmental, corporate, and academic institutions worldwide. The research also corrects misconceptions about representation bias previously speculated in the literature. Overall, the findings equip users of location-based device data with a better understanding of potential pitfalls, enabling them to anticipate the changes caused by the evolving regulatory landscape and to devise appropriate coping strategies. This finding is contrary to popular concerns about the under-representation of low-income populations in LBS data.

Download the preprint here and read the abstract below:


As smartphones become ubiquitous, practitioners look to the data generated by location-tracking services enabled on these devices as a comprehensive, yet low-cost means of studying people’s daily activities. It is now widely accepted that smartphone data traces can serve as a powerful analytical tool for research and policymaking. As the use of these data grows, though, so too do concerns regarding the privacy regulations surrounding location tracking of private citizens. Here, we examine how Apple’s tightened privacy measures, designed to restrict location-tracking on their devices, affect the quality of passively generated trace data. Using a large sample of such data collected in the Chicago metro area, we discover a significant drop in iOS data availability post-privacy regulations. The results also reveal a surprising puzzle: the reduced tracking is not uniform and contradicts customary concerns about the under-representation bias of low-income population. Instead, we find a negative correlation between device representation level and income, as well as population density. These findings reframe the debate over the increasing reliance on smartphone data, highlighting the need to understand evolving issues in tracking, coverage, and representation, which are essential for the validity of research and planning.

Entropy maximization for multi-class assignment

The lack of uniqueness constitutes a serious concern for any analysis that relies on class-specific traffic assignment results, such as understanding the impact of a transport policy on the welfare of travelers from different income groups, sometimes known the vertical equity analysis.  Entropy maximization is a standard approach to consistently selecting a unique class-specific solution for multi-class traffic assignment.

Here, we show the conventional maximum entropy formulation fails to strictly observe the multi-class bi-criteria user equilibrium condition, because a class-specific solution matching the total equilibrium link flow may violate the equilibrium condition. We propose to fix the problem by requiring the class-specific solution, in addition to matching the total equilibrium link flow, also match the objective function value at the equilibrium.  This leads to a new formulation that is solved using an exact algorithm based on dualizing the hard, equilibrium-related constraints.

Our numerical experiments highlight the superior stability of the maximum entropy solution, in that it is affected by a perturbation in inputs much less than an untreated benchmark multi-class assignment solution.  In addition to instability, the benchmark solution also exhibits varying degrees of arbitrariness, potentially rendering it unsuitable for assessing distributional effects across different groups, a capability crucial in applications concerning vertical equity and environmental justice. The proposed formulation and algorithm offer a practical remedy for these shortcomings.

This is the third paper completed by the first author, Qianni Wang, who officially joined my group last year.

The paper was currently under review at Transportation Research Part B.  You may download a preprint here, or read the abstract below.


Abstract: Entropy maximization is a standard approach to consistently selecting a unique class-specific solution for multi-class traffic assignment. Here, we show the conventional maximum entropy formulation fails to strictly observe the multi-class bi-criteria user equilibrium condition, because a class-specific solution matching the total equilibrium link flow may violate the equilibrium condition. We propose to fix the problem by requiring the class-specific solution, in addition to matching the total equilibrium link flow, also match the objective function value at the equilibrium. This leads to a new formulation that is solved using an exact algorithm based on dualizing the hard, equilibrium-related constraints. Our numerical experiments highlight the superior stability of the maximum entropy solution, in that it is affected by a perturbation in inputs much less than an untreated benchmark multi-class assignment solution. In addition to instability, the benchmark solution also exhibits varying degrees of arbitrariness, potentially rendering it unsuitable for assessing distributional effects across different groups, a capability crucial in applications concerning vertical equity and environmental justice. The proposed formulation and algorithm offer a practical remedy for these shortcomings.

Is competition for losers in bikesharing?

The rise and fall of the bikesharing industry in China offers a cautionary tale about the risks of an unregulated market with a low entry barrier. It is well known that, while low entry barriers can promote competition and innovation, they may also lead to higher market volatility and potential challenges in achieving profitability due to intensified rivalry . There are also limited economies of scale to be had, making it exceedingly difficult to establish a monopoly. As Peter Thiel noted, “competition is for losers”‘ in such markets and good entrepreneurs should simply stay away from them.   However, writing off the bikesharing industry as unprofitable cannot be the only story here. After all, bikesharing has a genuinely positive societal impact and should have its place in many of our cities that are haunted by the disease of auto-dependency. The question is what, if anything, can be done to foster a healthy bikesharing market that is attractive to both users and private investors.  We set up to answer this question here.  You may download a preprint here, or read the abstract below.


Abstract: We model inter-operator competition in a dockless bikesharing (DLB) market as a non-cooperative game. To play the game, a DLB operator sets a strategic target (e.g., maximizing profit or maximizing ridership) and makes tactical decisions (e.g., pricing and fleet sizing). As each operator’s payoff and decision set are influenced by its own decisions as well as those of its competitors, the outcome of the game is a generalized Nash equilibrium (GNE). To analyze how competition may shape the choice of strategic targets, we further augment the game framework with a ranking scheme to properly evaluate the preference for different targets. Using a model calibrated with empirical data, we show that, if an operator is committed to maximizing its market share with a budget constraint, all other operators must respond in kind. Otherwise, they would be driven out of the market. When all operators compete for market dominance, Moreover, even if all operators agree to focus on making money rather than ruinously seeking dominance, profitability still plunges quickly with the number of players. Taken together, the results explain why the unregulated DLB market is often oversupplied and prone to collapse under competition. We also show this market failure may be prevented by a fleet cap policy, which sets an upper limit on each operator’s fleet size.

Integrated bus-bike system

After much delay, the last paper Sida and I wrote together came out last week in Transportation Research Part C.   Growing out of the last chapter of Sida’s PhD dissertation, the first draft of the paper was completed before he went back to China in the summer of 2020, at the height of COVID19 pandemic.   In part, the long delay was due to Sida’s transition to his new job at Beijing Jiaotong University.   I am glad he pressed on despite the early setbacks and eventually published the paper  in a descent journal.  Here is the abstract for those who wonder what is an integrated bus-bike system.


Abstract: This paper examines the design of a transit system that integrates a fixed-route bus service and a bike-sharing service. Bike availability – the average probability that a traveller can find a bike at a dock – is modelled as an analytical function of bike utilization rate, which depends on travel demand, bike usage and bike fleet size. The proposed system also recognizes the greater flexibility provided by biking. Specifically, a traveller can choose between the closest bus stop and a more distant stop for access, egress or both. Whether the closest stop is a better choice depends on the relative position of the traveller’s origin and destination, as well as system design parameters. This interdependence complicates the estimation of average system cost, which is conditional on route choice. Using a stylized analysis approach, we construct the optimal design problem as a mixed integer program with a small number of decision variables. Results from our numerical experiments show the integrated bus-bike system promises a modest but consistent improvement over several benchmark systems, especially in poorer cities with lower demand density. We find a large share of travellers, more than 20% in nearly all cases tested, opt not to use the nearest bus stop in an optimally designed system. The system also tends to maintain a high level of bike availability: the probability of finding a bike rarely drops below 90% except in very poor cities.

Wardrop equilibrium can be boundedly rational

As one of the most fundamental concepts in transportation science, Wardrop equilibrium (WE) was the cornerstone of countless large mathematical models that were built in the past six decades to plan, design, and operate transportation systems around the world. However, like Nash Equilibrium, its more famous cousin, WE has always had a somewhat flimsy behavioral foundation. The efforts to beef up this foundation have largely centered on reckoning with the imperfections in human decision-making processes, such as the lack of accurate information, limited computing power, and sub-optimal choices. This retreat from behavioral perfectionism was typically accompanied by a conceptual expansion of equilibrium. In place of WE, for example, transportation researchers had defined such generalized equilibrium concepts as stochastic user equilibrium (SUE) and boundedly rational user equilibrium (BRUE). Invaluable as these alternatives are to enriching our understanding of equilibrium and advancing modeling and computational tools, they advocate for the abandonment of WE, predicated on its incompatibility with more realistic behaviors. Our study aims to demonstrate that giving up perfect rationality need not force a departure from WE, since WE may be reached with global stability in a routing game played by boundedly rational travelers. To this end, we construct a day-to-day (DTD) dynamical model that mimics how travelers gradually adjust their valuations of routes, hence the choice probabilities, based on past experiences.

Our model, called cumulative logit (CULO), resembles the classical DTD models but makes a crucial change: whereas the classical models assume routes are valued based on the cost averaged over historical data, ours values the routes based on the cost accumulated. To describe route choice behaviors, the CULO model only uses two parameters, one accounting for the rate at which the future route cost is discounted in the valuation relative to the past ones (the passivity measure) and the other describing the sensitivity of route choice probabilities to valuation differences (the dispersion parameter).  We prove that the CULO model always converges to WE, regardless of the initial point, as long as the passivity measure either shrinks to zero as time proceeds at a sufficiently slow pace or is held at a sufficiently small constant value. Importantly, at the aggregate (i.e., link flow) level, WE is independent of the behavioral parameters. Numerical experiments confirm that a population of travelers behaving differently reaches the same aggregate WE as a homogeneous population, even though in the heterogeneous population, travelers’ route choices may differ considerably at WE.

By equipping WE with a route choice theory compatible with bounded rationality, we uphold its role as a benchmark in transportation systems analysis. Compared to the incumbents, our theory requires no modifications of WE as a result of behavioral accommodation. This simplicity helps avoid the complications that come with a “moving benchmark”, be it caused by a multitude of equilibria or the dependence of equilibrium on certain behavioral traits. Moreover, by offering a plausible explanation for travelers’ preferences among equal-cost routes at WE, the theory resolves the theoretical challenge posed by Harsanyi‘s instability problem. Note that we lay no claim on the behavioral truth about route choices. Real-world routing games take place in such complicated and ever-evolving environments that they may never reach a true stationary state, much less the prediction of a mathematical model riddled with a myriad of assumptions. Indeed, a relatively stable traffic pattern in a transportation network may be explained as a point in a BRUE set, an SUE tied to properly calibrated behavioral parameters, or simply a crude WE according to the CULO model. More empirical research is still needed to compare and vet these competing theories for target applications. However, one should no longer write off WE just because it has no reasonable behavioral foundation.

A preprint can be downloaded at ArXiv or SSRN.

The sustainability appeal of URT

Few would deny that public transit has an important role to play in any sensible solutions to the transportation’s sustainability problem. Yet, the consensus often dissolves at the question of how. A case in point concerns urban rail transit (URT), which has expanded rapidly in recent decades.   The ongoing debate about URT has been fueled by inconclusive, sometimes contradictory, empirical evidence reported in the literature.  Has URT consistently reduced driving and/or auto ownership to affirm its appeal to sustainability? We set out to address this question head-on in this study.

You may read the abstract below, and download a preprint here.


Abstract: Urban rail transit (URT) has expanded rapidly since the dawn of the century. While the high cost of building and operating URT systems is increasingly justified by their presumed contribution to sustainability — by stimulating transit-oriented development, promoting the use of public transportation, and alleviating traffic congestion — the validity of these claims remains the subject of heated debates. Here we examine the impact of URT on auto ownership, traffic congestion, and bus usage and service, by applying fixed-effects panel regression to time series data sets compiled for major urban areas in China and the US. We find that URT development is strongly and negatively correlated with auto ownership in both countries. This URT effect has an absolute size (as measured by elasticity) in China three times that in the US, but is much larger in the US than in China, relative to other factors such as income and unemployment rate. Importantly, the benefit transpires only after a URT system reaches the tipping point that unleashes the network effect.  Where this condition is met, we estimate about 14,012 and 31,844 metric tons of greenhouse gas emissions can be eliminated each year in China and the US, respectively, for each additional million URT vehicle kilometers traveled. We also uncover convincing evidence of cannibalization by URT of bus market share in both countries. However, rather than undermining bus services, developing URT strongly stimulates their growth and adaptation. Finally, no conclusive evidence is found that confirms a significant association between URT and traffic congestion. While traffic conditions may respond positively to URT development in some cases, the relief is likely short-lived.

Ethics-Aware Transit Design

In this paper we proposed a corridor transit design model that places accessibility and equity at the center of the trade-off. By guiding transit design with ethical theories, it promises to improve vertical equity. We reviewed and examined four different ethical principle but were focused on the utilitarian principle (the status quo) and John Rawls’ difference principle (a form of egalitarianism). The main findings from our analysis of the design models are summarized as follows.

  • When the transit service is homogeneous in space, the utilitarian design model and the egalitarian design model are mathematically equivalent. Thus, they always produce identical designs for all forms of the opportunity distribution.
  • With supply heterogeneity, the egalitarian design has a prominent equity-enhancing effect, whereas the utilitarian design tends to exacerbate inequity, especially in presence of large innate inequality.
  • Correcting innate inequality by applying the egalitarian principle often entails interventions that appear more “discriminatory” than the status quo. Whether such distributive measures are justified, the appearance of unfairness can be met with skepticism, if not outright opposition, from the general public.
  •  Our ability to promote equity is restricted not only by the resources available but also by the structure of the problem at hand. The difference principle is useful because it defines the upper limit of equity that we may strive to reach but should not exceed.

It is worth recalling the egalitarian design based on the difference principle tends to reduce the total accessibility of all residents, compared to the incumbent design regime of utilitarianism. When innate inequality is large, the loss of accessibility can be substantial, up to 40\% according to our experiments. This, of course, is hardly a surprise, given the primary concern of the difference principle is the distributive justice, not the total utility. One thing is clear though: the benefits to the most disadvantaged could come at a hefty price to society writ large. Steven Dubner, the host of the popular podcast Freakonomics, likes to quip,

Economists know the price of everything but the value of nothing.

No doubt the same can be said about many if not most engineers. In some sense, our study constitutes an attempt to price social values in engineering practice. To be sure, these values are priceless to many an advocate, who would be quick to point out that the obsession with pricing everything is precisely what got us here in the first place. However, understanding the consequence of imposing certain values in engineering systems is still a crucial task, if only because we always need to secure public support or determine affordability.

The work is partially funded by Northwestern University’s Catalyst fund and NSF’s Smart and Connected Community (S&CC) Planning
Grant.  A prerprint of the paper may be downloaded here.

Allocation problem for the platform of platforms

Another joint work with Ruijie Li, built on our previous research of ridesharing, including A-PASS and Pricing carpool.

In this study, we consider a general problem called the Allocation Problem for the Platform of Platforms – dubbed AP3.  Such a problem might arise  in a two-sided service market, where a third-party integrator tries to allocate customers to workers separately controlled by a set of online platforms in a manner that satisfies all stakeholders.   The integrator, as a leader, influences the outcome of the game by pricing the service, whereas the platforms (followers) are given the freedom to accept or reject customers to maximize their own profit, given the prices set by the integrator (see the plot below for an illustration).  A set of nonlinear constraints are imposed on the leader’s problem to eliminate artificial scarcity, orignated from the integrator’s monopoly power.  We formulate AP3 as a Stackelberg bipartite matching problem, which is known to be NP-hard in general.  Our main result concerns the proof that AP3 can be reduced to a polynomially solvable problem by taking advantage of, somewhat paradoxically, the hard requirement of ruling out artificial scarcity.

A preprint can be downloaded here.


Abstract: We study the Allocation Problem for the Platform of Platforms (abbreviated as AP3) in a two-sided service market, where a third-party integrator tries to allocate customers to workers separately controlled by a set of online platforms in a manner that satisfies all stakeholders. AP3 is a natural Stackelberg game. The integrator, as a leader, influences the outcome of the game by pricing the service, whereas the platforms (followers) are given the freedom to accept or reject customers to maximize their own profit, given the prices set by the integrator. A set of nonlinear constraints are imposed on the leader’s problem to eliminate artificial scarcity, derived from the integrator’s monopoly power. We formulate AP3 as a Stackelberg bipartite matching problem, which is known to be NP-hard in general. Our main result concerns the proof that AP3 can be reduced to a polynomially solvable problem by taking advantage of, somewhat paradoxically, the “hard” requirement of ruling out artificial scarcity. Numerical experiments are conducted using the ride-hail service market as a case study. We find artificial scarcity negatively affects the number of customers served, although the magnitude of the effect varies with market conditions. In most cases, the integrator takes the lion’s share of the profit, but the need to eliminate artificial scarcity sometimes forces them to concede the benefits of collaboration to the platforms. The tighter the supply relative to the demand, the more the platforms benefit from removing artificial scarcity. In an over-supplied market, however, the integrator has a consistent and overwhelming advantage bestowed by its monopoly position.