Current Term: Fall 2022
Synopsis: This is an advanced topics seminar that will consider theoretical topics in the space of data economics. As data science transforms science and society, it is important to develop the economics of data. Collecting data is costly, possessing data gives market power, sharing data has risks and benefits, conclusions from data depend on data quantity and quality. The readings of the course will be drawn from the recent and classic literature pertaining to data economics. Topics include: valuing data, eliciting data, incentivizing data collection and sharing, adaptive data analysis, game theory with data.
Coursework: Students will prepare and lead discussions on the papers selected. Coursework includes a survey paper and a preliminary study for a research project. Optional participation in IDEAL Special Quarter on Data Economics.
Prerequisites: Prior Ph.D. level coursework in algorithms, microeconomics, mechanism design, data science, or econometrics.
Locale: Friday 2-5pm Central, Virtual
Schedule:
- Week 0 (Sept 23): Introduction
- Week 1 (Sept 30): Elicitation
- Savage, L. J. (1971). Elicitation of personal probabilities and expectations. Journal of the American Statistical Association, 66(336), 783-801.
- Lambert, N. S. (2018). Elicitation and evaluation of statistical forecasts. Preprint.
- (Additional Resource) McCarthy, J. (1956). Measures of the value of information. Proceedings of the National Academy of Sciences, 42(9), 654-655.
- (Additional Resource) Gneiting, T., & Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association, 102(477), 359-378.
- Week 2 (Oct 7): Valuing Data
- Chen, Y., & Waggoner, B. (2016, October). Informational substitutes. In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS) (pp. 239-247). IEEE.
- Frankel, A., & Kamenica, E. (2019). Quantifying information and uncertainty. American Economic Review, 109(10), 3650-80.
- Ghorbani, A., & Zou, J. (2019, May). Data shapley: Equitable valuation of data for machine learning. In International Conference on Machine Learning (pp. 2242-2251). PMLR.
- (Additional Resource) Howard, R. A. (1966). Information value theory. IEEE Transactions on systems science and cybernetics, 2(1), 22-26.
- Week 3 (Oct 14): Workshop 1: Elicitation Mechanisms in Practice
- Week 4 (Oct 21): Peer Prediction
- Prelec, D. (2004). A Bayesian truth serum for subjective data. Science, 306(5695), 462-466.
- Kong, Y., & Schoenebeck, G. (2019). An information theoretic framework for designing information elicitation mechanisms that reward truth-telling. ACM Transactions on Economics and Computation (TEAC), 7(1), 1-33.
- (Additional Resource) Miller, N., Resnick, P., & Zeckhauser, R. (2005). Eliciting informative feedback: The peer-prediction method. Management Science, 51(9), 1359-1373.
- (Additional Resource) Kong, Y. (2020). Dominantly truthful multi-task peer prediction with a constant number of tasks. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms (pp. 2398-2411).
- Week 5 (Oct 28): Workshop 2: Elicitation and Evaluation
- Week 6 (Nov 4): Data Collection and Sharing
- Bergemann, D., Bonatti, A., & Gan, T. (2022). The economics of social data. The RAND Journal of Economics.
- Gradwohl, R., & Tennenholtz, M. (2022). Pareto-Improving Data-Sharing. arXiv preprint arXiv:2205.11295.
- (Additional Resource) Shahmoon, R., Smorodinsky, R., & Tennenholtz, M. (2022). Data Curation from Privacy-Aware Agents. arXiv preprint arXiv:2207.06929.
- Week 7 (Nov 11): Workshop 3: Challenges in Data Economics
- Week 8 (Nov 18): Machine Learning Connections
- Abernethy, J. D., & Frongillo, R. (2011). A collaborative mechanism for crowdsourcing prediction problems. Advances in Neural Information Processing Systems, 24.
- Agarwal, A., & Agarwal, S. (2015, June). On consistent surrogate risk minimization and property elicitation. In Conference on Learning Theory (pp. 4-22). PMLR.
- (Additional Resource) Aragones, E., Gilboa, I., Postlewaite, A., & Schmeidler, D. (2005). Fact-free learning. American Economic Review, 95(5), 1355-1368.
- (Additional Resource) Gneiting, T. (2011). Making and evaluating point forecasts. Journal of the American Statistical Association, 106(494), 746-762.
- Week 9 (Dec 2): Adaptive Data Analysis
- Jung, C., Ligett, K., Neel, S., Roth, A., Sharifi-Malvajerdi, S., & Shenfeld, M. (2020). A New Analysis of Differential Privacy’s Generalization Guarantees. In 11th Innovations in Theoretical Computer Science Conference (ITCS 2020).
- Woodworth, B. E., Feldman, V., Rosset, S., & Srebro, N. (2018). The everlasting database: statistical validity at a fair price. In Advances in Neural Information Processing Systems (pp. 6531-6540).