CS 497: Calibration (Foundations of Trustworthy ML)

Current Term: Fall 2024

Synopsis: This is an advanced topics seminar that will consider theoretical topics related to calibration. Calibrated predictions are predictions that are empirically correct. For example, the weather forecast is calibrated if for each predicted probability of rain p, when the prediction is p chance of rain, the empirical fraction of times that it rains is also p. Calibrated predictions have the property that it is optimal for a decision maker to optimize assuming that the prediction is correct. Calibrated predictions also have applications to explainable AI and fairness. For these reasons, there has been a considerable and rich recent literature developing the algorithmic foundations of calibration. The readings of the course will be drawn from the recent and classic literature pertaining to calibration. Topics include: online learning and swap regret, prediction for decision making, measuring calibration error, online calibration, calibration and machine learning, multi-calibration, fairness, omni-prediction, correlated equilibrium, manipulation of learning algorithms, and calibration for language models.

Coursework: Students will prepare and lead discussions on the papers selected. Coursework includes a survey paper and a preliminary study for a research project. Optional participation in IDEAL Special Program on Interpretability, Privacy, and Fairness.

Week 6: survey paper due
Week 11: project presentations
Week 12: final papers due

Prerequisites: Prior Ph.D. level coursework in algorithms, microeconomics, mechanism design, machine learning, data science, or econometrics.

Instructors: Jason Hartline and Yifan Wu.

Locale: Tuesday, 2-5pm, University Hall 112, Northwestern, Evanston Campus.

Tentative Reading List:

Week 1: Online Learning and Swap Regret
- Topics: online learning, online prediction, exponential weights algorithm, Blum-Mansour algorithm, best-in-hindsight/external regret, swap/internal/calibrated regret.
- Blum, A., & Mansour, Y. (2007). From external to internal regret. Journal of Machine Learning Research, 8(6). [link]
Week 2: Prediction for Decision Making
- Topics: decision making, forecast elicitation, proper scoring rules, Blackwell’s informativeness theorem.
- Gneiting, T. (2011). Making and evaluating point forecasts. Journal of the American Statistical Association, 106(494), 746-762. [link]
- Blackwell, D. (1951, January). Comparison of experiments. In Proceedings of the second Berkeley symposium on mathematical statistics and probability (Vol. 1, No. 93-102, p. 26) [link]
Week 3: Measuring Calibration Error
- Topics: Expected Calibration Error (ECE), smooth calibration error, calibration decision loss.
- Błasiok, J., Gopalan, P., Hu, L., & Nakkiran, P. (2023, June). A unifying theory of distance from calibration. In Proceedings of the 55th Annual ACM Symposium on Theory of Computing (pp. 1727-1740). [link]
- Hu, L., & Wu, Y. (2024). Calibration Error for Decision Making. Foundations of Computer Science (FOCS). [arxiv]
Week 4: Online Calibration
- Topics: minimax theorem, online calibration.
- Hart, S. (2022). Calibrated forecasts: The minimax proof. arXiv preprint arXiv:2209.05863. [arxiv]
- Qiao, M., & Valiant, G. (2021, June). Stronger calibration lower bounds via sidestepping. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing (pp. 456-466). [link]
- Qiao, M., & Zheng, L. (2024). On the Distance from Calibration in Sequential Prediction. arXiv preprint arXiv:2402.07458. [arxiv]
- Arunachaleswaran, E. R., Collina, N., Roth, A., & Shi, M. (2024). An Elementary Predictor Obtaining $2\sqrt{T}$ Distance to Calibration. arXiv preprint arXiv:2402.11410. [arxiv]
Week 5: Calibration and Learning
- Topics: proper scoring rules, multi-calibration, neural networks.
- Blasiok, J., Gopalan, P., Hu, L., & Nakkiran, P. (2024). When does optimizing a proper loss yield calibration?. Advances in Neural Information Processing Systems, 36. [link]
- Błasiok, J., Gopalan, P., Hu, L., Kalai, A. T., & Nakkiran, P. (2024). Loss Minimization Yields Multicalibration for Large Neural Networks. In 15th Innovations in Theoretical Computer Science Conference (ITCS 2024). Schloss-Dagstuhl-Leibniz Zentrum für Informatik. [link]
Week 6: no class. Students are encourages to attend the FOCS Workshop on Calibration.
Week 7: Multi-calibration, Fairness and Omni-prediction
- Pleiss, G., Raghavan, M., Wu, F., Kleinberg, J., & Weinberger, K. Q. (2017). On fairness and calibration. Advances in neural information processing systems, 30. [link]
- Gopalan, P., Kalai, A. T., Reingold, O., Sharan, V., & Wieder, U. (2022). Omnipredictors. In 13th Innovations in Theoretical Computer Science Conference (ITCS 2022) (Vol. 215, p. 79). Schloss Dagstuhl–Leibniz-Zentrum fur Informatik. [link]
Week 8: Learning Algorithms under Competition
- Topics: robustness of learning algorithms, swap regret, Stackelberg game, correlated equilibria.
- Deng, Y., Schneider, J., & Sivan, B. (2019). Strategizing against no-regret learners. Advances in neural information processing systems, 32. [arxiv]
- Collina, N., Gupta, V., & Roth, A. (2024). Repeated Contracting with Multiple Non-Myopic Agents: Policy Regret and Limited Liability. EC 2024. [arxiv]
Week 9: Conformal Prediction
- Angelopoulos, A. N., & Bates, S. (2021). A gentle introduction to conformal prediction and distribution-free uncertainty quantification. arXiv preprint arXiv:2107.07511. [arxiv]
- Section 6, Noarov, G., Ramalingam, R., Roth, A., & Xie, S. (2023). High-dimensional prediction for sequential decision making. arXiv preprint arXiv:2310.17651. [arxiv]
Week 10: Calibration and Complexity Theory
- Topics: indistinguishability, Regularity Lemma, Hardcore Lemma.
- Dwork, C., Kim, M. P., Reingold, O., Rothblum, G. N., & Yona, G. (2021, June). Outcome indistinguishability. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing (pp. 1095-1108). [link]
- Casacuberta, S., Dwork, C., & Vadhan, S. (2024, June). Complexity-Theoretic Implications of Multicalibration. In Proceedings of the 56th Annual ACM Symposium on Theory of Computing (pp. 1071-1082). [link]
Week 11: project presentations