INFORMS Open Forum

DM Webinar (April 29): Alignment in Large Language Models: Statistical and Game-Theoretic Perspectives

  • 1.  DM Webinar (April 29): Alignment in Large Language Models: Statistical and Game-Theoretic Perspectives

    Posted 10 hours ago
      |   view attached
    Dear colleagues,
    The INFORMS Data Mining Society is pleased to invite you to an upcoming webinar:
    Title: Alignment in Large Language Models: Statistical and Game-Theoretic Perspectives

    Speaker: Weijie Su, Associate Professor, Wharton Statistics and Data Science Department, University of Pennsylvania

    Date & Time: April 29 (Wednesday), 2026 | 1 PM EST

    Registration: https://us06web.zoom.us/webinar/register/WN_VyPOx9gSSvi_ydaT1SdEKA

    Abstract: Large language models (LLMs) are predominantly aligned with human preferences through reinforcement learning from human feedback (RLHF). In this talk, we explore the theoretical foundations of LLM alignment through the intertwined lenses of statistics and game theory. First, we show how the current formulation of RLHF induces a systematic bias we call preference collapse, and how this can be mitigated by introducing a tailored regularization term into the reward function. Next, we expose a fundamental bottleneck of reward-based alignment, demonstrating that cyclic human preferences cannot be faithfully represented by scalar reward models such as the Bradley-Terry model. More precisely, we establish that such cyclic inconsistencies give rise to a lower bound on the approximation error of any scalar reward fitting. Shifting to a game-theoretic perspective, we focus on Nash learning from human feedback and establish several social choice desiderata for this approach to alignment, including the preservation of preference diversity through the emergence of mixed strategies. Finally, we show that the zero-sum game approach generally cannot perfectly match a target preference distribution as a unique Nash equilibrium. This talk is based on arXiv 2405.16455, 2503.10990, 2505.20627, and 2506.12350.
    Speaker Bio:
    Weijie Su is an Associate Professor in the Wharton Statistics and Data Science Department at the University of Pennsylvania, with courtesy appointments in Computer and Information Science and Mathematics. He is also co-director of the Penn Research in Machine Learning (PRiML) Center. He received his Ph.D. from Stanford University in 2016 and his bachelor's degree from Peking University in 2011. His research focuses on the statistical foundations of generative AI, privacy-preserving data analysis, high-dimensional statistics, and optimization. He serves as an associate editor for several leading journals, including Operations Research, Journal of the American Statistical Association, Journal of Machine Learning Research, and Annals of Applied Statistics. He is currently serving as Integrity Chair for ICML 2026. His work has been recognized with numerous honors, including the NSF CAREER Award, Sloan Research Fellowship, SIAM Early Career Prize in Data Science, ASA Noether Early Career Award, IMS Peter Hall Prize, IMS Fellowship, and the COPSS Presidents' Award.



    ------------------------------
    Ying Lin
    Associate Professor
    Department of Industrial Engineering
    University of Houston
    Houston, TX
    ------------------------------

    Attachment(s)

    pdf
    Weijie Su.pdf   989 KB 1 version