When
Where
Latent Neural Coupling of Risk and Time Preferences in LLMs
Abstract: Large language models (LLMs) increasingly act as proxy decision makers, but it is unclear whether their economic choices are organized by structured internal representations rather than surface-level pattern matching. Across several current LLMs, we show that risk preference is encoded as an approximately one-dimensional linear direction in activation space—a single latent coordinate that moves behavior from risk aversion to risk seeking across many lotteries. Using standard prospect-theory, intertemporal-choice, and dictator-game vignettes, we first elicit familiar patterns such as risk aversion and temporal discounting. We then train a sparse linear probe with contrastive prompts (risk-averse vs. risk-seeking) to identify a risk axis whose projection reliably predicts and orders lottery choices. Crucially, small perturbations along this axis shift choices in the intended direction without changing the vignettes, providing causal evidence that the axis is part of the decision mechanism. These interventions also move intertemporal choice: steering toward risk seeking lowers implied discount rates, while steering toward risk aversion raises them—reversing the positive risk-aversion/patience correlation commonly observed in humans and suggesting a different internal coupling between risk and patience in LLMs. In contrast, dictator-game allocations change weakly and in a model-specific manner, indicating that social preferences are represented in largely separate subspaces. Together, the results identify an interpretable latent coordinate for risk, show it also governs time preference, and provide a practical probe-and-steer method for mapping economic traits in LLMs.
Bio: Yan Leng is an Assistant Professor at the McCombs School of Business, The University of Texas at Austin, with courtesy appointments in the Computer Science Department and the School of Information. Her research lies at the intersection of computational social science, network science, and interpretable machine learning, using large-scale behavioral and platform data to study decision-making and interactions in networks. She also studies large language models from a behavioral-economics perspective. Her work appears in venues such as Management Science, Information Systems Research, ICML, and NeurIPS, and has been supported by the National Science Foundation and the National Institutes of Health.