Publication: Context Clues: Probing Proactive LLM Decision-Making in Ambiguous, Socially Contextualized Multi-Turn Interactions
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
This study investigates the behavior of large language models (LLMs) in dynamic, multi-turn decision-making scenarios under conditions of ambiguity and social contextualization. Using a dual-agent framework, LLMs assume the roles of both questioner and answerer in two tasks: a 20 Questions game and a medical diagnosis simulation. We analyze how gender cues and target ambiguity, quantified via semantic entropy, affect reasoning efficiency and conversational success. Results show that ChatGPT-4o and Gemini 2.0 Flash exhibit sensitivity to social context and ambiguity, with ChatGPT-4o demonstrating superior reasoning in high-stakes settings. The findings highlight implicit biases and reasoning inefficiencies introduced by demographic context and underscore the limitations of single-turn evaluation metrics. This work emphasizes the need for interaction-aware evaluation frameworks for LLMs acting as proactive agents in socially sensitive domains.