AIED
Can We Trust AI's Self-Assessment? Evaluating and Improving LLM Confidence Calibration in Educational Dialogue Coding
Mon Jun 29, 4:35 PM–5:00 PM · North 203
Generative AI & Large Language Models Explainable & Trustworthy AI NLP & Discourse Analysis Automated Assessment & Scoring
Evaluates whether LLMs' self-reported confidence is well-calibrated when coding educational dialogue, and proposes methods to improve calibration.