FoL 2026 Seoul · Jun 27 – Jul 3

AIED

Measuring What Matters---or What’s Convenient?: Robustness of LLM-Based Scoring Systems to Construct-Irrelevant Factors

Mon Jun 29, 9:00 AM–9:15 AM · Room 101

Part of Validity and Reliability of AI-Based Educational Measurement

Automated Assessment & Scoring Psychometrics & Educational Measurement Generative AI & Large Language Models Explainable & Trustworthy AI

Probes whether LLM-based scoring systems are sensitive to construct-irrelevant factors, raising questions about what these systems actually measure.

Authors

Cole Walsh, Rodica Ivan