AI-Assisted Assessment in Education: Transforming Assessment and Measuring Learning
Artificial intelligence is reshaping how teachers gauge learning, and this new volume shows you how to harness that change responsibly. Written by assessment veteran Dr. Goran Trajkovski and psychometrician Dr. Heather Hayes, the book starts with plain-language explanations of machine learning in assessment, then walks through practical methods for building AI-driven question formats, running adaptive tests, and interpreting the resulting data. Each chapter blends research findings with field examples to help you redesign quizzes, projects, and high-stakes exams for greater accuracy and speed. (SpringerLink)
Beyond technique, the authors devote full sections to fairness, privacy, and the human judgment that must guide any automated scoring system. Case studies from K-12, higher education, and workforce training reveal the real-world gains and trade-offs institutions face when algorithms give real-time feedback or flag at-risk learners. A concluding “Practical Guide” offers step-by-step checkpoints for piloting AI tools in your own courses or credential programs. (SpringerLink)
Researchers, administrators, instructional designers, and policy makers will find actionable advice for scaling personalized, equitable assessment while meeting emerging regulations. If you are ready to move past manual grading and toward data-rich evidence of learning, this book provides the roadmap.
Assessment Under Adversarial Pressure: Why AI Broke the Evidence Model in Higher Education
Generative AI did not create a cheating problem in higher education. It created an evidence problem.
For decades, assessment systems operated under a cooperative assumption: producing a convincing academic artifact required the cognitive effort the artifact was intended to represent. Essays, research papers, problem sets, and capstone projects served as reliable proxies for learning because the cost of producing them without learning was prohibitively high. That structural condition no longer holds. Generative AI severed the inferential link between what students produce and what students know, and it did so at a speed and scale that most institutions have not yet fully reckoned with.
The two dominant responses — detection and assignment redesign — fail for structural reasons that will not improve with better tools or more creative prompts. Detection tools operate in a permanently asymmetric arms race where evasion costs approach zero while detection costs compound indefinitely. Redesigned assignments, however pedagogically valuable, do not restore the broken inference chain if the final artifact can still be delegated.
This book argues for a different approach: adversarial assessment. Drawing on validity theory, evidence-centered design, and security architecture, it introduces a framework for designing assessment systems that produce trustworthy evidence of learning regardless of what tools students have access to. The core principle is structural rather than behavioral: instead of policing tool use, institutions should engineer the conditions under which evidence is generated — supervision, temporal coupling, interactional verification, and evidentiary redundancy.
The book proceeds in three movements. Chapters 1 and 2 close the door on detection and cosmetic redesign, establishing why both fail under adversarial conditions. Chapters 3 through 6 develop the adversarial assessment framework — from conceptual reframe through practical design patterns, institutional scaling, and accreditation alignment. Chapters 7 through 9 address leadership decisions, implementation sequencing, and the argument that the underlying evidence challenge extends well beyond generative AI.
Written for provosts, deans, assessment directors, accreditation liaisons, institutional effectiveness officers, and graduate students in higher education programs, the book includes an accreditation crosswalk mapping adversarial exposure across SACSCOC, HLC, MSCHE, and ABET standards, an institutional self-assessment diagnostic designed for committee use, and a glossary of key terms.
The argument is conservative in the best sense. It asks institutions to do what validity theory has always said they should do: verify the inference chain connecting observed performance to institutional claims about learning. Generative AI simply made the cost of not doing so visible.