Exploring the Limits and Possibilities of Legal LLM Evaluation
Aileen Nielsen (Visiting Assistant Professor of Law, Harvard University)
April 22, 2026 6:00 pm - 7:00 pm
Academic Conference Room, 11/F Cheng Yu Tung Tower, The University of Hong Kong
AI Ethics and Governance
Speaker: Aileen Nielsen (Visiting Assistant Professor of Law, Harvard University)
This article evaluates recent empirical efforts to benchmark large language model (LLM) performance on legal tasks. I analyze some methodological assumptions underlying these studies, drawing on contemporary computer science literature and social science experimental design principles to do so. Focusing on replicability, robustness, and construct validity, I next examine how and whether current benchmarking protocols can advance our understanding of appropriate methodological choices in assessing LLM capacity to perform legal work. I conclude with a synthesis of and proposed refinements to existing best practice recommendations for LLM legal benchmarking.
Aileen Nielsen is a Visiting Assistant Professor at Harvard Law School, where she teaches privacy law and torts. Her research focuses on the interplay of law and technology, drawing on empirical methods and private law topics. She holds degrees in anthropology, physics, and law. She has written two trade books on machine learning and has also worked in industry as a data scientist. She is a member of the New York bar.
Moderator: Benjamin Chen, Associate Professor & Director of the Law and Technology Centre, The University of Hong Kong Faculty of Law
To register, please go to https://hkuems1.hku.hk/hkuems/ec_regform.aspx?guest=Y&UEID=105896. A paper will be circulated in advance and attendees will be expected to have read the paper before the seminar.
We are applying for a CPD point with the Law Society of Hong Kong.
For inquiries, please contact Ms. Grace Chan at mcgrace@hku.hk / 39174727.
More Events
Piotr Staszczyk (Référendaire (Legal Clerk) Court of Justice of the European Union, Luxembourg)
Law in the Digital Age: The EU’s Approach to Regulating AI and Online Platforms
Bertram F. Malle (Professor, Cognitive and Psychological Sciences (CoPsy) Department Brown University)
What is Morality?
Bertram F. Malle (Professor, Cognitive and Psychological Sciences (CoPsy) Department Brown University)