I am a DPhil student in computer science at the University of Oxford, conducting research in Natural Language Processing and Machine Learning.
I am grateful to be advised by Philip Torr and Yarin Gal from Oxford and Hinrich Schütze from LMU Munich as a student of the European Laboratory for Learning and Intelligent Systems (ELLIS Society).
I've served on the program committee in AAAI 2023-2026, and as a reviewer for ACL ARR 2023-2025 and NeurIPS 2025.
Measuring what Matters: Construct Validity in Large Language Model Benchmarks PDF
Andrew M. Bean, Ryan Othniel Kearns, Angelika Romanou, Franziska Sofia Hafner, Harry Mayne, Jan Batzner, Negar Foroutan, Chris Schmitz, Karolina Korgul, Hunar Batra, Oishi Deb, Emma Beharry, Cornelius Emde, Thomas Foster, Anna Gausen, María Grandury, Simeng Han, Valentin Hofmann, Lujain Ibrahim, Hazel Kim, Hannah Rose Kirk, Fangru Lin, Gabrielle Kaili-May Liu, Lennart Luettgau, Jabez Magomere, Jonathan Rystrøm, Anna Sotnikova, Yushi Yang, Yilun Zhao, Adel Bibi, Antoine Bosselut, Ronald Clark, Arman Cohan, Jakob Nicolaus Foerster, Yarin Gal, Scott A. Hale, Inioluwa Deborah Raji, Christopher Summerfield, Philip Torr, Cozmin Ududec, Luc Rocher, Adam Mahdi
NeurIPS 2025 Datasets & Benchmarks
Detecting LLM Hallucination through Layer-wise Information Deficiency PDF
Hazel Kim, Tom A. Lamb, Adel Bibi, Philip Torr, Yarin Gal
EMNLP 2025
ATHENA: Mathematical Reasoning with Thought Expansion PDF
JB. Kim, Hazel Kim, Joonghyuk Hahn, Yo-Sub Han
EMNLP 2023
ALP: Data Augmentation Using Lexicalized PCFGs for Few-Shot Text Classification PDF
Hazel Kim, Daecheol Woo, Seong Joon Oh, Jeong-Won Cha, Yo-Sub Han
AAAI 2022
LST: Lexicon-Guided Self-Training for Few-Shot Text Classification PDF
{Hazel Kim, Jaeman Son}*, Yo-Sub Han
Arxiv
Measuring what Matters: Construct Validity in Large Language Model Benchmarks PDF
Andrew M. Bean, Ryan Othniel Kearns, Angelika Romanou, Franziska Sofia Hafner, Harry Mayne, Jan Batzner, Negar Foroutan, Chris Schmitz, Karolina Korgul, Hunar Batra, Oishi Deb, Emma Beharry, Cornelius Emde, Thomas Foster, Anna Gausen, María Grandury, Simeng Han, Valentin Hofmann, Lujain Ibrahim, Hazel Kim, Hannah Rose Kirk, Fangru Lin, Gabrielle Kaili-May Liu, Lennart Luettgau, Jabez Magomere, Jonathan Rystrøm, Anna Sotnikova, Yushi Yang, Yilun Zhao, Adel Bibi, Antoine Bosselut, Ronald Clark, Arman Cohan, Jakob Nicolaus Foerster, Yarin Gal, Scott A. Hale, Inioluwa Deborah Raji, Christopher Summerfield, Philip Torr, Cozmin Ududec, Luc Rocher, Adam Mahdi
NeurIPS 2025 Datasets & Benchmarks
Detecting LLM Hallucination through Layer-wise Information Deficiency PDF
Hazel Kim, Tom A. Lamb, Adel Bibi, Philip Torr, Yarin Gal
EMNLP 2025
ATHENA: Mathematical Reasoning with Thought Expansion PDF
JB. Kim, Hazel Kim, Joonghyuk Hahn, Yo-Sub Han
EMNLP 2023
ALP: Data Augmentation Using Lexicalized PCFGs for Few-Shot Text Classification PDF
Hazel Kim, Daecheol Woo, Seong Joon Oh, Jeong-Won Cha, Yo-Sub Han
AAAI 2022
LST: Lexicon-Guided Self-Training for Few-Shot Text Classification PDF
{Hazel Kim, Jaeman Son}*, Yo-Sub Han
Arxiv
Measuring what Matters: Construct Validity in Large Language Model Benchmarks PDF
Andrew M. Bean, Ryan Othniel Kearns, Angelika Romanou, Franziska Sofia Hafner, Harry Mayne, Jan Batzner, Negar Foroutan, Chris Schmitz, Karolina Korgul, Hunar Batra, Oishi Deb, Emma Beharry, Cornelius Emde, Thomas Foster, Anna Gausen, María Grandury, Simeng Han, Valentin Hofmann, Lujain Ibrahim, Hazel Kim, Hannah Rose Kirk, Fangru Lin, Gabrielle Kaili-May Liu, Lennart Luettgau, Jabez Magomere, Jonathan Rystrøm, Anna Sotnikova, Yushi Yang, Yilun Zhao, Adel Bibi, Antoine Bosselut, Ronald Clark, Arman Cohan, Jakob Nicolaus Foerster, Yarin Gal, Scott A. Hale, Inioluwa Deborah Raji, Christopher Summerfield, Philip Torr, Cozmin Ududec, Luc Rocher, Adam Mahdi
NeurIPS 2025 Datasets & Benchmarks
Detecting LLM Hallucination through Layer-wise Information Deficiency PDF
Hazel Kim, Tom A. Lamb, Adel Bibi, Philip Torr, Yarin Gal
EMNLP 2025
ATHENA: Mathematical Reasoning with Thought Expansion PDF
JB. Kim, Hazel Kim, Joonghyuk Hahn, Yo-Sub Han
EMNLP 2023
ALP: Data Augmentation Using Lexicalized PCFGs for Few-Shot Text Classification PDF
Hazel Kim, Daecheol Woo, Seong Joon Oh, Jeong-Won Cha, Yo-Sub Han
AAAI 2022
LST: Lexicon-Guided Self-Training for Few-Shot Text Classification PDF
{Hazel Kim, Jaeman Son}*, Yo-Sub Han
Arxiv
Wonderful People I've Met: I was fortunate to start researching with consistent support from Yosub Han. I got great insights into what and how to research from Seong Joon Oh. I was delighted to work with humble yet astute advice from Sangdoo Yun and lucky to have inspiring discussions with Kangmin Yoo. I was happy to mentor enthusiastic JB. Kim in writing his first paper. I enjoyed working on my first paper with Daecheol Woo's sincerity and positive mindset. I thankfully met many great people while conducting research! I appreciate all of my collaborators for supporting me in many different ways :)
Acknowledgement
This website uses the website design and template by Martin Saveski