I am pursuing my PhD in Computer Science at the National University of Singapore (NUS) WING lab with research interests in human-centered and socially aware Natural Language Processing. I am fortunately supervised by Prof Kan Min-Yen from NUS, Dr Nancy F. Chen from ASTAR, and Prof Shafiq Joty from Salesforce. I am also having great collaborations with SALT Lab under the advice of Prof Diyi Yang from Stanford University. I was a visiting PhD at Stanford SALT Lab and NLP Group. I graduated with a bachelor in NUS.

My research interests include (1) Human-Centric and Data-Centric NLP, (2) NLP for Social Good and (3οΌ‰Socially Aware NLP System. I have published papers at the top tier NLP conferences like ACL and EMNLP.

I like playing piano🎹, table tennisπŸ“ and badminton🏸. I enjoy listening to music and watching talk shows because they deliver inspiration and positive energy through words and rhythms. I also like traveling and exploring the world/good food!πŸ˜‹

πŸ”₯ News

  • 2025.05: Β πŸŽ‰πŸŽ‰ Our paper (as first author) Mind the Gap: Static and Interactive Evaluations of Large Audio Models was accepted by ACL 2025.
  • 2025.05: Β πŸŽ‰πŸŽ‰ Our paper (as collaborator) Distilling an End-to-End Voice Assistant Without Instruction Training Data was accepted by ACL 2025.
  • 2024.11: Β πŸŽ‰πŸŽ‰ Our paper (as first author) DnA-Eval: Enhancing Large Language Model Evaluation through Decomposition and Aggregation was accepted by COLING 2025.
  • 2024.11: Β πŸŽ‰πŸŽ‰ I received Google PhD Fellowship (NLP direction)!
  • 2024.09: Β πŸŽ‰πŸŽ‰ I started my visit at Stanford!
  • 2024.05: Β πŸŽ‰πŸŽ‰ Our paper (as first author) Social Intelligence Data Infrastructure: Structuring the Present and Navigating the Future was accepted by ACL 2024 Findings.
  • 2024.01: Β πŸŽ‰πŸŽ‰ I received Research Achievement Award awarded by School of Computing!
  • 2023.12: Β πŸŽ‰πŸŽ‰ I passed my PhD Qualification Exam!
  • 2023.10: Β πŸŽ‰πŸŽ‰ Our paper (as first author) CoAnnotating: Uncertainty-Guided Work Allocation between Human and Large Language Models for Data Annotation was accepted by EMNLP 2023 Main Conference.
  • 2023.10: Β πŸŽ‰πŸŽ‰ Our paper (as collaborator) Retrieving Multimodal Information for Augmented Generation: A Survey was accepted by EMNLP 2023 Findings.
  • 2022.03: Β πŸŽ‰πŸŽ‰ Our paper (as first co-author) Inducing Positive Perspectives with Text Reframing was accepted by ACL 2022 Main Conference. We also received Outstanding Paper Award!

πŸ“ Publications

πŸŽ™ First (Co-)Author

2025
sym

(5) Mind the Gap: Static and Interactive Evaluations of Large Audio Models

Minzhi Li, William Barr Held (equal contribution), Michael J Ryan, Kunat Pipatanakul, Potsawee Manakul, Hao Zhu, Diyi Yang

ACL 2025

  • We introduce an interactive approach to evaluate Large Audio Models (LAMs) and collect 7,500 LAM interactions. We identify primary use cases for audio interfaces, analyze user preference rankings and qualitative feedback to determine which models best align with user needs. We also evaluate how static benchmarks predict interactive performance and reveal a clear need to develop LAM evaluations that better correlate with user preferences.
2025
sym

(4) DnA-Eval: Enhancing Large Language Model Evaluation through Decomposition and Aggregation

Minzhi Li, Zhengyuan Liu, Shumin Deng, Shafiq Joty, Nancy F Chen, Min-Yen Kan

COLING 2025

  • We propose Decompose and Aggregate, which breaks down the evaluation process into different stages based on pedagogical practices. Our experiments illustrate that it not only provides a more interpretable window for how well LLMs evaluate, but also leads to improvements up to 39.6% for different LLMs on a variety of meta-evaluation benchmarks.
2024
sym

(3) Social Intelligence Data Infrastructure: Structuring the Present and Navigating the Future

Minzhi Li, Weiyan Shi, Caleb Ziems, Diyi Yang

ACL 2024

  • We build a Social AI Data Infrastructure, which consists of a comprehensive social AI taxonomy and a data library of 480 NLP datasets. Our infrastructure allows us to analyze existing dataset efforts, and also evaluate language models’ performance in different social intelligence aspects. Our analyses demonstrate its utility in enabling a thorough understanding of current data landscape and providing a holistic perspective on potential directions for future dataset development.
2023
sym

(2) CoAnnotating: Uncertainty-Guided Work Allocation between Human and Large Language Models for Data Annotation

Minzhi Li, Taiwei Shi, Caleb Ziems, Min-Yen Kan, Nancy F. Chen, Zhengyuan Liu, Diyi Yang,*

EMNLP 2023

  • This work propose CoAnnotating, a novel paradigm for Human-LLM co-annotation of unstructured texts at scale. Under this framework, we utilize uncertainty to estimate LLMs’ annotation capability. Our empirical study shows CoAnnotating to be an effective means to allocate work from results on different datasets, with up to 21% performance improvement over random baseline.
2022
sym

(1) Inducing Positive Perspectives with Text Reframing

Caleb Ziems, Minzhi Li (equal contribution), Anthony Zhang, Diyi Yang

ACL 2022

  • We introduce a large-scale benchmark, Positive Psychology Frames, with 8,349 sentence pairs and 12,755 structured annotations to explain positive reframing in terms of six theoretically-motivated reframing strategies. Then we evaluate a set of state-of-the-art text style transfer models, and conclude by discussing key challenges and directions for future work.

πŸŽ™ Co-author

  • (1) Ruochen Zhao, Hailin Chen, Weishi Wang, Fangkai Jiao, Xuan Long Do, Chengwei Qin, Bosheng Ding, Xiaobao Guo, Minzhi Li, Xingxuan Li, Shafiq Joty, 2023. Retrieving Multimodal Information for Augmented Generation: A Survey. EMNLP Findings (2023): 2207397. link

πŸŽ– Honors and Awards

  • 2024.11 Google PhD Fellowship (NLP Direction)
  • 2024.01 Research Achievement Award
  • 2022.08 ACIS Scholarship Award
  • 2022.01 21/22 S1 Dean’s List Award
  • 2021.01 20/21 S1 Dean’s List Award
  • 2018.08 Science and Technology Merit Scholarship
  • 2013.11 MOE SM1 Scholarship

πŸ“– Educations

  • 2024.09 - 2025.09, Visiting Ph.D., Department of Computer Science, Stanford University
  • 2022.08 - Present, Ph.D., Department of Computer Science, School of Computing, National University of Singapore, Singapore
  • 2018.08 - 2022.05, B.S., Business Analytics, School of Computing, National University of Singapore, Singapore
  • 2014.01 - 2017.11, Dunman High School, Singapore.
  • 2010.08 - 2013.05, Nanjing Foreign Language School, Nanjing, China.

πŸ’¬ Research Interest

  • (1) Human-Centric and Data-Centric NLP
  • (2) NLP for Social Good
  • (3) Socially Aware NLP System