Chun-Yi Kuan

bio.jpg

Hello! I’m Chun-Yi, a second-year Ph.D student at NTU Speech Processing and Machine Learning (SPML) Lab, supervised by Prof. Hung-yi Lee.

My research focuses on multi-modality large language models, exploring how to establish robust audio-language alignment to address recent trustworthiness issues, such as hallucination phenomena related to sound events in audio.

My previous research centered on text-guided speech generation tasks, investigating how to use textual information to guide the generation of high-quality speech with desired styles and prosody.

news

Jan 18, 2026 Our paper, “AQUA-Bench: Beyond Finding Answers to Knowing When There Are None in Audio Question Answering,” has been accepted to ICASSP 2026. See you in Barcelona!
May 28, 2025 Our paper, “Teaching Audio-Aware Large Language Models What Does Not Hear: Mitigating Hallucinations through Synthesized Negative Samples”, has been accepted to Interspeech 2025 🇳🇱.
Apr 15, 2024 Excited to share 🔱 Speech Trident- Awesome Speech LM

selected publications

  1. In Progress
    AQAScore: Evaluating Semantic Alignment in Text-to-Audio Generation via Audio Question Answering
    Chun-Yi Kuan , Kai-Wei Chang , and Hung-yi Lee
    arXiv preprint arXiv:2601.14728, 2026
  2. AQUA-Bench: Beyond Finding Answers to Knowing When There Are None in Audio Question Answering
    Chun-Yi Kuan , and Hung-yi Lee
    In ICASSP 2026-2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2026
  3. TASLP
    From Alignment to Advancement: Bootstrapping Audio-Language Alignment With Synthetic Data
    Chun-Yi Kuan , and Hung-yi Lee
    In IEEE/ACM Transactions on Audio, Speech, and Language Processing , 2025
  4. Teaching Audio-Aware Large Language Models What Does Not Hear: Mitigating Hallucinations through Synthesized Negative Samples
    Chun-Yi Kuan , and Hung-yi Lee
    In 2025 Conference of the International Speech Communication Association (INTERSPEECH) , 2025
  5. Gender Bias in Instruction-Guided Speech Synthesis Models
    Chun-Yi Kuan , and Hung-yi Lee
    In 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics , 2025
  6. Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning
    Chun-Yi Kuan , and Hung-yi Lee
    In ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2025
  7. Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
    Chun-Yi Kuan , Chih-Kai Yang , Wei-Ping Huang , and 2 more authors
    In IEEE Spoken Language Technology Workshop 2024 (SLT) , 2024
  8. Large Language Model as an Assignment Evaluator: Insights, Feedback, and Challenges in a 1000+ Student Course
    Cheng-Han Chiang , Wei-Chih Chen , Chun-Yi Kuan , and 2 more authors
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , Nov 2024
  9. Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models
    Chun-Yi Kuan , Wei-Ping Huang , and Hung-yi Lee
    In 2024 Conference of the International Speech Communication Association (INTERSPEECH) , Nov 2024
  10. Towards General-Purpose Text-Instruction-Guided Voice Conversion
    Chun-Yi Kuan , Chen-An Li , Tsu-Yuan Hsu , and 5 more authors
    In 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) , Nov 2023