Chun-Yi Kuan

bio.jpg

Hello! I’m Chun-Yi, a first-year Ph.D student at NTU Speech Processing and Machine Learning (SPML) Lab, supervised by Prof. Hung-yi Lee.

My research focuses on multi-modality large language models, exploring how to establish robust audio-language alignment to address recent trustworthiness issues, such as hallucination phenomena related to sound events in audio. I’m also involved in the Dynamic-SUPERB project phase 1 and 2, which benchmarks the performance of large audio-language models across universal speech, audio, and music tasks.

My previous research centered on text-guided speech generation tasks, investigating how to use textual information to guide the generation of high-quality speech with desired styles and prosody.

news

May 28, 2025 Our paper, “Teaching Audio-Aware Large Language Models What Does Not Hear: Mitigating Hallucinations through Synthesized Negative Samples”, has been accepted to Interspeech 2025 🇳🇱.
Jul 16, 2024 🚀 Excited to share our real-world application of using LLMs as automatic assignment evaluators in our Intro to Generative AI course at NTU with over 1000 students! Led by Prof. Hung-yi Lee and with tremendous contributions from Cheng-Han Chiang as the head TA. His dedication was crucial to the success of this work. Check out our findings and insights here: https://arxiv.org/abs/2407.05216
Apr 15, 2024 Excited to share 🔱 Speech Trident- Awesome Speech LM

selected publications

  1. Teaching Audio-Aware Large Language Models What Does Not Hear: Mitigating Hallucinations through Synthesized Negative Samples
    Chun-Yi Kuan , and Hung-yi Lee
    2025
  2. Gender Bias in Instruction-Guided Speech Synthesis Models
    Chun-Yi Kuan , and Hung-yi Lee
    2025
  3. Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning
    Chun-Yi Kuan , and Hung-yi Lee
    In ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2025
  4. Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
    Chun-Yi Kuan , Chih-Kai Yang , Wei-Ping Huang , and 2 more authors
    2024
  5. Large Language Model as an Assignment Evaluator: Insights, Feedback, and Challenges in a 1000+ Student Course
    Cheng-Han Chiang , Wei-Chih Chen , Chun-Yi Kuan , and 2 more authors
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , Nov 2024
  6. Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models
    Chun-Yi Kuan , Wei-Ping Huang , and Hung-yi Lee
    Nov 2024
  7. Towards General-Purpose Text-Instruction-Guided Voice Conversion
    Chun-Yi Kuan , Chen-An Li , Tsu-Yuan Hsu , and 5 more authors
    In 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) , Nov 2023