Chun-Yi Kuan

bio.jpg

Hello! I’m Chun-Yi, a master student at NTU Speech Processing and Machine Learning (SPML) Lab, supervised by Prof. Hung-yi Lee.

My research focuses on text-guided speech generation tasks, exploring how to utilize textual information to guide the generation of high-quality speech with desired styles and prosody. I’m also involved in the Dynamic-SUPERB project phase 1 and 2, which benchmarks the performance of large audio-language models across universal speech and audio tasks. The project not only includes common speech and audio tasks but also strives to create universal tasks that test the generalization capabilities of current large audio-language models.

news

Jul 16, 2024 🚀 Excited to share our real-world application of using LLMs as automatic assignment evaluators in our Intro to Generative AI course at NTU with over 1000 students! Led by Prof. Hung-yi Lee and with tremendous contributions from Cheng-Han Chiang as the head TA. His dedication was crucial to the success of this work. Check out our findings and insights here: https://arxiv.org/abs/2407.05216
Apr 15, 2024 Excited to share 🔱 Speech Trident- Awesome Speech LM

selected publications

  1. Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning
    Chun-Yi Kuan , and Hung-yi Lee
    In ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2025
  2. Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
    Chun-Yi Kuan , Chih-Kai Yang , Wei-Ping Huang , and 2 more authors
    2024
  3. Large Language Model as an Assignment Evaluator: Insights, Feedback, and Challenges in a 1000+ Student Course
    Cheng-Han Chiang , Wei-Chih Chen , Chun-Yi Kuan , and 2 more authors
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , Nov 2024
  4. Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models
    Chun-Yi Kuan , Wei-Ping Huang , and Hung-yi Lee
    Nov 2024
  5. Towards General-Purpose Text-Instruction-Guided Voice Conversion
    Chun-Yi Kuan , Chen-An Li , Tsu-Yuan Hsu , and 5 more authors
    In 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) , Nov 2023