Publications

And once the storm is over, you won’t remember how you made it through, how you managed to survive. You won’t even be sure, whether the storm is really over. But one thing is certain. When you come out of the storm, you won’t be the same person who walked in. That’s what this storm’s all about. ― Haruki Murakami, Kafka on the Shore.

2026

  1. In Progress
    AQAScore: Evaluating Semantic Alignment in Text-to-Audio Generation via Audio Question Answering
    Chun-Yi Kuan , Kai-Wei Chang , and Hung-yi Lee
    arXiv preprint arXiv:2601.14728, 2026
  2. AQUA-Bench: Beyond Finding Answers to Knowing When There Are None in Audio Question Answering
    Chun-Yi Kuan , and Hung-yi Lee
    In ICASSP 2026-2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2026

2025

  1. TASLP
    From Alignment to Advancement: Bootstrapping Audio-Language Alignment With Synthetic Data
    Chun-Yi Kuan , and Hung-yi Lee
    In IEEE/ACM Transactions on Audio, Speech, and Language Processing , 2025
  2. Teaching Audio-Aware Large Language Models What Does Not Hear: Mitigating Hallucinations through Synthesized Negative Samples
    Chun-Yi Kuan , and Hung-yi Lee
    In 2025 Conference of the International Speech Communication Association (INTERSPEECH) , 2025
  3. Gender Bias in Instruction-Guided Speech Synthesis Models
    Chun-Yi Kuan , and Hung-yi Lee
    In 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics , 2025
  4. Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning
    Chun-Yi Kuan , and Hung-yi Lee
    In ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2025

2024

  1. Dynamic-superb phase-2: A collaboratively expanding benchmark for measuring the capabilities of spoken language models with 180 tasks
    Chien-yu Huang , Wei-Chih Chen , Shu-wen Yang , and 8 more authors
    In The Thirteenth International Conference on Learning Representations (ICLR 2025) , 2024
  2. BRO 2024 SUM
    Building a Taiwanese Mandarin Spoken Language Model: A First Attempt
    Chih-Kai Yang , Yu-Kuan Fu , Chen-An Li , and 8 more authors
    2024
  3. Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
    Chun-Yi Kuan , Chih-Kai Yang , Wei-Ping Huang , and 2 more authors
    In IEEE Spoken Language Technology Workshop 2024 (SLT) , 2024
  4. Large Language Model as an Assignment Evaluator: Insights, Feedback, and Challenges in a 1000+ Student Course
    Cheng-Han Chiang , Wei-Chih Chen , Chun-Yi Kuan , and 2 more authors
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , Nov 2024
  5. Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models
    Yi-Cheng Lin , Tzu-Quan Lin , Chih-Kai Yang , and 4 more authors
    In IEEE Spoken Language Technology Workshop 2024 (SLT) , Nov 2024
  6. Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models
    Chun-Yi Kuan , Wei-Ping Huang , and Hung-yi Lee
    In 2024 Conference of the International Speech Communication Association (INTERSPEECH) , Nov 2024
  7. Dynamic-superb: Towards a dynamic, collaborative, and comprehensive instruction-tuning benchmark for speech
    Chien-yu Huang , Ke-Han Lu , Shih-Heng Wang , and 8 more authors
    In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , Nov 2024

2023

  1. Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision
    Chih-Kai Yang , Kuan-Po Huang , Ke-Han Lu , and 3 more authors
    In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , Nov 2023
  2. Towards General-Purpose Text-Instruction-Guided Voice Conversion
    Chun-Yi Kuan , Chen-An Li , Tsu-Yuan Hsu , and 5 more authors
    In 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) , Nov 2023