I am a PhD student at the College of Computer Science and Technology, Zhejiang University (浙江大学计算机学院).
I am now working on the Audio Research Team at Zhejiang University, under the supervision of Prof. Zhou Zhao (赵洲). My current research focuses on spatial audio generation based on multi-modal prompts.
I graduated from Chu Kochen Honors College, Zhejiang University (浙江大学竺可桢学院), with dual bachelor's degrees in Computer Science and Automation.
I also worked as a visiting scholar at University of Massachusetts Amherst, collaborating with Prof. Przemyslaw Grabowicz.
My research interests primarily focus on Multi-Modal Generative AI, specifically in Singing and Music Synthesis, and Spatial Audio Generation. I have published first-author papers at top international AI conferences, including NeurIPS, AAAI, and EMNLP.
I am actively seeking postdoctoral positions and research collaborations. Please feel free to contact me via email at [email protected].
- Personal Pages: https://aaronz345.github.io (updated recently🔥)
- Linkedin: www.linkedin.com/in/yuzhang34
- Google Scholar: https://scholar.google.com/citations?user=kA9A6LsAAAAJ
- DBLP: https://dblp.org/pid/50/671-126.html
NeurIPS 2024 Spotlight
GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks, Yu Zhang, Changhao Pan, Wenxinag Guo, et al.EMNLP 2024
TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control, Yu Zhang, Ziyue Jiang, Ruiqi Li, et al.AAAI 2024
StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis, Yu Zhang, Rongjie Huang, Ruiqi Li, et al.AAAI 2025
TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching, Wenxiang Guo, Yu Zhang, Changhao Pan, et al.ACL 2024
Robust Singing Voice Transcription Serves Synthesis, Ruiqi Li, Yu Zhang, Yongqi Wang, et al.