YuE: Scaling Open Foundation Models for Long-Form Music Generation Paper β’ 2503.08638 β’ Published Mar 11, 2025 β’ 71
Efficient Audio Captioning with Encoder-Level Knowledge Distillation Paper β’ 2407.14329 β’ Published Jul 19, 2024 β’ 5
SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound Paper β’ 2405.00233 β’ Published Apr 30, 2024 β’ 17
FlashSpeech: Efficient Zero-Shot Speech Synthesis Paper β’ 2404.14700 β’ Published Apr 23, 2024 β’ 32
AudioSR: Versatile Audio Super-resolution at Scale Paper β’ 2309.07314 β’ Published Sep 13, 2023 β’ 28
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining Paper β’ 2308.05734 β’ Published Aug 10, 2023 β’ 37
MusicLDM: Enhancing Novelty in Text-to-Music Generation Using Beat-Synchronous Mixup Strategies Paper β’ 2308.01546 β’ Published Aug 3, 2023 β’ 18
WavJourney: Compositional Audio Creation with Large Language Models Paper β’ 2307.14335 β’ Published Jul 26, 2023 β’ 44
AudioLDM: Text-to-Audio Generation with Latent Diffusion Models Paper β’ 2301.12503 β’ Published Jan 29, 2023 β’ 1