megatts2+paper

2025-04-18 07:35:54

拼音 [ 拼音 ]

Mega-TTS 2:具有任意长度语音提示的零样本文本转语音,arXiv - CS...

零样本文本转语音旨在合成具有看不见的语音提示的声音。之前的大型多扬声器 TTS 模型已经成功实现了这一目标,可在 10 秒内完成注册录音。然而,它们中的大多数都被设计为仅使用简短的语音提示。简短语音提示中的有限信息极大地阻碍了细粒度身份模仿的性能。在本文中,我们介绍了 Mega-TTS 2,这是一种通用的零样本多...
...Mechanisms for Zero-Shot Speech Synthesis | Papers With Code

making it untransferable to each other. This paper introduces Mega-TTS 2, a generic prompting mechanism for zero-shot TTS, to tackle the aforementioned challenges. Specifically, we design a powerful acoustic autoencoder that separately encodes the prosody and timbre information into the compressed lat...