Emotion and style transfer by cloning. Cross-language voice cloning. Multi-lingual speech generation. 24khz sampling rate. Updates over XTTS-v1 2 new languages; Hungarian and Korean Architectural improvements for speaker conditioning. Enables the use of multiple speaker references and interpolation betwe...
Windows natural TTS voices are pretty good, but lack proper cadence and emotion. Also, I did not want to pay for API, but rather have it run locally. XTTS is excellent in that regard, and seems to pick up on cues without even feeding it any additional information. The sound quality is...
emotion.capitalize() body = { "text": message.text, "name": "unnamed", "emotion": emotion, } if self.voice_prompt: body["prompt"] = self.voice_prompt if self.voice_id: body["voice_id"] = self.voice_id url, headers, body = self.get_request(message.text) create_speech_span = ...
First, it is a GUI editor for the Talkinghead emotion templates. Secondly, it can batch-generate static emotion sprites from a single Talkinghead image. The latter can be convenient if you want the convenience of AI-powered posing (e.g. if you make new characters often), but don't want...
Default (6 emotions): nateraw/bert-base-uncased-emotionOther solid option is (28 emotions): joeddav/distilbert-base-uncased-go-emotions-studentFor Chinese language: touch20032003/xuyuan-trial-sentiment-bert-chinese --captioning-model Load a custom captioning model.Expects a HuggingFace model ID....