Skip to content

Commit

Permalink
Merge branch '2noise:main' into webui
Browse files Browse the repository at this point in the history
  • Loading branch information
cronrpc authored May 29, 2024
2 parents d53272c + f4c8329 commit 8a544a7
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 0 deletions.
8 changes: 8 additions & 0 deletions ChatTTS/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,9 @@ def _load(
assert gpt_ckpt_path, 'gpt_ckpt_path should not be None'
gpt.load_state_dict(torch.load(gpt_ckpt_path, map_location='cpu'))
self.pretrain_models['gpt'] = gpt
spk_stat_path = os.path.join(os.path.dirname(gpt_ckpt_path), 'spk_stat.pt')
assert os.path.exists(spk_stat_path), f'Missing spk_stat.pt: {spk_stat_path}'
self.pretrain_models['spk_stat'] = torch.load(spk_stat_path).to(device)
self.logger.log(logging.INFO, 'gpt loaded.')

if decoder_config_path:
Expand Down Expand Up @@ -144,6 +147,11 @@ def infer(
wav = [self.pretrain_models['vocos'].decode(i).cpu().numpy() for i in mel_spec]

return wav

def sample_random_speaker(self, ):

dim = self.pretrain_models['gpt'].gpt.layers[0].mlp.gate_proj.in_features
std, mean = self.pretrain_models['spk_stat'].chunk(2)
return torch.randn(dim, device=std.device) * std + mean


3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,3 +126,6 @@ In the current released model, the only token-level control units are [laugh], [
- [fish-speech](https://github.com/fishaudio/fish-speech) reveals capability of GVQ as audio tokenizer for LLM modeling.
- [vocos](https://github.com/gemelo-ai/vocos) which is used as a pretrained vocoder.

---
## Special Appreciation
- [wlu-audio lab](https://audio.westlake.edu.cn/) for early algorithm experiments.
3 changes: 3 additions & 0 deletions README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,3 +128,6 @@ audio_array_en = chat.infer(inputs_en, params_refine_text=params_refine_text)
- [fish-speech](https://github.com/fishaudio/fish-speech)一个优秀的自回归TTS模型, 揭示了GVQ用于LLM任务的可能性.
- [vocos](https://github.com/gemelo-ai/vocos)作为模型中的vocoder.

---
## 特别致谢
- [wlu-audio lab](https://audio.westlake.edu.cn/)为我们提供了早期算法试验的支持.

0 comments on commit 8a544a7

Please sign in to comment.