forked from fishaudio/Bert-VITS2
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
9 changed files
with
21,230 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
*.bin.* filter=lfs diff=lfs merge=lfs -text | ||
*.lfs.* filter=lfs diff=lfs merge=lfs -text | ||
*.bin filter=lfs diff=lfs merge=lfs -text | ||
*.h5 filter=lfs diff=lfs merge=lfs -text | ||
*.tflite filter=lfs diff=lfs merge=lfs -text | ||
*.tar.gz filter=lfs diff=lfs merge=lfs -text | ||
*.ot filter=lfs diff=lfs merge=lfs -text | ||
*.onnx filter=lfs diff=lfs merge=lfs -text | ||
*.msgpack filter=lfs diff=lfs merge=lfs -text |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
--- | ||
language: | ||
- zh | ||
tags: | ||
- bert | ||
license: "apache-2.0" | ||
--- | ||
|
||
# Please use 'Bert' related functions to load this model! | ||
|
||
## Chinese BERT with Whole Word Masking | ||
For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**. | ||
|
||
**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)** | ||
Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu | ||
|
||
This repository is developed based on:https://github.com/google-research/bert | ||
|
||
You may also interested in, | ||
- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm | ||
- Chinese MacBERT: https://github.com/ymcui/MacBERT | ||
- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA | ||
- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet | ||
- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer | ||
|
||
More resources by HFL: https://github.com/ymcui/HFL-Anthology | ||
|
||
## Citation | ||
If you find the technical report or resource is useful, please cite the following technical report in your paper. | ||
- Primary: https://arxiv.org/abs/2004.13922 | ||
``` | ||
@inproceedings{cui-etal-2020-revisiting, | ||
title = "Revisiting Pre-Trained Models for {C}hinese Natural Language Processing", | ||
author = "Cui, Yiming and | ||
Che, Wanxiang and | ||
Liu, Ting and | ||
Qin, Bing and | ||
Wang, Shijin and | ||
Hu, Guoping", | ||
booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings", | ||
month = nov, | ||
year = "2020", | ||
address = "Online", | ||
publisher = "Association for Computational Linguistics", | ||
url = "https://www.aclweb.org/anthology/2020.findings-emnlp.58", | ||
pages = "657--668", | ||
} | ||
``` | ||
- Secondary: https://arxiv.org/abs/1906.08101 | ||
``` | ||
@article{chinese-bert-wwm, | ||
title={Pre-Training with Whole Word Masking for Chinese BERT}, | ||
author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping}, | ||
journal={arXiv preprint arXiv:1906.08101}, | ||
year={2019} | ||
} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
{ | ||
"architectures": [ | ||
"BertForMaskedLM" | ||
], | ||
"attention_probs_dropout_prob": 0.1, | ||
"bos_token_id": 0, | ||
"directionality": "bidi", | ||
"eos_token_id": 2, | ||
"hidden_act": "gelu", | ||
"hidden_dropout_prob": 0.1, | ||
"hidden_size": 1024, | ||
"initializer_range": 0.02, | ||
"intermediate_size": 4096, | ||
"layer_norm_eps": 1e-12, | ||
"max_position_embeddings": 512, | ||
"model_type": "bert", | ||
"num_attention_heads": 16, | ||
"num_hidden_layers": 24, | ||
"output_past": true, | ||
"pad_token_id": 0, | ||
"pooler_fc_size": 768, | ||
"pooler_num_attention_heads": 12, | ||
"pooler_num_fc_layers": 3, | ||
"pooler_size_per_head": 128, | ||
"pooler_type": "first_token_transform", | ||
"type_vocab_size": 2, | ||
"vocab_size": 21128 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]"} |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"init_inputs": []} |
Oops, something went wrong.