Skip to content

Commit

Permalink
Create tokenizer_config.json
Browse files Browse the repository at this point in the history
  • Loading branch information
Stardust-minus authored Oct 11, 2023
1 parent 95dc34f commit 7273c08
Showing 1 changed file with 10 additions and 0 deletions.
10 changes: 10 additions & 0 deletions bert/bert-base-japanese-v3/tokenizer_config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"tokenizer_class": "BertJapaneseTokenizer",
"model_max_length": 512,
"do_lower_case": false,
"word_tokenizer_type": "mecab",
"subword_tokenizer_type": "wordpiece",
"mecab_kwargs": {
"mecab_dic": "unidic_lite"
}
}

0 comments on commit 7273c08

Please sign in to comment.