Skip to content

Commit

Permalink
support all bert variation
Browse files Browse the repository at this point in the history
  • Loading branch information
zhanlaoban committed Dec 29, 2019
1 parent b7e7cee commit ecfd759
Show file tree
Hide file tree
Showing 68 changed files with 26,367 additions and 40 deletions.
57 changes: 51 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@



## Content
# Content

- dataset:存放数据集
- pretrained_models:存放预训练模型
Expand All @@ -15,15 +15,60 @@



# TODO
# Support

- [x] bert+fc
- [ ] bert+cnn
- [ ] bert+lstm
- [ ] bert+gru
- [x] bert+cnn
- [x] bert+lstm
- [x] bert+gru
- [ ] xlnet
- [ ] xlnet+cnn
- [ ] xlnet+lstm
- [ ] xlnet+gru



# Usage

按照THUNews数据集的例子修改processor即可。
## 1. 使用不同模型

**在shell文件中修改`model_type`参数即可指定模型**

如,BERT后接FC全连接层,则直接设置`model_type=bert`;BERT后接CNN卷积层,则设置`model_type=bert_cnn`.

在本README的`Performance`的model_type列中附本项目中各个预训练模型支持的`model_type`

最后,在终端直接运行shell文件即可,如:

```
bash run_classifier.sh
```

## 2. 使用自定义数据集

1.`dataset`文件夹里存放自定义的数据集文件夹,如`TestData`.
2. 在根目录下的`utils.py`中,仿照`class THUNewsProcessor`写一个自己的类,如命名为`class TestDataProcessor`,并在`tasks_num_labels`, `processors`, `output_modes`三个dict中添加相应内容.
3. 最后,在你需要运行的shell文件中修改TASK_NAME为你的任务名称,如`TestData`.



# Performance

| model_type | F1 | remark |
| ---------- | ---- | ------------ |
| bert | | BERT接FC层 |
| bert_cnn | | BERT接CNN层 |
| bert_lstm | | BERT接LSTM层 |
| bert_gru | | BERT接GRU层 |
| xlnet | | |
| albert | | |



# Download Chinese Pre-trained Models

[NPL_PEMDC](https://github.com/zhanlaoban/NLP_PEMDC)




2 changes: 2 additions & 0 deletions dataset/THUNews/5_100/README.MD
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
共5类
每类100条数据,按train/dev/test 划分为 80/10/10
51 changes: 51 additions & 0 deletions dataset/THUNews/5_100/dev.csv

Large diffs are not rendered by default.

51 changes: 51 additions & 0 deletions dataset/THUNews/5_100/test.csv

Large diffs are not rendered by default.

401 changes: 401 additions & 0 deletions dataset/THUNews/5_100/train.csv

Large diffs are not rendered by default.

2,501 changes: 2,501 additions & 0 deletions dataset/THUNews/5_5000/dev.csv

Large diffs are not rendered by default.

2,501 changes: 2,501 additions & 0 deletions dataset/THUNews/5_5000/test.csv

Large diffs are not rendered by default.

20,001 changes: 20,001 additions & 0 deletions dataset/THUNews/5_5000/train.csv

Large diffs are not rendered by default.

Loading

0 comments on commit ecfd759

Please sign in to comment.