diff --git a/pycorrector/conv_seq2seq/README.md b/pycorrector/conv_seq2seq/README.md index 8f33ef75..709cdf53 100644 --- a/pycorrector/conv_seq2seq/README.md +++ b/pycorrector/conv_seq2seq/README.md @@ -1,4 +1,4 @@ -# Neural Text Error Correction with CNN Sequence-to-Sequence Model +# Neural Text Error Correction with Conv Seq2Seq Model ## Features @@ -25,16 +25,7 @@ The OOV words UNK in summaries are manually replaced with words in source articl cd conv_seq2seq python preprocess.py ``` - -- big train data - -download from https://pan.baidu.com/s/1BkDru60nQXaDVLRSr7ktfA 密码:m6fg [130W sentence pair,215MB] - - - -generate toy train data(`train.src` and `train.trg`) and valid data(`val.src` and `val.trg`), segment by char - - +result: ``` # train.src: 吸 烟 对 人 的 健 康 有 害 处 , 这 是 各 个 人 都 知 道 的 事 实 。 @@ -47,6 +38,15 @@ generate toy train data(`train.src` and `train.trg`) and valid data(`val.src` an 如 服 装 , 若 有 一 个 很 流 行 的 样 式 , 人 们 就 赶 快 地 追 求 。 ``` +- big train data + +download from https://pan.baidu.com/s/1BkDru60nQXaDVLRSr7ktfA 密码:m6fg [130W sentence pair,215MB] + + + +generate toy train data(`train.src` and `train.trg`) and valid data(`val.src` and `val.trg`), segment by char + + ## train