Skip to content

Commit

Permalink
upgrade README for paddlenlp 2.0
Browse files Browse the repository at this point in the history
  • Loading branch information
ZeyuChen committed Feb 9, 2021
1 parent 5f0821c commit 433869d
Show file tree
Hide file tree
Showing 2 changed files with 28 additions and 23 deletions.
23 changes: 12 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,18 +12,19 @@

## 简介

PaddleNLP 2.0拥有丰富的模型库、简洁易用的API与高性能的分布式训练的能力,旨在为飞桨开发者提升文本建模效率,并提供基于PaddlePaddle 2.0的NLP领域最佳实践。
PaddleNLP 2.0拥有**覆盖多场景的模型库****简洁易用的全流程API****动静统一的高性能分布式训练**能力,旨在为飞桨开发者提升文本领域建模效率,并提供基于PaddlePaddle 2.0的NLP领域最佳实践。

## 特性

- **丰富的模型库**
- 涵盖了NLP主流应用相关的前沿模型,包括中文词向量、预训练模型、词法分析文本分类文本匹配文本生成机器翻译通用对话、问答系统等,更多详细介绍请查看[PaddleNLP模型库](./docs/model_zoo.md)
- **覆盖多场景的模型库**
- PaddleNLP集成了RNN与Transformer等多种主流模型结构,涵盖从[词向量](./exmaples/word_embedding)[词法分析](./examples/lexical_analysis/)[命名实体识别](./examples/named_entity_recognition/)[语义表示](./examples/language_model)等NLP基础技术,到[文本分类](./examples/text_classification)[文本匹配](./examples/text_matching)[文本生成](./examples/text_generation)[文本图学习](./examples/text_graph/erniesage/README.md)等NLP核心技术。同时针对[机器翻译](./examples/machine_translation)[通用对话](./examples/dialogue/)[阅读理解](./exampels/machine_reading_comprehension/)等系统应用提供相应核心组件与预训练模型。更多详细介绍请查看[PaddleNLP应用示例](./examples/)

- **简洁易用的API**
- 深度兼容飞桨2.0的高层API体系,提供可复用的文本建模模块,可大幅度减少数据处理、组网、训练环节的代码开发量,提升文本建模开发效率。

- **高性能分布式训练**
- 通过深度优化的混合精度训练策略与Fleet分布式训练API,可充分利用GPU集群资源,高效完成大规模预训练模型的分布式训练。
- **简洁易用的全流程API**
- 深度兼容飞桨2.0的[高层API](https://www.paddlepaddle.org.cn/documentation/docs/zh/tutorial/quick_start/high_level_api/high_level_api.html)体系,内置可复用的文本建模模块([Embedding](./docs/embeddings.md), [CRF](./paddlenlp/layers/crf.py), [Seq2Vec](./paddlenlp/seq2vec/crf.py), [Transformer](./docs/transformers.md)),可大幅度减少在数据处理、模型组网、训练与评估、推理部署环节的开发量,提升NLP任务迭代与落地的效率。

- **动静统一的高性能分布式训练**
- 基于飞桨2.0核心框架『动静统一』的特性与领先的混合精度优化策略,结合Fleet分布式训练API,可充分利用GPU集群资源,高效完成大规模预训练模型的分布式训练。


## 安装
Expand Down Expand Up @@ -81,10 +82,10 @@ gpt2 = GPT2ForPretraining.from_pretrained('gpt2-base-cn')
## 模型库及其应用

PaddleNLP模型库整体介绍请参考文档[PaddleNLP Model Zoo](./docs/model_zoo.md).
模型应用场景介绍请参考[PaddleNLP Examples](./examples/README.md)
模型应用场景介绍请参考[PaddleNLP Examples](./examples/)

- [词向量](./examples/word_embedding/README.md)
- [词法分析](./examples/lexical_analysis/README.md)
- [词向量](./examples/word_embedding/)
- [词法分析](./examples/lexical_analysis/)
- [语言模型](./examples/language_model)
- [文本分类](./examples/text_classification/README.md)
- [文本生成](./examples/text_generation/README.md)
Expand All @@ -108,7 +109,7 @@ PaddleNLP模型库整体介绍请参考文档[PaddleNLP Model Zoo](./docs/model_
- [Dataset API](./docs/datasets.md)
* 数据集相关API,包含自定义数据集,数据集贡献与数据集快速加载等功能说明。
- [Embedding API](./docs/embeddings.md)
* 词向量相关API,支持一键快速加载包预训练的中文词向量,VisualDL高维可视化等功能说明
* 词向量相关API,支持一键快速加载包预训练的中文词向量,VisulDL高维可视化等功能说明
- [Metrics API](./docs/metrics.md)
* 针对NLP场景的评估指标说明,与飞桨2.0框架高层API兼容。

Expand Down
28 changes: 16 additions & 12 deletions README_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,21 +12,19 @@ English | [简体中文](./README.md)

## Introduction

PaddleNLP aims to accelerate NLP applications through powerful model zoo, easy-to-use API with detailed tutorials. It's also the NLP best practice for PaddlePaddle 2.0 API system.
PaddleNLP aims to accelerate NLP applications through powerful model zoo, easy-to-use API and high performance distributed training. It's also the NLP best practice for PaddlePaddle 2.0 API system.

## Features

* **Rich and Powerful Model Zoo**
- Our Model Zoo covers mainstream NLP applications, including Lexical Analysis, Syntactic Parsing, Machine Translation, Text Classification, Text Generation, Text Matching, General Dialogue and Question Answering etc.

* **Easy-to-use API**
* **Easy-to-use and End-to-End API**
- The API is fully integrated with PaddlePaddle high-level API system. It minimizes the number of user actions required for common use cases like data loading, text pre-processing, training and evaluation. which enables you to deal with text problems more productively.

* **High Performance and Large-scale Training**
- We provide a highly optimized ditributed training implementation for BERT with Fleet API, it can fully utilize GPU clusters for large-scale model pre-training. Please refer to our [benchmark](./benchmark/bert) for more information.
* **High Performance and Distributed Training**
- We provide a highly optimized ditributed training implementation for BERT with Fleet API, bnd based the mixed precision training strategy based on PaddlePaddle 2.0, it can fully utilize GPU clusters for large-scale model pre-training.

* **Detailed Tutorials and Industrial Practices**
- We offers detailed and interactable notebook tutorials to show you the best practices of PaddlePaddle 2.0.

## Installation

Expand Down Expand Up @@ -64,16 +62,18 @@ wordemb.cosine_sim("艺术", "火车")

### Rich Chinsese Pre-trained Models


```python
from paddlenlp.transformers import ErnieModel, BertModel, RobertaModel, ElectraModel
from paddlenlp.transformers import ErnieModel, BertModel, RobertaModel, ElectraModel, GPT2ForPretraining

ernie = ErnieModel.from_pretrained('ernie-1.0')
bert = BertModel.from_pretrained('bert-wwm-chinese')
roberta = RobertaModel.from_pretrained('roberta-wwm-ext')
electra = ElectraModel.from_pretrained('chinese-electra-small')
gpt2 = GPT2ForPretraining.from_pretrained('gpt2-base-cn')
```

For more pretrained model selection, please refer to [Pretrained-Models](./paddlenlp/transformers/README.md)
For more pretrained model selection, please refer to [Pretrained-Models](./docs/transformers.md)

## Model Zoo and Applications

Expand All @@ -89,6 +89,10 @@ For more pretrained model selection, please refer to [Pretrained-Models](./paddl
- [Machine Translation](./exmaples/machine_translation)
- [Question Answering](./exmaples/machine_reading_comprehension)

## Advanced Application

- [Model Compression](./examples/model_compression/)

## API Usage

- [Transformer API](./docs/transformers.md)
Expand All @@ -102,13 +106,13 @@ For more pretrained model selection, please refer to [Pretrained-Models](./paddl

Please refer to our official AI Studio account for more interactive tutorials: [PaddleNLP on AI Studio](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/574995)

* [What's Seq2Vec?](https://aistudio.baidu.com/aistudio/projectdetail/1283423) shows how to use LSTM to do sentiment analysis.
* [What's Seq2Vec?](https://aistudio.baidu.com/aistudio/projectdetail/1283423) shows how to use simple API to finish LSTM model and solve sentiment analysis task.

* [Sentiment Analysis with ERNIE](https://aistudio.baidu.com/aistudio/projectdetail/1294333) shows how to exploit the pretrained ERNIE to make sentiment analysis better.
* [Sentiment Analysis with ERNIE](https://aistudio.baidu.com/aistudio/projectdetail/1294333) shows how to exploit the pretrained ERNIE to solve sentiment analysis problem.

* [Waybill Information Extraction with BiGRU-CRF Model](https://aistudio.baidu.com/aistudio/projectdetail/1317771) shows how to make use of bigru and crf to do information extraction.
* [Waybill Information Extraction with BiGRU-CRF Model](https://aistudio.baidu.com/aistudio/projectdetail/1317771) shows how to make use of Bi-GRU plus CRF to finish information extraction task.

* [Waybill Information Extraction with ERNIE](https://aistudio.baidu.com/aistudio/projectdetail/1329361) shows how to exploit the pretrained ERNIE to do information extraction better.
* [Waybill Information Extraction with ERNIE](https://aistudio.baidu.com/aistudio/projectdetail/1329361) shows how to use ERNIE, the Chinese pre-trained model improve information extraction performance.


## Community
Expand Down

0 comments on commit 433869d

Please sign in to comment.