Skip to content

Commit

Permalink
update english readme
Browse files Browse the repository at this point in the history
  • Loading branch information
ZeyuChen committed Jun 5, 2021
1 parent 3b02983 commit 07e7a6a
Show file tree
Hide file tree
Showing 5 changed files with 67 additions and 77 deletions.
8 changes: 5 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,7 @@ PaddleNLP的所有模型均采用PaddlePaddle 2.0全新API体系实现,通过

| 模型 | 简介 |
| :---------- | ------- |
| [STACL](./examples/simultaneous_translation/stacl) :star:| [STACL](https://www.aclweb.org/anthology/P19-1289/)是基于Prefix-to-Prefix框架的同传翻译模型,具备一定的隐式预测能力;结合Wait-k策略在保持较高的翻译质量的同时实现任意字级别的翻译延迟,并提供了可视化的Demo|
| [STACL](./examples/simultaneous_translation/stacl) :star:| [STACL](https://www.aclweb.org/anthology/P19-1289/)是基于Prefix-to-Prefix框架的同传翻译模型,结合Wait-k策略在保持较高的翻译质量的同时实现任意字级别的翻译延迟,并提供了轻量级同声传译系统搭建说明|

#### 对话系统 (Dialogue System)

Expand Down Expand Up @@ -241,9 +241,7 @@ PaddleNLP的所有模型均采用PaddlePaddle 2.0全新API体系实现,通过
更多教程参见[PaddleNLP on AI Studio](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/574995)


## 版本更新

更多版本更新说明请查看[ChangeLog](./docs/change_log.md)

## 社区贡献与技术交流

Expand All @@ -259,6 +257,10 @@ PaddleNLP的所有模型均采用PaddlePaddle 2.0全新API体系实现,通过
<img src="./docs/imgs/qq.png" width="200" height="200" />
</div>

## 版本更新

更多版本更新说明请查看[ChangeLog](./docs/changelog.md)

## License

PaddleNLP遵循[Apache-2.0开源协议](./LICENSE)
101 changes: 60 additions & 41 deletions README_en.md
Original file line number Diff line number Diff line change
@@ -1,39 +1,45 @@
English | [简体中文](./README.md)

<p align="center">
<img src="./docs/imgs/paddlenlp.png" width="520" height ="100" />
<img src="./docs/imgs/paddlenlp.png" width="720" height ="100" />
</p>

---------------------------------------------------------------------------------

------------------------------------------------------------------------------------------
[![PyPI - PaddleNLP Version](https://img.shields.io/pypi/v/paddlenlp.svg?label=pip&logo=PyPI&logoColor=white)](https://pypi.org/project/paddlenlp/)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/paddlenlp)](https://pypi.org/project/paddlenlp/)
[![PyPI Status](https://pepy.tech/badge/paddlenlp/month)](https://pepy.tech/project/paddlenlp)
![python version](https://img.shields.io/badge/python-3.6+-orange.svg)
![support os](https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-yellow.svg)
![GitHub](https://img.shields.io/github/license/paddlepaddle/paddlenlp)

## Introduction
## News <img src="./docs/imgs/news_icon.png" width="40"/>

PaddleNLP aims to accelerate NLP applications through powerful model zoo, easy-to-use API and high performance distributed training. It's also the NLP best practice for PaddlePaddle 2.0 API system.
* [2021-06-04] [ERNIE-Gram](https://arxiv.org/abs/2010.12148) pretrained model has been released! Install v2.0.2 to try it.
* [2021-05-20] PaddleNLP 2.0 has been officially relealsed! :tada: For more information please refer to [Release Note](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.0.0).

## Features
## Introduction

* **Powerful Model Zoo for Rich Senario**
- Our Model Zoo covers mainstream NLP applications, including Lexical Analysis, Text Classification, Text Generation, Text Matching, Text Graph, Information Extraction, Machine Translation, General Dialogue and Question Answering etc.
PaddleNLP 2.0 aims to accelerate NLP applications through powerful model zoo, easy-to-use API and high performance distributed training. We also provide NLP best practice based on PaddlePaddle 2.0 API system.

### Feature

* **Easy-to-Use and End-to-End API**
- The API is fully integrated with PaddlePaddle 2.0 high-level API system. It minimizes the number of user actions required for common use cases like data loading, text pre-processing, training and evaluation, which enables you to deal with text problems more productively.
- The API is fully integrated with PaddlePaddle 2.0 high-level API system. It minimizes the number of user actions required for common use cases like data loading, text pre-processing, transformer model loading, training and deployment, which enables you to deal with text problems more productively.

* **High Performance and Distributed Training**
- We provide a highly optimized ditributed training implementation for BERT with Fleet API, and mixed precision training strategy based on PaddlePaddle 2.0, it can fully utilize GPU clusters for large-scale model pre-training.
* **Rich Application Examples**
- Our Model Zoo covers mainstream NLP applications, including Lexical Analysis, Text Classification, Text Generation, Text Matching, Text Graph, Information Extraction, Machine Translation, General Dialogue and Question Answering etc.

* **High Performance Distributed Training**
- We provide a highly optimized ditributed training implementation for BERT with Fleet API, and mixed precision training strategy based on PaddlePaddle 2.0, it can fully utilize GPU clusters for large-scale model pre-training.


## Installation

### Prerequisites

* python >= 3.6
* paddlepaddle >= 2.0.1
* paddlepaddle >= 2.1

More information about PaddlePaddle installation please refer to [PaddlePaddle Install](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/conda/linux-conda.html)

Expand All @@ -43,14 +49,6 @@ More information about PaddlePaddle installation please refer to [PaddlePaddle I
pip install --upgrade paddlenlp -i https://pypi.org/simple
```

### Install from Source

```
pip install --upgrade git+https://github.com/PaddlePaddle/PaddleNLP.git
pip install --upgrade git+https://gitee.com/PaddlePaddle/PaddleNLP.git
```

## Quick Start

### Quick Dataset Loading
Expand All @@ -76,15 +74,16 @@ wordemb.cosine_sim("apple", "rail")
>>> 0.29207364
```

For more TokenEmbedding usage, please refer to [Embedding API](./docs/embeddings.md)
For more `TokenEmbedding` usage, please refer to [Embedding API](./docs/embeddings.md)

### Rich Chinese Pre-trained Models

```python
from paddlenlp.transformers import ErnieModel, BertModel, RobertaModel, ElectraModel, GPTForPretraining
from paddlenlp.transformers import *

ernie = ErnieModel.from_pretrained('ernie-1.0')
bert = BertModel.from_pretrained('bert-wwm-chinese')
albert = AlbertModel.from_pretrained('albert-chinese-tiny')
roberta = RobertaModel.from_pretrained('roberta-wwm-ext')
electra = ElectraModel.from_pretrained('chinese-electra-small')
gpt = GPTForPretraining.from_pretrained('gpt-cpm-large-cn')
Expand All @@ -105,35 +104,52 @@ text = tokenizer('自然语言处理')
pooled_output, sequence_output = model.forward(input_ids=paddle.to_tensor([text['input_ids']]))
```

## Model Zoo and Applications
### More API Usage

- [Transformer API](./docs/model_zoo/transformers.rst)
- [Data API](./docs/data.md)
- [Dataset API](./docs/datasets.md)
- [Embedding API](./docs/embeddings.md)
- [Metrics API](./docs/metrics.md)

Please find more API Reference from our [readthedocs](https://paddlenlp.readthedocs.io/).

For model zoo introduction please refer to[PaddleNLP Model Zoo](./docs/model_zoo.md). As for applicaiton senario please refer to [PaddleNLP Examples](./examples/)
## Rich Text Application Examples

PaddleNLP provide rich application examples covers mainstream NLP task to help developer accelerate problem solving.

### NLP Basic Technique

- [Word Embedding](./examples/word_embedding/)
- [Lexical Analysis](./examples/lexical_analysis/)
- [Named Entity Recognition](./examples/information_extraction/msra_ner/)
- [Language Model](./examples/language_model/)
- [Semantic Parsing (Text to SQL)](./examples/text_to_sql):star:


### NLP Core Technique

- [Text Classification](./examples/text_classification/)
- [Text Gneeration](./examples/text_generation/)
- [Semantic Maching](./examples/text_matching/)
- [Text Graph](./examples/text_graph/erniesage/)
- [Text Matching](./examples/text_matching/)
- [Text Generation](./examples/text_generation/)
- [Semantic Indexing](./examples/semantic_indexing/)
- [Information Extraction](./examples/information_extraction/)
- [General Dialogue](./examples/dialogue/)
- [Machine Translation](./examples/machine_translation/)
- [Machine Readeng Comprehension](./examples/machine_reading_comprehension/)

## Advanced Application
-

- [Model Compression](./examples/model_compression/)
### NLP Application in Real System

## API Usage
- [Sentiment Analysis](./examples/sentiment_analysis/skep/):star2:
- [General Dialogue System](./examples/dialogue/)
- [Machine Translation](./examples/machine_translation/)
- [Simultaneous Translation](././examples/simultaneous_translation/)
- [Machine Reading Comprehension](./examples/machine_reading_comprehension/)

- [Transformer API](./docs/model_zoo/transformers.rst)
- [Data API](./docs/data.md)
- [Dataset API](./docs/datasets.md)
- [Embedding API](./docs/embeddings.md)
- [Metrics API](./docs/metrics.md)
### Extention Application

- [Text Knowledge Linking](./examples/text_to_knowledge/):star2:
- [Machine Reading Comprehension](./examples/machine_reading_comprehension)
- [Model Compression](./examples/model_compression/)
- [Text Graph Learning](./examples/text_graph/erniesage/)
- [Time Series Prediction](./examples/time_series/)

## Tutorials

Expand All @@ -149,10 +165,9 @@ Please refer to our official AI Studio account for more interactive tutorials: [

* [Use TCN Model to predict COVID-19 confirmed cases](https://aistudio.baidu.com/aistudio/projectdetail/1290873)


## Community

### Special Interest Group(SIG)
### Special Interest Group (SIG)

Welcome to join [PaddleNLP SIG](https://iwenjuan.baidu.com/?code=bkypg8) for contribution, eg. Dataset, Models and Toolkit.

Expand All @@ -166,6 +181,10 @@ Join our QQ Technical Group for technical exchange right now! ⬇️
<img src="./docs/imgs/qq.png" width="200" height="200" />
</div>

## ChangeLog

For more information about our release, please refer to [ChangeLog](./docs/changelog.md)

## License

PaddleNLP is provided under the [Apache-2.0 License](./LICENSE).
31 changes: 0 additions & 31 deletions docs/change_log.md

This file was deleted.

2 changes: 1 addition & 1 deletion docs/get_started/quick_start.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

.. code-block::
>>> pip install --upgrade paddlenlp>=2.0.0rc -i https://pypi.org/simple
>>> pip install --upgrade paddlenlp -i https://pypi.org/simple
2. 一键加载预训练模型
========
Expand Down
2 changes: 1 addition & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

- **易用的文本领域API**

- 提供从数据集加载、文本预处理、模型组网、模型评估、到推理加速的领域API:如一键加载中文数据集的 **Dataset API**,可灵活高效地完成数据预处理的Data API,预置60+预训练词向量的**Embedding API**; 提供50+预训练模型的生态基础能力的**Transformer API**,可大幅提升NLP任务建模和迭代的效率。
- 提供从数据集加载、文本预处理、模型组网、模型评估、到推理加速的领域API:如一键加载中文数据集的 **Dataset API**,可灵活高效地完成数据预处理的Data API,预置60+预训练词向量的 **Embedding API**; 提供50+预训练模型的生态基础能力的 **Transformer API**,可大幅提升NLP任务建模和迭代的效率。

- **多场景的应用示例**

Expand Down

0 comments on commit 07e7a6a

Please sign in to comment.