English | 简体中文
PaddleNLP aims to accelerate NLP applications through powerful model zoo, easy-to-use API with detailed tutorials. It's also the NLP best practice for PaddlePaddle 2.0 API system.
-
Rich and Powerful Model Zoo
- Our Model Zoo covers mainstream NLP applications, including Lexical Analysis, Syntactic Parsing, Machine Translation, Text Classification, Text Generation, Text Matching, General Dialogue and Question Answering etc.
-
Easy-to-use API
- The API is fully integrated with PaddlePaddle high-level API system. It minimizes the number of user actions required for common use cases like data loading, text pre-processing, training and evaluation. which enables you to deal with text problems more productively.
-
High Performance and Large-scale Training
- We provide a highly optimized ditributed training implementation for BERT with Fleet API, it can fully utilize GPU clusters for large-scale model pre-training. Please refer to our benchmark for more information.
-
Detailed Tutorials and Industrial Practices
- We offers detailed and interactable notebook tutorials to show you the best practices of PaddlePaddle 2.0.
- python >= 3.6
- paddlepaddle >= 2.0.0
pip install paddlenlp>=2.0.0rc
from paddlenlp.datasets import ChnSentiCorp
train_ds, test_ds = ChnSentiCorp.get_datasets(['train','test'])
from paddlenlp.embeddings import TokenEmbedding
wordemb = TokenEmbedding("w2v.baidu_encyclopedia.target.word-word.dim300")
print(wordemb.cosine_sim("国王", "王后"))
>>> 0.63395125
wordemb.cosine_sim("艺术", "火车")
>>> 0.14792643
from paddlenlp.transformers import ErnieModel, BertModel, RobertaModel, ElectraModel
ernie = ErnieModel.from_pretrained('ernie-1.0')
bert = BertModel.from_pretrained('bert-wwm-chinese')
roberta = RobertaModel.from_pretrained('roberta-wwm-ext')
electra = ElectraModel.from_pretrained('chinese-electra-small')
For more pretrained model selection, please refer to Pretrained-Models
- Word Embedding
- Lexical Analysis
- Language Model
- Text Classification
- Text Generation
- Semantic Matching
- Named Entity Recognition
- Text Graph
- General Dialogue
- Machine Translation
- Question Answering
Please refer to our official AI Studio account for more interactive tutorials: PaddleNLP on AI Studio
-
What's Seq2Vec? shows how to use LSTM to do sentiment analysis.
-
Sentiment Analysis with ERNIE shows how to exploit the pretrained ERNIE to make sentiment analysis better.
-
Waybill Information Extraction with BiGRU-CRF Model shows how to make use of bigru and crf to do information extraction.
-
Waybill Information Extraction with ERNIE shows how to exploit the pretrained ERNIE to do information extraction better.
Join our QQ Technical Group for technical exchange right now! ⬇️
PaddleNLP is provided under the Apache-2.0 License.