Skip to content

Commit

Permalink
Merge branch 'master' into master
Browse files Browse the repository at this point in the history
  • Loading branch information
LZYSaltedFish authored Feb 8, 2023
2 parents f7a869a + 89190f1 commit 8997c3b
Show file tree
Hide file tree
Showing 188 changed files with 40,817 additions and 1,080 deletions.
5 changes: 3 additions & 2 deletions .github/workflows/unit_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
# This workflow contains a single job called "build"
build:
# The type of runner that the job will run on
runs-on: ubuntu-latest
runs-on: ubuntu-20.04

# Steps represent a sequence of tasks that will be executed as part of the job
steps:
Expand All @@ -28,7 +28,8 @@ jobs:
- name: Set up Python 3.6
uses: actions/setup-python@v2
with:
python-version: 3.6
python-version: '3.6' # 指定python版本


# Runs a single command using the runners shell
- name: Run a one-line script
Expand Down
7 changes: 6 additions & 1 deletion README.cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,12 @@

随着 BERT、Megatron、GPT-3 等预训练模型在NLP领域取得瞩目的成果,越来越多团队投身到超大规模训练中,这使得训练模型的规模从亿级别发展到了千亿甚至万亿的规模。然而,这类超大规模的模型运用于实际场景中仍然有一些挑战。首先,模型参数量过大使得训练和推理速度过慢且部署成本极高;其次在很多实际场景中数据量不足的问题仍然制约着大模型在小样本场景中的应用,提高预训练模型在小样本场景的泛化性依然存在挑战。为了应对以上问题,PAI 团队推出了 EasyNLP 中文 NLP 算法框架,助力大模型快速且高效的落地。


- [PAI-Diffusion模型来了!阿里云机器学习团队带您徜徉中文艺术海洋](https://zhuanlan.zhihu.com/p/590020134)
- [模型精度再被提升,统一跨任务小样本学习算法 UPT 给出解法!](https://zhuanlan.zhihu.com/p/590611518)
- [Span抽取和元学习能碰撞出怎样的新火花,小样本实体识别来告诉你!](https://zhuanlan.zhihu.com/p/590297824)
- [算法 KECP 被顶会 EMNLP 收录,极少训练数据就能实现机器阅读理解](https://zhuanlan.zhihu.com/p/590024650)
- [当大火的文图生成模型遇见知识图谱,AI画像趋近于真实世界](https://zhuanlan.zhihu.com/p/581870071)
- [EasyNLP发布融合语言学和事实知识的中文预训练模型CKBERT](https://zhuanlan.zhihu.com/p/574853281)
- [EasyNLP带你实现中英文机器阅读理解](https://zhuanlan.zhihu.com/p/568890245)
- [跨模态学习能力再升级,EasyNLP电商文图检索效果刷新SOTA](https://zhuanlan.zhihu.com/p/568512230)
- [EasyNLP玩转文本摘要(新闻标题)生成](https://zhuanlan.zhihu.com/p/566607127)
Expand Down
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,11 @@ EasyNLP is an easy-to-use NLP development and application toolkit in PyTorch, fi

We have a series of technical articles on the functionalities of EasyNLP.


- [阿里云PAI-Diffusion功能再升级,全链路支持模型调优,平均推理速度提升75%以上](https://zhuanlan.zhihu.com/p/604483551)
- [PAI-Diffusion模型来了!阿里云机器学习团队带您徜徉中文艺术海洋](https://zhuanlan.zhihu.com/p/590020134)
- [模型精度再被提升,统一跨任务小样本学习算法 UPT 给出解法!](https://zhuanlan.zhihu.com/p/590611518)
- [Span抽取和元学习能碰撞出怎样的新火花,小样本实体识别来告诉你!](https://zhuanlan.zhihu.com/p/590297824)
- [算法 KECP 被顶会 EMNLP 收录,极少训练数据就能实现机器阅读理解](https://zhuanlan.zhihu.com/p/590024650)
- [当大火的文图生成模型遇见知识图谱,AI画像趋近于真实世界](https://zhuanlan.zhihu.com/p/581870071)
- [EasyNLP发布融合语言学和事实知识的中文预训练模型CKBERT](https://zhuanlan.zhihu.com/p/574853281)
- [EasyNLP带你实现中英文机器阅读理解](https://zhuanlan.zhihu.com/p/568890245)
Expand Down
29 changes: 25 additions & 4 deletions easynlp/appzoo/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,21 +25,24 @@
"text_match.model": ['TextMatch', 'TextMatchTwoTower', 'DistillatoryTextMatch', 'FewshotSingleTowerTextMatch', 'CptFewshotSingleTowerTextMatch'],
"data_augmentation.model": ["DataAugmentation"],
"geep_classification.model": ["GEEPClassification"],
"text2video_retrieval.model": ["Text2VideoRetrieval"],
"clip.model": ["CLIPApp"],
# "latent_diffusion.model": ["LatentDiffusion"],
"latent_diffusion.model": ["LatentDiffusion","StableDiffusion"],
"wukong_clip.model": ["WukongCLIP"],
"text2image_generation.model": ["TextImageGeneration", "TextImageGeneration_knowl"],
"image2text_generation.model": ['VQGANGPTImageTextGeneration', 'CLIPGPTImageTextGeneration'],
"video2text_generation.model": ['CLIPGPTFrameTextGeneration'],
"sequence_generation.model": ["SequenceGeneration"],
"machine_reading_comprehension.model": ["MachineReadingComprehension"],
"open_domain_dialogue.model": ["OpenDomainDialogue"],
"information_extraction.model": ["InformationExtractionModel"],

"sequence_classification.evaluator": ['SequenceClassificationEvaluator', 'SequenceMultiLabelClassificationEvaluator'],
"sequence_labeling.evaluator": ['SequenceLabelingEvaluator'],
"language_modeling.evaluator": ['LanguageModelingEvaluator'],
"text_match.evaluator": ['TextMatchEvaluator'],
"geep_classification.evaluator": ['GEEPClassificationEvaluator'],
"text2video_retrieval.evaluator": ['Text2VideoRetrievalEvaluator'],
"clip.evaluator": ['CLIPEvaluator'],
"wukong_clip.evaluator": ['WukongCLIPEvaluator'],
"text2image_generation.evaluator": ["TextImageGenerationEvaluator"],
Expand All @@ -48,25 +51,30 @@
"sequence_generation.evaluator": ["SequenceGenerationEvaluator"],
"machine_reading_comprehension.evaluator": ["MachineReadingComprehensionEvaluator"],
"open_domain_dialogue.evaluator": ["OpenDomainDialogueEvaluator"],
"information_extraction.evaluator": ["InformationExtractionEvaluator"],
"latent_diffusion.evaluator": ["LatentDiffusionModelEvaluator"],

"sequence_classification.predictor": ['SequenceClassificationPredictor', 'FewshotSequenceClassificationPredictor', 'CptFewshotSequenceClassificationPredictor'],
"sequence_labeling.predictor": ['SequenceLabelingPredictor'],
"feature_vectorization.predictor": ['FeatureVectorizationPredictor'],
"text_match.predictor": ['TextMatchPredictor', 'TextMatchTwoTowerPredictor', 'FewshotSingleTowerTextMatchPredictor', 'CptFewshotSingleTowerTextMatchPredictor'],
"data_augmentation.predictor": ['DataAugmentationPredictor'],
"geep_classification.predictor": ['GEEPClassificationPredictor'],
"text2video_retrieval.predictor": ['Text2VideoRetrievalPredictor'],
"clip.predictor": ['CLIPPredictor'],
# "latent_diffusion.predictor": ['LatentDiffusionPredictor'],
"latent_diffusion.predictor": ['LatentDiffusionPredictor'],
"wukong_clip.predictor": ['WukongCLIPPredictor'],
"text2image_generation.predictor": ['TextImageGenerationPredictor', 'TextImageGenerationKnowlPredictor'],
"image2text_generation.predictor": ['VQGANGPTImageTextGenerationPredictor', 'CLIPGPTImageTextGenerationPredictor'],
"video2text_generation.predictor": ['CLIPGPTFrameTextGenerationPredictor'],
"sequence_generation.predictor": ['SequenceGenerationPredictor'],
"machine_reading_comprehension.predictor": ["MachineReadingComprehensionPredictor"],
"open_domain_dialogue.predictor": ["OpenDomainDialoguePredictor"],
"information_extraction.predictor": ["InformationExtractionPredictor"],

"geep_classification.data": ['GEEPClassificationDataset'],
"language_modeling.data": ['LanguageModelingDataset'],
"text2video_retrieval.data": ['Text2VideoRetrievalDataset'],
"clip.data": ['CLIPDataset'],
"wukong_clip.data": ['WukongCLIPDataset'],
"sequence_classification.data": ['ClassificationDataset', 'DistillatoryClassificationDataset', 'FewshotSequenceClassificationDataset'],
Expand All @@ -78,6 +86,9 @@
"sequence_generation.data": ['SequenceGenerationDataset'],
"machine_reading_comprehension.data": ["MachineReadingComprehensionDataset"],
"open_domain_dialogue.data": ['OpenDomainDialogueDataset'],
"information_extraction.data": ["InformationExtractionDataset"],
"latent_diffusion.data": ["LdmDataset"],

"dataset": ['BaseDataset', 'GeneralDataset', 'load_dataset', 'list_datasets'],
"api": ['get_application_dataset', 'get_application_model', 'get_application_model_for_evaluation', 'get_application_evaluator', 'get_application_predictor'],
}
Expand All @@ -90,21 +101,24 @@
from .text_match.model import TextMatch, TextMatchTwoTower, DistillatoryTextMatch, FewshotSingleTowerTextMatch, CptFewshotSingleTowerTextMatch
from .data_augmentation.model import DataAugmentation
from .geep_classification.model import GEEPClassification
from .text2video_retrieval.model import Text2VideoRetrieval
from .clip.model import CLIPApp
# from .latent_diffusion.model import LatentDiffusion
from .latent_diffusion.model import LatentDiffusion,StableDiffusion
from .wukong_clip.model import WukongCLIP
from .text2image_generation.model import TextImageGeneration, TextImageGeneration_knowl
from .image2text_generation.model import VQGANGPTImageTextGeneration, CLIPGPTImageTextGeneration
from .video2text_generation.model import CLIPGPTFrameTextGeneration
from .sequence_generation.model import SequenceGeneration
from .machine_reading_comprehension.model import MachineReadingComprehension
from .open_domain_dialogue.model import OpenDomainDialogue
from .information_extraction.model import InformationExtractionModel

from .sequence_classification.evaluator import SequenceClassificationEvaluator, SequenceMultiLabelClassificationEvaluator
from .sequence_labeling.evaluator import SequenceLabelingEvaluator
from .language_modeling.evaluator import LanguageModelingEvaluator
from .text_match.evaluator import TextMatchEvaluator
from .geep_classification.evaluator import GEEPClassificationEvaluator
from .text2video_retrieval.evaluator import Text2VideoRetrievalEvaluator
from .clip.evaluator import CLIPEvaluator
from .wukong_clip.evaluator import WukongCLIPEvaluator
from .text2image_generation.evaluator import TextImageGenerationEvaluator
Expand All @@ -113,28 +127,33 @@
from .sequence_generation.evaluator import SequenceGenerationEvaluator
from .machine_reading_comprehension.evaluator import MachineReadingComprehensionEvaluator
from .open_domain_dialogue.evaluator import OpenDomainDialogueEvaluator
from .information_extraction.evaluator import InformationExtractionEvaluator
from .latent_diffusion.evaluator import LatentDiffusionModelEvaluator

from .sequence_classification.predictor import SequenceClassificationPredictor, FewshotSequenceClassificationPredictor, CptFewshotSequenceClassificationPredictor
from .sequence_labeling.predictor import SequenceLabelingPredictor
from .feature_vectorization.predictor import FeatureVectorizationPredictor
from .text_match.predictor import TextMatchPredictor, TextMatchTwoTowerPredictor, FewshotSingleTowerTextMatchPredictor, CptFewshotSingleTowerTextMatchPredictor
from .data_augmentation.predictor import DataAugmentationPredictor
from .geep_classification.predictor import GEEPClassificationPredictor
from .text2video_retrieval.predictor import Text2VideoRetrievalPredictor
from .clip.predictor import CLIPPredictor
# from .latent_diffusion.predictor import LatentDiffusionPredictor
from .latent_diffusion.predictor import LatentDiffusionPredictor
from .wukong_clip.predictor import WukongCLIPPredictor
from .text2image_generation.predictor import TextImageGenerationPredictor, TextImageGenerationKnowlPredictor
from .image2text_generation.predictor import VQGANGPTImageTextGenerationPredictor, CLIPGPTImageTextGenerationPredictor
from .video2text_generation.predictor import CLIPGPTFrameTextGenerationPredictor
from .sequence_generation.predictor import SequenceGenerationPredictor
from .machine_reading_comprehension.predictor import MachineReadingComprehensionPredictor
from .open_domain_dialogue.predictor import OpenDomainDialoguePredictor
from .information_extraction.predictor import InformationExtractionPredictor

from .sequence_classification.data import ClassificationDataset, DistillatoryClassificationDataset, FewshotSequenceClassificationDataset
from .sequence_labeling.data import SequenceLabelingDataset, SequenceLabelingAutoDataset
from .language_modeling.data import LanguageModelingDataset
from .text_match.data import TwoTowerDataset, SingleTowerDataset, DistillatorySingleTowerDataset, FewshotSingleTowerTextMatchDataset, SiameseDataset
from .geep_classification.data import GEEPClassificationDataset
from .text2video_retrieval.data import Text2VideoRetrievalDataset
from .clip.data import CLIPDataset
from .wukong_clip.data import WukongCLIPDataset
from .text2image_generation.data import TextImageDataset, TextImageKnowlDataset
Expand All @@ -143,6 +162,8 @@
from .sequence_generation.data import SequenceGenerationDataset
from .machine_reading_comprehension.data import MachineReadingComprehensionDataset
from .open_domain_dialogue.data import OpenDomainDialogueDataset
from .information_extraction.data import InformationExtractionDataset
from .latent_diffusion.data import LdmDataset

from .dataset import BaseDataset, GeneralDataset
from .dataset import load_dataset, list_datasets
Expand Down
Loading

0 comments on commit 8997c3b

Please sign in to comment.