Skip to content
@corpus-dataset

corpus-dataset

Popular repositories Loading

  1. CLUEDatasetSearch CLUEDatasetSearch Public

    Forked from CLUEbenchmark/CLUEDatasetSearch

    搜索所有中文NLP数据集,附常用英文NLP数据集

    Python 4 1

  2. CDial-GPT CDial-GPT Public

    Forked from thu-coai/CDial-GPT

    A Large-scale Chinese Short-Text Conversation Dataset and Chinese pre-training dialog models

    Python 4 1

  3. CLUENER2020 CLUENER2020 Public

    Forked from CLUEbenchmark/CLUENER2020

    CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition

    Python 2 2

  4. word2word word2word Public

    Forked from kakaobrain/word2word

    Easy-to-use word-to-word translations for 3,564 language pairs.

    Python

  5. hncynic hncynic Public

    Forked from leod/hncynic

    Generate Hacker News Comments from Titles

    Python

  6. funNLP funNLP Public

    Forked from fighting41love/funNLP

    中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、…

    Python

Repositories

Showing 10 of 50 repositories
  • ChineseDiachronicCorpus Public Forked from yanshanjing/ChineseDiachronicCorpus

    ChineseDiachronicCorpus,中文历时语料库,横跨六十余年,包括腾讯历时新闻2000-2016,人民日报历时语料1946-2003,参考消息历时语料1957-2002。基于历时流通语料库,可用于历时语言变化计算、语言监测、社会文化变迁研究提供基础性的语料支持。

    corpus-dataset/ChineseDiachronicCorpus’s past year of commit activity
    0 56 0 0 Updated Jan 10, 2021
  • names.io Public Forked from Debdut/names.io

    A Global Exhaustive First and Last Name Database

    corpus-dataset/names.io’s past year of commit activity
    Shell 0 Apache-2.0 56 0 0 Updated Oct 8, 2020
  • CDial-GPT Public Forked from thu-coai/CDial-GPT

    A Large-scale Chinese Short-Text Conversation Dataset and Chinese pre-training dialog models

    corpus-dataset/CDial-GPT’s past year of commit activity
    Python 4 MIT 254 0 0 Updated Aug 12, 2020
  • corpus-1 Public Forked from SimmerChan/corpus

    自然语言处理,知识图谱相关语料。按照Task细分,欢迎PR。

    corpus-dataset/corpus-1’s past year of commit activity
    Python 0 154 0 0 Updated Jul 12, 2020
  • MATINF Public Forked from WHUIR/MATINF

    The dataset and PyTorch Implementation for ACL 2020 paper "MATINF: A Jointly Labeled Large-Scale Dataset for Classification, Question Answering and Summarization".

    corpus-dataset/MATINF’s past year of commit activity
    Python 0 7 0 0 Updated May 3, 2020
  • AllDataPackages Public Forked from qianzhengyang/AllDataPackages

    分词、词表、核心词典、停用词、敏感词、问答、问答数据、知识图谱 等

    corpus-dataset/AllDataPackages’s past year of commit activity
    0 46 0 0 Updated Apr 29, 2020
  • Company-Names-Corpus Public Forked from wainshine/Company-Names-Corpus

    公司名语料库。机构名语料库。公司简称,缩写,品牌词,企业名。可用于中文分词、机构名实体识别。

    corpus-dataset/Company-Names-Corpus’s past year of commit activity
    0 Apache-2.0 375 0 0 Updated Mar 30, 2020
  • Chinese-Names-Corpus Public Forked from wainshine/Chinese-Names-Corpus

    中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。

    corpus-dataset/Chinese-Names-Corpus’s past year of commit activity
    0 Apache-2.0 1,014 0 0 Updated Mar 30, 2020
  • CrossWOZ Public Forked from thu-coai/CrossWOZ

    A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset

    corpus-dataset/CrossWOZ’s past year of commit activity
    Python 0 Apache-2.0 114 0 0 Updated Mar 24, 2020
  • IEDatasets Public Forked from zxlzr/IEDatasets

    Information extraction dataset zoo.

    corpus-dataset/IEDatasets’s past year of commit activity
    0 7 0 0 Updated Mar 20, 2020

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…