-
Zhejiang University
- Hangzhou, China
Stars
Retrieval and Retrieval-augmented LLMs
利用指针网络进行信息抽取,包含命名实体识别、关系抽取、事件抽取。
中文医学NLP公开资源整理:术语集/语料库/词向量/预训练模型/知识图谱/命名实体识别/QA/信息抽取/模型/论文/etc
收录NLP竞赛策略实现、各任务baseline、相关竞赛经验贴(当前赛事、往期赛事、训练赛)、NLP会议时间、常用自媒体、GPU推荐等,持续更新中
超长文本分类(大于1000字);文档级/篇章级文本分类;主要是解决长距离依赖问题
Entity and Relation Extraction Based on TensorFlow and BERT. 基于TensorFlow和BERT的管道式实体及关系抽取,2019语言与智能技术竞赛信息抽取任务解决方案。Schema based Knowledge Extraction, SKE 2019
Automatic Detection of Sexist Statements Commonly Used at the Workplace (PAKDD LDRC '20)
Source code for ACL 2021 finding paper: CasEE: A Joint Learning Framework with Cascade Decoding for Overlapping Event Extraction.
Pytorch implementation of Supporting Clustering with Contrastive Learning, NAACL 2021
科大讯飞2020事件抽取挑战赛第一名解决方案&完整事件抽取系统
Tensorflow implementation of "Language Modeling with Gated Convolutional Networks"
中文医疗信息处理基准CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)
A curated list of network embedding techniques.
Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)
Multi-stage passage ranking: monoBERT + duoBERT
transform multi-label classification as sentence pair task, with more training data and information
YAGO is a large semantic knowledge base, derived from Wikipedia, WordNet, WikiData, GeoNames, and other data sources
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
Language Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard