Stars
Transformer: PyTorch Implementation of "Attention Is All You Need"
A playbook for systematically maximizing the performance of deep learning models.
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
Code for the ICLR'22 paper "Improving Non-Autoregressive Translation Models Without Distillation"
ParaGen is a PyTorch deep learning framework for parallel sequence generation.
Token Drop mechanism for Neural Machine Translation
Unsupervised Word Segmentation for Neural Machine Translation and Text Generation
A masked language modeling objective to train a model to predict any subset of the target words, conditioned on both the input text and a partially masked target translation.
MIT Deep Learning Book in PDF format (complete and parts) by Ian Goodfellow, Yoshua Bengio and Aaron Courville
DisCo Transformer for Non-autoregressive MT
Code for Synchronous Bidirectional Neural Machine Translation (SB-NMT)
PyTorch original implementation of Cross-lingual Language Model Pretraining.
Beam search for neural network sequence to sequence (encoder-decoder) models.
A general-purpose encoder-decoder framework for Tensorflow
A simple module consistently outperforms self-attention and Transformer model on main NMT datasets with SoTA performance.
GIZA++ is a statistical machine translation toolkit that is used to train IBM Models 1-5 and an HMM word alignment model. This package also contains the source for the mkcls tool which generates th…
A PyTorch implementation of the 1d and 2d Sinusoidal positional encoding/embedding.
This repo contains the source code in my personal column (https://zhuanlan.zhihu.com/zhaoyeyu), implemented using Python 3.6. Including Natural Language Processing and Computer Vision projects, suc…
MASS: Masked Sequence to Sequence Pre-training for Language Generation
TED parallel Corpora is growing collection of Bilingual parallel corpora, Multilingual parallel corpora and Monolingual corpora extracted from TED talks www.ted.com for 109 world languages.
自然语言处理(nlp),小姜机器人(闲聊检索式chatbot),BERT句向量-相似度(Sentence Similarity),XLNET句向量-相似度(text xlnet embedding),文本分类(Text classification), 实体提取(ner,bert+bilstm+crf),数据增强(text augment, data enhance),同义句同义词生成,句子…
Code inspired by Unsupervised Machine Translation Using Monolingual Corpora Only
📁 This repo makes it easy to download the raw audio files from AudioSet (32.45 GB, 632 classes).
Command-line program to download videos from YouTube.com and other video sites
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
Transformers without Tears: Improving the Normalization of Self-Attention
An implementation of Performer, a linear attention-based transformer, in Pytorch