A masked language modeling objective to train a model to predict any subset of the target words, conditioned on both the input text and a partially masked target translation.

Python 243 39 Updated Sep 17, 2021

janishar / mit-deep-learning-book-pdf

MIT Deep Learning Book in PDF format (complete and parts) by Ian Goodfellow, Yoshua Bengio and Aaron Courville

Java 13,000 2,730 Updated Oct 9, 2023

dropreg / R-Drop

Python 870 107 Updated May 24, 2024

facebookresearch / DisCo

DisCo Transformer for Non-autoregressive MT

Python 78 9 Updated Jul 28, 2022

wszlong / sb-nmt

Code for Synchronous Bidirectional Neural Machine Translation (SB-NMT)

Python 64 17 Updated May 16, 2019

facebookresearch / XLM

PyTorch original implementation of Cross-lingual Language Model Pretraining.

Python 2,896 497 Updated Feb 14, 2023

ottokart / beam_search

Beam search for neural network sequence to sequence (encoder-decoder) models.

Python 34 17 Updated Apr 4, 2019

google / seq2seq

A general-purpose encoder-decoder framework for Tensorflow

Python 5,607 1,301 Updated Oct 15, 2020

lancopku / Prime

A simple module consistently outperforms self-attention and Transformer model on main NMT datasets with SoTA performance.

Python 87 9 Updated Jul 24, 2023

moses-smt / giza-pp

GIZA++ is a statistical machine translation toolkit that is used to train IBM Models 1-5 and an HMM word alignment model. This package also contains the source for the mkcls tool which generates th…

C++ 265 82 Updated Mar 31, 2023

fxsjy / jieba

结巴中文分词

Python 33,547 6,726 Updated Aug 21, 2024

wzlxjtu / PositionalEncoding2D

A PyTorch implementation of the 1d and 2d Sinusoidal positional encoding/embedding.

Jupyter Notebook 251 23 Updated Nov 17, 2020

NELSONZHAO / zhihu

This repo contains the source code in my personal column (https://zhuanlan.zhihu.com/zhaoyeyu), implemented using Python 3.6. Including Natural Language Processing and Computer Vision projects, suc…

Jupyter Notebook 3,500 2,136 Updated Jun 14, 2021

microsoft / MASS

MASS: Masked Sequence to Sequence Pre-training for Language Generation

Python 1,119 206 Updated Nov 28, 2022

ajinkyakulkarni14 / TED-Multilingual-Parallel-Corpus

TED parallel Corpora is growing collection of Bilingual parallel corpora, Multilingual parallel corpora and Monolingual corpora extracted from TED talks www.ted.com for 109 world languages.

246 81 Updated Jan 4, 2016

yongzhuo / nlp_xiaojiang

自然语言处理（nlp），小姜机器人（闲聊检索式chatbot），BERT句向量-相似度（Sentence Similarity），XLNET句向量-相似度（text xlnet embedding），文本分类（Text classification），实体提取（ner，bert+bilstm+crf），数据增强（text augment, data enhance），同义句同义词生成，句子…

Python 1,528 395 Updated Sep 23, 2021

IlyaGusev / UNMT

Code inspired by Unsupervised Machine Translation Using Monolingual Corpora Only

Jupyter Notebook 50 9 Updated Jul 25, 2024

jim-schwoebel / download_audioset

📁 This repo makes it easy to download the raw audio files from AudioSet (32.45 GB, 632 classes).

Python 100 22 Updated Aug 1, 2023

ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites

Python 133,390 10,136 Updated Dec 31, 2024

saffsd / langid.py

Stand-alone language identification system

Python 2,338 320 Updated Jan 1, 2020

marcotcr / checklist

Beyond Accuracy: Behavioral Testing of NLP models with CheckList

Jupyter Notebook 2,020 206 Updated Jan 9, 2024

tnq177 / transformers_without_tears

Transformers without Tears: Improving the Normalization of Self-Attention

Python 130 17 Updated May 29, 2024

lucidrains / performer-pytorch

An implementation of Performer, a linear attention-based transformer, in Pytorch

Python 1,107 145 Updated Feb 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mandee13

Block or report mandee13

Stars

hyunwoongko / transformer

google-research / tuning_playbook

jessevig / bertviz

layer6ai-labs / CMLMC

bytedance / ParaGen

zhajiahe / Token_Drop

rsennrich / subword-nmt

facebookresearch / Mask-Predict