Tags: sxjscience/sockeye
Tags
Yet another fix for the data iterator. Added a test. (awslabs#188) * Yet another fix for the data iterator. Added a test that would catch this kind of problem * Bump minor version
Hotfix: use correct vocab and add_bos setting for target side validat… …ion data (awslabs#186)
Reduce memory footprint of data i/o: added a SentenceReader that stre… ( awslabs#178) * Reduce memory footprint of data i/o: added a SentenceReader that streams integer sequences from disk. Removed label arrays from ParallelBucketSentenceIter. Labels are created on the fly from target sequence. * added limit flag and some streamlining of log messages * Iter check * Changelog
Merge pull request awslabs#142 from awslabs/convseq2seq [1.8.0] Convolutional decoder.
Word batching update (awslabs#152) * Word batching update: guarantee default bucket has largest batch size. * Comments/logic for clarity. * Address PR comments. - Memory usage note. - NamedTuple for bucket batch sizes.
Log exception and traceback for uncaught exceptions (awslabs#134)
Add args for zero initialization and input sequence reversal. Multi-l… …ayer RNN fix. Generalized RNN residual connections. (awslabs#113) * Change default RNN model to initialize decoder states with zeros. Added argument to use mlp initialization from last encoders state. * Correctly only creating a single bi-rnn encoder layer. * Correct Config inheritance * Add ReverseSequence encoder. Cleanup encoder interface. * Add encoder.py to typechecked files * MLP init remains default decoder state initializer * Adressed Davids comments * fix * Addressed Tobis comment * Generalized RNN residual connections. (awslabs#114) Added parameter for configuring first layer with residual connections. Additionally, on the encoder side residual connections now optionally, start from the second layer instead of the third layer.
Reworked dropout on RNN models (awslabs#112) * Implement Bayesian Dropout Cell (credit to Philip Schulz) * Test for BayesianDropoutCell * Add --rnn-dropout and --rnn-variational-dropout arguments * Add varitional dropout on hidden states of RNN * Added --conv-embed-dropout * Added embed dropout (source & target). Changed VariationalDropoutCell to ModifierCell. RNNs now always use Variational Dropout BEFORE the residual connection. Added reset method to decoder interface for inference purposes. * Added RNNDecoder MLP dropout. Refactored context gating and MLP into separate methods. Bumped version. * adressed Tobis comments
Recombined arguments for source and target, e.g. support for things l… ( awslabs#107) * Recombined arguments for source and target, e.g. support for things like --num-words <src>:<trg> * Generalized to multiple values function
Transformer models in Sockeye (awslabs#98) * Positional encodings and initial arguments for transformer * Stub for TransformerEncoder * WIP self attention * ffn * Unmasked self-attention prototype * cleaned up code. Still not tested * Put things together so we can run and debug, some cleanup * Separate layer construction from application in encoder * Added masking for self-attention * More fixes, now runs on CPUs with default args * removed unused code * fix inference for transformer * docstrings * Added Multi-head dot attention for the actual attention mechanism. Enable with --attention-type mhdot * fixed existing tests * Import fix * Precompute positional encodings in variable initialization * temporary fix. Will change later * Pass max_seq_len to Embedding if needed for positional encodings * fix import * more control over positional encodings * Fix masking for MultiheadAttention * Fix nasty bug with layer normalization quietly accepting 3d input. * WIP: decoder * Added transformer test * WIP full transformer with decoder. Inference and RNN is currently broken, work-in-progress * fix auto-regressive bias * Revised Configs and Decoder interface * moved attention into (rnn) decoder * Defined proper Decoder interface for inference. Rewrote RecurrentDecoder to adhere to the new interface. * Fixed bias variable/length problem by writing a custom operator * custom operator for positional encodings * added integration tests * improve consistency * Fixed a last bug in inference regarding lengths. All tests pass now * Bump version * Update tests * Make mypy happy * Support transformer with convolutional embedding encoder * Fix to actually use layer normalization * Allow projecting segment embeddings to arbitrary size * Typo fix * Correct path in documentation pypi upload doc. (awslabs#92) * Uniform weight initialization. (awslabs#93) * Added transformer dropout * Learning rate warmup * fix * Changed eps for Layer Normalization * Docstrings and cleanup * Better coverage for ConvolutionalEmbeddingEncoder * warmup WIP * Fix travis builds * Removed source_length from inference code. Is now computed in the encoder graph * Added transformer module to doc generation * small fixes * fixed doc generation * Fix tests * Refactored read_metrics_file method to separate multiple of its responsibilities. The new read_metrics_file method can now easily be used for other things, e.g. offline analysis etc. * Removed old method * Fixed argument description * revised arguments according to David & Tobis comments * Fix system tests * Removed duplicate query scaling in DotAttention * adressed Tobis comments * pass correct argument to rnn attention num heads * Moved check for batch2timemajor encoder being last encoder to encoder sequence * Fixed RNN decoder after decoder rewrite. * fix awslabs#2 * Do not truncate metrics file in callback_monitor constructor. Restructured saving and loading of metris file to make it consistent. * make pylint happy * adressed Tobis comments * Test averaging in integration/system tests * Adressed Tobis (last?) comments * revised abstract class * adressed tobis comments
PreviousNext