Tags: MXNetEdge/sockeye
Tags
Add support for source factor models (awslabs#275) Source factors are enabled by passing --source-factors file1 [file2 ...] (-sf), where file1, etc. are token-parallel to the source (-s). This option can be passed both to sockeye.train or in the data preparation step, if data sharding is used. An analogous parameter, --validation-source-factors, is used to pass factors for validation data. The flag --source-factors-num-embed D1 [D2 ...] denotes the embedding dimensions. These are concatenated with the source word dimension (--num-embed), which can continue to be tied to the target (--weight-tying --weight-tying-type=src_trg). At test time, the input sentence and its factors can be passed by multiple parallel files (--input and --input-factors) or through stdin with token-level annotations, separated by |. Another way is to send a string-serialized JSON object to the CLI through stdin which needs to have a top-level key called 'text' and optionally a key 'factors' of type List[str]. * Cleanup of vocab functions * Simplified vocab logic a bit. Removed pickle functionality since it has been deprecated for long * Refactor so that first factor corresponds to the source surface form (e.g. configs by default set num_factors to at least 1) * fixed a TODO. slightly reworded the changelog * Reworked inference interface. Added a bunch of TranslatorInput factory functions (including json) * Removed max_seq_len_{source,target} from ModelConfig * Separate data statistics relevant for inference from data information relevant only for training. * Bumped Major Version to 1.17.0 * Do not throw exceptions while translating (awslabs#294) * Remove bias parameters in Transformer attention layers as they bring no benefit. (awslabs#296)
Added custom speedometer to exactly track samples/sec and words/sec d… …uring training (awslabs#260)
Update to mxnet==1.0 (awslabs#244) * Fix inference dims for mxnet 1.0 * Improved mx.NDArray indexing: removed intermediate numpy arrays for scores and topk hyp/word indices * Removed unused parameter * Reformatting inference.py * simplify _get_inference_input() * Update dependencies to mxnet==1.0.0 * Expose nccl kvstore and gradient compression * Update major version
Fixed the maximum input length calculation at inference. (awslabs#255) * Fixed the maximum input length calculation at inference. * doc string
Bugfix: --num-samples-per-shard must be int (awslabs#254) * Bugfix: --num-samples-per-shard must be int * bump version
Sharded data iterator. (awslabs#241) * Sharded data iterator. * Added remaining sockeye/*.py files to typechecked files (awslabs#242) * Tests to see we get the right number of batches. * Improved log message about vocabs a little bit * Factored validation iter creation into separate function * Covering prepare data in the system tests. * Writing a data version.
Remove RNN parameter packing, FusedRNN support; refactored core model… … components (awslabs#189) * Removed RNN parameter packing and FusedRNN support * Refactor embedding and output layers (awslabs#196) * Removed RNN parameter packing and FusedRNN support * Refactoring of sockeye model: source embed/target embed/output layers are now separate components in model * Make training and inference work. Remove lexical biasing code.
Yet another fix for the data iterator. Added a test. (awslabs#188) * Yet another fix for the data iterator. Added a test that would catch this kind of problem * Bump minor version
PreviousNext