Tags: mjdenkowski/sockeye
Tags
Two fixes to SampleK (awslabs#1086) * Fix: set device for best_hyp_indices in SampleK. * Fix: Take top-k values. * Changelog.
Code cleanup: refactoring, type checking, and formatting (awslabs#1076)
Neural vocabulary selection. (awslabs#1046) Co-authored-by: Tobias Domhan <[email protected]>
Ignore false-positive missing keys for traced modules (awslabs#1042)
Also trace SockeyeModel components when `inference_only == False` (in… …cludes CheckpointDecoder) (awslabs#1032) * Trace checkpoint decoder - Remove inference_only checks for model tracing - Checkpoint decoder always runs in eval mode * Version and changelog * Grammar * Whitespace * Rename variable
Support target prefix with JSON format (awslabs#1025) This feature allows Sockeye to add target prefix and target prefix factors during inference with JSON format. During inference target prefix can be specified with JSON format as follows: { "text": "The boy ate the waff@@ le .", "target_prefix": "2XX"} If a model was trained with target factors, we can add target prefix factors during inference with JSON format as follows: { "text": "The boy ate the waff@@ le .", "target_prefix_factors": ["O"]} Meanwhile, we can also add both target prefix and target prefix factors at the same time with JSON format, e.g.,: { "text": "The boy ate the waff@@ le .", "target_prefix": "2XX", "target_prefix_factors": ["O"]} Note that if an input is very long, Sockeye chunks the text and translates each chunk separately. By default, target prefix and target prefix factors are added to all chunks in that case. Alternatively, we can set use_target_prefix_all_chunks to false to add them only to the first chunk, e.g.,: { "text": "The boy ate the waff@@ le .", "target_prefix": "2XX", "target_prefix_factors": ["O"], "use_target_prefix_all_chunks": false} * support target prefix * revise illustration * fix space * add type ignore * add target prefix, revision 2 * add target prefix factors * slightly revise docs * revise based on Michael's suggestions * revise based on Felix's comments * small revise type for pylint * revise based on Tobias suggestion * revised based on Felix's comments * one_hot_encoding_from_prefix function for a full tensor * revise warning of empty prefix * use clamp instead of masked_fill_ * cleaner adding of target_prefix_factors to decode_step) * put pt.index_select vocab_slice_ids before beam loop * revised with prefix masking * revise a tiny comment * pre_expand outside, suggested by Tobi * pre_expand outside, a small fix * competely pre_expand outside, gen_prefix_masking generated only once * fix factors and add prefix to vocab_slice_ids by pytorch * small mypy fix * small fix mypy * minor revision * avoiding unnecesary copy tensor * avoid duplicate padding * extra revision suggested by Tobias * Fix translatorInput with pylint * Fix translatorInput with pylint Co-authored-by: Hoang <[email protected]>
PreviousNext