Skip to content

v2.0.0

Compare
Choose a tag to compare
@github-actions github-actions released this 21 Jan 07:31
· 8 commits to main since this release
012eb1b

What's Changed

  • engine: stop and release model when engine release, and remove deprecated lock
  • sampling: generate_op heavily modified, remove dependency on global tensors
  • prefix cache: some bug fix, impove evict performance
  • json mode: update lmfe-cpp patch, add process_logits, sampling with top_k top_p
  • span-attention: move span_attn decoderReshape to init
  • lora: add docs, fix typo
  • ubuntu: add ubuntu dockerfile, fix install dir err
  • bugifx: fix multi-batch rep_penlty bug

Full Changelog: v1.3.0...v2.0.0