Releases: lancedb/lance
Releases · lancedb/lance
v0.17.1-beta.2
What's Changed
Bug Fixes 🐛
- fix: calculate decode priority correctly by @westonpace in #2841
Performance Improvements 🚀
- perf: parallelize remapping to improve FTS compaction by @BubbleCal in #2834
- perf: concurrent loading FTS index files by @BubbleCal in #2787
Full Changelog: v0.17.1-beta.1...v0.17.1-beta.2
v0.17.1-beta.1
What's Changed
Bug Fixes 🐛
- fix: ensure the v2 file writer does not mix up the order of a column by @westonpace in #2836
Full Changelog: v0.17.0...v0.17.1-beta.1
v0.17.0
What's Changed
Breaking Changes 🛠
- feat!: index_statistics returns the concrete index type by @BubbleCal in #2716
- feat: stop writing
_latest.manifest
by @wjones127 in #2776 - refactor: build & search vector index with new indexer & IVF impl by @BubbleCal in #2552
New Features 🎉
- feat: add
max_bytes_per_file
andbatch_size
to CompactionOptions by @westonpace in #2728 - feat: add fixed size binary encoding to lance by @raunaks13 in #2707
- feat: make the I/O buffer size configurable by @westonpace in #2736
- feat: pushdown limit & offset into file reader if there is no filter by @westonpace in #2747
- feat: support phrase query for full text search by @BubbleCal in #2751
- feat: make the # of CPU threads configurable and document cpu/memory patterns by @westonpace in #2773
- feat: add logging to compaction by @westonpace in #2791
- feat(python): account for both multi-process and distributed torch workers by @tonyf in #2761
- feat: allow duckdb / polars pushdown to operate with non-substrait types in schema but not filter by @westonpace in #2796
- feat: add an environment variable LANCE_INITIAL_UPLOAD_SIZE by @westonpace in #2806
- feat: extend python bindings for the v2 reader/writer by @ankitvij-db in #2800
- feat: provide param to control whether to build FTS index with positions by @BubbleCal in #2795
- feat: allow creating fragments from v2 files, expose rewrite operation to python by @westonpace in #2811
- feat: use a considerably higher default for frag readahead in v2 by @westonpace in #2797
- feat: support update tags by @broccoliSpicy in #2813
- feat(java): add create vector search index, list index names, and vector search scan by @LuQQiu in #2782
- feat: make default I/O buffer size configurable via env var by @westonpace in #2826
- feat: constant-time manifest lookup on object stores by @wjones127 in #2798
Bug Fixes 🐛
- fix: drop indices if all fragment ids are missing by @wjones127 in #2720
- fix: ignore pages with zero rows by @westonpace in #2724
- fix: various changes to fix backpressure by @westonpace in #2721
- fix: raise exception if user passes unsorted indices to take_rows() of lance file api by @raunaks13 in #2729
- fix: flaky test caused by PQ distortion by @BubbleCal in #2732
- fix: don't panic when v2 scans end early (reader is dropped) by @westonpace in #2690
- fix: fix a crash that could sometimes happen reading largish string/binary data in a list by @westonpace in #2731
- fix: fix two cases in the v2 decoder where the decode order didn't match the scheduling priority by @westonpace in #2754
- fix: fix several situations where we were incorrectly inferring the storage version by @westonpace in #2756
- fix: rework priority handling for lists. We now properly schedule list item pages in priority order by @westonpace in #2769
- fix: try and fix an invalid data storage version automatically by @westonpace in #2759
- fix: add not-linux version of benches to avoid bench compiler error on mac by @westonpace in #2794
- fix: fix v2 error that can happen when writing list<struct<...>> with many empty lists by @westonpace in #2762
- fix: torch to_tensor for FixedShapeTensorType data by @jacketsj in #2824
- fix: cleanup external staging manifests by @wjones127 in #2792
- fix: the DataFile version doesn't respect the writer's version by @BubbleCal in #2825
- fix: merge-insert and update can't work on v2 format by @BubbleCal in #2833
Documentation 📚
- docs: clean up quickstart notebook and add tags example usage by @dsgibbons in #2735
- docs: fix version table so it renders by @wjones127 in #2745
- docs: document newly added compaction options in optimize.py by @westonpace in #2746
- docs: fix transaction conflict table by @wjones127 in #2744
- docs: fix #2748, "Merge Insert" doc error by @broccoliSpicy in #2788
Performance Improvements 🚀
- perf: coalesce ids before executing take by @westonpace in #2680
- perf: cache default session contexts by @dsgibbons in #2709
- perf: various fixes to improve shuffling performance at high scales by @westonpace in #2710
- perf: calculate the max scores for posting lists then have a tighter upper bound by @BubbleCal in #2763
- perf: stable row id prefilter by @wjones127 in #2706
- perf: add prefetching to hnsw greedy_search by @jacketsj in #2783
- perf: parallelize FTS indexing by @BubbleCal in #2807
- perf: split MaterializeIndex stream into batches by @wjones127 in #2770
Other Changes
- refactor: convert DataBlock to an enum, add conversion from arrow, normalize dictionaries by @westonpace in #2789
New Contributors
- @dentiny made their first contribution in #2740
- @jacketsj made their first contribution in #2783
- @tonyf made their first contribution in #2761
- @ankitvij-db made their first contribution in #2800
Full Changelog: v0.16.1...v0.17.0
v0.17.0-beta.13
What's Changed
New Features 🎉
- feat: make default I/O buffer size configurable via env var by @westonpace in #2826
Full Changelog: v0.17.0-beta.12...v0.17.0-beta.13
v0.17.0-beta.12
What's Changed
New Features 🎉
- feat: use a considerably higher default for frag readahead in v2 by @westonpace in #2797
- feat: support update tags by @broccoliSpicy in #2813
- feat(java): add create vector search index, list index names, and vector search scan by @LuQQiu in #2782
Bug Fixes 🐛
- fix: fix v2 error that can happen when writing list<struct<...>> with many empty lists by @westonpace in #2762
Performance Improvements 🚀
- perf: parallelize FTS indexing by @BubbleCal in #2807
Full Changelog: v0.17.0-beta.11...v0.17.0-beta.12
v0.17.0-beta.11
What's Changed
New Features 🎉
- feat: extend python bindings for the v2 reader/writer by @ankitvij-db in #2800
- feat: provide param to control whether to build FTS index with positions by @BubbleCal in #2795
- feat: allow creating fragments from v2 files, expose rewrite operation to python by @westonpace in #2811
New Contributors
- @ankitvij-db made their first contribution in #2800
Full Changelog: v0.17.0-beta.10...v0.17.0-beta.11
v0.17.0-beta.10
What's Changed
New Features 🎉
- feat: allow duckdb / polars pushdown to operate with non-substrait types in schema but not filter by @westonpace in #2796
- feat: add an environment variable LANCE_INITIAL_UPLOAD_SIZE by @westonpace in #2806
Other Changes
- refactor: convert DataBlock to an enum, add conversion from arrow, normalize dictionaries by @westonpace in #2789
Full Changelog: v0.17.0-beta.9...v0.17.0-beta.10
v0.17.0-beta.9
What's Changed
Breaking Changes 🛠
- feat: stop writing
_latest.manifest
by @wjones127 in #2776 - refactor: build & search vector index with new indexer & IVF impl by @BubbleCal in #2552
New Features 🎉
- feat: support phrase query for full text search by @BubbleCal in #2751
- feat: make the # of CPU threads configurable and document cpu/memory patterns by @westonpace in #2773
- feat: add logging to compaction by @westonpace in #2791
- feat(python): account for both multi-process and distributed torch workers by @tonyf in #2761
Bug Fixes 🐛
- fix: add not-linux version of benches to avoid bench compiler error on mac by @westonpace in #2794
Documentation 📚
- docs: fix #2748, "Merge Insert" doc error by @broccoliSpicy in #2788
Performance Improvements 🚀
- perf: calculate the max scores for posting lists then have a tighter upper bound by @BubbleCal in #2763
- perf: stable row id prefilter by @wjones127 in #2706
- perf: add prefetching to hnsw greedy_search by @jacketsj in #2783
New Contributors
Full Changelog: v0.17.0-beta.8...v0.17.0-beta.9
v0.17.0-beta.8
What's Changed
Bug Fixes 🐛
- fix: rework priority handling for lists. We now properly schedule list item pages in priority order by @westonpace in #2769
- fix: try and fix an invalid data storage version automatically by @westonpace in #2759
Full Changelog: v0.17.0-beta.7...v0.17.0-beta.8
v0.17.0-beta.7
What's Changed
New Features 🎉
- feat: pushdown limit & offset into file reader if there is no filter by @westonpace in #2747
Bug Fixes 🐛
- fix: fix two cases in the v2 decoder where the decode order didn't match the scheduling priority by @westonpace in #2754
- fix: fix several situations where we were incorrectly inferring the storage version by @westonpace in #2756
Performance Improvements 🚀
- perf: various fixes to improve shuffling performance at high scales by @westonpace in #2710
Full Changelog: v0.17.0-beta.6...v0.17.0-beta.7