Releases: lancedb/lance
Releases · lancedb/lance
v0.21.0
What's Changed
Breaking Changes 🛠
- fix!: correctly handle nulls in btree and bitmap indices by @westonpace in #3211
- feat!: support hamming distance & binary vector by @BubbleCal in #3198
- refactor(python)!: simplify marshalling of
Fragment
,DataFile
,Operation
,Transaction
by @wjones127 in #3240
New Features 🎉
- feat: enhance repdef utilities to handle empty / null lists by @westonpace in #3200
- feat: support _rowid meta column for spark connector in java by @SaintBacchus in #3194
- feat: support blob api in pytorch loader by @eddyxu in #3217
- feat(python): add experimental parameter
enable_move_stable_row_ids
for pylance by @SaintBacchus in #3216 - feat: add the repetition index to the miniblock write path by @westonpace in #3208
- feat: packed struct encoding by @broccoliSpicy in #3186
- feat: support between sql clauses by @connellPortrait in #3225
- feat(java): support drop columns for dataset by @yanghua in #3237
- feat(java): expose uri method for Dataset instance by @yanghua in #3231
- feat: add file statistics by @broccoliSpicy in #3232
- feat: enable tracing for object storage by @wjones127 in #3244
- feat(java): support limit and offset interface for spark connector by @SaintBacchus in #3253
- feat: adds list decode support for mini-block encoded data by @westonpace in #3241
- feat(java): support topn pushdown in spark connector by @SaintBacchus in #3261
- feat: add replace_schema_metadata and replace_field_metadata by @westonpace in #3263
- feat: merge-insert supports inserting subset of columns by @wjones127 in #3100
- feat: support merge by row_id, row_addr by @chenkovsky in #3254
- feat: add the s3 retry config options for storage option by @SaintBacchus in #3268
- feat(java): support alter columns for dataset by @yanghua in #3259
- feat: support remapping for IVF_FLAT, IVF_PQ and IVF_SQ by @BubbleCal in #2708
- feat: change MSRV from 1.78 to 1.80.1 by @westonpace in #3279
- feat: support merge fragment with dataset by @chenkovsky in #3256
Bug Fixes 🐛
- fix: test failure in
test_fsl_packed_struct
by @broccoliSpicy in #3227 - fix: remove overzealous warning by @westonpace in #3239
- fix: correctly copy null buffer when making deep copy by @westonpace in #3238
- fix: allow LANCE_LOG to be set to trace by @westonpace in #3246
- fix: list indices always shows vector index type is IVF_PQ even it's not by @BubbleCal in #3258
- fix: panic when get stats from index over binary vectors by @BubbleCal in #3267
- fix(rust): adjust scan range to avoid unnecessary warnings by @takaebato in #3248
- fix: when taking struct fields they should be merged into the output in the correct order by @westonpace in #3277
- fix: full text search with limit may return an incorrect results by @BubbleCal in #3284
- fix: refine type annotation by @chenkovsky in #3278
Documentation 📚
- docs: add the documentation about how to install packages for tests by @yanghua in #3213
- docs: add doc and test for 4bit PQ by @BubbleCal in #3212
- docs: blob api documents by @eddyxu in #3247
- docs: add java module into directory structure by @yanghua in #3273
Performance Improvements 🚀
- perf: in-register lookup table & SIMD for 4bit PQ by @BubbleCal in #3178
New Contributors
- @connellPortrait made their first contribution in #3225
- @takaebato made their first contribution in #3248
Full Changelog: v0.20.0...v0.21.0
v0.21.0-beta.5
What's Changed
New Features 🎉
- feat(java): support limit and offset interface for spark connector by @SaintBacchus in #3253
- feat: adds list decode support for mini-block encoded data by @westonpace in #3241
- feat(java): support topn pushdown in spark connector by @SaintBacchus in #3261
- feat: add replace_schema_metadata and replace_field_metadata by @westonpace in #3263
- feat: merge-insert supports inserting subset of columns by @wjones127 in #3100
- feat: support merge by row_id, row_addr by @chenkovsky in #3254
- feat: add the s3 retry config options for storage option by @SaintBacchus in #3268
Bug Fixes 🐛
- fix: allow LANCE_LOG to be set to trace by @westonpace in #3246
- fix: list indices always shows vector index type is IVF_PQ even it's not by @BubbleCal in #3258
- fix: panic when get stats from index over binary vectors by @BubbleCal in #3267
- fix(rust): adjust scan range to avoid unnecessary warnings by @takaebato in #3248
Documentation 📚
New Contributors
- @takaebato made their first contribution in #3248
Full Changelog: v0.21.0-beta.4...v0.21.0-beta.5
v0.21.0-beta.4
What's Changed
New Features 🎉
- feat: add file statistics by @broccoliSpicy in #3232
- feat: enable tracing for object storage by @wjones127 in #3244
Full Changelog: v0.21.0-beta.3...v0.21.0-beta.4
v0.21.0-beta.3
What's Changed
New Features 🎉
- feat(java): support drop columns for dataset by @yanghua in #3237
- feat(java): expose uri method for Dataset instance by @yanghua in #3231
Bug Fixes 🐛
- fix: remove overzealous warning by @westonpace in #3239
- fix: correctly copy null buffer when making deep copy by @westonpace in #3238
Full Changelog: v0.21.0-beta.2...v0.21.0-beta.3
v0.21.0-beta.2
What's Changed
Breaking Changes 🛠
- feat!: support hamming distance & binary vector by @BubbleCal in #3198
New Features 🎉
- feat: support blob api in pytorch loader by @eddyxu in #3217
- feat(python): add experimental parameter
enable_move_stable_row_ids
for pylance by @SaintBacchus in #3216 - feat: add the repetition index to the miniblock write path by @westonpace in #3208
- feat: packed struct encoding by @broccoliSpicy in #3186
- feat: support between sql clauses by @connellPortrait in #3225
Bug Fixes 🐛
- fix: test failure in
test_fsl_packed_struct
by @broccoliSpicy in #3227
Documentation 📚
- docs: add doc and test for 4bit PQ by @BubbleCal in #3212
New Contributors
- @connellPortrait made their first contribution in #3225
Full Changelog: v0.21.0-beta.1...v0.21.0-beta.2
v0.21.0-beta.1
What's Changed
Breaking Changes 🛠
- fix!: correctly handle nulls in btree and bitmap indices by @westonpace in #3211
New Features 🎉
- feat: enhance repdef utilities to handle empty / null lists by @westonpace in #3200
- feat: support _rowid meta column for spark connector in java by @SaintBacchus in #3194
Documentation 📚
Performance Improvements 🚀
- perf: in-register lookup table & SIMD for 4bit PQ by @BubbleCal in #3178
Full Changelog: v0.20.0...v0.21.0-beta.1
v0.20.0
What's Changed
Breaking Changes 🛠
- feat!: allow passing down existing dataset for write by @wjones127 in #3119
- fix!: low recall with cosine/dot on v3 index types by @BubbleCal in #3141
New Features 🎉
- feat: start recording index details in the mainifest, cache index type lookup by @westonpace in #3131
- feat: make dataset version serializable by @albertlockett in #3143
- feat: support 4bit PQ on new IVF_PQ by @BubbleCal in #3144
- feat: add
commit_batch
API by @wjones127 in #3142 - feat: allow async stream for writing and appending to a dataset by @HoKim98 in #3146
- feat: add dictionary encoding by @broccoliSpicy in #3134
- feat(rust): make JSON serialization of DataType and Field public by @wjones127 in #3161
- feat: expose the table provider by @westonpace in #3162
- feat: support write multi fragments or empty fragment in one spark task by @SaintBacchus in #3183
- feat: add drop to dataset by @chenkovsky in #3184
- feat: upgrade arrow (to 53) & datafusion (to 42) by @westonpace in #3201
Bug Fixes 🐛
- fix: fix error about schema is not writable pd to pa by @Jay-ju in #3109
- fix: handle filter on empty partition by @eddyxu in #3151
- fix: fix dynamodb drop table by @LuQQiu in #3152
- fix: full text search index broken after optimize_indices() by @BubbleCal in #3145
- fix: fix performance regression introduced during reader refactor by @westonpace in #3170
- fix: panic if all docs are deleted in a posting list by @BubbleCal in #3163
- fix: full text search may produce dup results when search over multiple columns by @BubbleCal in #3189
- fix: fix typing for _write_fragment by @chenkovsky in #3171
- fix: fix storage options for dataset builder by @chenkovsky in #3156
- fix: fix storage options for ray by @chenkovsky in #3164
Performance Improvements 🚀
- perf: optimize reading transactions in commit loop by @wjones127 in #3117
- perf: improve PQ computing distances by @BubbleCal in #3150
- perf: improve constructing dist table by @BubbleCal in #3155
- perf: improve dot distance computing by @BubbleCal in #3169
Other Changes
- refactor: remove the queue in LanceArrowWriter to reduce memory usage for spark sink by @SaintBacchus in #3110
New Contributors
- @Jay-ju made their first contribution in #3109
- @chenkovsky made their first contribution in #3171
- @imotai made their first contribution in #3078
- @yanghua made their first contribution in #3193
Full Changelog: v0.19.2...v0.20.0
v0.20.0-beta.3
What's Changed
New Features 🎉
- feat: add
commit_batch
API by @wjones127 in #3142 - feat: allow async stream for writing and appending to a dataset by @HoKim98 in #3146
- feat: add dictionary encoding by @broccoliSpicy in #3134
- feat(rust): make JSON serialization of DataType and Field public by @wjones127 in #3161
- feat: expose the table provider by @westonpace in #3162
Bug Fixes 🐛
- fix: fix dynamodb drop table by @LuQQiu in #3152
- fix: full text search index broken after optimize_indices() by @BubbleCal in #3145
- fix: fix performance regression introduced during reader refactor by @westonpace in #3170
- fix: panic if all docs are deleted in a posting list by @BubbleCal in #3163
Performance Improvements 🚀
- perf: improve PQ computing distances by @BubbleCal in #3150
- perf: improve constructing dist table by @BubbleCal in #3155
- perf: improve dot distance computing by @BubbleCal in #3169
Full Changelog: v0.20.0-beta.2...v0.20.0-beta.3
v0.20.0-beta.2
What's Changed
New Features 🎉
- feat: support 4bit PQ on new IVF_PQ by @BubbleCal in #3144
Bug Fixes 🐛
Performance Improvements 🚀
- perf: optimize reading transactions in commit loop by @wjones127 in #3117
Full Changelog: v0.20.0-beta.1...v0.20.0-beta.2
v0.20.0-beta.1
What's Changed
Breaking Changes 🛠
- feat!: allow passing down existing dataset for write by @wjones127 in #3119
- fix!: low recall with cosine/dot on v3 index types by @BubbleCal in #3141
New Features 🎉
- feat: make dataset version serializable by @albertlockett in #3143
Full Changelog: v0.19.3-beta.1...v0.20.0-beta.1