blaze-v4.0.1:

New Feature

Initial supports to ORC input file format.
Initial supports to RSS framework and Apache Celeborn shuffle service.

Improvement

Optimize AggExec by supporting Implement columnar-based aggregation.
Use custom implemented hashmap implement for aggregation.
Supports specialized count(0).
Optimize bloom filter by reusing same bloom filter in the same executor.
Optimize bloom filter by supporting shrinking.
Optimize reading parquet files by supporting parallel reading.
Improve spill file deletion logics.

Bug fixes

Fix file not found for path with url encoded character.
Fix Hashaggregate convert job throwing ScalaReflectionException.
Fix pruning error while reading parquet files with multiple row groups.
Fix incorrect number of tasks due to missing shuffleOrigin.
Fix record batch creating error when hash joining with empty input.

Other

Upgrade datafusion/arrow dependency to v42/v53.
Replace gxhash with foldhash for better compatibility on some hardwares.
Other minor improvement & fixes.

PRs

AggExec: implement columnar accumulator states. by @richox in kwai#646
Bump bigdecimal from 0.4.5 to 0.4.6 by @dependabot in kwai#638
Bump bytes from 1.7.2 to 1.8.0 by @dependabot in kwai#625
Bump bytes from 1.8.0 to 1.9.0 by @dependabot in kwai#671
Bump object_store from 0.11.0 to 0.11.1 by @dependabot in kwai#622
Bump sonic-rs from 0.3.13 to 0.3.14 by @dependabot in kwai#623
Bump sonic-rs from 0.3.14 to 0.3.16 by @dependabot in kwai#647
Bump tempfile from 3.13.0 to 3.14.0 by @dependabot in kwai#641
Bump tokio from 1.40.0 to 1.41.0 by @dependabot in kwai#629
Bump tokio from 1.41.0 to 1.41.1 by @dependabot in kwai#642
Bump tokio from 1.41.0 to 1.41.1 by @dependabot in kwai#676
Bump uuid from 1.10.0 to 1.11.0 by @dependabot in kwai#618
Create RecordBatch with num_rows option to avoid bhj error caused by empty output_schema by @wForget in kwai#683
Fix build on windows by @wForget in kwai#666
Fix file not found for path with url encoded character by @wForget in kwai#679
Followup to #674, add -r for rm by @wForget in kwai#681
Introduce base blaze sql test suite by @wForget in kwai#674
[BLAZE-287][FOLLOWUP] Use JavaUtils#newConcurrentHashMap to speed up ConcurrentHashMap#computeIfAbsent by @SteNicholas in kwai#615
[BLAZE-573][FOLLOWUP] Bump Spark from 3.4.3 to 3.4.4 by @SteNicholas in kwai#640
[BLAZE-627] Make ORC and Parquet format detection more generic by @dixingxing0 in kwai#628
[BLAZE-664] Bump Celeborn version from 0.5.1 to 0.5.2 by @SteNicholas in kwai#665
[MINOR] Avoid NPE when native lib is not found by @wForget in kwai#668
add new blaze logo by @richox in kwai#633
chore: Make spotless plugin happy by @zuston in kwai#653
code refactoring by @richox in kwai#658
code refactoring by @richox in kwai#677
doc: update tpc-h benchmark result by @richox in kwai#614
fix Hashaggregate convert job throw ScalaReflectionException by @leizhang5s in kwai#637
fix pruning error while reading parquet files with multiple row groups by @richox in kwai#616
fix running error for Spark 3.2.0 and 3.2.1 by @XorSum in kwai#602
fix(shuffle): Progagate shuffle origin to native exchange exec to make AQE rebalance valid by @zuston in kwai#663
fix(spill): Delete spill file when dropping for rust FileSpill by @zuston in kwai#660
fix(spill): Explicitly delete spill file for FileBasedSpillBuf after release by @zuston in kwai#654
improve NativeOrcScan by @richox in kwai#631
improve memory management by @richox in kwai#621
improvement: Add numOfPartitions metrics for exchange exec to align with vanilla spark by @zuston in kwai#669
optimize bloom filter by @richox in kwai#620
parquet reading improvements by @richox in kwai#650
release version v4.0.0 by @richox in kwai#613
replace gxhash with foldhash by @richox in kwai#624
supports specialized count(0) by @richox in kwai#619
tpcd benchmarkrunner : add orc format support by @leizhang5s in kwai#639
update to datafusion-v42 by @richox in kwai#574
use custom implemented hashmap for aggregation by @richox in kwai#617

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RELEASES.md

RELEASES.md

blaze-v4.0.1:

New Feature

Improvement

Bug fixes

Other

PRs

Files

RELEASES.md

Latest commit

History

RELEASES.md

File metadata and controls

blaze-v4.0.1:

New Feature

Improvement

Bug fixes

Other

PRs