-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Insights: apache/datafusion
Overview
Could not load contribution data
Please try again later
36 Pull requests merged by 19 people
-
Make aggr fuzzer query builder more configurable
#15851 merged
Apr 27, 2025 -
Minor: Interval singleton
#15859 merged
Apr 26, 2025 -
Fix
from_unixtime
function documentation#15844 merged
Apr 25, 2025 -
Remove usage of
dbg!
#15858 merged
Apr 25, 2025 -
chore(deps): bump clap from 4.5.36 to 4.5.37
#15853 merged
Apr 25, 2025 -
chore: More details to
No UDF registered
error#15843 merged
Apr 25, 2025 -
Fix ScalarValue::List comparison when the compared lists have different lengths
#15856 merged
Apr 25, 2025 -
Fix build failure caused by new
CoalescePartitionsExec::with_fetch
method#15849 merged
Apr 25, 2025 -
Fix
CoalescePartitionsExec
proto serialization#15824 merged
Apr 25, 2025 -
Fix: fetch is missing in
plan_with_order_breaking_variants
method#15842 merged
Apr 25, 2025 -
predicate pruning: support cast and try_cast for more types
#15764 merged
Apr 24, 2025 -
Feature/benchmark config from env
#15782 merged
Apr 24, 2025 -
Make
Diagnostic
easy/convinient to attach by using macro and avoidingmap_err
#15796 merged
Apr 24, 2025 -
Fix
ILIKE
expression support in SQL unparser#15820 merged
Apr 24, 2025 -
Minor: fix potential flaky test in aggregate.slt
#15829 merged
Apr 24, 2025 -
Fix: fetch is missing in
EnforceSorting
optimizer (two places)#15822 merged
Apr 24, 2025 -
chore(deps): bump pyo3 from 0.24.1 to 0.24.2
#15838 merged
Apr 24, 2025 -
Minor: cleanup hash table after emit all
#15834 merged
Apr 24, 2025 -
Preserve projection for inline scan
#15825 merged
Apr 24, 2025 -
Add
MemoryPool::memory_limit
to expose setting memory usage limit#15828 merged
Apr 24, 2025 -
Support unparsing
UNION
for distinct results#15814 merged
Apr 23, 2025 -
chore(deps): bump env_logger from 0.11.7 to 0.11.8
#15823 merged
Apr 23, 2025 -
docs: add ArkFlow
#15826 merged
Apr 23, 2025 -
Support WITHIN GROUP syntax to standardize certain existing aggregate functions
#13511 merged
Apr 23, 2025 -
Speed up
optimize_projection
#15787 merged
Apr 23, 2025 -
Fix: fetch is missing in
replace_order_preserving_variants
method duringEnforceDistribution
optimizer#15808 merged
Apr 23, 2025 -
Add
or_fun_call
andunnecessary_lazy_evaluations
lints oncore
#15807 merged
Apr 22, 2025 -
chore(deps): bump half from 2.5.0 to 2.6.0
#15806 merged
Apr 22, 2025 -
doc: Adding Feldera as known user
#15799 merged
Apr 22, 2025 -
Minor: eliminate unnecessary struct creation in session state build
#15800 merged
Apr 22, 2025 -
Add try_new for LogicalPlan::Join
#15757 merged
Apr 21, 2025 -
chore(deps): bump sqllogictest from 0.28.0 to 0.28.1
#15788 merged
Apr 21, 2025 -
Minor: remove unused logic for limit pushdown
#15730 merged
Apr 21, 2025 -
Minor: fix flaky test in
aggregate.slt
#15786 merged
Apr 21, 2025 -
fix: clickbench type err
#15773 merged
Apr 21, 2025 -
Show current SQL recursion limit in RecursionLimitExceeded error message
#15644 merged
Apr 21, 2025
21 Pull requests opened by 18 people
-
Add `FormatOptions` to Config
#15793 opened
Apr 21, 2025 -
Factor out Substrait consumers into separate files
#15794 opened
Apr 21, 2025 -
refactor filter pushdown apis
#15801 opened
Apr 22, 2025 -
fix: Add coercion rules for Float16 types
#15816 opened
Apr 22, 2025 -
pipe column orderings into pruning predicate creation
#15821 opened
Apr 23, 2025 -
Implement min max for dictionary types
#15827 opened
Apr 23, 2025 -
Update extending-operators.md
#15832 opened
Apr 23, 2025 -
fix: Avoid mistaken ILike to string equality optimization
#15836 opened
Apr 24, 2025 -
fix(avro): Respect projection order in Avro reader
#15840 opened
Apr 24, 2025 -
refactor: replace `unwrap_or` with `unwrap_or_else` for improved lazy…
#15841 opened
Apr 24, 2025 -
feat(benchmark): collect benchmarks for last 5 versions in line protocol format
#15846 opened
Apr 24, 2025 -
deprecate schema expressions
#15847 opened
Apr 25, 2025 -
Feat: introduce partition statistics API
#15852 opened
Apr 25, 2025 -
fix: fold cast null to substrait typed null
#15854 opened
Apr 25, 2025 -
feat(datafusion-functions-aggregate): add support for lists and other nested types in min and max
#15857 opened
Apr 25, 2025 -
chore: fix clippy::large_enum_variant for DataFusionError
#15861 opened
Apr 25, 2025 -
POC: Parse to Merge Logical Plan
#15862 opened
Apr 26, 2025 -
infer placeholder datatype for IN lists
#15864 opened
Apr 26, 2025 -
Map file-level column statistics to the table-level
#15865 opened
Apr 26, 2025 -
feat: simplify count distinct logical plan
#15867 opened
Apr 26, 2025 -
Substrait: Handle inner map fields in schema renaming
#15869 opened
Apr 26, 2025
11 Issues closed by 8 people
-
Support more types when pruning Parquet data
#15742 closed
Apr 24, 2025 -
benchmarks: Read SessionConfig from Environment
#15684 closed
Apr 24, 2025 -
Potential flaky tests
#15789 closed
Apr 24, 2025 -
Inline table scan drops projection
#15810 closed
Apr 24, 2025 -
Support exposing setting memory limit of memory pool
#15830 closed
Apr 24, 2025 -
The SQL Unparser does not correctly handle `UNION`
#15813 closed
Apr 23, 2025 -
Standardize APPROX_PERCENTILE_CONT / PERCENTILE_CONT and similar aggregation functions
#11732 closed
Apr 23, 2025 -
Eliminate the function call in `xxx_or (e.g. unwrap_or("".to_string())`
#15803 closed
Apr 22, 2025 -
Add `try_new` for `LogicalPlan::Join` `Join` and others
#14363 closed
Apr 21, 2025 -
`Cargo bench --bench sql_planner` is failing
#15753 closed
Apr 21, 2025 -
Improve SQL parser recursion limit error message
#15623 closed
Apr 21, 2025
26 Issues opened by 20 people
-
Tracking: improve aggreagation fuzzer
#15870 opened
Apr 27, 2025 -
Substrait: Handle inner map fields in schema renaming
#15868 opened
Apr 26, 2025 -
Rust API - "contains" function expression wrongly declared, not usable
#15866 opened
Apr 26, 2025 -
Placeholders in IN lists are not inferred
#15863 opened
Apr 26, 2025 -
`NULL::<Data type>` can't be encode to substrait
#15855 opened
Apr 25, 2025 -
`select count(distinct ..)` query doesn't go to the specialized distinct accumulator
#15850 opened
Apr 25, 2025 -
Avro reader fails when query columns are reordered in SELECT statement
#15839 opened
Apr 24, 2025 -
Cannot use Projection::new_from_schema to set parquet field ids.
#15837 opened
Apr 24, 2025 -
ILike with no wildcards is mistakenly optimized to string equality
#15835 opened
Apr 24, 2025 -
Sorting is not maintained after using a window function
#15833 opened
Apr 23, 2025 -
Ensure Substrait producer for `BinaryExpr` includes `output_type`
#15831 opened
Apr 23, 2025 -
Custom sort order for column
#15819 opened
Apr 23, 2025 -
Bring ordering information for grouped aggregation
#15818 opened
Apr 23, 2025 -
Improve `candidate functions` suggestions
#15817 opened
Apr 23, 2025 -
Type coercion does not handle `Float16` correctly
#15815 opened
Apr 22, 2025 -
Pruning of floating point Parquet columns is incorrect when `NaN` is present
#15812 opened
Apr 22, 2025 -
Merging Statistics is slow when sum statistic is present
#15809 opened
Apr 22, 2025 -
Update to 2024 edition
#15804 opened
Apr 22, 2025 -
Eliminate the function call in `xxx_or (e.g. unwrap_or("".to_string())`
#15802 opened
Apr 22, 2025 -
Deprecate ExprSchemable functions
#15798 opened
Apr 21, 2025 -
Support metadata on literal values
#15797 opened
Apr 21, 2025 -
Migrate `datafusion-cli` tests to `insta`
#15795 opened
Apr 21, 2025 -
Migrate `logical_plan` tests to `insta`
#15792 opened
Apr 21, 2025 -
Migrate `core` tests to `insta`
#15791 opened
Apr 21, 2025
69 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Implement Parquet filter pushdown via new filter pushdown APIs
#15769 commented on
Apr 24, 2025 • 22 new comments -
Add Extension Type / Metadata support for Scalar UDFs
#15646 commented on
Apr 25, 2025 • 14 new comments -
replace reassign_predicate_columns helper with PhysicalExpr::with_schema
#15779 commented on
Apr 24, 2025 • 8 new comments -
Fix Infer prepare statement type tests
#15743 commented on
Apr 22, 2025 • 4 new comments -
Enable repartitioning on MemTable.
#15409 commented on
Apr 22, 2025 • 3 new comments -
TopK dynamic filter pushdown attempt 2
#15770 commented on
Apr 22, 2025 • 3 new comments -
Introduce Async User Defined Functions
#14837 commented on
Apr 24, 2025 • 2 new comments -
feat: Add option to adjust writer buffer size for query output
#15747 commented on
Apr 23, 2025 • 2 new comments -
Introduce selection vector repartitioning
#15423 commented on
Apr 21, 2025 • 1 new comment -
feat: Add `datafusion-spark` crate
#15168 commented on
Apr 26, 2025 • 1 new comment -
Improve documentation for `FileSource`, `DataSource` and `DataSourceExec`
#15766 commented on
Apr 23, 2025 • 1 new comment -
Improve `ListingTable` / `ListingTableOptions` docs
#15767 commented on
Apr 21, 2025 • 1 new comment -
Use `interleave` to speed up hash repartitioning
#15768 commented on
Apr 22, 2025 • 1 new comment -
feat: ORDER BY ALL
#15772 commented on
Apr 21, 2025 • 1 new comment -
Set HashJoin seed
#15783 commented on
Apr 25, 2025 • 1 new comment -
feat: Add Aggregate UDF to FFI crate
#14775 commented on
Apr 25, 2025 • 0 new comments -
chore: Return NativeType instead of DataType for get_example_types
#14778 commented on
Apr 22, 2025 • 0 new comments -
chore : migrated all the UDFS to invoke_with_args
#14779 commented on
Apr 22, 2025 • 0 new comments -
feat: implement contextualized ObjectStore
#14805 commented on
Apr 24, 2025 • 0 new comments -
feat: add `register_metadata` function for `GroupsAccumulator` to help create specialized impl
#15022 commented on
Apr 22, 2025 • 0 new comments -
Support metadata columns (`location`, `size`, `last_modified`) in `ListingTableProvider`
#15181 commented on
Apr 21, 2025 • 0 new comments -
Add union_tag scalar function
#14687 commented on
Apr 24, 2025 • 0 new comments -
Support Avg distinct for `float64` type
#15413 commented on
Apr 25, 2025 • 0 new comments -
chore(deps): bump tonic from 0.12.3 to 0.13.0
#15430 commented on
Apr 25, 2025 • 0 new comments -
Attach diagnostic for wrong arg number error
#15451 commented on
Apr 26, 2025 • 0 new comments -
Add `statistics_by_partition API` to ExecutionPlan
#15503 commented on
Apr 25, 2025 • 0 new comments -
Respect ignore_nulls in array_agg
#15544 commented on
Apr 22, 2025 • 0 new comments -
Implement intermeidate result blocked approach sketch
#15591 commented on
Apr 27, 2025 • 0 new comments -
Updated extending operators documentation
#15612 commented on
Apr 27, 2025 • 0 new comments -
Add Cloud-Native Performance Monitoring System with GitHub Integration
#15624 commented on
Apr 21, 2025 • 0 new comments -
feat: Emit warning with Diagnostic when doing = Null
#15696 commented on
Apr 25, 2025 • 0 new comments -
Add slt tests for `datafusion.execution.parquet.coerce_int96` setting
#15723 commented on
Apr 21, 2025 • 0 new comments -
fix: enhance-CLI-query-header-for-cast-expressions-with-literals
#15736 commented on
Apr 27, 2025 • 0 new comments -
Support `GroupsAccumulator` for Avg duration
#15748 commented on
Apr 21, 2025 • 0 new comments -
Added SQL Example for `Aggregate Functions`
#15778 commented on
Apr 25, 2025 • 0 new comments -
Support integration with Parquet modular encryption
#15216 commented on
Apr 24, 2025 • 0 new comments -
Unnecessary casting in stats & filter evaluation
#15780 commented on
Apr 23, 2025 • 0 new comments -
Join on pandas dataframe from python API fails due to schema metadata
#15754 commented on
Apr 23, 2025 • 0 new comments -
Memory limited nest loop join
#15760 commented on
Apr 23, 2025 • 0 new comments -
Support full UTF-8 in CSV files
#15756 commented on
Apr 23, 2025 • 0 new comments -
Bad performance on wide tables (1000+ columns)
#7698 commented on
Apr 23, 2025 • 0 new comments -
Release DataFusion `48.0.0` (June 2025)
#15771 commented on
Apr 22, 2025 • 0 new comments -
[Discussion] Efficient Row Selection for Multi-Engine Support
#14816 commented on
Apr 21, 2025 • 0 new comments -
[Epic] Add snapshot tests (migrate to `insta` for tests)
#15178 commented on
Apr 21, 2025 • 0 new comments -
Make it easier to run TPCH queries with datafusion-cli
#14608 commented on
Apr 21, 2025 • 0 new comments -
Attach `Diagnostic` to "wrong number of arguments" error
#14432 commented on
Apr 21, 2025 • 0 new comments -
Add retract_batch method for median accumulator
#7664 commented on
Apr 21, 2025 • 0 new comments -
incorrect range frame implementation
#15714 commented on
Apr 21, 2025 • 0 new comments -
Reuse Rows allocation in SortPreservingMergeStream / `RowCursorStream`
#15720 commented on
Apr 21, 2025 • 0 new comments -
Remove nulls in joins during hash table lookup
#15784 commented on
Apr 21, 2025 • 0 new comments -
Move code in `user_defined_plan.rs` to the `extending-operators` doc
#15774 commented on
Apr 21, 2025 • 0 new comments -
Support zero copy hash repartitioning for Hash Aggregate
#15383 commented on
Apr 21, 2025 • 0 new comments -
Consolidate feature flags into configuration guide
#14657 commented on
Apr 24, 2025 • 0 new comments -
support simple/cross lateral joins
#14595 commented on
Apr 24, 2025 • 0 new comments -
feat: Add `array_min` function support
#14417 commented on
Apr 22, 2025 • 0 new comments -
Reduce size of `Expr` struct
#14366 commented on
Apr 22, 2025 • 0 new comments -
Support marking columns as system columns via Field's metadata
#14362 commented on
Apr 22, 2025 • 0 new comments -
hash join: add build-side join keys to memory accounting
#14222 commented on
Apr 23, 2025 • 0 new comments -
Support Null aware anti join by HashJoin
#10584 commented on
Apr 23, 2025 • 0 new comments -
ListingTable statistics improperly merges statistics when files have different schemas
#15689 commented on
Apr 27, 2025 • 0 new comments -
Decorrelate scalar subqueries with more complex filter expressions
#14554 commented on
Apr 27, 2025 • 0 new comments -
Make ClickBench Q23 Go Faster
#15177 commented on
Apr 27, 2025 • 0 new comments -
Allow UDFs to return custom `Diagnostic`
#15276 commented on
Apr 26, 2025 • 0 new comments -
[Epic] A collection of FFI related tasks
#15283 commented on
Apr 26, 2025 • 0 new comments -
Push Dynamic Join Predicates into Scan ("Sideways Information Passing", etc)
#7955 commented on
Apr 25, 2025 • 0 new comments -
Optimized spill file format
#14078 commented on
Apr 25, 2025 • 0 new comments -
Support dot graph output in explain (analyze)
#3606 commented on
Apr 24, 2025 • 0 new comments -
Tracking: speed up the logical optimizer
#15775 commented on
Apr 24, 2025 • 0 new comments -
Incorrect field indices for right‑side columns in Substrait ProjectRel after
#15765 commented on
Apr 24, 2025 • 0 new comments