-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UDAF refactor: Add PhysicalExpr trait dependency on datafusion-expr
and remove logical expressions requirement for creating physical aggregate expression
#11845
Conversation
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
@@ -44,6 +44,8 @@ arrow-schema = { workspace = true } | |||
datafusion-common = { workspace = true } | |||
datafusion-execution = { workspace = true } | |||
datafusion-expr = { workspace = true } | |||
datafusion-functions-aggregate-common = { workspace = true } | |||
datafusion-physical-expr = { workspace = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is no longer an issue given the datafusion-physical-expr
is somewhat thin now. We could move out expressions
(Column, Literal, ...) for minimum dependencies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "thin"? I checked on this branch and here is what datafusion/physical-expr still has
(we could potentially move the groups accumulators into datafusion-functions-aggegate-common
as well, as a follow on PR)
$ find datafusion/physical-expr/src
datafusion/physical-expr/src
datafusion/physical-expr/src/physical_expr.rs
datafusion/physical-expr/src/partitioning.rs
datafusion/physical-expr/src/analysis.rs
datafusion/physical-expr/src/aggregate
datafusion/physical-expr/src/aggregate/mod.rs
datafusion/physical-expr/src/aggregate/stats.rs
datafusion/physical-expr/src/aggregate/groups_accumulator
datafusion/physical-expr/src/aggregate/groups_accumulator/adapter.rs
datafusion/physical-expr/src/aggregate/groups_accumulator/mod.rs
datafusion/physical-expr/src/lib.rs
datafusion/physical-expr/src/functions.rs
datafusion/physical-expr/src/utils
datafusion/physical-expr/src/utils/guarantee.rs
datafusion/physical-expr/src/utils/mod.rs
datafusion/physical-expr/src/planner.rs
datafusion/physical-expr/src/intervals
datafusion/physical-expr/src/intervals/test_utils.rs
datafusion/physical-expr/src/intervals/mod.rs
datafusion/physical-expr/src/intervals/cp_solver.rs
datafusion/physical-expr/src/intervals/utils.rs
datafusion/physical-expr/src/equivalence
datafusion/physical-expr/src/equivalence/properties.rs
datafusion/physical-expr/src/equivalence/mod.rs
datafusion/physical-expr/src/equivalence/projection.rs
datafusion/physical-expr/src/equivalence/class.rs
datafusion/physical-expr/src/equivalence/ordering.rs
datafusion/physical-expr/src/window
datafusion/physical-expr/src/window/row_number.rs
datafusion/physical-expr/src/window/ntile.rs
datafusion/physical-expr/src/window/window_expr.rs
datafusion/physical-expr/src/window/lead_lag.rs
datafusion/physical-expr/src/window/cume_dist.rs
datafusion/physical-expr/src/window/nth_value.rs
datafusion/physical-expr/src/window/built_in.rs
datafusion/physical-expr/src/window/built_in_window_function_expr.rs
datafusion/physical-expr/src/window/aggregate.rs
datafusion/physical-expr/src/window/rank.rs
datafusion/physical-expr/src/window/mod.rs
datafusion/physical-expr/src/window/sliding_aggregate.rs
datafusion/physical-expr/src/expressions
datafusion/physical-expr/src/expressions/like.rs
datafusion/physical-expr/src/expressions/negative.rs
datafusion/physical-expr/src/expressions/literal.rs
datafusion/physical-expr/src/expressions/cast.rs
datafusion/physical-expr/src/expressions/binary.rs
datafusion/physical-expr/src/expressions/is_not_null.rs
datafusion/physical-expr/src/expressions/unknown_column.rs
datafusion/physical-expr/src/expressions/mod.rs
datafusion/physical-expr/src/expressions/is_null.rs
datafusion/physical-expr/src/expressions/try_cast.rs
datafusion/physical-expr/src/expressions/column.rs
datafusion/physical-expr/src/expressions/no_op.rs
datafusion/physical-expr/src/expressions/binary
datafusion/physical-expr/src/expressions/binary/kernels.rs
datafusion/physical-expr/src/expressions/case.rs
datafusion/physical-expr/src/expressions/in_list.rs
datafusion/physical-expr/src/expressions/not.rs
datafusion/physical-expr/src/scalar_function.rs
datafusion/physical-expr/src/math_expressions.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think half of them are window
which will be moved to functions-window-common
in the future. Another half are equivalence
. However, I agree they are still unnecessary dependency for functions-aggregate
. 😞 I think we could iteratively move window
and maybe equivalence
🤔 out from physical-expr
and keep most of expressions under datafusion/physical-expr/src/expressions/
at the end if we expect minimum dependency for 3rd party.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can see if window
and/or equivalence
staying here causes any problems in practice. If so, I agree that they can be moved in the future. However, it is also possible that they do not pose any challenges in practice and can stay. We'll see soon enough 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can see if
window
and/orequivalence
staying here causes any problems in practice. If so, I agree that they can be moved in the future. However, it is also possible that they do not pose any challenges in practice and can stay. We'll see soon enough 🚀
💯 agree
FWIW one project that would likely result in moving window functions would be #8709
I suspect that is a relatively smaller project than moving scalar or aggregates as there are so many fewer window functions
Signed-off-by: jayzhan211 <[email protected]>
I know this is still a draft, but can you remind us which issue/discussion this refactor is related to? Is it about removing logical stuff from aggregates? We can help more effectively with more context. Refactors are great when done judiciously but we should make sure we solve pressing pain points when we make major API changes or touch so many files at the same time. If removing logicals from aggregates is the intent here, that would be great to achieve and IMO would be worth to do a big-ish refactor for. |
Signed-off-by: jayzhan211 <[email protected]>
I agree it would be great to ensure this PR has the background / rationale for the changes |
Signed-off-by: jayzhan211 <[email protected]>
@ozankabak @alamb This PR is from #11359, the main goal is to eliminate logical expressions for creating physical expressions. |
Signed-off-by: jayzhan211 <[email protected]>
@@ -20,10 +20,13 @@ | |||
#[macro_use] | |||
mod binary; | |||
mod case; | |||
mod cast; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move cast, literal, column back to physical-expr, since physical-expr-common is now a common-level
crate
@@ -108,35 +106,6 @@ pub fn reverse_order_bys(order_bys: &[PhysicalSortExpr]) -> Vec<PhysicalSortExpr | |||
.collect() | |||
} | |||
|
|||
/// Converts `datafusion_expr::Expr` into corresponding `Arc<dyn PhysicalExpr>`. | |||
/// If conversion is not supported yet, returns Error. | |||
pub fn limited_convert_logical_expr_to_physical_expr_with_dfschema( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
workaround cleanup
/// an error if the given schema has fewer columns than the original schema. | ||
/// Note that the resulting expression may not be valid if data types in the | ||
/// new schema is incompatible with expression nodes. | ||
pub fn with_new_schema( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to physical_expr::expressions::column
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this takes in a general PhysicalExpr
and rewrites it, it is more natural for it to be a top-level utility. Do we have to nest it inside expressions::column
due to some dependency issues? Avoiding that nesting would be good due to the signature/function of this routine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The best place we could move to is probably expressions::mod
, it should be in the same crate with Column
, but I don't think there is much different, so I keep it besides column
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expressions::mod
sounds good to me 👍
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
let n = match acc_args.physical_exprs[1] | ||
.as_any() | ||
.downcast_ref::<Literal>() | ||
{ | ||
Some(lit) => match lit.value() { | ||
ScalarValue::Int64(Some(value)) => { | ||
if acc_args.is_reversed { | ||
-*value | ||
} else { | ||
*value | ||
} | ||
} | ||
_ => { | ||
return not_impl_err!( | ||
"{} not supported for n: {}", | ||
self.name(), | ||
&acc_args.physical_exprs[1] | ||
); | ||
} | ||
}, | ||
None => { | ||
return not_impl_err!( | ||
"{} not supported for n: {}", | ||
self.name(), | ||
&acc_args.physical_exprs[1] | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing code is readable in the current form.
But I wanted to explore ways for removing the repetition in the error handling block. Finally, ended up having to split it into two separate bindings.
I don't have any strong opinions about this. So please feel free to use it or not 😃
let scalar = acc_args.physical_exprs[1]
.as_any()
.downcast_ref::<Literal>()
.and_then(|lit| {
if let ScalarValue::Int64(inner) = lit.value() {
Some(*inner)
} else {
None
}
})
.flatten();
let n = if let Some(value) = scalar {
if acc_args.is_reversed {
-value
} else {
value
}
} else {
return not_impl_err!(
"{} not supported for n: {}",
self.name(),
&acc_args.physical_exprs[1]
);
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I come out another way that is slightly more readable
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
use datafusion_expr::function::StateFieldsArgs; | ||
use datafusion_expr::type_coercion::aggregates::check_arg_count; | ||
use datafusion_expr::utils::AggregateOrderSensitivity; | ||
use datafusion_expr::AggregateUDF; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The blocker for us to move this crate into common-level
crate functions-aggregate-common
is AggregateUDF
. It has fn call() -> Expr
that tightly coupled with Expr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should remove call()
, but create another function like
aggregate_expr(sum_udaf(), vec![args]) -> Expr
And, aggregate_physical_expr(sum_udaf(), vec![physical_args]) -> Arc<dyn AggregateExpr>>
similarly. 🤔
Or we force to use ExprFuncBuilder for whatever
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for driving this forward.
I gave this a careful read and left some comments/suggestions. Most of them are stylistic, but there are also some with questions and possible discussion points. I will be happy to iterate with you to move this forward so we succeed in removing logical expressions from accumulators/aggregates.
One general point in addition to inline comments: Given that the structs will only contain physical expressions after this, I'm not sure if we need to be overly explicit and call the expression fields physical_exprs
. The prefix physical would make sense in the old design where both logical and physical expressions were there, but now we only have one type of expression. So calling it exprs
seem to make more sense to me (and the linter will show the type anyway).
use datafusion_physical_expr::expressions::col; | ||
use datafusion_physical_expr::expressions::lit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use datafusion_physical_expr::expressions::col; | |
use datafusion_physical_expr::expressions::lit; | |
use datafusion_physical_expr::expressions::{col, lit}; |
use crate::type_coercion::binary::get_result_type; | ||
use crate::Operator; | ||
use arrow_buffer::{IntervalDayTime, IntervalMonthDayNano}; | ||
use arrow::datatypes::{IntervalDayTime, IntervalMonthDayNano}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can move this down next to other arrow::datatypes
imports and have one import statement instead of three for arrow datatypes
use datafusion_physical_expr::expressions::Column; | ||
use datafusion_physical_expr::expressions::{IsNotNullExpr, IsNullExpr}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use datafusion_physical_expr::expressions::Column; | |
use datafusion_physical_expr::expressions::{IsNotNullExpr, IsNullExpr}; | |
use datafusion_physical_expr::expressions::{Column, IsNotNullExpr, IsNullExpr}; |
use crate::expressions::{col, lit, try_cast, Literal}; | ||
|
||
use crate::expressions::Column; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use crate::expressions::{col, lit, try_cast, Literal}; | |
use crate::expressions::Column; | |
use crate::expressions::{col, lit, try_cast, Column, Literal}; | |
use super::Column; | ||
use super::Literal; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use super::Column; | |
use super::Literal; | |
use super::{Column, Literal}; |
@@ -395,6 +379,8 @@ impl BuiltInWindowFunctionExpr for WindowUDFExpr { | |||
} | |||
} | |||
|
|||
// TODO: Find a way to make clippy happy | |||
#[allow(clippy::needless_borrow)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why we get this warning now. Is it because we used to clone with expr
and now do it with &expr
? We should understand if this is a spurious warning or indicative of a code smell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error: this expression creates a reference which is immediately dereferenced by the compiler
--> datafusion/physical-plan/src/windows/mod.rs:401:28
|
401 | Arc::clone(&expr),
| ^^^^^ help: change this to: `expr`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow
= note: `-D clippy::needless-borrow` implied by `-D warnings`
= help: to override `-D warnings` add `#[allow(clippy::needless_borrow)]`
This is the clippy error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I presume if you just change to expr
, like how it was before, we get some other error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, actually it has no error. The error I got is from rust-analyzer.
expected &Arc<dyn PhysicalExpr, Global>, found Arc<dyn PhysicalExpr, Global>
😕
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
I cleanup dfschema and input_types together in this PR because they turned out to be more straightforward than I initially thought |
datafusion-expr
and remove logical expressions for creating physical aggregate expression
datafusion-expr
and remove logical expressions for creating physical aggregate expressiondatafusion-expr
and remove logical expressions for creating physical aggregate expression
datafusion-expr
and remove logical expressions for creating physical aggregate expressiondatafusion-expr
and remove logical expressions requirement for creating physical aggregate expression
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this. I looked at this in detail and it looks good to me.
Hopefully we will not need to do such big refactors in the future as we stabilize the architecture.
Given that it is a big PR with many changes (both structurally and API), it would be good to get multiple eyes on this. @alamb, can you please take a look? Also @andygrove, you were interested in removing logical expressions from aggregates, it'd be good if you can take a look as well. I also announced on Discord and Slack to attract more eyes.
Thanks @ozankabak for your review! |
pub is_distinct: bool, | ||
|
||
/// The physical expression of arguments the aggregate function takes. | ||
pub exprs: &'a [Arc<dyn PhysicalExpr>], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Introducing PhysicalExpr
and PhysicalSortExpr
to AccumulatorArgs
which used in AggregateUDFImpl
is the main reason for this large crate refactor
It would be great to visualize the dependency graph after this refactor. We can probably use diagrams like that in our documentation too. |
reduced datafusion dependencies.pdf I'm trying to wrap my head around the current dependency graph and what the purpose of each of the crates is. I generated this diagram using Hopefully this helps. |
FYI @andygrove and @viirya as I suspect this may impact comet (though I haven't looked at it thoroughly yet) Given the size of this change and that we are getting close (days if not sooner) to releasing DataFusion 41 -- #11476 -- it might be a good idea to wait to merge this PR until right after the RC is created That would give these changes the maximal "bake time" on main and give early adopters who integrate from main rather than releases on crates.io time to test and make adjustments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jayzhan211 and @ozankabak -- I think this is an epic refactor and very nicely / cleanly separates the logical from physical APIs ❤️
I am sure we can always improve things going forward too, but I think this is a great step in the right direction
I left some small documentation/comment suggestions but I think they could be done as a follow on PR as well.
I do think we should consider waiting to merge until we have made a 41.0.0 release candidate so we have the maximum time to work out any kinks this might cause
|
||
[package] | ||
name = "datafusion-expr-common" | ||
description = "Logical plan and expression representation for DataFusion query engine" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could adjust this to say it had commonly shared types
description = "Logical plan and expression representation for DataFusion query engine" | |
description = "Traits and types for logical plans and expressions for DataFusion query engine" |
@@ -17,7 +17,7 @@ | |||
|
|||
//! Vectorized [`GroupsAccumulator`] | |||
|
|||
use arrow_array::{ArrayRef, BooleanArray}; | |||
use arrow::array::{ArrayRef, BooleanArray}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
datafusion/expr-common/src/lib.rs
Outdated
//! Logical Expr Common packages for [DataFusion] | ||
//! | ||
//! [DataFusion]: <https://crates.io/crates/datafusion> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think some additional rationale here might help
//! Logical Expr Common packages for [DataFusion] | |
//! | |
//! [DataFusion]: <https://crates.io/crates/datafusion> | |
//! Logical Expr types and traits for [DataFusion] | |
//! | |
//! This crate contains types and traits that are used by both Logical and Physical expressions. | |
//! They are kept in their own crate to avoid physical expressions depending on logical expressions. | |
//! | |
//! | |
//! [DataFusion]: <https://crates.io/crates/datafusion> |
@@ -287,202 +280,3 @@ impl fmt::Display for Operator { | |||
write!(f, "{display}") | |||
} | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
|
||
[package] | ||
name = "datafusion-functions-aggregate-common" | ||
description = "Common aggregate function packages for the DataFusion query engine" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description = "Common aggregate function packages for the DataFusion query engine" | |
description = "Utility functions for implementing aggregate functions for the DataFusion query engine" |
//! Contains the trait `AggregateExpr` which defines the interface all aggregate expressions | ||
//! (built-in and custom) need to satisfy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this first sentence is rendered as the summary on the docs page, I think it is nice to make a link to the relevant struct/trait if possible
for example https://docs.rs/datafusion/latest/datafusion/index.html
//! Contains the trait `AggregateExpr` which defines the interface all aggregate expressions | |
//! (built-in and custom) need to satisfy. | |
//! [`AggregateExpr`] which defines the interface all aggregate expressions | |
//! (built-in and custom) need to satisfy. |
//! Aggregate Function Common packages for [DataFusion] | ||
//! This package could be used to build for 3rd party aggregate function | ||
//! | ||
//! [DataFusion]: <https://crates.io/crates/datafusion> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//! Aggregate Function Common packages for [DataFusion] | |
//! This package could be used to build for 3rd party aggregate function | |
//! | |
//! [DataFusion]: <https://crates.io/crates/datafusion> | |
//! Common Aggregate functionality for [DataFusion] | |
//! | |
//! This crate contains traits and utilities commonly used to implement aggregate functions | |
//! They are kept in their own crate to avoid physical expressions depending on logical expressions. | |
//! | |
//! [DataFusion]: <https://crates.io/crates/datafusion> |
@@ -44,6 +44,8 @@ arrow-schema = { workspace = true } | |||
datafusion-common = { workspace = true } | |||
datafusion-execution = { workspace = true } | |||
datafusion-expr = { workspace = true } | |||
datafusion-functions-aggregate-common = { workspace = true } | |||
datafusion-physical-expr = { workspace = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "thin"? I checked on this branch and here is what datafusion/physical-expr still has
(we could potentially move the groups accumulators into datafusion-functions-aggegate-common
as well, as a follow on PR)
$ find datafusion/physical-expr/src
datafusion/physical-expr/src
datafusion/physical-expr/src/physical_expr.rs
datafusion/physical-expr/src/partitioning.rs
datafusion/physical-expr/src/analysis.rs
datafusion/physical-expr/src/aggregate
datafusion/physical-expr/src/aggregate/mod.rs
datafusion/physical-expr/src/aggregate/stats.rs
datafusion/physical-expr/src/aggregate/groups_accumulator
datafusion/physical-expr/src/aggregate/groups_accumulator/adapter.rs
datafusion/physical-expr/src/aggregate/groups_accumulator/mod.rs
datafusion/physical-expr/src/lib.rs
datafusion/physical-expr/src/functions.rs
datafusion/physical-expr/src/utils
datafusion/physical-expr/src/utils/guarantee.rs
datafusion/physical-expr/src/utils/mod.rs
datafusion/physical-expr/src/planner.rs
datafusion/physical-expr/src/intervals
datafusion/physical-expr/src/intervals/test_utils.rs
datafusion/physical-expr/src/intervals/mod.rs
datafusion/physical-expr/src/intervals/cp_solver.rs
datafusion/physical-expr/src/intervals/utils.rs
datafusion/physical-expr/src/equivalence
datafusion/physical-expr/src/equivalence/properties.rs
datafusion/physical-expr/src/equivalence/mod.rs
datafusion/physical-expr/src/equivalence/projection.rs
datafusion/physical-expr/src/equivalence/class.rs
datafusion/physical-expr/src/equivalence/ordering.rs
datafusion/physical-expr/src/window
datafusion/physical-expr/src/window/row_number.rs
datafusion/physical-expr/src/window/ntile.rs
datafusion/physical-expr/src/window/window_expr.rs
datafusion/physical-expr/src/window/lead_lag.rs
datafusion/physical-expr/src/window/cume_dist.rs
datafusion/physical-expr/src/window/nth_value.rs
datafusion/physical-expr/src/window/built_in.rs
datafusion/physical-expr/src/window/built_in_window_function_expr.rs
datafusion/physical-expr/src/window/aggregate.rs
datafusion/physical-expr/src/window/rank.rs
datafusion/physical-expr/src/window/mod.rs
datafusion/physical-expr/src/window/sliding_aggregate.rs
datafusion/physical-expr/src/expressions
datafusion/physical-expr/src/expressions/like.rs
datafusion/physical-expr/src/expressions/negative.rs
datafusion/physical-expr/src/expressions/literal.rs
datafusion/physical-expr/src/expressions/cast.rs
datafusion/physical-expr/src/expressions/binary.rs
datafusion/physical-expr/src/expressions/is_not_null.rs
datafusion/physical-expr/src/expressions/unknown_column.rs
datafusion/physical-expr/src/expressions/mod.rs
datafusion/physical-expr/src/expressions/is_null.rs
datafusion/physical-expr/src/expressions/try_cast.rs
datafusion/physical-expr/src/expressions/column.rs
datafusion/physical-expr/src/expressions/no_op.rs
datafusion/physical-expr/src/expressions/binary
datafusion/physical-expr/src/expressions/binary/kernels.rs
datafusion/physical-expr/src/expressions/case.rs
datafusion/physical-expr/src/expressions/in_list.rs
datafusion/physical-expr/src/expressions/not.rs
datafusion/physical-expr/src/scalar_function.rs
datafusion/physical-expr/src/math_expressions.rs
@@ -136,6 +138,35 @@ pub fn col(name: &str, schema: &Schema) -> Result<Arc<dyn PhysicalExpr>> { | |||
Ok(Arc::new(Column::new_with_schema(name, schema)?)) | |||
} | |||
|
|||
// TODO: Move expressions out of physical-expr? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to where would it go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover, it should be kept with expressions/
) | ||
let args = [col("b", schema)?]; | ||
|
||
AggregateExprBuilder::new(first_value_udaf(), args.to_vec()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😍
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Reasonable, agreed |
I plan on creating 41.0.0-rc1 today |
Signed-off-by: jayzhan211 <[email protected]>
RC1 is out https://lists.apache.org/thread/gz15fc6qbn4q99rp0of6xjsngvf8jbw0 so let's get this PR in! Thanks again @jayzhan211 and @ozankabak 👏 |
Thanks everyone 🚀 |
🚀 |
Which issue does this PR close?
Closes #11359
Closes #11761
Rationale for this change
Logical expressions are no longer required for creating physical aggregate expression
What changes are included in this PR?
Lots of crate created
expr
has dependency onfunctions-aggregate-common
+physical-expr-common
physical-expr
Other cleanup
create_aggregate_expr_with_dfschema
removedcreate_aggregate_expr
removedinput_types
inAccumulatorArgs
dfschema
inAccumulatorArgs
data_type
toreturn_type
inAccumulatorArgs
Are these changes tested?
Are there any user-facing changes?