-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make CommonSubexprEliminate
top-down like
#11683
Conversation
Some((common_aggr_exprs, mut aggr_list)) => { | ||
let new_aggr_expr = aggr_list.pop().unwrap(); | ||
|
||
let mut agg_exprs = common_aggr_exprs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is basically the same as it was: https://github.com/apache/datafusion/pull/11683/files#diff-351499880963d6a383c92e156e75019cd9ce33107724a9635853d7d4cd1898d0L522
.map_data(|(new_window_expr_list, new_input, window_expr_list)| { | ||
// If there were common expressions extracted, then we need to make sure | ||
// we restore the original column names. | ||
// TODO: Although `find_common_exprs()` inserts aliases around extracted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to fix the previous TODO
(https://github.com/apache/datafusion/pull/11683/files#diff-351499880963d6a383c92e156e75019cd9ce33107724a9635853d7d4cd1898d0L563) but realized that preserving names is still required, so I added 2 TODO
s where we have that logic.
I will try to get rid of them in a follow-up PR.
cc @alamb |
@@ -1963,6 +1944,52 @@ mod test { | |||
Ok(()) | |||
} | |||
|
|||
#[test] | |||
fn test_non_top_level_common_expression() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New test for the possible perf regression (1.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I verified without the code chages in this PR this test fails like this:
failed to optimize plan
thread 'common_subexpr_eliminate::test::test_non_top_level_common_expression' panicked at datafusion/optimizer/src/common_subexpr_eliminate.rs:1215:9:
failed to optimize plan
stack backtrace:
0: rust_begin_unwind
at /rustc/051478957371ee0084a7c0913941d2a8c4757bb9/library/std/src/panicking.rs:652:5
1: core::panicking::panic_fmt
at /rustc/051478957371ee0084a7c0913941d2a8c4757bb9/library/core/src/panicking.rs:72:14
2: datafusion_optimizer::common_subexpr_eliminate::test::assert_optimized_plan_eq
at ./src/common_subexpr_eliminate.rs:1215:9
3: datafusion_optimizer::common_subexpr_eliminate::test::test_non_top_level_common_expression
at ./src/common_subexpr_eliminate.rs:1984:9
4: datafusion_optimizer::common_subexpr_eliminate::test::test_non_top_level_common_expression::{{closure}}
at ./src/common_subexpr_eliminate.rs:1967:50
5: core::ops::function::FnOnce::call_once
at /rustc/051478957371ee0084a7c0913941d2a8c4757bb9/library/core/src/ops/function.rs:250:5
6: core::ops::function::FnOnce::call_once
at /rustc/051478957371ee0084a7c0913941d2a8c4757bb9/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
} | ||
|
||
#[test] | ||
fn test_nested_common_expression() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New test for the top-down improvement (2.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I verified this test fails without the changes in this PR:
assertion `left == right` failed
left: "Projection: __common_expr_1 AS c1, __common_expr_1 AS c2\n Projection: __common_expr_2 * __common_expr_2 AS __common_expr_1, test.a, test.b, test.c\n Projection: test.a + test.b AS __common_expr_2, test.a, test.b, test.c\n TableScan: test"
right: "Projection: __common_expr_1 AS c1, __common_expr_1 AS c2\n Projection: (test.a + test.b) * (test.a + test.b) AS __common_expr_1, test.a, test.b, test.c\n TableScan: test"
<Click to see difference>
thread 'common_subexpr_eliminate::test::test_nested_common_expression' panicked at datafusion/optimizer/src/common_subexpr_eliminate.rs:1218:9:
assertion `left == right` failed
left: "Projection: __common_expr_1 AS c1, __common_expr_1 AS c2\n Projection: __common_expr_2 * __common_expr_2 AS __common_expr_1, test.a, test.b, test.c\n Projection: test.a + test.b AS __common_expr_2, test.a, test.b, test.c\n TableScan: test"
right: "Projection: __common_expr_1 AS c1, __common_expr_1 AS c2\n Projection: (test.a + test.b) * (test.a + test.b) AS __common_expr_1, test.a, test.b, test.c\n TableScan: test"
stack backtrace:
0: rust_begin_unwind
at /rustc/051478957371ee0084a7c0913941d2a8c4757bb9/library/std/src/panicking.rs:652:5
1: core::panicking::panic_fmt
at /rustc/051478957371ee0084a7c0913941d2a8c4757bb9/library/core/src/panicking.rs:72:14
2: core::panicking::assert_failed_inner
3: core::panicking::assert_failed
at /rustc/051478957371ee0084a7c0913941d2a8c4757bb9/library/core/src/panicking.rs:363:5
4: datafusion_optimizer::common_subexpr_eliminate::test::assert_optimized_plan_eq
at ./src/common_subexpr_eliminate.rs:1218:9
5: datafusion_optimizer::common_subexpr_eliminate::test::test_nested_common_expression
at ./src/common_subexpr_eliminate.rs:2007:9
6: datafusion_optimizer::common_subexpr_eliminate::test::test_nested_common_expression::{{closure}}
at ./src/common_subexpr_eliminate.rs:1990:43
7: core::ops::function::FnOnce::call_once
at /rustc/051478957371ee0084a7c0913941d2a8c4757bb9/library/core/src/ops/function.rs:250:5
8: core::ops::function::FnOnce::call_once
at /rustc/051478957371ee0084a7c0913941d2a8c4757bb9/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Namely the output projection looks like
Projection: __common_expr_1 AS c1, __common_expr_1 AS c2
Projection: (test.a + test.b) * (test.a + test.b) AS __common_expr_1, test.a, test.b, test.c
TableScan: test
Rather than
Projection: __common_expr_1 AS c1, __common_expr_1 AS c2
Projection: __common_expr_2 * __common_expr_2 AS __common_expr_1, test.a, test.b, test.c
Projection: test.a + test.b AS __common_expr_2, test.a, test.b, test.c
TableScan: test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the 2nd is a better plan and after this PR we reach it with executing the rule only once. (Before this PR the optimizer had to execute the rule one more time to reach the 2nd plan from the 1st.)
Thank you @peter-toth -- this is on my review list |
Interestingly, I ran the planning benchmarks and this branch actually seems to be ever so slightly slower. I'll rerun to see if I can reproduce the same results. cargo bench --bench sql_planner
|
Yeah, this is basically the trade-off between the time we spend in planning and execution. These benchmarks only measure planning time and that might be a bit more with this PR. #10473 could cause regression due to no longer extracting common expressions from non top level nodes and so causing longer execution times in some cases. (This side effect was not intentional, I just didn't notice it with the test available at that time.) This PR restores the common expression elimination from non top level nodes and so fix the possible execution time regression, but it comes with a small cost in planning time. BTW, the TPCH queries are not affected by the regression (there are no non top level nodes with common expressions in them) so this PR only makes their planning slower but doesn't improve their execution times. But I still think that the possible execution time regression that #10473 can cause needs to be fixed. Update: I've ellaborated on 1. in the PR description. |
I reran the planning benchmarks and I see the slowdown again Details
|
I am sorry for the delay in reviewing this PR -- my big hesitation is that after all the work we have done to improve planning time this makes planning time worse I understand that there is a tradeoff wher the plan execution time should decrease for other plans. However, there are no changes to existing tests, thus suggesting that this isn't a widely applicable optimziation So what I am hoping to do (or maybe someone will beat me to it) is figure out how to have my cake and eat it too (aka optimize this code so it doesn't slow down planning but still makes better plans) |
@alamb , I just realized that I have no idea how I missed that fact. Maybe because I was working on parts of the rule where it does call |
1b67311
to
6a62811
Compare
I believe I understood what happened here. The main problem with the top-down conversion is that in some cases Then I added So I think the right fix is to:
I've rebased the previous commit on the latest I ran some benchmarks and this PR is now better than
|
I've updated the PR description and PR is ready for review again. |
Thanks @peter-toth I also reran the benchmarks and no longer see any regression. I will get this PR reviewed carefully shortly (hopefully later today or tomorrow) Details
|
CI failure seemed infra related https://github.com/apache/datafusion/actions/runs/10267531872/job/28408513630?pr=11683 so I restarted it on this PR |
I am sorry -- something came up at work today and I am super behind on reviews. this is very high on my list for tomorow |
No worries @alamb, this PR can wait. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @peter-toth
I reviewed this PR quite carefully and I think it is very nice 👌
I had a suggestion on readability (use a named struct) in peter-toth#4 but we could also merge that as a follow on PR (or never)
@@ -688,7 +702,10 @@ impl OptimizerRule for CommonSubexprEliminate { | |||
} | |||
|
|||
fn apply_order(&self) -> Option<ApplyOrder> { | |||
Some(ApplyOrder::TopDown) | |||
// This rule handles recursion itself in a `ApplyOrder::TopDown` like manner. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
fn rewrite_expr( | ||
/// 2. An optional tuple that contains the extracted common sub-expressions and the | ||
/// original `exprs_list`. | ||
fn find_common_exprs( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very nice refactor / rename -- calling it find_common_exprs
is 💯 and I think makes it much easier to follow
/// Rewrites the expression in `exprs_list` with common sub-expressions | ||
/// replaced with a new column and adds a ProjectionExec on top of `input` | ||
/// which computes any replaced common sub-expressions. | ||
/// Extracts common sub-expressions and rewrites `exprs_list`. | ||
/// | ||
/// Returns a tuple of: | ||
/// 1. The rewritten expressions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// 1. The rewritten expressions | |
/// 1. The (potentially) rewritten expressions |
Expr::Column(Column::from_name(expr_alias)).alias(out_name), | ||
); | ||
let input = unwrap_arc(input); | ||
// Extract common sub-expressions from the aggregate and grouping expressions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this new structure is very nice
)?; | ||
transformed |= rewrite_exprs.transformed; | ||
expr_mask: ExprMask, | ||
) -> Result<Transformed<(Vec<Vec<Expr>>, FindCommonExprResult)>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found reasoning about this return type (Vec<Vec<Expr>>, FindCommonExprResult)
hard (I couldn't keep track in my head of the three potential fields that were returned)
I made a PR here with a proposal to move it into a named enum
: peter-toth#4
(I can also make that a PR to main if we would prefer to merge this PR as is)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for PR, I've just merged it.
* Extract the result of find_common_exprs into a struct * Make naming consistent
Thanks @peter-toth -- I plan to merge this PR tomorrow (after the #11476 RC is cut) |
# Conflicts: # datafusion/optimizer/src/common_subexpr_eliminate.rs
Thanks again @peter-toth -- the code continues to look nicer and nicer |
Which issue does this PR close?
Part of #11194.
Rationale for this change
This PR contains 2 ideas:
In Stop copying LogicalPlan and Exprs in
CommonSubexprEliminate
(2-3% planning speed improvement) #10835 theCommonSubexprEliminate
rule was converted to aApplyOrder::TopDown
rule, but also the self.rewrite() call was kept.The main problem with the top-down conversion is that in some cases
CommonSubexprEliminate
collects adjacent nodes (e.g.Window
) into groups and once we eliminated subexpression among those groups we transform back the result into adjacent nodes. This kind of transformations can't be definied with a simple top-down optimizer rule. (Most likely this is why the rule handled recursion itself before Stop copying LogicalPlan and Exprs inCommonSubexprEliminate
(2-3% planning speed improvement) #10835.)This means that
CommonSubexprEliminate
should not be aApplyOrder::TopDown
rule, but it should handle recursion itself. But if we reverse the top-down conversion then there is another issue:Improve
CommonSubexprEliminate
identifier management (10% faster planning) #10473 changedto_arrays()
to return a boolean flag if it make sense to execute the 2nd rewriting traversal, that does the actual common expression extraction. E.g. iffound_common
is falserewrite_expr()
is not executed:datafusion/datafusion/optimizer/src/common_subexpr_eliminate.rs
Lines 609 to 622 in 204e1bc
The problem is that calling
self.rewrite()
is currently inrewrite_expr()
:datafusion/datafusion/optimizer/src/common_subexpr_eliminate.rs
Lines 278 to 285 in 204e1bc
(I.e. Improve
CommonSubexprEliminate
identifier management (10% faster planning) #10473 ruined the self recursion handling of the rule.)So what we need to do is:
self.rewrite()
call out of thefound_common
check.The current rule is not optimal as extracted common expressions are not sub-expression eliminated in the current rule exection. This is because the rule recurses into the child plan nodes with
self.rewrite()
and then adds the new projection from the extracted common expressions:The issue with this approach is that even the new projection can contain sub-expressions to eliminate.
E.g. a plan like
can be rewritten to:
but the current rule requires 2 rule executions (optimizer cycles) to reach the final plan.
This can be improved by swapping the order of adding the new project and calling
self.rewrite()
.What changes are included in this PR?
This PR:
apply_order()
to returnNone
.rewrite_expr()
intofind_common_exprs()
to extract common sub-expressions and rewrite an expression list. The step of recursing into child plan nodes is moved out from this method. This wayfind_common_exprs()
can safely leverage the boolean ofto_arrays()
to skip the 2nd traversal.try_unary_plan()
,try_optimize_aggregate()
andtry_optimize_window()
.Are these changes tested?
Yes, added new UTs.
Are there any user-facing changes?
Yes, it fixes a possible performance regression.