Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-50380][SQL] ReorderAssociativeOperator should respect the contract in ConstantFolding #48918

Closed
wants to merge 1 commit into from

Conversation

cloud-fan
Copy link
Contributor

@cloud-fan cloud-fan commented Nov 21, 2024

What changes were proposed in this pull request?

This PR fixes a long-standing issue in ReorderAssociativeOperator. In this rule, we flatten the Add/Multiply nodes, and combine the foldable operands into a single Add/Multiply, then evaluate it into a literal. This is fine normally, but we added a new contract in ConstantFolding with #36468 , due to the introduction of ANSI mode and we don't want to fail eagerly for expressions within conditional branches. ReorderAssociativeOperator does not follow this contract.

The solution in this PR is to leave the expression evaluation to ConstantFolding. ReorderAssociativeOperator should only match literals. This makes sure that the early expression evaluation follows all the contracts in ConstantFolding.

Why are the changes needed?

Avoid failing the query which should not fail. This also fixes a regression caused by #48395 , which does not introduce the bug, but makes the bug more likely to happen.

Does this PR introduce any user-facing change?

Yes, failed queries can run now.

How was this patch tested?

new test

Was this patch authored or co-authored using generative AI tooling?

no

@github-actions github-actions bot added the SQL label Nov 21, 2024
@cloud-fan
Copy link
Contributor Author

cc @yaooqinn @viirya

val optimized2 = Optimize.execute(originalQuery2)
val correctAnswer2 = testRelation.select(
If($"a" === 1, 1, $"b" + (Literal(Int.MaxValue) + 1)).as("col")).analyze
comparePlans(optimized2, correctAnswer2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any difference between originalQuery2 and correctAnswer2?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The association is different.

  • originalQuery2: (A + B) + C
  • correctAnswer2: A + (B + C)

@@ -106,4 +107,17 @@ class ReorderAssociativeOperatorSuite extends PlanTest {

comparePlans(optimized, correctAnswer)
}

test("SPARK-50380: conditional branches with error expression") {
val originalQuery1 = testRelation.select(If($"a" === 1, 1L, Literal(1).div(0) + $"b")).analyze
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding this.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @cloud-fan and @viirya .

@dongjoon-hyun
Copy link
Member

Unfortunately, it seems that the test case assumes ANSI setting only and broke non-ANSI CI. I made a follow-up, @cloud-fan and @viirya .

dongjoon-hyun added a commit that referenced this pull request Nov 24, 2024
…hes with error expression test

### What changes were proposed in this pull request?

This is a follow-up to recover non-ANSI CI.
- #48918

### Why are the changes needed?

The original PR broke non-ANSI CI because the test case assumes ANSI setting.

- https://github.com/apache/spark/actions/runs/11964792566
- https://github.com/apache/spark/actions/runs/11982859814

### Does this PR introduce _any_ user-facing change?

No, this is a test-only change.

### How was this patch tested?

Manual tests.

**BEFORE**

```
$ SPARK_ANSI_SQL_MODE=false build/sbt "catalyst/testOnly *.ReorderAssociativeOperatorSuite -- -z SPARK-50380"
...
[info] *** 1 TEST FAILED ***
[error] Failed tests:
[error] 	org.apache.spark.sql.catalyst.optimizer.ReorderAssociativeOperatorSuite
[error] (catalyst / Test / testOnly) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 8 s, completed Nov 23, 2024, 11:50:45 AM
```

**AFTER**

```
$ SPARK_ANSI_SQL_MODE=false build/sbt "catalyst/testOnly *.ReorderAssociativeOperatorSuite -- -z SPARK-50380"
...
[info] ReorderAssociativeOperatorSuite:
[info] - SPARK-50380: conditional branches with error expression (508 milliseconds)
[info] Run completed in 1 second, 413 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 11 s, completed Nov 23, 2024, 11:51:34 AM
```

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #48943 from dongjoon-hyun/SPARK-50380.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants