[analyzer] Avoid out-of-order node traversal on void return #117863

necto · 2024-11-27T09:59:23Z

The motivating example: https://compiler-explorer.com/z/WjsxYfs43

#include <stdlib.h>
void inf_loop_break_callee() {
  void* data = malloc(10);
  while (1) {
    (void)data; // line 3
    break; // -> execution continues on line 3 ?!!
  }
}

To correct the flow steps in this example (see the fixed version in the added test case) I changed two things in the engine:

Make processCallExit create a new StmtPoint only for return statements. If the last non-jump statement is not a return statement, e.g. (void)data;, it is no longer inserted in the exploded graph after the function exit.
Optionally skip the purge program points. In the example above, purge points are still inserted after the break; executes. Now, when the bug reporter is looking for the next statement executed after the function execution is finished, it will ignore the purge program points, so it won't confusingly pick the (void)data; statement.

CPP-5778

The motivating example: ```C++ void inf_loop_break_callee() { void* data = malloc(10); while (1) { (void)data; // line 4 break; // -> execution continues on line 4 ?!! } } void inf_loop_break_caller() { inf_loop_break_callee(); } ``` To correct the flow steps in this example (see the fixed version in the added test case) I changed two things in the engine: - Make `processCallExit` create a new StmtPoint only for return statements. If the last non-jump statement is not a return statement, e.g. `(void)data;`, it is no longer inserted in the exploded graph after the function exit. - Optionally skip the purge program points. In the example above, purge points are still inserted after the `break;` executes. Now, when the bug reporter is looking for the next statement executed after the function execution is finished, it will ignore the purge program points, so it won't confusingly pick the `(void)data;` statement.

llvmbot · 2024-11-27T09:59:59Z

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Arseniy Zaostrovnykh (necto)

Changes

The motivating example:

void inf_loop_break_callee() {
  void* data = malloc(10);
  while (1) {
    (void)data; // line 4
    break; // -&gt; execution continues on line 4 ?!!
  }
}
void inf_loop_break_caller() {
  inf_loop_break_callee();
}

To correct the flow steps in this example (see the fixed version in the added test case) I changed two things in the engine:

Make processCallExit create a new StmtPoint only for return statements. If the last non-jump statement is not a return statement, e.g. (void)data;, it is no longer inserted in the exploded graph after the function exit.
Optionally skip the purge program points. In the example above, purge points are still inserted after the break; executes. Now, when the bug reporter is looking for the next statement executed after the function execution is finished, it will ignore the purge program points, so it won't confusingly pick the (void)data; statement.

Full diff: https://github.com/llvm/llvm-project/pull/117863.diff

6 Files Affected:

(modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h (+3-1)
(modified) clang/lib/StaticAnalyzer/Core/BugReporter.cpp (+28-12)
(modified) clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp (+4-1)
(modified) clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp (+7-2)
(modified) clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp (+4-4)
(added) clang/test/Analysis/void-call-exit-modelling.c (+17)

diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h
index 2fb05ac46e8fad..d0a8aa434bb576 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h
@@ -278,7 +278,9 @@ class ExplodedNode : public llvm::FoldingSetNode {
   /// Useful for explaining control flow that follows the current node.
   /// If the statement belongs to a body-farmed definition, retrieve the
   /// call site for that definition.
-  const Stmt *getNextStmtForDiagnostics() const;
+  /// If skipPurge is true, skip the purge-dead-symbols nodes (that are often
+  /// inserted out-of-order by the endinge).
+  const Stmt *getNextStmtForDiagnostics(bool skipPurge) const;
 
   /// Find the statement that was executed immediately before this node.
   /// Useful when the node corresponds to a CFG block entrance.
diff --git a/clang/lib/StaticAnalyzer/Core/BugReporter.cpp b/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
index b67e6cd86c3d60..1f531ffc292511 100644
--- a/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
+++ b/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
@@ -589,7 +589,8 @@ static void removePiecesWithInvalidLocations(PathPieces &Pieces) {
 
 PathDiagnosticLocation PathDiagnosticBuilder::ExecutionContinues(
     const PathDiagnosticConstruct &C) const {
-  if (const Stmt *S = C.getCurrentNode()->getNextStmtForDiagnostics())
+  if (const Stmt *S =
+          C.getCurrentNode()->getNextStmtForDiagnostics(/*skipPurge=*/true))
     return PathDiagnosticLocation(S, getSourceManager(),
                                   C.getCurrLocationContext());
 
@@ -882,7 +883,8 @@ void PathDiagnosticBuilder::generateMinimalDiagForBlockEdge(
 
   case Stmt::GotoStmtClass:
   case Stmt::IndirectGotoStmtClass: {
-    if (const Stmt *S = C.getCurrentNode()->getNextStmtForDiagnostics())
+    if (const Stmt *S = C.getCurrentNode()->getNextStmtForDiagnostics(
+            /*skipPurge=*/false))
       C.getActivePath().push_front(generateDiagForGotoOP(C, S, Start));
     break;
   }
@@ -2416,6 +2418,28 @@ PathSensitiveBugReport::getRanges() const {
   return Ranges;
 }
 
+static bool exitingDestructor(const ExplodedNode *N) {
+  // Need to loop here, as some times the Error node is already outside of the
+  // destructor context, and the previous node is an edge that is also outside.
+  while (N && !N->getLocation().getAs<StmtPoint>()) {
+    N = N->getFirstPred();
+  }
+  return N && isa<CXXDestructorDecl>(N->getLocationContext()->getDecl());
+}
+
+static const Stmt *
+findReasonableStmtCloseToFunctionExit(const ExplodedNode *N) {
+  if (exitingDestructor(N)) {
+    // If we are exiting a destructor call, it is more useful to point to
+    // the next stmt which is usually the temporary declaration.
+    if (const Stmt *S = N->getNextStmtForDiagnostics(/*skipPurge=*/false))
+      return S;
+    // If next stmt is not found, it is likely the end of a top-level
+    // function analysis. find the last execution statement then.
+  }
+  return N->getPreviousStmtForDiagnostics();
+}
+
 PathDiagnosticLocation
 PathSensitiveBugReport::getLocation() const {
   assert(ErrorNode && "Cannot create a location with a null node.");
@@ -2433,18 +2457,10 @@ PathSensitiveBugReport::getLocation() const {
       if (const ReturnStmt *RS = FE->getStmt())
         return PathDiagnosticLocation::createBegin(RS, SM, LC);
 
-      // If we are exiting a destructor call, it is more useful to point to the
-      // next stmt which is usually the temporary declaration.
-      // For non-destructor and non-top-level calls, the next stmt will still
-      // refer to the last executed stmt of the body.
-      S = ErrorNode->getNextStmtForDiagnostics();
-      // If next stmt is not found, it is likely the end of a top-level function
-      // analysis. find the last execution statement then.
-      if (!S)
-        S = ErrorNode->getPreviousStmtForDiagnostics();
+      S = findReasonableStmtCloseToFunctionExit(ErrorNode);
     }
     if (!S)
-      S = ErrorNode->getNextStmtForDiagnostics();
+      S = ErrorNode->getNextStmtForDiagnostics(/*skipPurge=*/false);
   }
 
   if (S) {
diff --git a/clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp b/clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp
index 11cef00ada330b..81dc45b6923524 100644
--- a/clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp
@@ -347,8 +347,11 @@ const Stmt *ExplodedNode::getStmtForDiagnostics() const {
   return nullptr;
 }
 
-const Stmt *ExplodedNode::getNextStmtForDiagnostics() const {
+const Stmt *ExplodedNode::getNextStmtForDiagnostics(bool skipPurge) const {
   for (const ExplodedNode *N = getFirstSucc(); N; N = N->getFirstSucc()) {
+    if (skipPurge && N->getLocation().isPurgeKind()) {
+      continue;
+    }
     if (const Stmt *S = N->getStmtForDiagnostics()) {
       // Check if the statement is '?' or '&&'/'||'.  These are "merges",
       // not actual statement points.
diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp
index 2ca24d0c5ab22b..02facf786830d2 100644
--- a/clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp
@@ -353,14 +353,19 @@ void ExprEngine::processCallExit(ExplodedNode *CEBNode) {
   ExplodedNodeSet CleanedNodes;
   if (LastSt && Blk && AMgr.options.AnalysisPurgeOpt != PurgeNone) {
     static SimpleProgramPointTag retValBind("ExprEngine", "Bind Return Value");
-    PostStmt Loc(LastSt, calleeCtx, &retValBind);
+    auto Loc = isa<ReturnStmt>(LastSt)
+                   ? ProgramPoint{PostStmt(LastSt, calleeCtx, &retValBind)}
+                   : ProgramPoint{EpsilonPoint(calleeCtx, /*Data1=*/nullptr,
+                                               /*Data2=*/nullptr, &retValBind)};
+    const CFGBlock *PrePurgeBlock =
+        isa<ReturnStmt>(LastSt) ? Blk : &CEBNode->getCFG().getExit();
     bool isNew;
     ExplodedNode *BindedRetNode = G.getNode(Loc, state, false, &isNew);
     BindedRetNode->addPredecessor(CEBNode, G);
     if (!isNew)
       return;
 
-    NodeBuilderContext Ctx(getCoreEngine(), Blk, BindedRetNode);
+    NodeBuilderContext Ctx(getCoreEngine(), PrePurgeBlock, BindedRetNode);
     currBldrCtx = &Ctx;
     // Here, we call the Symbol Reaper with 0 statement and callee location
     // context, telling it to clean up everything in the callee's context
diff --git a/clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp b/clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp
index 53e72e7c5f4f7c..611e1d8255976c 100644
--- a/clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp
+++ b/clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp
@@ -163,8 +163,8 @@ class UnguardedFieldThroughMethodTest {
       Volume = 0;
       break;
     case A:
-      Area = 0; // expected-warning {{1 uninitialized field}}
-      break;
+      Area = 0;
+      break; // expected-warning {{1 uninitialized field}}
     }
   }
 
@@ -201,8 +201,8 @@ class UnguardedPublicFieldsTest {
       Volume = 0;
       break;
     case A:
-      Area = 0; // expected-warning {{1 uninitialized field}}
-      break;
+      Area = 0;
+      break; // expected-warning {{1 uninitialized field}}
     }
   }
 
diff --git a/clang/test/Analysis/void-call-exit-modelling.c b/clang/test/Analysis/void-call-exit-modelling.c
new file mode 100644
index 00000000000000..961aac55082e31
--- /dev/null
+++ b/clang/test/Analysis/void-call-exit-modelling.c
@@ -0,0 +1,17 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix.Malloc -analyzer-output text -verify %s
+
+typedef __typeof(sizeof(int)) size_t;
+void *malloc(size_t size);
+
+void inf_loop_break_callee() {
+  void* data = malloc(10); // expected-note{{Memory is allocated}}
+  while (1) { // expected-note{{Loop condition is true}}
+    (void)data;
+    break; // No note that we jump to the line above from this break
+  } // expected-note@-1{{Execution jumps to the end of the function}}
+} // expected-warning{{Potential leak of memory pointed to by 'data'}}
+// expected-note@-1  {{Potential leak of memory pointed to by 'data'}}
+
+void inf_loop_break_caller() {
+  inf_loop_break_callee(); // expected-note{{Calling 'inf_loop_break_callee'}}
+}

llvmbot · 2024-11-27T10:00:00Z

@llvm/pr-subscribers-clang

Author: Arseniy Zaostrovnykh (necto)

Changes

The motivating example:

void inf_loop_break_callee() {
  void* data = malloc(10);
  while (1) {
    (void)data; // line 4
    break; // -&gt; execution continues on line 4 ?!!
  }
}
void inf_loop_break_caller() {
  inf_loop_break_callee();
}

To correct the flow steps in this example (see the fixed version in the added test case) I changed two things in the engine:

Make processCallExit create a new StmtPoint only for return statements. If the last non-jump statement is not a return statement, e.g. (void)data;, it is no longer inserted in the exploded graph after the function exit.
Optionally skip the purge program points. In the example above, purge points are still inserted after the break; executes. Now, when the bug reporter is looking for the next statement executed after the function execution is finished, it will ignore the purge program points, so it won't confusingly pick the (void)data; statement.

Full diff: https://github.com/llvm/llvm-project/pull/117863.diff

6 Files Affected:

(modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h (+3-1)
(modified) clang/lib/StaticAnalyzer/Core/BugReporter.cpp (+28-12)
(modified) clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp (+4-1)
(modified) clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp (+7-2)
(modified) clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp (+4-4)
(added) clang/test/Analysis/void-call-exit-modelling.c (+17)

diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h
index 2fb05ac46e8fad..d0a8aa434bb576 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h
@@ -278,7 +278,9 @@ class ExplodedNode : public llvm::FoldingSetNode {
   /// Useful for explaining control flow that follows the current node.
   /// If the statement belongs to a body-farmed definition, retrieve the
   /// call site for that definition.
-  const Stmt *getNextStmtForDiagnostics() const;
+  /// If skipPurge is true, skip the purge-dead-symbols nodes (that are often
+  /// inserted out-of-order by the endinge).
+  const Stmt *getNextStmtForDiagnostics(bool skipPurge) const;
 
   /// Find the statement that was executed immediately before this node.
   /// Useful when the node corresponds to a CFG block entrance.
diff --git a/clang/lib/StaticAnalyzer/Core/BugReporter.cpp b/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
index b67e6cd86c3d60..1f531ffc292511 100644
--- a/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
+++ b/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
@@ -589,7 +589,8 @@ static void removePiecesWithInvalidLocations(PathPieces &Pieces) {
 
 PathDiagnosticLocation PathDiagnosticBuilder::ExecutionContinues(
     const PathDiagnosticConstruct &C) const {
-  if (const Stmt *S = C.getCurrentNode()->getNextStmtForDiagnostics())
+  if (const Stmt *S =
+          C.getCurrentNode()->getNextStmtForDiagnostics(/*skipPurge=*/true))
     return PathDiagnosticLocation(S, getSourceManager(),
                                   C.getCurrLocationContext());
 
@@ -882,7 +883,8 @@ void PathDiagnosticBuilder::generateMinimalDiagForBlockEdge(
 
   case Stmt::GotoStmtClass:
   case Stmt::IndirectGotoStmtClass: {
-    if (const Stmt *S = C.getCurrentNode()->getNextStmtForDiagnostics())
+    if (const Stmt *S = C.getCurrentNode()->getNextStmtForDiagnostics(
+            /*skipPurge=*/false))
       C.getActivePath().push_front(generateDiagForGotoOP(C, S, Start));
     break;
   }
@@ -2416,6 +2418,28 @@ PathSensitiveBugReport::getRanges() const {
   return Ranges;
 }
 
+static bool exitingDestructor(const ExplodedNode *N) {
+  // Need to loop here, as some times the Error node is already outside of the
+  // destructor context, and the previous node is an edge that is also outside.
+  while (N && !N->getLocation().getAs<StmtPoint>()) {
+    N = N->getFirstPred();
+  }
+  return N && isa<CXXDestructorDecl>(N->getLocationContext()->getDecl());
+}
+
+static const Stmt *
+findReasonableStmtCloseToFunctionExit(const ExplodedNode *N) {
+  if (exitingDestructor(N)) {
+    // If we are exiting a destructor call, it is more useful to point to
+    // the next stmt which is usually the temporary declaration.
+    if (const Stmt *S = N->getNextStmtForDiagnostics(/*skipPurge=*/false))
+      return S;
+    // If next stmt is not found, it is likely the end of a top-level
+    // function analysis. find the last execution statement then.
+  }
+  return N->getPreviousStmtForDiagnostics();
+}
+
 PathDiagnosticLocation
 PathSensitiveBugReport::getLocation() const {
   assert(ErrorNode && "Cannot create a location with a null node.");
@@ -2433,18 +2457,10 @@ PathSensitiveBugReport::getLocation() const {
       if (const ReturnStmt *RS = FE->getStmt())
         return PathDiagnosticLocation::createBegin(RS, SM, LC);
 
-      // If we are exiting a destructor call, it is more useful to point to the
-      // next stmt which is usually the temporary declaration.
-      // For non-destructor and non-top-level calls, the next stmt will still
-      // refer to the last executed stmt of the body.
-      S = ErrorNode->getNextStmtForDiagnostics();
-      // If next stmt is not found, it is likely the end of a top-level function
-      // analysis. find the last execution statement then.
-      if (!S)
-        S = ErrorNode->getPreviousStmtForDiagnostics();
+      S = findReasonableStmtCloseToFunctionExit(ErrorNode);
     }
     if (!S)
-      S = ErrorNode->getNextStmtForDiagnostics();
+      S = ErrorNode->getNextStmtForDiagnostics(/*skipPurge=*/false);
   }
 
   if (S) {
diff --git a/clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp b/clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp
index 11cef00ada330b..81dc45b6923524 100644
--- a/clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp
@@ -347,8 +347,11 @@ const Stmt *ExplodedNode::getStmtForDiagnostics() const {
   return nullptr;
 }
 
-const Stmt *ExplodedNode::getNextStmtForDiagnostics() const {
+const Stmt *ExplodedNode::getNextStmtForDiagnostics(bool skipPurge) const {
   for (const ExplodedNode *N = getFirstSucc(); N; N = N->getFirstSucc()) {
+    if (skipPurge && N->getLocation().isPurgeKind()) {
+      continue;
+    }
     if (const Stmt *S = N->getStmtForDiagnostics()) {
       // Check if the statement is '?' or '&&'/'||'.  These are "merges",
       // not actual statement points.
diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp
index 2ca24d0c5ab22b..02facf786830d2 100644
--- a/clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp
@@ -353,14 +353,19 @@ void ExprEngine::processCallExit(ExplodedNode *CEBNode) {
   ExplodedNodeSet CleanedNodes;
   if (LastSt && Blk && AMgr.options.AnalysisPurgeOpt != PurgeNone) {
     static SimpleProgramPointTag retValBind("ExprEngine", "Bind Return Value");
-    PostStmt Loc(LastSt, calleeCtx, &retValBind);
+    auto Loc = isa<ReturnStmt>(LastSt)
+                   ? ProgramPoint{PostStmt(LastSt, calleeCtx, &retValBind)}
+                   : ProgramPoint{EpsilonPoint(calleeCtx, /*Data1=*/nullptr,
+                                               /*Data2=*/nullptr, &retValBind)};
+    const CFGBlock *PrePurgeBlock =
+        isa<ReturnStmt>(LastSt) ? Blk : &CEBNode->getCFG().getExit();
     bool isNew;
     ExplodedNode *BindedRetNode = G.getNode(Loc, state, false, &isNew);
     BindedRetNode->addPredecessor(CEBNode, G);
     if (!isNew)
       return;
 
-    NodeBuilderContext Ctx(getCoreEngine(), Blk, BindedRetNode);
+    NodeBuilderContext Ctx(getCoreEngine(), PrePurgeBlock, BindedRetNode);
     currBldrCtx = &Ctx;
     // Here, we call the Symbol Reaper with 0 statement and callee location
     // context, telling it to clean up everything in the callee's context
diff --git a/clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp b/clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp
index 53e72e7c5f4f7c..611e1d8255976c 100644
--- a/clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp
+++ b/clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp
@@ -163,8 +163,8 @@ class UnguardedFieldThroughMethodTest {
       Volume = 0;
       break;
     case A:
-      Area = 0; // expected-warning {{1 uninitialized field}}
-      break;
+      Area = 0;
+      break; // expected-warning {{1 uninitialized field}}
     }
   }
 
@@ -201,8 +201,8 @@ class UnguardedPublicFieldsTest {
       Volume = 0;
       break;
     case A:
-      Area = 0; // expected-warning {{1 uninitialized field}}
-      break;
+      Area = 0;
+      break; // expected-warning {{1 uninitialized field}}
     }
   }
 
diff --git a/clang/test/Analysis/void-call-exit-modelling.c b/clang/test/Analysis/void-call-exit-modelling.c
new file mode 100644
index 00000000000000..961aac55082e31
--- /dev/null
+++ b/clang/test/Analysis/void-call-exit-modelling.c
@@ -0,0 +1,17 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix.Malloc -analyzer-output text -verify %s
+
+typedef __typeof(sizeof(int)) size_t;
+void *malloc(size_t size);
+
+void inf_loop_break_callee() {
+  void* data = malloc(10); // expected-note{{Memory is allocated}}
+  while (1) { // expected-note{{Loop condition is true}}
+    (void)data;
+    break; // No note that we jump to the line above from this break
+  } // expected-note@-1{{Execution jumps to the end of the function}}
+} // expected-warning{{Potential leak of memory pointed to by 'data'}}
+// expected-note@-1  {{Potential leak of memory pointed to by 'data'}}
+
+void inf_loop_break_caller() {
+  inf_loop_break_callee(); // expected-note{{Calling 'inf_loop_break_callee'}}
+}

steakhal · 2024-11-27T10:16:07Z

I already reviewed this change downstream.

NagyDonat

I didn't systematically analyze all the small details of this commit, but overall it looks good to me.

Xazax-hun

Great fix! I have one question inline.

Xazax-hun · 2024-11-27T11:22:20Z

clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h

-  const Stmt *getNextStmtForDiagnostics() const;
+  /// If skipPurge is true, skip the purge-dead-symbols nodes (that are often
+  /// inserted out-of-order by the endinge).
+  const Stmt *getNextStmtForDiagnostics(bool skipPurge) const;


What would happen if we always skipped purge-dead-symbol nodes? I wonder if we in general want to tie diagnostics to constructs written by the users instead of artificial constructs that we inserted as implementation details of the analysis.

Our policy was so far to minimize the impact to maximize the chance of approval on upstream review.
I think what you say makes sense.

I cannot speak generally, but it breaks on more test case: clang/test/Analysis/copy-elision.cpp.
In short for the following code, it moves the report further down the execution:

// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -std=c++11 \ // RUN: -analyzer-config elide-constructors=false -DNO_ELIDE_FLAG \ // RUN: -analyzer-config eagerly-assume=false -verify=expected,no-elide %s void clang_analyzer_eval(bool); void clang_analyzer_dump(int); template <typename T> struct AddressVector { T *buf[20]; int len; AddressVector() : len(0) {} void push(T *t) { buf[len] = t; ++len; } }; class ClassWithDestructor { AddressVector<ClassWithDestructor> &v; public: ClassWithDestructor(AddressVector<ClassWithDestructor> &v) : v(v) { push(); } ClassWithDestructor(ClassWithDestructor &&c) : v(c.v) { push(); } ClassWithDestructor(const ClassWithDestructor &c) : v(c.v) { push(); } ~ClassWithDestructor() { push(); } void push() { v.push(this); } }; struct TestCtorInitializer { ClassWithDestructor c; TestCtorInitializer(AddressVector<ClassWithDestructor> &v) : c(ClassWithDestructor(v)) {} // currently: <- address of stack var leaks via 'v' }; void testCtorInitializer() { AddressVector<ClassWithDestructor> v; { TestCtorInitializer t(v); // skipping-purge: <- address of stack var leaks via 'v' } }

I.e., it ends up skipping one more stack frame.
I don't think it is a large difference, although I slightly prefer the current report to the one resulting if we skip purge always. Moreover, I wanted to reduce the test footprint, so I minimized the behavior change.
Do you think it is better simplify the logic and skip purge always?

I find it hard to justify why would we ever not skip those nodes, and if there is a fallout we need to deal with it independently. I would prefer us to do the right thing and always skip the purge nodes (unless we discover why is it not a good idea, in that case we can document our findings in the form of comments). Having logic in the code that we do not fully the purpose of is technical debt. So I'd prefer minimizing that.

Here you go: 674befc skipPurge unconditionally

clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp

Xazax-hun

LGTM!

llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Nov 27, 2024

steakhal requested review from Xazax-hun and NagyDonat November 27, 2024 10:15

Test top-level function effect. Thanks, Balázs for this example

eb1b47c

NagyDonat approved these changes Nov 27, 2024

View reviewed changes

Xazax-hun reviewed Nov 27, 2024

View reviewed changes

[NFC] Remove curly braces around a single-statement block

4555ae3

necto requested a review from Xazax-hun November 27, 2024 12:18

necto added 2 commits November 27, 2024 13:39

skipPurge unconditionally

674befc

[NFC] fix format

525c6ab

Xazax-hun approved these changes Nov 27, 2024

View reviewed changes

steakhal approved these changes Nov 27, 2024

View reviewed changes

steakhal merged commit dddeec4 into llvm:main Nov 27, 2024
6 of 8 checks passed

necto deleted the az/void-return-model branch November 27, 2024 13:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[analyzer] Avoid out-of-order node traversal on void return #117863

[analyzer] Avoid out-of-order node traversal on void return #117863

necto commented Nov 27, 2024 •

edited

Loading

llvmbot commented Nov 27, 2024

llvmbot commented Nov 27, 2024

steakhal commented Nov 27, 2024

NagyDonat left a comment

Xazax-hun left a comment

Xazax-hun Nov 27, 2024

steakhal Nov 27, 2024 •

edited

Loading

necto Nov 27, 2024

Xazax-hun Nov 27, 2024 •

edited

Loading

necto Nov 27, 2024

Xazax-hun left a comment

[analyzer] Avoid out-of-order node traversal on void return #117863

[analyzer] Avoid out-of-order node traversal on void return #117863

Conversation

necto commented Nov 27, 2024 • edited Loading

llvmbot commented Nov 27, 2024

llvmbot commented Nov 27, 2024

steakhal commented Nov 27, 2024

NagyDonat left a comment

Choose a reason for hiding this comment

Xazax-hun left a comment

Choose a reason for hiding this comment

Xazax-hun Nov 27, 2024

Choose a reason for hiding this comment

steakhal Nov 27, 2024 • edited Loading

Choose a reason for hiding this comment

necto Nov 27, 2024

Choose a reason for hiding this comment

Xazax-hun Nov 27, 2024 • edited Loading

Choose a reason for hiding this comment

necto Nov 27, 2024

Choose a reason for hiding this comment

Xazax-hun left a comment

Choose a reason for hiding this comment

necto commented Nov 27, 2024 •

edited

Loading

steakhal Nov 27, 2024 •

edited

Loading

Xazax-hun Nov 27, 2024 •

edited

Loading