[RFC] Apply the trait NoMemoryEffect
to most ReadOnly
ops
#3891
+1,226
−714
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In my understanding, the trait
ReadOnly
implies that the memory for input values does not get modified by the operation.This is similar to
NoMemoryEffect
, with a few exceptions. Ops likeprim.RaiseException
andprim.Print
areReadOnly
, but have effects like terminating the program and printing, respectively.There may be other
ReadOnly
ops with side effects. As a hypothetical example, there might be a comparison op that checks if two tensors are the same type, prints some debug info, and also returns a boolean value. In the event that such an op is discovered, I've added a keyword argument to theemit_op
function to explicitly label an operation ashas_memory_effects=True
to avoid adding theNoMemoryEffect
trait to the tablegen definition.My primary motivation for this change is to remove the need to have one-off
RemoveUnused
patterns for ops that we want to remove when dead, but adding this trait also has the very appealing effect of dramatically simplifying torch-IR for some models. Specifically for ONNX models, it is not uncommon to have patterns that consistently call "onnx.Shape" on the same exact tensor, only to extract a different element from the shape. That redundant IR doesn't get CSE'd until converting to a core mlir dialect like linalg/tensor/arith.