rfcs: graph api: support swish operation #2156

TaoLv · 2024-10-09T03:17:41Z

The RFC proposes adding support for Swish operation in Graph API.

Rendered version: link .

mgouicem

Thanks for the proposal.
For composition of eltwise support, I would actually advocate for using a composition of simpler ops (option 1) vs introducing new ops for each new eltwise (option 2).

There are a few reasons to that:

Framework users often write eltwise operations in their script as composition of smaller ops (Most likely because they started using those eltwise ops before the FWK supported them). As an example, just search swish in HuggingFace github.
There are often "equivalent" formulas that are used in the wild. Think gelu_erf/gelu_tanh, or hardswish(x) = x * hardsigmoid(x), or bounded_relu(x, alpha) = clip (x, 0, alpha), ..... This would be more scalable IMHO to just match those patterns internally than to expose new ops for each flavor.
as you mention, API simplicity: exposing dedicated op would still make it possible for user to pass composite op as in option 1. We would likely have to support both in library.

mgouicem · 2024-10-09T06:48:07Z

rfcs/20241008-graph-api-swish/README.md

+
+- Operation Kind: `Swish` (C++), `dnnl_graph_op_swish` (C).
+- Input/output: Single input, single output.
+- Attribute: `beta` (optional). `beta = 1.f` if not provided.


(nit) why beta and not alpha?

Thank you for the review, @mgouicem. I didn't think too much about the naming here. It just followed the naming in OpenVINO and cuDNN. I feel alpha is also good to me if we think it's better to align with eltwise primitive.

TaoLv · 2024-10-10T03:35:02Z

@mgouicem Thanks for the review. I just incorporate the comments into the RFC and leave the conclusion section open for discussion.

TaoLv · 2024-10-10T07:57:55Z

I would actually advocate for using a composition of simpler ops (option 1) ...

Thanks. I should have mentioned that we already have this composition for swish in our code (link). It indeed worked for some requests from frameworks. But we also see some issues as I mentioned in the cons of option 1.

As an example, just search swish in HuggingFace github.

Thanks for the link. I copied the link to the RFC also. :)
For these cases, if framework developers want to optimize it with oneDNN, they will have to detect the pattern, rewrite it with PyTorch's SiLU operation, and then call oneDNN to optimize the SiLU operation, if we don't consider further fusion into other operations like matmul or conv. So it still becomes if we want to use two or more operations to optimize SiLU or just one dedicated swish operation.

There are often "equivalent" formulas that are used in the wild.

Yes, this is really a good point. Using composition of smaller operations will give us the flexibility to support more variants, without breaking or adding API. I also added this into the proposals.

TaoLv · 2024-10-10T08:06:11Z

rfcs/20241008-graph-api-swish/README.md

+  operations in oneDNN Graph is troublesome for some integrations.
+- Currently, oneDNN Graph Sigmoid operation does not support a multiplication
+  `factor`. We may need to extend either the proposed Swish graph or the Sigmoid
+  operation to support cases where `factor != 1.f`.


Since PyTorch's SiLU operation only supports factor = 1.f, it can be simply decomposed to Sigmoid + Multiply. When we dispatch these pattern to oneDNN swish kernel, we just set alpha = 1.f. It works perfectly.
But if we want to support factor != 1.f for other cases, the operation will be decomposed into Multiply + Sigmoid + Multiply with the first Multiply to do factor * x. To dispatch this pattern into oneDNN swish kernel, we have more things to consider:

We need to check that the factor input of the first Multiply should be a scalar.

The value of the scalar should be const among iterations.

The value of the scalar should be known at compilation stage as it's required to create primitive descriptors. But for now, the compile() API only accepts the logical tensors of input tensors, not the values of input tensors.

THat is correct. For that case, you would need an extra attribute to Sigmoid with alpha value.

Thanks! Or, can we simply rely on binary + eltwise post-op + binary post-op for the cases where we don't know factor at compilation?

That is an option as well. THough I would expect for the specific case of switch, the alpha parameter is constant, and using switch eltwise would be faster.
The other thing is GPU support. alpha might not be a memory object on GPU.

TaoLv · 2024-11-20T02:22:33Z

@mgouicem Could you please take another look on this? The conclusion part has been updated in the last commit (I will squash them once ready for merge). Thank you!

mgouicem

LGTM, thanks!

rfcs: graph api: support swish operation

5bb3253

TaoLv added the RFC A design document label Oct 9, 2024

TaoLv requested a review from a team as a code owner October 9, 2024 03:17

TaoLv requested a review from a team October 9, 2024 03:18

mgouicem reviewed Oct 9, 2024

View reviewed changes

update

b7de6c5

TaoLv commented Oct 10, 2024

View reviewed changes

update the conclusion

485e42a

ElaineBao approved these changes Nov 4, 2024

View reviewed changes

mgouicem approved these changes Nov 20, 2024

View reviewed changes

TaoLv merged commit 8fd566e into oneapi-src:rfcs Nov 21, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rfcs: graph api: support swish operation #2156

rfcs: graph api: support swish operation #2156

TaoLv commented Oct 9, 2024

mgouicem left a comment

mgouicem Oct 9, 2024

TaoLv Oct 10, 2024

TaoLv commented Oct 10, 2024

TaoLv commented Oct 10, 2024

TaoLv Oct 10, 2024

mgouicem Oct 10, 2024

TaoLv Oct 10, 2024

mgouicem Oct 10, 2024

TaoLv commented Nov 20, 2024

mgouicem left a comment

rfcs: graph api: support swish operation #2156

rfcs: graph api: support swish operation #2156

Conversation

TaoLv commented Oct 9, 2024

mgouicem left a comment

Choose a reason for hiding this comment

mgouicem Oct 9, 2024

Choose a reason for hiding this comment

TaoLv Oct 10, 2024

Choose a reason for hiding this comment

TaoLv commented Oct 10, 2024

TaoLv commented Oct 10, 2024

TaoLv Oct 10, 2024

Choose a reason for hiding this comment

mgouicem Oct 10, 2024

Choose a reason for hiding this comment

TaoLv Oct 10, 2024

Choose a reason for hiding this comment

mgouicem Oct 10, 2024

Choose a reason for hiding this comment

TaoLv commented Nov 20, 2024

mgouicem left a comment

Choose a reason for hiding this comment