Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove pool2d MLRoundingType - Simplify the operand layout support of conv2d and pooling 2d operations #770

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 13 additions & 39 deletions index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -5347,18 +5347,13 @@ partial dictionary MLOpSupportLimits {
### Pooling operations ### {#api-mlgraphbuilder-pool2d}
Compute a pooling operation across all the elements within the moving window over the input tensor.
<script type=idl>
enum MLRoundingType {
"floor",
"ceil"
};

dictionary MLPool2dOptions : MLOperatorOptions {
sequence<[EnforceRange] unsigned long> windowDimensions;
sequence<[EnforceRange] unsigned long> padding;
sequence<[EnforceRange] unsigned long> strides;
sequence<[EnforceRange] unsigned long> dilations;
MLInputOperandLayout layout = "nchw";
MLRoundingType roundingType = "floor";
sequence<[EnforceRange] unsigned long> outputSizes;
};

Expand Down Expand Up @@ -5410,16 +5405,16 @@ partial dictionary MLOpSupportLimits {
- input tensor: *[batches, height, width, inputChannels]*
- output tensor: *[batches, height, width, outputChannels]*

: <dfn>roundingType</dfn>
::
The rounding function used to compute the output shape.

: <dfn>outputSizes</dfn>
::
A list of length 2.
Specifies the sizes of the two spacial dimensions of the output tensor. When the output sizes are explicitly specified, the {{MLPool2dOptions/roundingType}} is ignored.
A list of length 2: *[outputHeight, outputWidth]*
Specifies the sizes of the two spatial dimensions of the output tensor.

If not specified, the output sizes are automatically computed.
The spatial dimensions of the output tensor can be calculated as follows:

*output size = ((input size - filter size + beginning padding + ending padding) / stride) + 1*

Then the caller either applies a floor or ceiling depending on whether partial window results are desired.
</dl>

<div dfn-for="MLGraphBuilder/averagePool2d(input, options), MLGraphBuilder/l2Pool2d(input, options), MLGraphBuilder/maxPool2d(input, options)" dfn-type=argument>
Expand All @@ -5430,13 +5425,8 @@ partial dictionary MLOpSupportLimits {

**Returns:** an {{MLOperand}}. The output 4-D tensor that contains the
result of the reduction. The logical shape is interpreted according to the
value of *layout*. More specifically, if the *options.roundingType* is {{MLRoundingType/"floor"}}, the spatial dimensions of the output tensor can be calculated as follows:

`output size = floor(1 + (input size - filter size + beginning padding + ending padding) / stride)`

or if *options.roundingType* is {{MLRoundingType/"ceil"}}:

`output size = ceil(1 + (input size - filter size + beginning padding + ending padding) / stride)`
value of *layout*, taking the batch and channel count from the input with
the spatial sizes from *outputSizes*.
</div>

{{MLOpSupportLimits}} has following members for pooling operations:
Expand All @@ -5459,7 +5449,7 @@ partial dictionary MLOpSupportLimits {

<details open algorithm>
<summary>
To <dfn for=MLGraphBuilder>calculate pool2d output sizes</dfn> given {{MLInputOperandLayout}} |layout|, [=/list=] of 4 unsigned integers |inputShape|, {{MLRoundingType}} |roundingType|, [=/list=] of 2 unsigned integers |windowDimensions|, [=/list=] of 4 unsigned integers |padding|, [=/list=] of 2 unsigned integers |strides|, [=/list=] of 2 unsigned integers |dilations|, and optional [=/list=] of 2 unsigned integers |outputSizes|, perform these steps. They return a [=/list=] of 4 unsigned integers.
To <dfn for=MLGraphBuilder>calculate pool2d output sizes</dfn> given {{MLInputOperandLayout}} |layout|, [=/list=] of 4 unsigned integers |inputShape|, [=/list=] of 2 unsigned integers |windowDimensions|, [=/list=] of 4 unsigned integers |padding|, [=/list=] of 2 unsigned integers |strides|, [=/list=] of 2 unsigned integers |dilations|, and optional [=/list=] of 2 unsigned integers |outputSizes|, perform these steps. They return a [=/list=] of 4 unsigned integers.
fdwr marked this conversation as resolved.
Show resolved Hide resolved
</summary>
1. Switch on |layout|:
<dl class=switch>
Expand All @@ -5476,24 +5466,8 @@ partial dictionary MLOpSupportLimits {
1. Let |inputWidth| be |inputShape|[2].
1. Let |channels| be |inputShape|[3].
</dl>
1. If |outputSizes| is not given, then:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outputSizes is optional, we should keep this checking. I feel the existing code has an issue, it should use outputSizes if it is given?

Copy link
Collaborator Author

@fdwr fdwr Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the previous if looked like a typo (so that should be fixed in any case).

Well, with this change, outputSizes would be required (caller is explicit now). Do you think this simplification to be problematic/burdensome? 🤔

Copy link
Contributor

@huningxin huningxin Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then outputSizes is not an option anymore, should it be moved to another parameter? Like

 MLOperand averagePool2d(MLOperand input, sequence<[EnforceRange] unsigned long> outputSizes, optional MLPool2dOptions options = {});

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://w3ctag.github.io/design-principles/#prefer-dictionaries suggests:

You should also consider accepting mandatory parameters through a dictionary, if it would make the API more readable, especially when they are of primitive types.

...but it also says:

The dictionary itself should be an optional argument, so that if the author is happy with all of the default options, they can avoid passing an extra argument.

I'd err on the side of putting required arguments into the dictionary - and making the dictionary itself required - because:

  • It's more readable, since primitive types are now described (as the Design Principles suggest)
    • averagePool2d(input, {outputSizes: [2, 3]}) > averagePool2d(input, [2, 3])
  • It's more readable, since it's otherwise not obvious to the caller why some configuration options belong in the options dictionary while others don't
    • averagePool2d(input, {outputSizes: [2, 3], layout: "nchw"}) > averagePool2d(input, [2, 3], {layout: "nchw"})
  • It's more extensible, since the options dict can be changed without changing the method signature (though making a non-required dictionary field required would still be a breaking change)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd err on the side of putting required arguments into the dictionary - and making the dictionary itself required

SGTM. Thanks for the explanation, @a-sully !

Copy link
Collaborator Author

@fdwr fdwr Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd err on the side of putting required arguments into the dictionary - and making the dictionary itself required

Fine with me. Will do.

and making the dictionary itself required

Hmm, the following doesn't work (bikeshed error as required can be used on fields but evidently not parameters):

  MLOperand averagePool2d(MLOperand input, required MLPool2dOptions options);

So I presume you meant to just remove the "= {}" instead?

  MLOperand averagePool2d(MLOperand input, MLPool2dOptions options);

1. Let |outputHeight| be |outputSizes|[0].
1. Let |outputWidth| be |outputSizes|[1].
1. Otherwise:
1. Let |outputSizes| be the result of [=MLGraphBuilder/calculating conv2d output sizes=] given |inputHeight|, |inputWidth|, |windowDimensions|[0], |windowDimensions|[1], |padding|, |strides|, and |dilations|.
1. Let |outputHeight| be |outputSizes|[0].
1. Let |outputWidth| be |outputSizes|[1].
1. Switch on |roundingType|:
huningxin marked this conversation as resolved.
Show resolved Hide resolved
<dl class=switch>
: {{MLRoundingType/"floor"}}
::
1. Set |outputWidth| to floor(|outputWidth|).
1. Set |outputHeight| to floor(|outputHeight|).
: {{MLRoundingType/"ceil"}}
::
1. Set |outputWidth| to ceiling(|outputWidth|).
1. Set |outputHeight| to ceiling(|outputHeight|).
</dl>
1. Let |outputHeight| be |outputSizes|[0].
fdwr marked this conversation as resolved.
Show resolved Hide resolved
1. Let |outputWidth| be |outputSizes|[1].
1. Switch on |layout|:
<dl class=switch>
: {{MLInputOperandLayout/"nchw"}}
Expand Down Expand Up @@ -5526,7 +5500,7 @@ partial dictionary MLOpSupportLimits {
1. If |options|.{{MLPool2dOptions/dilations}}'s [=list/size=] is not 2, then [=exception/throw=] a {{TypeError}}.
1. If any value in |options|.{{MLPool2dOptions/dilations}} is not greater than 0, then [=exception/throw=] a {{TypeError}}.
1. Let |desc| be a copy of |input|.{{MLOperand/[[descriptor]]}}.
1. Let |outputShape| be the result of [=MLGraphBuilder/calculating pool2d output sizes=] given |options|.{{MLPool2dOptions/layout}}, |input|'s [=MLOperand/shape=], |options|.{{MLPool2dOptions/roundingType}}, |options|.{{MLPool2dOptions/windowDimensions}}, |options|.{{MLPool2dOptions/padding}}, |options|.{{MLPool2dOptions/strides}}, |options|.{{MLPool2dOptions/dilations}}, and |options|.{{MLPool2dOptions/outputSizes}} (if it [=map/exists=]).
1. Let |outputShape| be the result of [=MLGraphBuilder/calculating pool2d output sizes=] given |options|.{{MLPool2dOptions/layout}}, |input|'s [=MLOperand/shape=], |options|.{{MLPool2dOptions/windowDimensions}}, |options|.{{MLPool2dOptions/padding}}, |options|.{{MLPool2dOptions/strides}}, |options|.{{MLPool2dOptions/dilations}}, and |options|.{{MLPool2dOptions/outputSizes}} (if it [=map/exists=]).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The outputSizes validation above in 13.2 doesn't appear to be correct--shouldn't it be similar to this logic?

1. If any [=list/item=] in |outputShape| is not a [=valid dimension=], then [=exception/throw=] a {{TypeError}}.
1. Set |desc|.{{MLOperandDescriptor/shape}} to |outputShape|.
1. *Make graph connections:*
Expand Down
Loading