Type bounds for type Param and related types in traits #350

itrumper · 2023-05-06T13:26:47Z

itrumper
May 6, 2023

Working with argmin has been a pleasure. I really appreciate the work that has gone into implementing these algorithms in Rust! Thank you for your hard work.

My question is concerning the use of argmin, where I don't understand how implementing a trait like CostFunction and Gradient using my own custom types for the associated types in those traits will result in a valid optimized problem. I am able to define

type Param = MyStruct;
type Output = SomeOtherStruct;

and this satisfies the CostFunction trait. That all seems reasonable. When I initialize my Executor the initial state is passed in with the correct parameter type and we can compute the cost based on that. However, when it comes to implementing the Gradient trait, and I use the same type for my input parameter, how does argmin know what the gradient means for these structs?

If we are always working with numeric types (f64 or f32), I understand how the linear algebra of defining a Jacobian or some other matrix can work. You don't need to know what each parameter means, just that their values. This isn't true when using structs, like in my example. Could you please explain further how the algorithm handles these cases?

I guess the other possibility is that argmin is not meant to handle arbitrary associated types in these traits, in which case I would have expected to have them restricted by the compiler. In my opinion, this is the correct way to go, or else everything has to be very generic!

stefan-k · 2023-05-09T05:39:29Z

stefan-k
May 9, 2023
Maintainer

Apologies for the late reply and thanks for the kind words!

argmin does not know anything about your Param and Gradient, but it does require them to provide certain functionality via traits. Consider for instance Landweber, in particular this line. Here, P is required to implement the trait ArgminScaledSub<G, F, P>, which defines how to subtract a gradient of type G scaled by a scalar of type F from a parameter vector of type P, with an output of type P.

For Vec<F>s, ndarray and nalgebra types, these math-related traits are already implemented in argmin-math. If you provide your own types, you have to implement the relevant traits for those types. You can look at what the solver of your choice requires and implement the relevant argmin-math traits (as opposed to implementing all of them, which may not be feasible).

Why do you want your own structs? Typically one would resort to using one of the math backends. If you need to carry around additional data you may be better off moving it to the problem struct.

0 replies

itrumper · 2023-05-09T12:52:22Z

itrumper
May 9, 2023
Author

Not a problem! Totally reasonable time for a reply :)

That all makes sense, I think my surprise comes from where those trait bounds are applied. As I was following along in examples, I was implementing CostFunction and Gradient traits for my problem struct and I used a struct for the Param to get named quantities so their meaning was communicated not just the value. For example, using just a Vec<f64>, you don't know what each value in that set represents, but a struct with its fields should help clarify the meaning.

I see now that if needed, I could implement the math-related traits for any struct that I wanted. So to reframe the discussion, I am asking if there is some way for those trait bounds to be applied to the various traits that we implement for our problem structs as well as the solvers. This way, I know my implementation of a problem will work with at least some solvers.

I think the difficulty comes because the solvers do not require the same trait bonds (I assume). Is there a common trait among them all, for example the output is ArgminFloat?

If it is not possible to constrain the types, I think a mention in the Book that the traits we implement for the problem are then restricted by the solver we choose, so we should look at those details before implementing the problem so we don't have type conflicts later on that surprise us.

0 replies

stefan-k · 2023-05-10T09:50:20Z

stefan-k
May 10, 2023
Maintainer

I used a struct for the Param to get named quantities so their meaning was communicated not just the value. For example, using just a Vec, you don't know what each value in that set represents, but a struct with its fields should help clarify the meaning.

That makes absolutely sense!

I think the difficulty comes because the solvers do not require the same trait bonds (I assume).

That's correct. Different solvers have different requirements.

I am asking if there is some way for those trait bounds to be applied to the various traits that we implement for our problem structs as well as the solvers. This way, I know my implementation of a problem will work with at least some solvers.
...
Is there a common trait among them all, for example the output is ArgminFloat?

No, there is no supertrait (I think that's what they are called, but I'm not sure). Do I understand correctly that you are asking for a trait which covers all math-related traits, and as such can be used to constrain for instance Param in the CostFunction trait? And the reason for this is that the compiler would error earlier (when defining the problem, rather then when combined with a solver)?

This has substantial downsides, in particular with the wider ecosystem:

it is difficult to define such a supertrait, because the math-related traits in itself have generics
these constraints often have to be splattered all over the place in the code, even in places where they don't seem to make sense (this one is a bit difficult to explain for me, and maybe it's not true in this particular case, but I've made certain experiences ;) )
SemVer: If a trait is added to the supertrait, that is a breaking change for downstream code, whereas without a supertrait this doesn't affect downstream code
It won't cover everything anyway, because external crates which use argmin as a "solver runtime" could add additional constraints on the types, therefore you can't be sure that your problem works with all solvers anyways

I personally think the fact that only the solver defines which traits must be implemented is a good design. For instance, if you want to use a custom type and you're only interested in L-BFGS, then you only need to implement the traits which are really needed.

Now back to your problem. Assuming you are using the Vec backend, you could do the following: Your custom type could implement From/Into for Vec<f64>. Your Param would be Vec<f64>, and when you pass the initial parameter into the Executor, you simply do initial_params.into(). After the optimization is done, you can transform the result back into your custom type: let final: MyStruct = result_vec.into(). The overhead is negligible, you can use the existing backends and you still get the explicitness of your MyStruct.

0 replies

itrumper · 2023-05-10T13:14:02Z

itrumper
May 10, 2023
Author

No, there is no supertrait (I think that's what they are called, but I'm not sure). Do I understand correctly that you are asking for a trait which covers all math-related traits, and as such can be used to constrain for instance Param in the CostFunction trait? And the reason for this is that the compiler would error earlier (when defining the problem, rather then when combined with a solver)?

That is correct, I was looking for a supertrait (I like the sound of it anyway and I'm pretty sure I've seen it in the Rust book) so that the compiler would error earlier when defining the problem versus when combining it with a solver.

Thank you for highlighting the reasons why a supertrait would make the ergonomics worse. This is really what I am seeking, so hearing the larger context for its impact helps put the idea in perspective.

I personally think the fact that only the solver defines which traits must be implemented is a good design. For instance, if you want to use a custom type and you're only interested in L-BFGS, then you only need to implement the traits which are really needed.

I agree, it is very clean and nice to only have to implement the traits for the particular solver, and setting up the problem definition is typically not fraught with type errors (like I have experienced using other solvers in the Rust ecosystem). This helps us get started quicker, but there could be some unforeseen type conflicts later on when we use a solver (I'll give my thoughts on how best to address this below).

Your custom type could implement From/Into for Vec<64>

That's a great idea! Thanks for sharing. This is nice for when you're setting up the problem, but for people doing the implementation of the problem they would still have to work with the ambiguous Vec. To help mitigate this, I have just been putting doc style comments on the type Param definitions, and then you can at least express some intent. It's not at the compiler level, but its still nicer, in my opinion.

As I mentioned above, my objective with adding a supertrait would be to improve the ergonomics of the library. Based on your comment:

It won't cover everything anyway, because external crates which use argmin as a "solver runtime" could add additional constraints on the types, therefore you can't be sure that your problem works with all solvers anyways

Could we define a supertrait for each solver that expresses what their bounds are and then include that in the problem definition? This could either be done by requiring a separate trait called Solver where the implementation is just another type definition specifying which solvers you want to use this problem with. Then, in any of the other traits like CostFunction or Gradient we would use the Solver trait to further constrain the types.

These are just my musings, I'm not sure if this is even possible. I may take a look at implementing this to see if its even possible, but I've got other features that I want to work on that are needed for my application, so my time will be put into those first :)

In the meantime, I think all of this is mitigated by having a section in the book that covers this topic. I can make an initial attempt at writing such a section if that sounds reasonable to you?

Thank for the great discussion :)

0 replies

stefan-k · 2023-05-11T19:12:37Z

stefan-k
May 11, 2023
Maintainer

This is nice for when you're setting up the problem, but for people doing the implementation of the problem they would still have to work with the ambiguous Vec.

That's a very good point.

Could we define a supertrait for each solver that expresses what their bounds are...

We had that for one solver once and I recall not liking it, but I don't recall the reason anymore (probably just taste).

... and then include that in the problem definition? This could either be done by requiring a separate trait called Solver where the implementation is just another type definition specifying which solvers you want to use this problem with. Then, in any of the other traits like CostFunction or Gradient we would use the Solver trait to further constrain the types.

Apologies, I have trouble following this thought. Could you provide a code example, please? I think this would make it easier for me to understand. (Side note: there is already a Solver trait ;)). I'd also be interested in your use case, because it sounds as if there are social factors involved which require a more detailed communication of intent (if that makes any sense).

In the meantime, I think all of this is mitigated by having a section in the book that covers this topic. I can make an initial attempt at writing such a section if that sounds reasonable to you?

Contributions to the docs are always highly appreciated! :) I just had a look at the docs and admittedly, it is not easy to see the trait bounds. Pointing people to the right place would definitely help.

0 replies

itrumper · 2023-05-13T03:27:01Z

itrumper
May 13, 2023
Author

We had that for one solver once and I recall not liking it, but I don't recall the reason anymore (probably just taste).

Ah, I can see that. With regards to putting together a code example, I will work on getting that once I nail down what exactly I mean :)

I'll put together that PR for the docs soon.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Type bounds for type Param and related types in traits #350

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 6 comments

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Type bounds for type Param and related types in traits #350

itrumper May 6, 2023

Replies: 6 comments

stefan-k May 9, 2023 Maintainer

itrumper May 9, 2023 Author

stefan-k May 10, 2023 Maintainer

itrumper May 10, 2023 Author

stefan-k May 11, 2023 Maintainer

itrumper May 13, 2023 Author

itrumper
May 6, 2023

stefan-k
May 9, 2023
Maintainer

itrumper
May 9, 2023
Author

stefan-k
May 10, 2023
Maintainer

itrumper
May 10, 2023
Author

stefan-k
May 11, 2023
Maintainer

itrumper
May 13, 2023
Author