Way to set a fixed number of training runs for reinforcement learning #232

AsgerHB · 2023-11-17T14:20:11Z

AsgerHB
Nov 17, 2023
Collaborator

By default when using reinforcement learning to generate a strategy using minE or maxE queries, UPPAAL automatically chooses a number of traces to train on. On a very simple example, it seems to default to 9000 runs.

However, for experiments it can be beneficial to fix the number of training episodes while changing other variables. How can this be done? I have previously been trying something like these learning parameters to train for 10 traces only:

But this (seemingly) results in twice that amount of episodes being trained for.

Is the number in the result box correct, and should I put in half as many training episodes as I want in the boxes? Or is the reporting misleading somehow, and it actually trained for the 10 traces I wanted?

thorulf4 · 2023-11-22T10:24:52Z

thorulf4
Nov 22, 2023
Collaborator

Unfortunately we don't have good documentation on these parameters internally, and I think we need help from @petergjoel to completely understand how your desired results.

0 replies

petergjoel · 2024-01-15T08:22:19Z

petergjoel
Jan 15, 2024
Collaborator

Most of the parameters have a direct equivalent in Fig. 2 of On Time with Minimal Expected Cost!.

For your particular case, you need to limit to one iteration, as per your example.

limit the number of iterations to 1,
set Number of successful runs, Maximum number of runs and Number of good runs to the desired number of iterations, and
set evaluation runs to 1 (the evaluation is only used to discriminate between iterations).

This assume that your model ALWAYS reaches the trace-termination condition.
If not, then you will have to adjust the first three parameters to accommodate for the conditional learning.

0 replies

AsgerHB · 2024-01-17T08:48:32Z

AsgerHB
Jan 17, 2024
Collaborator Author

Setting number of runs to evaluate to 1 changes the number of runs reported from 20 to 11. The UI doesn't allow setting it to 0, but that hardly matters. This will save a good deal of time when running learning queries in the UI. (I see that when calling veriftya through CLI, I use the default 100 evaluation runs which is not so much.)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPPAAL

Way to set a fixed number of training runs for reinforcement learning #232

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

UPPAAL

Way to set a fixed number of training runs for reinforcement learning #232

AsgerHB Nov 17, 2023 Collaborator

Replies: 3 comments

thorulf4 Nov 22, 2023 Collaborator

petergjoel Jan 15, 2024 Collaborator

AsgerHB Jan 17, 2024 Collaborator Author

AsgerHB
Nov 17, 2023
Collaborator

thorulf4
Nov 22, 2023
Collaborator

petergjoel
Jan 15, 2024
Collaborator

AsgerHB
Jan 17, 2024
Collaborator Author