Set job group count to the number of nodes in a node pool #681

josegonzalez · 2023-07-26T03:16:48Z

For jobs where the scaling method is to match the number of client nodes, node pools offer an effective, native way to describe a cluster of resources. It would be great to automatically match the job group count with the number of instances in a node pool. This would allow users to scale underlying clusters based on metrics such as incoming request count or cpu utilization and have service jobs be placed on each node.

Note that for effective usage, one would have to ensure allocations are on distinct hosts, and that scaling down infrastructure doesn't impact running allocations but simply removes what was allocated on the now removed nodes.

lgfa29 · 2023-12-22T03:53:43Z

Hi @josegonzalez 👋

Interestingly I think I just answered a question like this #797 (comment) 😄

The tricky part is that the job_summary metric I used in the query doesn't have a node_pool label, and I don't think it's even possible to do so given that queued allocs are not running in a any client. We could read the value from the job, but the all node pool would need to be taken into special consideration.

We could also simplify things quite a bit by adding a new query operation to the Nomad APM to just return client counts.

Then there's also the problem mentioned in the comment linked above:

Unfortunately this doesn't work as well because the group policy will not be able to take into consideration the number of queued allocations. So you will be able to scale up the number of clients, but not down 😅

So lots to improve, but node pools do open an interesting points of exploration.

Thanks for the suggestion!

lgfa29 added stage/accepted type/enhancement theme/policy Policy source, parsing and validation labels Dec 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set job group count to the number of nodes in a node pool #681

Set job group count to the number of nodes in a node pool #681

josegonzalez commented Jul 26, 2023

lgfa29 commented Dec 22, 2023

Set job group count to the number of nodes in a node pool #681

Set job group count to the number of nodes in a node pool #681

Comments

josegonzalez commented Jul 26, 2023

lgfa29 commented Dec 22, 2023