You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For jobs where the scaling method is to match the number of client nodes, node pools offer an effective, native way to describe a cluster of resources. It would be great to automatically match the job group count with the number of instances in a node pool. This would allow users to scale underlying clusters based on metrics such as incoming request count or cpu utilization and have service jobs be placed on each node.
Note that for effective usage, one would have to ensure allocations are on distinct hosts, and that scaling down infrastructure doesn't impact running allocations but simply removes what was allocated on the now removed nodes.
The text was updated successfully, but these errors were encountered:
Interestingly I think I just answered a question like this #797 (comment) 😄
The tricky part is that the job_summary metric I used in the query doesn't have a node_pool label, and I don't think it's even possible to do so given that queued allocs are not running in a any client. We could read the value from the job, but the all node pool would need to be taken into special consideration.
We could also simplify things quite a bit by adding a new query operation to the Nomad APM to just return client counts.
Then there's also the problem mentioned in the comment linked above:
Unfortunately this doesn't work as well because the group policy will not be able to take into consideration the number of queued allocations. So you will be able to scale up the number of clients, but not down 😅
So lots to improve, but node pools do open an interesting points of exploration.
For jobs where the scaling method is to match the number of client nodes, node pools offer an effective, native way to describe a cluster of resources. It would be great to automatically match the job group count with the number of instances in a node pool. This would allow users to scale underlying clusters based on metrics such as incoming request count or cpu utilization and have service jobs be placed on each node.
Note that for effective usage, one would have to ensure allocations are on distinct hosts, and that scaling down infrastructure doesn't impact running allocations but simply removes what was allocated on the now removed nodes.
The text was updated successfully, but these errors were encountered: