Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add opt-in support for per bosh deployment DRS rules #378

Open
1 of 3 tasks
gberche-orange opened this issue Apr 22, 2024 · 3 comments
Open
1 of 3 tasks

Add opt-in support for per bosh deployment DRS rules #378

gberche-orange opened this issue Apr 22, 2024 · 3 comments

Comments

@gberche-orange
Copy link
Contributor

gberche-orange commented Apr 22, 2024

Feature Request

Detailed Description

Given dynamically created bosh deployments (e.g. mysql-cluster-1, mysql-cluster-2, ... mysql-cluster-N) with an instance group "mysql" with 3 instances,
In order for DRS to avoid scheduling mysql instances of a given deployment on the same vsphere esx
I need that vsphere bosh cpi supports a DRS rule per deployment

Currently, either

https://github.com/cloudfoundry/docs-bosh/blob/96fdb6fff79d7eed1f78b6fb05ce064de2acfea0/content/vsphere-cpi.md?plain=1#L15-L17

        * **drs_rules** [Array, optional]: Array of DRS rules applied to [constrain VM placement](vm-anti-affinity.md#vsphere). Must have only one.
            * **name** [String, required]: Name of a DRS rule that the Director will create.
            * **type** [String, required]: Type of a DRS rule. Currently only `separate_vms` is supported.
- type: replace
  path: /vm_extensions?/-
  value:
    name: drs-antiaffinity-r4
    cloud_properties:
      datacenters:
      - name: ((/secrets/vsphere_4_vcenter_dc))
        clusters:
        #r4-z1 cluster              
        - ((/secrets/vsphere_4_1_vcenter_cluster)):
            drs_rules:
            - name: ((/secrets/site_type))-bosh-coab-drs-antiaffinity
              type: separate_vms      
  • or systematic dynamic DRS rules can be turned on for a given bosh director, and then applies to all deployments

vcenter.enable_auto_anti_affinity_drs_rules:
description: Creates anti-affinity DRS rules for each instance group of each bosh deployment to place VMs on separate hosts. The DRS rules are named from template <bosh-director-name>-<bosh-deployment-name>-<instance-group-name>
default: false

def should_create_auto_drs_rule(vm_config, cluster)
return @enable_auto_anti_affinity_drs_rules && vm_config.drs_rule(cluster).nil? && !vm_config.bosh_group.nil?
end
def create_drs_rules(vm_config, vm_mob, cluster)
if should_create_auto_drs_rule(vm_config, cluster) then
drs_rule_name = vm_config.bosh_group
elsif !vm_config.drs_rule(cluster).nil? then
drs_rule_res = vm_config.drs_rule(cluster)
drs_rule_name = drs_rule_res['name']
else
return
end

def bosh_group
if !agent_env['bosh'].nil? then
return agent_env['bosh']['group']
else
return nil
end
end

Given that env.bosh.group is systematically defined by bosh director in https://github.com/cloudfoundry/bosh/blob/dec31de320fcd29a574db8685f6abf697138f788/src/bosh-director/lib/bosh/director/deployment_plan/steps/create_vm_step.rb#L135 This results into DRS rules being created for each instance group of each deployment. The DRS rules are named from template: <bosh-director-name>-<bosh-deployment-name>-<instance-group-name>

This results into a large number of auto-created DRS rules for bosh directors with a existing large number of deployments

While theoretically there is no limit to number of DRS rules, it seems not recommended to enable this property on a bosh director with a large number of deployments (unless every single instance group in all deployments require an anti-affinity DRS rule )
https://communities.vmware.com/t5/VMware-vCenter-Discussions/Maximum-Number-of-DRS-Rules-per-Cluster/td-p/2744546

It is recommend to use DRS rules sparingly, hence it is better not to use them unless it is absolutely required. As the number of rules gets increased, it will restrict DRS opportunities of balancing the cluster. It is operationally challenging in managing them as well.

Context

Why is this change important to you? How would you use it?

In order to benefit from vsphere HA support from distinct esx instances, I need DRS anti affinity on relevant instance groups of selected deployments. This is important for many dynamic bosh deployments which can not leverage static DRS rules declared in the cloud-config.

Alternative Implementations

VM Types / VM Extensions support for enable_auto_anti_affinity_drs_rules

In addition to supporting the enable_auto_anti_affinity_drs_rules=true at the global level, this property would also be supported in a vm_types or vm_extensions block, overriding the global value.

Inspiration from similar property upgrade_hw_version

if vm_config.upgrade_hw_version?(vm_config.vm_type.upgrade_hw_version, @upgrade_hw_version)
created_vm.upgrade_vm_virtual_hardware
end

def upgrade_hw_version?(vmtype_hw_version, global_hw_version)
vmtype_hw_version.nil? ? global_hw_version : vmtype_hw_version
end

https://github.com/orange-cloudfoundry/bosh-vsphere-cpi-release/blob/87b8474f18046e6920d4c44478138f084cb3cdf3/src/vsphere_cpi/spec/unit/cloud/vsphere/vm_config_spec.rb#L24-L50

New cpi property

EDIT: likely too complex proposal

Add new cpi flag vcenter.restrict_auto_anti_affinity_drs_rules_to_marked_instance_groups which adds new opt-in behavior without introducing breaking changes to existing behavior

  vcenter.enable_auto_anti_affinity_drs_rules:
    description: Creates a DRS rules for each instance group to place VMs on separate hosts. Conditional to the deployment manifest to set a non-nil `env.bosh.group` field in an instance group. The DRS rules are named from template: <bosh-director-name>-<bosh-deployment-name>-<instance-group-name>
    default: false

  vcenter.restrict_auto_anti_affinity_drs_rules_to_marked_instance_groups:
    description: When `enable_auto_anti_affinity_drs_rules=true`, restrict auto generated DRS rules to instance groups declaring `env.bosh.enable_auto_anti_affinity_drs_rules=true` in the deployment manifest
    default: false

Complexity

  • Low - Simple enhancement or bug fix, no architectural changes or refactoring
  • Medium - Change requires some thought, but is relatively isolated
  • High - Significant architectural change or large refactor
gberche-orange added a commit to orange-cloudfoundry/bosh-vsphere-cpi-release that referenced this issue Apr 22, 2024
See related cloudfoundry#378

Extracted from https://www.pivotaltracker.com/n/projects/1133984/stories/133642741
 >  We can create a DRS rule based on env.bosh.group (skip if this field is not provided).
@jpalermo jpalermo moved this from Inbox to Waiting for Changes | Open for Contribution in Foundational Infrastructure Working Group Apr 25, 2024
gberche-orange added a commit to orange-cloudfoundry/bosh-vsphere-cpi-release that referenced this issue Apr 26, 2024
See related cloudfoundry#378

Extracted from https://www.pivotaltracker.com/n/projects/1133984/stories/133642741
 >  We can create a DRS rule based on env.bosh.group (skip if this field is not provided).
selzoc pushed a commit that referenced this issue Apr 26, 2024
See related #378

Extracted from https://www.pivotaltracker.com/n/projects/1133984/stories/133642741
 >  We can create a DRS rule based on env.bosh.group (skip if this field is not provided).
@gberche-orange
Copy link
Contributor Author

@selzoc would you accept a PR implementing this proposal ?

@selzoc
Copy link
Member

selzoc commented Apr 29, 2024

@selzoc would you accept a PR implementing this proposal ?

Well, it's not up to me! But I see this issue is in the Waiting for Changes | Open for Contribution part of the working group project, so we'd probably review it.

@gberche-orange
Copy link
Contributor Author

@cunnie would you by chance have historical background to review and comment this updated proposal, in particular the VM Types / VM Extensions support for enable_auto_anti_affinity_drs_rules section above ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Waiting for Changes | Open for Contribution
Development

No branches or pull requests

2 participants