Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement GitRepo Deletion #32

Closed
3 tasks
srueg opened this issue Mar 30, 2020 · 12 comments · Fixed by #76
Closed
3 tasks

Implement GitRepo Deletion #32

srueg opened this issue Mar 30, 2020 · 12 comments · Fixed by #76
Labels
enhancement New feature or request

Comments

@srueg
Copy link
Contributor

srueg commented Mar 30, 2020

Implement the Git repository cleanup process with a retention policy feature (retain, delete) and by leveraging the finalizer pattern for actually deleting repositories.

Task Deliverables

  • Controller is able to delete managed Git repositories
  • Deletion policy is implemented
  • Finalizer pattern is used for actual deletion process
@srueg srueg added the enhancement New feature or request label Mar 30, 2020
@tobru
Copy link
Contributor

tobru commented Jun 8, 2020

To discuss:

  • Recycle policies like on PVC
    • Delayed deletion? Complicated with finalizer
    • Archive git repo?
  • Should a snapshot be kept before deleting stuff so that a restore could be possible?

To figure out:

  • What to actually delete?
    • GitRepos
    • File listing in Tenant object (cluster.yml)
    • Vault secrets
    • ...?

@Kidswiss
Copy link
Contributor

Kidswiss commented Jun 12, 2020

Here's my suggestion:

Use recycle policies:

  • delete -> obvious one
  • retain -> also pretty obvious
  • archive -> if the the implementation (git, vault, etc.) supports it, archive the project, if not fallback to retain

What to delete:

I'd suggest the default policy is archive and if you really want something gone, you'll have to set it explicitly beforehand.

@srueg
Copy link
Contributor Author

srueg commented Jun 12, 2020

In K8s PVC it's called reclaimPolicy which can be Retain, Delete or Recycle.
It's not really "reclaim" in our use case, so I'd suggest to name it deletionPolicy with Delete, Retain and Archive as options.

I'd suggest the default policy is archive and if you really want something gone, you'll have to set it explicitly beforehand.

Let's make the default configurable (i.e. on dev we probably want delete as default).

An interesting concept from Hive is delete protection:

When enabled, Hive will add the "hive.openshift.io/protected-delete" annotation to new ClusterDeployments. Once a ClusterDeployment has been installed, a user must remove the annotation from a ClusterDeployment prior to deleting it.

So we could add a syn.tools/protected-delete=true to prevent deletion of an object. To implement this properly an admission webhook is required. A "soft" version of it could work without a webhook though: if the annotation is set, the controller won't do any deprovisioning and leave the finalizer in place. The object would still be marked as deleted but it won't disappear as long as there are finalizers set.

@Kidswiss
Copy link
Contributor

I started with the basic framework to support deletionPolicy (though currently called recyclePolicy, but can be changed easily).

Also started to implement Delete and Archive functionality for Vault and Gitlab.

I like your suggestion with the delete protection annotation. I'll look into it.

@tobru
Copy link
Contributor

tobru commented Jun 12, 2020

What should the exact meaning of Retain? Would that mean the objects in Kubernetes are deleted, but no external resources are touched and just stay there as nothing happened? Aka the same behavior we have today?

related to #45: deleting a tenant should propagate to all clusters. e.g setting owner references accordingly.

Yes, but I think before we can implement this one we really need good safeguards as I already see this generating a huge clusterf*ck one day.

I did some research how others are doing that kind of stuff, to have a better understanding of it:

KubeDB

KubeDB also has some concepts regarding deletion: https://kubedb.com/docs/v0.13.0-rc.0/concepts/what-is-kubedb/overview/:

  • "Apply deletion lock to avoid accidental deletion of database."
  • "Keep track of deleted databases, cleanup prior snapshots with a single command."

They are calling it TerminationPolicy, with the possible values of DoNotTerminate, Pause (Default), Delete and WipeOut. See f.e. https://kubedb.com/docs/0.9.0/concepts/databases/postgres/#specterminationpolicy.

It is implemented as a ValidationWebhook.

Crossplane

The are calling it reclaimPolicy, with the possible values of Retain and Delete .

ReclaimPolicy specifies what will happen to this managed resource when its resource claim is deleted, and what will happen to the underlying external resource when the managed resource is deleted. The "Delete" policy causes the managed resource to be deleted when its bound resource claim is deleted, and in turn causes the external resource to be deleted when its managed resource is deleted. The "Retain" policy causes the managed resource to be retained, in binding phase "Released", when its resource claim is deleted, and in turn causes the external resource to be retained when its managed resource is deleted. The "Retain" policy is used when no policy is specified.

See https://github.com/crossplane/crossplane/blob/master/design/one-pager-resource-reclaim-policy.md

Conclusion

I guess the long-term goal should be to try to reuse as much as possible from Crossplane, even replacing parts of the Lieutenant Operator by Crossplane could be an option. But for the time being I think we should still implement it in Lieutenant Operator.

@Kidswiss
Copy link
Contributor

@tobru yes retain would simply leave the external resources be.

I like crossplane's concept a lot, but it might be quite overkill to implement in our lieutenant operator.

So is the long-term idea that we (VSHN) will submit code to crossplane as well to "outsource" these kinds of workloads away from our operator(s)?

@Kidswiss
Copy link
Contributor

I read up on how finalizers are implemented with the operator-sdk, as well as tobru's suggestions.

I also started to implement some finalizer logic into the gitrepo reconcile loop.

@Kidswiss
Copy link
Contributor

Did some plumbing for the finalizer handling.

Tested and implemented:

  • Vault soft and hard deletion
  • Gitlab repository soft and hard deletion

To do:

  • Delete cluster files from the tenant repository

Also we need to define how we want to handle the annotation to prevent accidental deletions. Should the operator set it on any CR by default?

@srueg
Copy link
Contributor Author

srueg commented Jun 15, 2020

Should the operator set it on any CR by default?

I'd say so, yes. But please make it configurable (e.g. OPERATOR_ENABLE_DELTE_PROTECTION=true), so we can skip in for example in dev envs.

@Kidswiss
Copy link
Contributor

Started implementing the deletion protection via label.

@Kidswiss
Copy link
Contributor

While testing the functionality I found some UX issues. When setting the annotation to false on a cluster, it will delete the vault token. But it won't delete the gitrepo, as it's a separate CR.

The annotation value should be propagate throughout the chain, if changed.

@Kidswiss Kidswiss mentioned this issue Jun 17, 2020
@Kidswiss
Copy link
Contributor

@srueg opened a PR for the changes: #76

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants