Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor checkpointing #51

Open
wants to merge 1 commit into
base: dra-1.31
Choose a base branch
from

Conversation

bart0sh
Copy link

@bart0sh bart0sh commented Jul 18, 2024

What type of PR is this?

/kind bug
/kind cleanup

What this PR does / why we need it:

Current implementation doesn't guarantee that checkpointing would work after changing DRA data structures, which was explained in the @123552

This PR changes checkpointing implementation by isolating DRA checkpointing into a separate API maintained as other Kubernetes APIs.

Which issue(s) this PR fixes:

Hopefully fixes kubernetes#123552

Special notes for your reviewer:

This is just a first step, a base for further work. The next step would be checkpointing data optimisation, i.e. only the data that's required should be checkpointed, which is currently not the case.

@bart0sh
Copy link
Author

bart0sh commented Jul 18, 2024

/cc @pohly @klueska @moshe010

@bart0sh bart0sh force-pushed the PR150-dra-refactor-checkpoint branch 2 times, most recently from b69d976 to 1e9402b Compare July 18, 2024 07:20
@pohly pohly force-pushed the dra-1.31 branch 5 times, most recently from 13c74c9 to 85f0965 Compare July 21, 2024 15:26
@bart0sh bart0sh force-pushed the PR150-dra-refactor-checkpoint branch from 1e9402b to 8693ebc Compare July 21, 2024 16:12
@pohly pohly force-pushed the dra-1.31 branch 3 times, most recently from 818c6c1 to a011083 Compare July 21, 2024 18:57
@bart0sh bart0sh force-pushed the PR150-dra-refactor-checkpoint branch from 8693ebc to ce39107 Compare July 21, 2024 20:08
@bart0sh bart0sh marked this pull request as ready for review July 21, 2024 20:09
@bart0sh bart0sh force-pushed the PR150-dra-refactor-checkpoint branch from ce39107 to 5452056 Compare July 21, 2024 20:14
@bart0sh
Copy link
Author

bart0sh commented Jul 21, 2024

/assign @pohly @klueska @moshe010

PTAL if this approach makes sense, thanks.

@pohly pohly force-pushed the dra-1.31 branch 2 times, most recently from fb59476 to 68ad8b2 Compare July 22, 2024 07:55
// plugin should be invoked to process this ResourceHandle's data once it
// lands on a node. This may differ from the DriverName set in
// ResourceClaimStatus this ResourceHandle is embedded in.
DriverName string `json:"driverName,omitempty" protobuf:"bytes,1,opt,name=driverName"`
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need protobuf, do we?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the json tag also can be dropped. It should be able to decode 1.30 files without it (JSON isn't case-sensitive, if I'm not mistaken).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

)

// V1 data structures are copies of the correspondent structures from state_checkpoint.go (k/k 1.30)
// This is done to avoid breaking changes
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the goal is to define two versions as a proof-of-concept, right?

Some commit message explaining why you are making these changes would help with a review.

Also, this isn't complete yet, is it? I was looking for the code that switches between the two versions depending on what is in a checkpoint file, but haven't found it.

cc @klueska

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually an older comment. Might be obsolete now, I haven't checked again yet.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, obsolete as the rest of the comments due to the latest update.

@@ -85,9 +85,6 @@ func (manager *impl) GetCheckpoint(checkpointKey string, checkpoint Checkpoint)
return err
}
err = checkpoint.UnmarshalCheckpoint(blob)
if err == nil {
err = checkpoint.VerifyChecksum()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bart0sh: can you add a commit message explaining the "what" and "why" of this commit?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as we moved to another approach this is not needed anymore. checksum is always verified. see details in the types.go

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bart0sh I'm not quite sure what you mean? This is in the generic library code, and Patrick made changes to the devicemanager and the cpumanager to adhere to this API. I don't think we can just remove this because of that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion. This is an old code and it's not in this PR anymore.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you forgot to remove it? I am commenting on this from the "Files Changed" tab of the PR...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my bad. removed. Thank you for pointing out.

@bart0sh bart0sh force-pushed the PR150-dra-refactor-checkpoint branch from 5452056 to 70a9844 Compare July 22, 2024 09:41
@pohly pohly force-pushed the dra-1.31 branch 3 times, most recently from 43567fc to d11b58e Compare July 22, 2024 16:09
@bart0sh bart0sh force-pushed the PR150-dra-refactor-checkpoint branch 2 times, most recently from 6bdb873 to 19b170b Compare July 22, 2024 18:32
@bart0sh
Copy link
Author

bart0sh commented Jul 22, 2024

@pohly @klueska Updated code to use common API management approach, PTAL.

@bart0sh bart0sh force-pushed the PR150-dra-refactor-checkpoint branch from 19b170b to c94f83f Compare July 22, 2024 18:52
@bart0sh
Copy link
Author

bart0sh commented Jul 22, 2024

Glancing through https://github.com/pohly/kubernetes/tree/master/pkg/kubelet/apis/config, this PR doesn't seem to adhere to the organization we would need to eventually support automatic generation of the conversion code.

From what I can tell, most of what you currently have under checkpoint/v1alpha1 would need to be moved up to checkpoint and then the folder for v1alpha1 (or just v1 as I have suggested) would only contain a single file, with contents similar to the following: https://github.com/pohly/kubernetes/blob/master/pkg/kubelet/apis/config/v1alpha1/doc.go

Unlike kubelet/apis/config, this is an internal-only API, so it doesn't need internal and external parts. When we create v2, the conversion-gen machinery should be able to generate v1->v2 conversion code.

@pohly Is this a correct assumption?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this file, even if we don't ever plan to expose this over the API server? Seems silly to have a whole file just for this one constant.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we don't. And we don't need doc.go either it seems. I'm going to remove both and also move APIs(New, MarshalCheckpoint, etc) out of this directory to avoid duplicating them for every API version.

"k8s.io/kubernetes/pkg/kubelet/checkpointmanager"
"k8s.io/kubernetes/pkg/kubelet/checkpointmanager/errors"
checkpointapi "k8s.io/kubernetes/pkg/kubelet/cm/dra/checkpoint/v1"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we not just alias this to checkpoint? Calling checkpointapi.New() is more awkward than simply calling checkpoint.New().

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do that, but we may need to import checkpoint APIs from the k8s.io/kubernetes/pkg/kubelet/cm/dra/checkpoint. How would you suggest to call this import?

pc := &Checkpoint{
TypeMeta: metav1.TypeMeta{
Kind: "DRACheckpoint",
APIVersion: "checkpoint.dra.kubelet.k8s.io/v1",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use the GroupName variable here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean moving GroupName constant from register.go here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added GroupName constant to the types.go and used it in the api.go

checkpoint "k8s.io/kubernetes/pkg/kubelet/cm/dra/checkpoint"
checkpointapi "k8s.io/kubernetes/pkg/kubelet/cm/dra/checkpoint/v1"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't expect us to have to pull both of these in here. Why is ClaimInfoStateList defined in checkpoint but ClaimInfoState is defined in checkpoint/v1?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ClaimInfoStateList is here to avoid duplication in every API version. It's just a helper type for the list of ClaimInfoStates. If you think it shouldn't be outside of checkpoint/ I can either move it there or get rid of it and simply use []checkpoint.ClaimInfoState instead of it. Which way would you prefer?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't expect us to have to pull both of these in here.

that's a result of getting rid of state subdirectory, I guess. Should I return it back and put state_checkpoint* files there?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved state_checkpoint.go and its tests to the pkg/kubelet/cm/dra/state/. Does it look better now?

@klueska
Copy link

klueska commented Jul 22, 2024

nlike kubelet/apis/config, this is an internal-only API, so it doesn't need internal and external parts. When we create v2, the conversion-gen machinery should be able to generate v1->v2 conversion code.

@pohly Is this a correct assumption?

I would think that the code in pkg/kubelet/cm/dra/checkpoint should always be the latest, with any generated conversion code living in checkpoint/v1, checkpoint/v2, etc. Where (for now) the code under checkpoint/v1 is simply an "identity" conversion (i.e. it is a 1-1 mapping where no real conversion takes place), but will be updated if / when we ever add a v2 -- at which point v2 will be the "identity" conversion mapping to the latest changes in pkg/kubelet/cm/dra/checkpoint.

@pohly
Copy link
Owner

pohly commented Jul 23, 2024

Unlike kubelet/apis/config, this is an internal-only API, so it doesn't need internal and external parts. When we create v2, the conversion-gen machinery should be able to generate v1->v2 conversion code.

@pohly Is this a correct assumption?

I am not sure. The code generator should be able to do it, but I am less sure about how to configure the scheme to use such a direct conversion path. I know for "normal" APIs, the conversion path is always "version A" <-> "internal" <-> "version B". That means conversion back and forth is only needed between each version and the internal representation. I've not seen it being used with just conversion from "version A" -> "version B".

No matter whether this works or not, we can start with just encoding/decoding "version A", without any conversion. Then if, and only if we need conversion, we can add it, either with such an internal representation or directly.

@bart0sh
Copy link
Author

bart0sh commented Jul 23, 2024

No matter whether this works or not, we can start with just encoding/decoding "version A", without any conversion. Then if, and only if we need conversion, we can add it, either with such an internal representation or directly.

That was my intention. I didn't do any conversion as we have only one version. I can try to add a new verson locally and see if conversion-gen is able to create v1->v2 and v2->v1 conversion, but in my understanding it should. That's why I think that we don't need "internal" part of the API, it would be just a duplication of the latest API version.

@bart0sh bart0sh force-pushed the PR150-dra-refactor-checkpoint branch from 6e92d20 to 02a5ad3 Compare July 23, 2024 09:38
@bart0sh
Copy link
Author

bart0sh commented Jul 23, 2024

@pohly btw, do we really ever need to convert from v2 to v1? to do that we'd need to backport v2 types and conversion code to the previous kubelet version as far as I understand. Please, clarify.

@bart0sh bart0sh force-pushed the PR150-dra-refactor-checkpoint branch from 02a5ad3 to 18b7cde Compare July 23, 2024 09:48
@pohly
Copy link
Owner

pohly commented Jul 23, 2024

btw, do we really ever need to convert from v2 to v1? to do that we'd need to backport v2 types and conversion code to the previous kubelet version as far as I understand. Please, clarify.

I think we need to support downgrades. It's not okay if kubelet v1.32 encounters a checkpoint file written by kubelet v1.33 and then refuses to start. That's the opposite direction of what we encountered before, but it's also a problem.

To answer your question: writing a checkpoint v2 in kubelet v1.33 and then failing to handle it in kubelet v1.32.0 (no backporting!) has that problem, so it that was the plan, then we need to reconsider.

@bart0sh bart0sh force-pushed the PR150-dra-refactor-checkpoint branch 2 times, most recently from 9cab240 to 649411e Compare July 23, 2024 15:41
@bart0sh
Copy link
Author

bart0sh commented Jul 23, 2024

@pohly @klueska I've just pushed new changes:

  • used checkpoint structure proposed by @pohly
  • tested downgrades and upgrades
  • moved checkpoint APIs (not versioned) to the dra/checkpoint/
  • moved versioned API to the dra/state/v1
  • renamed non-versioned apis and structures as @klueska suggested

@bart0sh
Copy link
Author

bart0sh commented Jul 23, 2024

@pohly Is it ok to use versioned checkpoint structures (e.g ClaimInfoStateList) in the manager.go and claiminfo.go or we should avoid it? If yes, how to better do it? Should we use "internal" API and conversions for this?

@bart0sh bart0sh force-pushed the PR150-dra-refactor-checkpoint branch 3 times, most recently from 74af2e5 to 38ed8c2 Compare July 23, 2024 16:06
// Device is how a DRA driver described an allocated device in a claim
// to kubelet. RequestName and CDI device IDs are optional.
// +k8s:deepcopy-gen=true
type Device struct {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For symmetry...

Suggested change
type Device struct {
type DeviceState struct {

Copy link
Author

@bart0sh bart0sh Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I also rename DriverState.Devices -> DriverState.DeviceStates ?

Store(state.ClaimInfoStateList) error
}

type checkpointer struct {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be called checkpointerV1

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it should always work with the latest API version.

Co-authored-by: Kevin Klues <[email protected]>
@bart0sh
Copy link
Author

bart0sh commented Jul 23, 2024

@klueska @pohly let's continue in the k/k upstream PR: kubernetes#126303

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants