Skip to content

Conversation

@adrianreber
Copy link
Member

As described in sig-wg-lifecycle.md this PR is the next step after sending an email to dev@kubernetes.io about the creation of the Working Group Checkpoint Restore.

CC: @rst0git, @viktoriaas, @xhejtman

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 3, 2025
@k8s-ci-robot k8s-ci-robot requested review from deads2k and macsko July 3, 2025 13:33
@k8s-ci-robot
Copy link
Contributor

Welcome @adrianreber!

It looks like this is your first PR to kubernetes/community 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/community has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. sig/cli Categorizes an issue or PR as relevant to SIG CLI. labels Jul 3, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @adrianreber. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. labels Jul 3, 2025
@github-project-automation github-project-automation bot moved this to Needs Triage in SIG Scheduling Jul 3, 2025
@k8s-ci-robot k8s-ci-robot added the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label Jul 3, 2025
@kannon92
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 10, 2025
@kannon92
Copy link
Contributor

Looking at #8519,

I see that we are missing a charter.

@adrianreber
Copy link
Member Author

Looking at #8519,

I see that we are missing a charter.

In https://github.com/kubernetes/community/blob/master/sig-wg-lifecycle.md#GitHub is says to add a charter once this initial PR has been merged. That's why is skipped it.

sigs.yaml Outdated
the integration of Checkpoint/Restore functionality into Kubernetes.
charter_link: charter.md
stakeholder_sigs:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sig auth may have a big say in security of this whole restoration pipeline

Copy link
Member

@rst0git rst0git Jul 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing this out! Security is definitely an important topic that we need to discuss with sig-auth, both for the checkpoint API and the restoration pipeline. The following paper and master thesis describe our recent work on this topic:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added sig auth to the list of stakeholder sigs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this showed up in the sig-auth meeting, we may have missed the discussion around this WG

if this WG is contemplating taking state from a running pod / saving it / letting it be consumed on another node or from another pod or another namespace, then sig-auth is definitely interested in making sure the permissions model around that exists and is ~consistent with similar things Kubernetes does elsewhere (like PVC / snapshots)

We're happy to consult on that, I'm not sure our awareness / involvement rises to the level of sponsoring the WG :)

cc @kubernetes/sig-auth-leads

Copy link
Member

@mikebrow mikebrow Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nod.. definitely needs an extra level of security due to customer data being serialized and available in the checkpoint, esp if not encrypted, but also due to windows of opportunity to do transactions/data manipulation.. then "undo" them by restoring a checkpoint

@k8s-ci-robot k8s-ci-robot added committee/steering Denotes an issue or PR intended to be handled by the steering committee. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 20, 2025
@helayoty helayoty moved this from Needs Triage to Backlog in SIG Scheduling Sep 9, 2025
@helayoty helayoty moved this from Backlog to Needs Review in SIG Scheduling Sep 9, 2025
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 9, 2025
@lujinda
Copy link

lujinda commented Sep 14, 2025

Is there a slack channel where we can discuss C/R related ideas? Thanks

@adrianreber
Copy link
Member Author

Is there a slack channel where we can discuss C/R related ideas? Thanks

You are not the first to ask. We kind of are waiting for the proposal to be accepted to have a slack channel. Not sure if there is a another way to have a slack channel without having the proposal merged.

@rst0git
Copy link
Member

rst0git commented Sep 15, 2025

Is there a slack channel where we can discuss C/R related ideas?

@lujinda Please reach out to us in the Kubernetes slack. You can find Viktoria, Adrian, and myself there :)

Copy link
Member

@BenTheElder BenTheElder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't tell if this has been raised to all the relevant SIGs yet (via their mailinglist, slack, meetings or similar, particularly to the leads to +1)

Otherwise I think this is looking pretty good ...

@adrianreber
Copy link
Member Author

@rst0git @viktoriaas are currently trying to bring this WG up in all the mentioned SIGs.

Which SIGs haven't been officially informed about this?

@viktoriaas
Copy link

@rst0git @viktoriaas are currently trying to bring this WG up in all the mentioned SIGs.

Which SIGs haven't been officially informed about this?

We attended SIG Node (16.9) and SIG Scheduling (18.9.), so SIG API Machinery, SIG Auth, SIG Apps are left.

@BenTheElder
Copy link
Member

For SIG Apps I think we have @janetkuo and @soltysh on this thread, cc @kow3ns.

For auth @ritazh @liggitt have commented here but I think we also need to discuss with others, x-ref https://github.com/kubernetes/community/pull/8508/files#r2274401485

Going to the meeting is a good approach, but you could also try raising to the mailinglists / slacks for earlier feedback.

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 8, 2025
@kow3ns
Copy link
Member

kow3ns commented Oct 13, 2025

After hearing the presentation SIG Apps is +1. We don't anticipate this work, in the near term, impacting the in tree components that fall under SIG Apps. However, we think we can be advocates for users that run workloads and applications that use the infrastructure to ensure that it meets their use cases.

Comment on lines +16 to +19
* [SIG Apps](/sig-apps)
* [SIG Auth](/sig-auth)
* [SIG Node](/sig-node)
* [SIG Scheduling](/sig-scheduling)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see SIG Apps +1 on the comments, do we have +1 from the other 3 SIGs in the comments

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haircommander Are you the right person to +1 from SIG Node?

@viktoriaas Who at SIG scheduling did you talk to. Can they +1 here?

@rst0git plans to present at the SIG-Auth meeting on Wednesday, October 22nd.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who at SIG scheduling did you talk to. Can they +1 here?

@dom4ha As a follow-up on the discussion we had at the SIG-Scheduling meeting on September 18th, would you be able to add +1 for our working group?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an ack from sig-scheduling is still needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 from sig-scheduling. We discussed it on our sig meeting and looking forward for any integration with the scheduler logic.

Co-authored-by: Viktória Spišaková <spisakova@ics.muni.cz>
Co-authored-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
Co-authored-by: Sergey Kanzhelev <S.Kanzhelev@live.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
Comment on lines +44 to +47
- SIG Node
- SIG Scheduling
- SIG Auth
- SIG Apps
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is SIG networking a stakeholder, given details around checkpoint/restoring the network of a container?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was some discussion above #8508 (comment)

@adrianreber
Copy link
Member Author

At this point we presented our idea at all possibly involved SIGs. (I think). Anything missing to create this WG?

@BenTheElder
Copy link
Member

BenTheElder commented Oct 28, 2025

cc @kubernetes/steering-committee I think all of the SIG ACKs are done and there don't seem to be suggestions / concerns raised from the related SIGs.

/lgtm
/approve
/hold

Holding for review + ACK from other steering members.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 28, 2025
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 28, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adrianreber, BenTheElder

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 28, 2025
@aojea
Copy link
Member

aojea commented Oct 28, 2025

+1 (steering)

Copy link
Member

@pacoxu pacoxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
(steering)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. committee/steering Denotes an issue or PR intended to be handled by the steering committee. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

Status: In Review
Status: Needs Review

Development

Successfully merging this pull request may close these issues.