Skip to content

Conversation

@tunedev
Copy link

@tunedev tunedev commented Oct 21, 2025

What type of PR is this?
feature

What this PR does / why we need it:
This PR upgrades Kubernetes to v1.34, replacing the reference to Struct with interfaces in the new version.
It includes necessary code, YAML, and Docker changes to support this transition.

Which issue(s) this PR fixes:
fixes #4671

Special notes for your reviewer:
Does this PR introduce a user-facing change?

@volcano-sh-bot
Copy link
Contributor

Welcome @tunedev!

It looks like this is your first PR to volcano-sh/volcano.

Thank you, and welcome to Volcano. 😃

@volcano-sh-bot volcano-sh-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 21, 2025
@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign hwdef
You can assign the PR to them by writing /assign @hwdef in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Oct 21, 2025
@hzxuzhonghu
Copy link
Member

/ok-to-test

@volcano-sh-bot volcano-sh-bot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Oct 22, 2025
@JesseStutler JesseStutler changed the title code changes, add updates for new interfaces being introduced Update volcano k8s dependencies to v1.34 Oct 22, 2025
@JesseStutler
Copy link
Member

@tunedev Also please check all of the failed CIs

@JesseStutler
Copy link
Member

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request upgrades Kubernetes dependencies to v1.34, which involves significant changes to adapt to the new scheduler framework interfaces. The changes are mostly mechanical and look correct. However, I've identified a few critical issues that would prevent the code from compiling or cause runtime errors. One is a recursive function call that would lead to a stack overflow, and the others are attempts to call methods on an interface type that doesn't have them. Please see the detailed comments for suggestions on how to fix these issues.

// ignore this err since apiserver doesn't properly validate affinity terms
// and we can't fix the validation for backwards compatibility.
podInfo, _ := k8sframework.NewPodInfo(pod)
node.AddPodInfo(podInfo)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The node variable is of type fwk.NodeInfo, which is an interface. This interface does not have an AddPodInfo method. You need to perform a type assertion to the concrete type *k8sframework.NodeInfo before calling this method. This will cause a compilation error.

Suggested change
node.AddPodInfo(podInfo)
if n, ok := node.(*k8sframework.NodeInfo); ok {
n.AddPodInfo(podInfo)
} else {
klog.Errorf("Failed to cast node to *k8sframework.NodeInfo for pod %s/%s", pod.Namespace, pod.Name)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like gemini have some hallucinations? Node has AddPodInfo method

Copy link
Author

@tunedev tunedev Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it is

Signed-off-by: Babatunde Sanusi <swisskid95@gmail.com>
Signed-off-by: Babatunde Sanusi <swisskid95@gmail.com>
@tunedev tunedev force-pushed the feat-update-k8s-dependencies-to-1.34 branch from 55065b4 to 4bd920b Compare October 23, 2025 17:40
tunedev and others added 3 commits October 26, 2025 11:12
Signed-off-by: Babatunde Sanusi <swisskid95@gmail.com>
Signed-off-by: Babatunde Sanusi <swisskid95@gmail.com>
@tunedev tunedev marked this pull request as ready for review October 26, 2025 10:21
@volcano-sh-bot volcano-sh-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 26, 2025

// Align default feature-gates with the connected cluster's version.
if err := setupComponentGlobals(config); err != nil {
klog.Errorf("failed to set component globals: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should return an error here when met it

"k8s.io/klog/v2"

// Register gcp auth
utilruntime "k8s.io/apimachinery/pkg/util/runtime"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should put before the comment Register gcp auth


componentGlobalsRegistry := basecompatibility.NewComponentGlobalsRegistry()
if componentGlobalsRegistry.EffectiveVersionFor(basecompatibility.DefaultKubeComponent) == nil {
utilruntime.Must(componentGlobalsRegistry.Register(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can refactor to

err = componentGlobalsRegistry.Register(
    basecompatibility.DefaultKubeComponent,
    kubeEffectiveVersion,
    utilfeature.DefaultMutableFeatureGate,
)
if err != nil {
    return fmt.Errorf("failed to register component globals: %w", err)
}

We don't need to panic here

@JesseStutler
Copy link
Member

@tunedev
I think the main reason about all failed CIs is that dra plugin has added a DynamicResourcesArgs arg and will validate the arg:
https://github.com/kubernetes/kubernetes/blob/03a5f06c2695805059278c9d6b47edc3bdcf51b1/pkg/scheduler/framework/plugins/dynamicresources/dynamicresources.go#L193-L199
But currently in volcano predicate plugin, the dra initialization will pass a nil arg:

plugin, err = dynamicresources.New(context.TODO(), nil, handle, features)

Therefore will cause an fatal error:
F1027 09:23:48.126316 1 predicates.go:356] failed to create dra plugin with err: got args of type <nil>, want *DynamicResourcesArgs
And the scheduler will restart, therefore all the CIs may fail or timeout.

So, we need to init a DynamicResourcesArgs, just like VolumeBindingArgs in this file:
https://github.com/volcano-sh/volcano/blob/master/pkg/scheduler/plugins/predicates/helper.go

We can first set the default DynamicResourcesArgs in predicate plugin, and user also can override it, to set the FilterTimeout through framework.Arguments, and BTW, currently DynamicResourceArgs only contain a FilterTimeout field, the default value of it is 10s:
https://github.com/kubernetes/kubernetes/blob/03a5f06c2695805059278c9d6b47edc3bdcf51b1/pkg/scheduler/apis/config/v1/defaults.go#L249

@tunedev

@JesseStutler
Copy link
Member

@tunedev BTW, once you fix all the CIs and fix the review comments, please squash your commits into one

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DRA v1.34 has lots of new feature gate, therefore we need to also add them here:

features := feature.Features{
EnableStorageCapacityScoring: utilFeature.DefaultFeatureGate.Enabled(features.StorageCapacityScoring),
EnableNodeInclusionPolicyInPodTopologySpread: utilFeature.DefaultFeatureGate.Enabled(features.NodeInclusionPolicyInPodTopologySpread),
EnableMatchLabelKeysInPodTopologySpread: utilFeature.DefaultFeatureGate.Enabled(features.MatchLabelKeysInPodTopologySpread),
EnableSidecarContainers: utilFeature.DefaultFeatureGate.Enabled(features.SidecarContainers),
EnableDRAAdminAccess: utilFeature.DefaultFeatureGate.Enabled(features.DRAAdminAccess),
EnableDynamicResourceAllocation: utilFeature.DefaultFeatureGate.Enabled(features.DynamicResourceAllocation),
EnableVolumeAttributesClass: utilFeature.DefaultFeatureGate.Enabled(features.VolumeAttributesClass),
EnableCSIMigrationPortworx: utilFeature.DefaultFeatureGate.Enabled(features.CSIMigrationPortworx),
}

For the new DRA feature gates needed to add: https://github.com/kubernetes/kubernetes/blob/03a5f06c2695805059278c9d6b47edc3bdcf51b1/pkg/scheduler/framework/plugins/feature/feature.go#L53-L60

Signed-off-by: Babatunde Sanusi <swisskid95@gmail.com>
@tunedev tunedev force-pushed the feat-update-k8s-dependencies-to-1.34 branch from 402953c to 963c6b0 Compare October 28, 2025 04:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Upgrade volcano k8s dependencies to v1.34

4 participants