Conversation
VeraQin
commented
Apr 7, 2026
- One-line PR description: Add the KEP doc with proposal, design and initial plans.
- Issue link: Support Default Pod Sysctls in Kubelet #5996
- Other comments: N/A
|
|
|
Welcome @VeraQin! |
|
Hi @VeraQin. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: VeraQin The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/ok-to-test |
|
|
||
| # The following PRR answers are required at beta release | ||
| metrics: | ||
| - TBD |
There was a problem hiding this comment.
just remove the metrics section
There was a problem hiding this comment.
unless we need metrics for some PRR. But I think simple failed pod metrics spike can be used for it
| * Cover list: `kernel.shm*`, `kernel.msg*`, `kernel.sem`, `fs.mqueue.*`, `net.*`, `kernel.domainname`, `user.*`. | ||
| * Ensure that sysctls are only applied if the pod is namespaced for the corresponding subsystem (i.e., not using HostNetwork or HostIPC). | ||
|
|
||
| ### Non-Goals |
There was a problem hiding this comment.
add non-goal to not include any additional filtering on which Pods those default will apply to. For e.g. we do not want to have per-Pod-namespace sets of sysctls
There was a problem hiding this comment.
(not in this version at least)
|
@VeraQin: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
|
||
| #### Story 1 | ||
|
|
||
| As a cluster admin, I would like to be able to set namespaced sysctls for all pods to share the same kernel environments. And pods keep the ability to customize specific sysctl in its own namespace. |
There was a problem hiding this comment.
for me this is too broad of a user story. Can you give example with the specific sysclt that system administrator may need to adjust across all Pods and why
| Currently, Kubernetes allows users to specify sysctls for individual pods via the [securityContext.sysctls](https://kubernetes.io/docs/tasks/administer-cluster/sysctl-cluster/#setting-sysctls-for-a-pod) field in the Pod spec. However, in many production environments, node administrators often need to enforce consistent kernel parameter tuning across all workloads on a node or within a node pool/group. For example, high-performance networking or messaging applications may require specific `net.*` or `kernel.shm*` values to be set globally for all containers on specialized nodes. | ||
| Setting these parameters manually in every Pod spec is error-prone and redundant. Furthermore, cluster operators may wish to manage these defaults at the infrastructure level (e.g., via Kubelet configuration) to ensure performance and stability without requiring application developers to be aware of the underlying host kernel tuning needs. | ||
|
|
||
| ### Goals |
There was a problem hiding this comment.
specify that static pods are also in-scope. Mentioning them explicitly helps to not forget to add tests for them
|
|
||
| The field will be guarded by a new feature gate called `DefaultPodSysctls` added in [kube_features.go](https://github.com/kubernetes/kubernetes/blob/master/pkg/features/kube_features.go). | ||
|
|
||
| The namespace-level verification will be performed while applying to pods in `Application Logic` below, since it depends on the pod namespace information. Unnamespaced flags will be ignored silently to avoid failure. |
There was a problem hiding this comment.
Let's say there will be a log message at least, not completely silently.
I am not quite sure that the right way is to ignore instead of failing fast. Failing fast will help detect the configuration issue.
There was a problem hiding this comment.
Also I would suggest to fail on non-existing sysctls - if cluster admin made a typo, they likely want to know about it
| } | ||
| ``` | ||
|
|
||
| ### Validation |
There was a problem hiding this comment.
I think it general, we want to have early validation on kubelet as aligned with container runtime as possible. Basically making it the same as just passing everything to the container runtime.
| - "@SergeyKanzhelev" | ||
| - "@BenTheElder" | ||
| approvers: | ||
| - TBD |
There was a problem hiding this comment.
approvers will be assigned at the KEP triage