Commit Graph

43518 Commits

Author SHA1 Message Date
Francesco Romani
23abdab2b7 smtalign: propagate policy options to policies
Consume in the static policy the cpu manager policy options from
the cpumanager instance.
Validate in the none policy if any option is given, and fail if so -
this is almost surely a configuration mistake.

Add new cpumanager.Options type to hold the options and translate from
user arguments to flags.

Co-authored-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-07-08 23:15:37 +02:00
Francesco Romani
6dcec345df smtalign: cm: factor out admission response
Introduce a new `admission` subpackage to factor out the responsability
to create `PodAdmitResult` objects. This enables resource manager
to report specific errors in Allocate() and to bubble up them
in the relevant fields of the `PodAdmitResult`.

To demonstrate the approach we refactor TopologyAffinityError as a
proper error.

Co-authored-by: Kevin Klues <kklues@nvidia.com>
Co-authored-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-07-08 23:15:37 +02:00
Francesco Romani
c5cb263dcf smtalign: propagate policy options to cpumanager
The CPUManagerPolicyOptions received from the kubelet config/command line args
is propogated to the Container Manager.

We defer the consumption of the options to a later patch(set).

Co-authored-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-07-08 23:15:35 +02:00
Francesco Romani
6dccad45b4 smtalign: add auto generated code
Files generate after running `make generated_files`.

Co-authored-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-07-08 23:14:59 +02:00
Swati Sehgal
cc76a756e4 smtalign: add cpu-manager-policy-options flag in Kubelet
In this patch we enhance the kubelet configuration to support
cpuManagerPolicyOptions.

In order to introduce SMT-awareness in CPU Manager, we introduce a
new flag in Kubelet to allow the user to specify an additional flag
called `cpumanager-policy-options` to allow the user to modify the
behaviour of static policy to strictly guarantee allocation of whole
core.

Co-authored-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-07-08 23:14:59 +02:00
Kubernetes Prow Robot
4d78db54a5 Merge pull request #103580 from tkestack/fix-version-format
fix kubelet panic when DynamicKubeletConfig enabled
2021-07-08 14:02:24 -07:00
Kubernetes Prow Robot
a9d7526864 Merge pull request #102970 from tkestack/feature-memory-qos
Feature: Support memory qos with cgroups v2
2021-07-08 14:01:36 -07:00
Kubernetes Prow Robot
b814b83392 Merge pull request #102122 from Nordix/conn_reuse_mode
Don't set sysctl net.ipv4.vs.conn_reuse_mode for kernels >=5.9
2021-07-08 14:01:19 -07:00
Kubernetes Prow Robot
7c84064a4f Merge pull request #99000 from verb/1.21-kubelet-metrics
Add kubelet metrics for ephemeral containers
2021-07-08 14:00:55 -07:00
James Sturtevant
d5d9327351 Only use dualstack if the node and config supports it 2021-07-08 11:39:20 -07:00
Peri Thompson
8e2b728c68 Explicitly skip host file mounting for windows 2021-07-08 19:38:49 +01:00
Aldo Culquicondor
2dd2622188 Track Job Pods completion in status
Through Job.status.uncountedPodUIDs and a Pod finalizer

An annotation marks if a job should be tracked with new behavior

A separate work queue is used to remove finalizers from orphan pods.

Change-Id: I1862e930257a9d1f7f1b2b0a526ed15bc8c248ad
2021-07-08 17:48:05 +00:00
Kubernetes Prow Robot
b765496650 Merge pull request #98817 from alculquicondor/job-completion-api
Add Job.status.uncountedTerminatedPods for Job tracking
2021-07-08 10:44:54 -07:00
Peter Hunt
a9b7dcc8c2 kubelet: update remote runtimes for cri stat changes
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2021-07-08 13:17:04 -04:00
Aldo Culquicondor
bb56a0bd04 Add Job.status.uncountedPodUIDs
For tracking Job Pods that have finished but are not yet counted as failed or succeeded

And feature gate JobTrackingWithFinalizers

Change-Id: I3e080f3ec090922640384b692e88eaf9a544d3b5
2021-07-08 15:31:59 +00:00
Kubernetes Prow Robot
81065fd085 Merge pull request #103532 from thockin/fix-91459-service-update-allocs
Service: Fix semantics for Update wrt allocations
2021-07-08 05:59:05 -07:00
Li Bo
79e230ea21 fix kubelet panic when DynamicKubeletConfig enabled 2021-07-08 16:20:51 +08:00
Lars Ekman
b6b3a69284 Don't set sysctl net.ipv4.vs.conn_reuse_mode for kernels >=5.9 2021-07-08 09:41:12 +02:00
boenn
369c4a2b98 Use cmp.Diff() replace reflect and diagnosis 2021-07-08 15:13:11 +08:00
Li Bo
c3d9b10ca8 feature: support Memory QoS for cgroups v2 2021-07-08 09:26:46 +08:00
Kubernetes Prow Robot
16af282ee7 Merge pull request #103520 from swetharepakula/truncate-endpoints
Truncate endpoints over a 1000 addresses
2021-07-07 18:09:21 -07:00
Kubernetes Prow Robot
8fb777efb0 Merge pull request #103451 from swetharepakula/ga-proxy-gates
Graduate EndpointSliceProxying and WindowsEndpointSliceProxying Gates
2021-07-07 18:09:13 -07:00
Kubernetes Prow Robot
36a7426aa5 Merge pull request #99144 from bart0sh/PR0094-promote-HugePageStorageMediumSize-to-GA
promote huge page storage medium size to GA
2021-07-07 18:09:05 -07:00
Kubernetes Prow Robot
ebbe63f116 Merge pull request #92863 from AkihiroSuda/rootless-pr
kubelet & kube-proxy: ignore sysctl errors and rlimit errors when running in UserNS (for rootless)
2021-07-07 18:08:53 -07:00
Tim Hockin
80dda49ce2 Service: Fix semantics for Update wrt allocations
It is not uncommon for users to Create a Service and not specify things
like ClusterIP and NodePort, which we then allocate for them.  They same
that YAML somewhere and later use it again in an Update, but then it
fails.

That's because we detected them trying to set a ClusterIP from a value
to "", which is not allowed.  If it was just NodePort, they would
actually succeed and reallocate a new port.

After this change, we try to "patch" updates where the user did not
specify those values from the old object.
2021-07-07 17:09:12 -07:00
Kubernetes Prow Robot
7bfd0b0503 Merge pull request #103467 from thockin/svc-alloc-lb-nodeports-bug
Fix small bug with AllocateLoadBalancerNodePorts
2021-07-07 17:05:40 -07:00
Kubernetes Prow Robot
6ed98b60f0 Merge pull request #103383 from Huang-Wei/move-up-pods
sched: provide an option for plugin developers to move pods to activeQ
2021-07-07 17:05:22 -07:00
Kubernetes Prow Robot
8e56a34195 Merge pull request #102966 from SergeyKanzhelev/deprecateDynamicKubeletConfig
deprecate and disable by default DynamicKubeletConfig feature flag
2021-07-07 17:05:15 -07:00
Kubernetes Prow Robot
e67979eaf6 Merge pull request #103550 from tkashem/apf-bootstrap-log-message
apf: fix bootstrap ensurer log message
2021-07-07 14:20:36 -07:00
Swetha Repakula
0a42f7b989 Graduate EndpointSliceProxying and WindowsEndpointSliceProxying Gates 2021-07-07 13:33:30 -07:00
Wei Huang
fb9cafc99b sched: provide an option for plugin developers to move pods to activeQ 2021-07-07 12:50:12 -07:00
Swetha Repakula
9bd857ca04 Truncate endpoints over a 1000 addresses
* set `endpoints.kubernetes.io/over-capacity` to "truncated" when
 number of addresses has been truncated to a 1000
 * ready addresses are prioritized over non-ready addresses
 * addresses are proportionally truncated across subsets
2021-07-07 12:48:43 -07:00
Kubernetes Prow Robot
ac6a1b1821 Merge pull request #103414 from ravisantoshgudimetla/fix-pdb-status
[disruptioncontroller] Don't error for unmanaged pods
2021-07-07 12:40:35 -07:00
Abu Kashem
d9e3fbff94 apf: fix bootstrap ensurer log message 2021-07-07 15:01:46 -04:00
Nayef Ghattas
bb3fe633b4 add test for triggering race condition 2021-07-07 20:17:22 +02:00
Nayef Ghattas
ab1807f2bc copy podStatus.ContainerStatuses before sorting it 2021-07-07 20:14:53 +02:00
Kubernetes Prow Robot
896cf744cb Merge pull request #103420 from raisaat/pods-api-test-fix
Fix pkg/api/pod/util tests to ensure feature gate is set
2021-07-07 10:43:53 -07:00
Vikram Jadhav
a9a3c4bb9a Refactor of TestValidateIngressClass and TestValidateIngressClassUpdate methods by adding Boilerplate in helper functions #FIXES: 99005 2021-07-07 22:35:35 +05:30
Kubernetes Prow Robot
eaba61b4de Merge pull request #103276 from NetApp/data-source-ref
Add DataSourceRef field to PVC spec
2021-07-07 08:56:44 -07:00
ravisantoshgudimetla
2c116055f7 [disruptioncontroller] Don't error for unmanaged pods
As of now, we allow PDBs to be applied to pods via
selectors, so there can be unmanaged pods(pods that
don't have backing controllers) but still have PDBs associated.
Such pods are to be logged instead of immediately throwing
a sync error. This ensures disruption controller is
not frequently updating the status subresource and thus
preventing excessive and expensive writes to etcd.
2021-07-07 10:42:24 -04:00
Kubernetes Prow Robot
17f6f28621 Merge pull request #103468 from Huang-Wei/fix-sched-cc
instantiates scheduler ComponentConfig after parsing feature gates
2021-07-07 01:22:43 -07:00
Akihiro Suda
26e83ac4d4 kubelet: ignore /dev/kmsg error when running in userns
oomwatcher.NewWatcher returns "open /dev/kmsg: operation not permitted" error,
when running with sysctl value `kernel.dmesg_restrict=1`.

The error is negligible for KubeletInUserNamespace.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-07-07 14:23:31 +09:00
Akihiro Suda
192790c52f kube-proxy: allow running in userns
Ignore an error during setting RLIMIT_NOFILE.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-07-07 14:23:31 +09:00
Akihiro Suda
dbe0155139 kubelet/cm: ignore sysctl error when running in userns
Errors during setting the following sysctl values are ignored:
- vm.overcommit_memory
- vm.panic_on_oom
- kernel.panic
- kernel.panic_on_oops
- kernel.keys.root_maxkeys
- kernel.keys.root_maxbytes

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-07-07 14:23:29 +09:00
Akihiro Suda
b16323e37c New feature gate: KubeletInUserNamespace
Enables support for running kubelet in a user namespace.
The user namespace has to be created before running kubelet.
All the node components such as CRI need to be running in the same user namespace.

See kubernetes/enhancements PR 1371 (merged) and issue 2033.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-07-07 14:22:55 +09:00
Kubernetes Prow Robot
2547c5bb97 Merge pull request #103307 from aojea/kubelet_podIPs
podIPs order match node IP family preference (Downward API)
2021-07-06 22:11:20 -07:00
Kubernetes Prow Robot
561959f682 Merge pull request #102823 from ehashman/kep-2400-swap
Alpha node swap support
2021-07-06 22:11:11 -07:00
Kubernetes Prow Robot
7df432f78f Merge pull request #99582 from chendave/fix_config
custom plugin config should take precedence over default plugin config
2021-07-06 22:10:43 -07:00
Shiming Zhang
d8fe255f41 Add test for validateProbe 2021-07-07 11:31:23 +08:00
Shiming Zhang
e378600c90 Add validation for Prober TerminationGracePeriodSeconds 2021-07-07 10:51:30 +08:00