Commit Graph

42603 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
2423813207 Merge pull request #103573 from chendave/fix_index
Fix index out of range if multiple default plugins are overridden
2021-07-09 08:43:23 -07:00
Kubernetes Prow Robot
a6c2cd7d18 Merge pull request #103291 from wzshiming/fix/nodeshutdown-restart
Fix Data Race in nodeshutdown restart
2021-07-09 08:43:14 -07:00
Kubernetes Prow Robot
03fa68099e Merge pull request #98238 from alculquicondor/job-completion
Track Job completion through Pod finalizers and status
2021-07-09 08:42:54 -07:00
Kubernetes Prow Robot
29652248eb Merge pull request #103596 from andrewsykim/endpointslice-terminating
Promote EndpointSliceTerminatingCondition to Beta
2021-07-09 06:01:42 -07:00
Kubernetes Prow Robot
8daced4d3f Merge pull request #103508 from boenn/UseDiff
Use cmp.Diff() replace reflect and diagnosis
2021-07-09 06:01:13 -07:00
Dave Chen
1727cea64c Fix index out of range if multiple default plugins are overridden
Signed-off-by: Dave Chen <dave.chen@arm.com>
2021-07-09 19:56:14 +08:00
Kubernetes Prow Robot
617064d732 Merge pull request #101432 from swatisehgal/smtaware
node: cpumanager: add options to reject non SMT-aligned workload
2021-07-08 21:04:53 -07:00
Kubernetes Prow Robot
83baa708df Merge pull request #103429 from saschagrunert/metrics-test-fix
Fix resource metrics e2e test
2021-07-08 17:58:53 -07:00
Kubernetes Prow Robot
dab6f6a43d Merge pull request #102344 from smarterclayton/keep_pod_worker
Prevent Kubelet from incorrectly interpreting "not yet started" pods as "ready to terminate pods" by unifying responsibility for pod lifecycle into pod worker
2021-07-08 16:48:53 -07:00
Kubernetes Prow Robot
57716897eb Merge pull request #103434 from perithompson/windows-etchostcreate-skip
Explicitly skip host file mounting for Windows when HostProcess pod
2021-07-08 15:36:53 -07:00
Andrew Sy Kim
826a5219da promote EndpointSliceTerminatingCondition to Beta
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
2021-07-08 17:34:10 -04:00
Francesco Romani
23abdab2b7 smtalign: propagate policy options to policies
Consume in the static policy the cpu manager policy options from
the cpumanager instance.
Validate in the none policy if any option is given, and fail if so -
this is almost surely a configuration mistake.

Add new cpumanager.Options type to hold the options and translate from
user arguments to flags.

Co-authored-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-07-08 23:15:37 +02:00
Francesco Romani
6dcec345df smtalign: cm: factor out admission response
Introduce a new `admission` subpackage to factor out the responsability
to create `PodAdmitResult` objects. This enables resource manager
to report specific errors in Allocate() and to bubble up them
in the relevant fields of the `PodAdmitResult`.

To demonstrate the approach we refactor TopologyAffinityError as a
proper error.

Co-authored-by: Kevin Klues <kklues@nvidia.com>
Co-authored-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-07-08 23:15:37 +02:00
Francesco Romani
c5cb263dcf smtalign: propagate policy options to cpumanager
The CPUManagerPolicyOptions received from the kubelet config/command line args
is propogated to the Container Manager.

We defer the consumption of the options to a later patch(set).

Co-authored-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-07-08 23:15:35 +02:00
Francesco Romani
6dccad45b4 smtalign: add auto generated code
Files generate after running `make generated_files`.

Co-authored-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-07-08 23:14:59 +02:00
Swati Sehgal
cc76a756e4 smtalign: add cpu-manager-policy-options flag in Kubelet
In this patch we enhance the kubelet configuration to support
cpuManagerPolicyOptions.

In order to introduce SMT-awareness in CPU Manager, we introduce a
new flag in Kubelet to allow the user to specify an additional flag
called `cpumanager-policy-options` to allow the user to modify the
behaviour of static policy to strictly guarantee allocation of whole
core.

Co-authored-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-07-08 23:14:59 +02:00
Kubernetes Prow Robot
4d78db54a5 Merge pull request #103580 from tkestack/fix-version-format
fix kubelet panic when DynamicKubeletConfig enabled
2021-07-08 14:02:24 -07:00
Kubernetes Prow Robot
a9d7526864 Merge pull request #102970 from tkestack/feature-memory-qos
Feature: Support memory qos with cgroups v2
2021-07-08 14:01:36 -07:00
Kubernetes Prow Robot
b814b83392 Merge pull request #102122 from Nordix/conn_reuse_mode
Don't set sysctl net.ipv4.vs.conn_reuse_mode for kernels >=5.9
2021-07-08 14:01:19 -07:00
Kubernetes Prow Robot
7c84064a4f Merge pull request #99000 from verb/1.21-kubelet-metrics
Add kubelet metrics for ephemeral containers
2021-07-08 14:00:55 -07:00
Peri Thompson
8e2b728c68 Explicitly skip host file mounting for windows 2021-07-08 19:38:49 +01:00
Aldo Culquicondor
2dd2622188 Track Job Pods completion in status
Through Job.status.uncountedPodUIDs and a Pod finalizer

An annotation marks if a job should be tracked with new behavior

A separate work queue is used to remove finalizers from orphan pods.

Change-Id: I1862e930257a9d1f7f1b2b0a526ed15bc8c248ad
2021-07-08 17:48:05 +00:00
Kubernetes Prow Robot
b765496650 Merge pull request #98817 from alculquicondor/job-completion-api
Add Job.status.uncountedTerminatedPods for Job tracking
2021-07-08 10:44:54 -07:00
Aldo Culquicondor
bb56a0bd04 Add Job.status.uncountedPodUIDs
For tracking Job Pods that have finished but are not yet counted as failed or succeeded

And feature gate JobTrackingWithFinalizers

Change-Id: I3e080f3ec090922640384b692e88eaf9a544d3b5
2021-07-08 15:31:59 +00:00
Kubernetes Prow Robot
81065fd085 Merge pull request #103532 from thockin/fix-91459-service-update-allocs
Service: Fix semantics for Update wrt allocations
2021-07-08 05:59:05 -07:00
Li Bo
79e230ea21 fix kubelet panic when DynamicKubeletConfig enabled 2021-07-08 16:20:51 +08:00
Lars Ekman
b6b3a69284 Don't set sysctl net.ipv4.vs.conn_reuse_mode for kernels >=5.9 2021-07-08 09:41:12 +02:00
boenn
369c4a2b98 Use cmp.Diff() replace reflect and diagnosis 2021-07-08 15:13:11 +08:00
Li Bo
c3d9b10ca8 feature: support Memory QoS for cgroups v2 2021-07-08 09:26:46 +08:00
Kubernetes Prow Robot
16af282ee7 Merge pull request #103520 from swetharepakula/truncate-endpoints
Truncate endpoints over a 1000 addresses
2021-07-07 18:09:21 -07:00
Kubernetes Prow Robot
8fb777efb0 Merge pull request #103451 from swetharepakula/ga-proxy-gates
Graduate EndpointSliceProxying and WindowsEndpointSliceProxying Gates
2021-07-07 18:09:13 -07:00
Kubernetes Prow Robot
36a7426aa5 Merge pull request #99144 from bart0sh/PR0094-promote-HugePageStorageMediumSize-to-GA
promote huge page storage medium size to GA
2021-07-07 18:09:05 -07:00
Kubernetes Prow Robot
ebbe63f116 Merge pull request #92863 from AkihiroSuda/rootless-pr
kubelet & kube-proxy: ignore sysctl errors and rlimit errors when running in UserNS (for rootless)
2021-07-07 18:08:53 -07:00
Tim Hockin
80dda49ce2 Service: Fix semantics for Update wrt allocations
It is not uncommon for users to Create a Service and not specify things
like ClusterIP and NodePort, which we then allocate for them.  They same
that YAML somewhere and later use it again in an Update, but then it
fails.

That's because we detected them trying to set a ClusterIP from a value
to "", which is not allowed.  If it was just NodePort, they would
actually succeed and reallocate a new port.

After this change, we try to "patch" updates where the user did not
specify those values from the old object.
2021-07-07 17:09:12 -07:00
Kubernetes Prow Robot
7bfd0b0503 Merge pull request #103467 from thockin/svc-alloc-lb-nodeports-bug
Fix small bug with AllocateLoadBalancerNodePorts
2021-07-07 17:05:40 -07:00
Kubernetes Prow Robot
6ed98b60f0 Merge pull request #103383 from Huang-Wei/move-up-pods
sched: provide an option for plugin developers to move pods to activeQ
2021-07-07 17:05:22 -07:00
Kubernetes Prow Robot
8e56a34195 Merge pull request #102966 from SergeyKanzhelev/deprecateDynamicKubeletConfig
deprecate and disable by default DynamicKubeletConfig feature flag
2021-07-07 17:05:15 -07:00
Kubernetes Prow Robot
e67979eaf6 Merge pull request #103550 from tkashem/apf-bootstrap-log-message
apf: fix bootstrap ensurer log message
2021-07-07 14:20:36 -07:00
Swetha Repakula
0a42f7b989 Graduate EndpointSliceProxying and WindowsEndpointSliceProxying Gates 2021-07-07 13:33:30 -07:00
Wei Huang
fb9cafc99b sched: provide an option for plugin developers to move pods to activeQ 2021-07-07 12:50:12 -07:00
Swetha Repakula
9bd857ca04 Truncate endpoints over a 1000 addresses
* set `endpoints.kubernetes.io/over-capacity` to "truncated" when
 number of addresses has been truncated to a 1000
 * ready addresses are prioritized over non-ready addresses
 * addresses are proportionally truncated across subsets
2021-07-07 12:48:43 -07:00
Kubernetes Prow Robot
ac6a1b1821 Merge pull request #103414 from ravisantoshgudimetla/fix-pdb-status
[disruptioncontroller] Don't error for unmanaged pods
2021-07-07 12:40:35 -07:00
Abu Kashem
d9e3fbff94 apf: fix bootstrap ensurer log message 2021-07-07 15:01:46 -04:00
Kubernetes Prow Robot
896cf744cb Merge pull request #103420 from raisaat/pods-api-test-fix
Fix pkg/api/pod/util tests to ensure feature gate is set
2021-07-07 10:43:53 -07:00
Kubernetes Prow Robot
eaba61b4de Merge pull request #103276 from NetApp/data-source-ref
Add DataSourceRef field to PVC spec
2021-07-07 08:56:44 -07:00
ravisantoshgudimetla
2c116055f7 [disruptioncontroller] Don't error for unmanaged pods
As of now, we allow PDBs to be applied to pods via
selectors, so there can be unmanaged pods(pods that
don't have backing controllers) but still have PDBs associated.
Such pods are to be logged instead of immediately throwing
a sync error. This ensures disruption controller is
not frequently updating the status subresource and thus
preventing excessive and expensive writes to etcd.
2021-07-07 10:42:24 -04:00
Kubernetes Prow Robot
17f6f28621 Merge pull request #103468 from Huang-Wei/fix-sched-cc
instantiates scheduler ComponentConfig after parsing feature gates
2021-07-07 01:22:43 -07:00
Akihiro Suda
26e83ac4d4 kubelet: ignore /dev/kmsg error when running in userns
oomwatcher.NewWatcher returns "open /dev/kmsg: operation not permitted" error,
when running with sysctl value `kernel.dmesg_restrict=1`.

The error is negligible for KubeletInUserNamespace.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-07-07 14:23:31 +09:00
Akihiro Suda
192790c52f kube-proxy: allow running in userns
Ignore an error during setting RLIMIT_NOFILE.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-07-07 14:23:31 +09:00
Akihiro Suda
dbe0155139 kubelet/cm: ignore sysctl error when running in userns
Errors during setting the following sysctl values are ignored:
- vm.overcommit_memory
- vm.panic_on_oom
- kernel.panic
- kernel.panic_on_oops
- kernel.keys.root_maxkeys
- kernel.keys.root_maxbytes

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-07-07 14:23:29 +09:00