Commit Graph

43125 Commits

Author SHA1 Message Date
Jordan Liggitt
81fa9855c1 Fix quota controller hotloop in integration tests 2021-10-06 11:34:32 -04:00
Kubernetes Prow Robot
debd6c1e9e Merge pull request #104526 from jingxu97/aug/volumeattach
Fix issue in node status updating VolumeAttached list
2021-10-05 17:30:32 -07:00
Kubernetes Prow Robot
907d62eac8 Merge pull request #105462 from ehashman/merge-terminal-phase
Ensure terminal pods maintain terminal status
2021-10-05 13:12:58 -07:00
Jing Xu
69b9f9b1f0 Fix issue in node status updating VolumeAttached list
During volume detach, the following might happen in reconciler

1. Pod is deleting
2. remove volume from reportedAsAttached, so node status updater will
update volumeAttached list
3. detach failed due to some issue
4. volume is added back in reportedAsAttached
5. reconciler loops again the volume, remove volume from
reportedAsAttached
6. detach will not be trigged because exponential back off, detach call
will fail with exponential backoff error
7. another pod is added which using the same volume on the same node
8. reconciler loops and it will NOT try to tigger detach anymore

At this point, volume is still attached and in actual state, but
volumeAttached list in node status does not has this volume anymore, and
will block volume mount from kubelet.

The fix in first round is to add volume back into the volume list that
need to reported as attached at step 6 when detach call failed with
error (exponentical backoff). However this might has some performance
issue if detach fail for a while. During this time, volume will be keep
removing/adding back to node status which will cause a surge of API
calls.

So we changed to logic to check first whether operation is safe to retry which
means no pending operation or it is not in exponentical backoff time
period before calling detach. This way we can avoid keep removing/adding
volume from node status.

Change-Id: I5d4e760c880d72937d34b9d3e904ecad125f802e
2021-10-05 09:44:35 -07:00
Elana Hashman
3005ef34f2 Ensure terminal pods maintain terminal status 2021-10-05 09:26:27 -07:00
Kubernetes Prow Robot
6c292ce270 Merge pull request #105245 from yibozhuang/lost-pvc-prefilter-optimization
Scheduler volumebinding plugin - handle Lost PVC as UnschedulableAndUnresolvable
2021-10-05 08:23:09 -07:00
Kubernetes Prow Robot
c91f9bdc60 Merge pull request #104689 from cynepco3hahue/memory_manager_restricted_policy_fix
kubelet: memory manager: fix preferred topology hints calculation
2021-10-05 06:47:08 -07:00
Kubernetes Prow Robot
519b164db1 Merge pull request #105222 from cyclinder/remove_node_lease_GA
remove nodeLease feature GA
2021-10-05 05:41:21 -07:00
Kubernetes Prow Robot
efa9029a0d Merge pull request #104920 from tkashem/response-writer-cleanup
apiserver: decorate http.ResponseWriter correctly
2021-10-05 00:53:09 -07:00
Author cyclinder
e61b901628 remove nodeLease feature GA
Signed-off-by: cyclinder <qifeng.guo@daocloud.io>
2021-10-05 12:23:27 +08:00
Kubernetes Prow Robot
0a29e2a73a Merge pull request #105197 from alculquicondor/job-tracking
Roll-forward: Beta requirements for JobTrackingWithFinalizers
2021-10-04 18:57:49 -07:00
Kensei Nakada
0bb4c14519 scheduler: delete docs related to equivalence cache from scheduler (#105417)
* Delete: delete docs related to equivalence cache from scheduler

* Fix: re-add comment about ServiceAffinity
2021-10-04 16:23:49 -07:00
Kubernetes Prow Robot
2358c8ae5b Merge pull request #105144 from umangachapagain/fix-logs
remove format specifiers from structured logs
2021-10-04 14:12:51 -07:00
Aldo Culquicondor
5929ccd391 Track expected removals of Pod finalizers
Add the UIDs of Pods for which we are removing finalizers to an in-memory cache.

The controller removes UIDs from the cache as Pod updates or deletes come in.

This avoids double counting finished Pods when Pod updates arrive after Job status updates.

https://github.com/kubernetes/kubernetes/issues/105200
2021-10-04 16:09:58 -04:00
Elana Hashman
5ff6c2396d Do not sync Waiting statuses for Terminated pods 2021-10-04 11:05:54 -07:00
Kubernetes Prow Robot
04f747d09f Merge pull request #104782 from kerthcet/cleanup/remove-cc-v1beta1
remove scheduler component config v1beta1
2021-10-04 08:53:08 -07:00
Abu Kashem
0d50c969c5 apiserver: wrap ResponseWriter using abstraction 2021-10-04 10:59:11 -04:00
Kubernetes Prow Robot
c724abae81 Merge pull request #105375 from jonyhy96/fix-clock-util
node: test file use k8s.io/utils/clock instead
2021-10-04 04:19:31 -07:00
Kubernetes Prow Robot
0ac956ff2b Merge pull request #104968 from twpayne/twpayne/ones-count-64
Speed up counting of bits in allocator
2021-10-01 19:27:05 -07:00
Rob Scott
d7a640d831 Excluding Control Plane Nodes from Topology Hints calculations 2021-10-01 15:12:38 -07:00
Kubernetes Prow Robot
e414cf7641 Merge pull request #100482 from pohly/generic-ephemeral-volume-checks
generic ephemeral volume checks
2021-10-01 10:47:22 -07:00
haoyun
8d9fa1decc fix: use k8s.io/utils/clock instead
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-10-02 00:08:53 +08:00
Tom Payne
21755f9ec0 Speed up counting of bits in allocator
Benchmark:

goos: linux
goarch: amd64
pkg: k8s.io/kubernetes/pkg/registry/core/service/allocator
cpu: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz

Before:

BenchmarkCountBits-8     9459236               140.4 ns/op

After:

BenchmarkCountBits-8    140667842                9.541 ns/op
2021-10-01 17:09:56 +02:00
Tom Payne
125312a8cf Add extra test and benchmark for count bits 2021-10-01 17:08:36 +02:00
Patrick Ohly
1d181ad84f scheduler: fail the volume attach limit when PVC is missing
Previously, the situation was ignored, which might have had the effect that Pod
scheduling continued (?) even though the Pod+PVC weren't known to be in an
acceptable state.
2021-10-01 17:03:44 +02:00
Patrick Ohly
1e26115df5 consider ephemeral volumes for host path and node limits check
When adding the ephemeral volume feature, the special case for
PersistentVolumeClaim volume sources in kubelet's host path and node
limits checks was overlooked. An ephemeral volume source is another
way of referencing a claim and has to be treated the same way.
2021-10-01 17:03:44 +02:00
Kubernetes Prow Robot
c46d84b991 Merge pull request #105293 from verb/1.23-ec-validation
Validate PodSpec in EphemeralContainersUpdate
2021-10-01 07:31:34 -07:00
Aldo Culquicondor
95c2a8024c Parallelize pod updates in job test
To potentially reduce the number of job controller syncs.

Also reduce the maximum number of pods to sync in tests.
2021-10-01 09:55:53 -04:00
Kubernetes Prow Robot
883250145c Merge pull request #104788 from 249043822/memorymanager-br
Fix initContainersReusableMemory delete bug in MemoryManager
2021-10-01 05:27:22 -07:00
Yibo Zhuang
b8fe514232 Scheduler volumebinding plugin - handle Lost PVC as
UnschedulableAndUnresolvable

This change adds an additional check in the volumebinding scheduler
plugin to handle PVC with phase ClaimLost which will allow the
scheduler to return UnschedulableAndUnresolvable during the PreFilter
stage and skip the rest of the node evaluation since the PVC is
bound to a PV that does not exist.

Without this change, the FailedScheduling error message would look like:

0/10 nodes are available: 2 node(s) had taint {node/test: true},
that the pod didn't tolerate, 6 node(s) had taint {node/unhealthy: true},
that the pod didn't tolerate, 2 pvc(s) bound to non-existent pv(s)

Which is still evaluating every single node to determine that the pod
cannot be scheduled because the PVC is bound to a non-existent PV

With this change, the FailedScheduling error message would look like:

0/10 nodes are available: 1 persistentvolumeclaim "foo" bound
to non-existent persistentvolume "bar"

Signed-off By: Yibo Zhuang <yibzhuang@gmail.com>
2021-09-30 21:49:46 -07:00
Kubernetes Prow Robot
7d76d519ca Merge pull request #105374 from xing-yang/update_volume_csi_owners
Bubble up to pkg/volume/OWNERS file
2021-09-30 21:17:21 -07:00
Kubernetes Prow Robot
0ff7486e21 Merge pull request #105385 from elweb9858/remove_owner
Removing elweb9858 from winkernel kube-proxy approver+reviewer lists
2021-09-30 19:47:21 -07:00
Kubernetes Prow Robot
e136faa1c4 Merge pull request #105379 from pohly/volume-util-owners
pkg/volume/util: remove out-dated OWNERS
2021-09-30 18:33:45 -07:00
Kubernetes Prow Robot
cab54856f1 Merge pull request #104933 from vikramcse/automate_mockery
conversion of tests from mockery to mockgen
2021-09-30 18:33:21 -07:00
Shuhei Kitagawa
ef0eff14ab Add tests kubelet default config (#105116)
* Use utilpointer to get a pointer

* Add tests for kubelet default configs

* Change copyright year from 2015 to 2021

* Run gofmt

* Add all negative and all positive test cases
2021-09-30 17:29:33 -07:00
Lee Verberne
8b24dc07ff Test ephemeral container/pod conflicting fields
This adds a test case to cover the scenario where the fields of an
ephemeral container conflict with other fields in the pod and must be
detected by full PodSpec validation.
2021-09-30 21:47:19 +02:00
elweb9858
365c5e5687 Removing elweb9858 from winkernel kube-proxy approver+reviewer lists 2021-09-30 11:40:37 -07:00
Patrick Ohly
07f6571a49 pkg/volume/util: remove out-dated OWNERS
There is no reason for having separate owners for this folder. The parent
folder has a much better OWNERS file with references to the SIG-Storage
aliases.
2021-09-30 17:54:46 +02:00
xing-yang
2833aee8ff Bubble up to pkg/volume/OWNERS file 2021-09-30 14:01:28 +00:00
Umanga Chapagain
e262278772 fix incorrect structured log patterns
proxy/winkernel/proxier.go was using format specifier with
structured logging pattern which is wrong. This commit removes
use of format specifiers to align with the pattern.

Signed-off-by: Umanga Chapagain <chapagainumanga@gmail.com>
2021-09-30 11:10:13 +05:30
Kubernetes Prow Robot
6f47878926 Merge pull request #105107 from cici37/addFG
Add feature gate CustomResourceValidationExpressions
2021-09-29 08:08:48 -07:00
Kubernetes Prow Robot
598a8829c1 Merge pull request #105267 from llhuii/fix-topology-hint-bug
TopologyAwareHints: fix getHintsByZone bug
2021-09-28 20:54:48 -07:00
llhuii
e7350d126e TopologyAwareHints: add getHintsByZone unit test 2021-09-29 10:39:03 +08:00
Kubernetes Prow Robot
e138afc35d Merge pull request #105213 from yxxhero/remove_StartedPodsErrorsTotal_metrice_message
Remove StartedPodsErrorsTotal metric message
2021-09-28 10:45:16 -07:00
Kubernetes Prow Robot
9005160245 Merge pull request #105272 from wojtek-t/add_jittering_for_kubelet
Add jittering for Kubelet status computing
2021-09-28 00:20:42 -07:00
kerthcet
75a255d2ed remove scheduler component config v1beta1
Signed-off-by: kerthcet <kerthcet@gmail.com>
2021-09-28 13:13:17 +08:00
Kubernetes Prow Robot
16fdb2f391 Merge pull request #105196 from yibozhuang/improve-scheduler-pv-not-exist-err
Enhance ErrReasonPVNotExist in volumebinding scheduler plugin
2021-09-27 19:04:42 -07:00
llhuii
528cd30145 TopologyAwareHints: fix getHintsByZone bug
The bug could result in the EndpointSlice controller unnecessarily updating
EndpointSlices associated with a Service that had Topology Aware Hints enabled.
2021-09-28 09:02:24 +08:00
Kubernetes Prow Robot
ce4958d908 Merge pull request #105009 from harjas27/printer_event_count
kubectl: remove extra +1 for printing event count
2021-09-27 16:24:43 -07:00
Yibo Zhuang
603a4e1931 Enhance ErrReasonPVNotExist in volumebinding scheduler plugin
This change will make the message more clear when there
is a case of PVC(s) bound to PV(s) that no longer exists
and scheduler does not select the node due to this issue.

Previous error message would look like:
0/2 nodes are available: 2 pvc(s) bound to non-existent pv(s)

Updated message looks like:
0/2 nodes are available: 2 node(s) unavailable due to one or more
pvc(s) bound to non-existent pv(s)

For larger clusters with many different reasons of nodes that
are not available, the current message can be very misleading for
users to think that there are many PVCs lost due to PVs deleted but
in fact it could be just a single PVC case but many nodes not selected
by the scheduler due to this case.

Signed-off By: Yibo Zhuang <yibzhuang@gmail.com>
2021-09-27 15:02:35 -07:00