Commit Graph

38338 Commits

Author SHA1 Message Date
Clayton Coleman
8bc5cb01a9 kubelet: Clear the podStatusChannel before invoking syncBatch
The status manager syncBatch() method processes the current state
of the cache, which should include all entries in the channel. Flush
the channel before we call a batch to avoid unnecessary work and
to unblock pod workers when the node is congested.

Discovered while investigating long shutdown intervals on the node
where the status channel stayed full for tens of seconds.

Add a for loop around the select statement to avoid unnecessary
invocations of the wait.Forever closure each time.
2020-03-04 13:34:25 -05:00
Clayton Coleman
8722c834e5 kubelet: Never restart containers in deleting pods
When constructing the API status of a pod, if the pod is marked for
deletion no containers should be started. Previously, if a container
inside of a terminating pod failed to start due to a container
runtime error (that populates reasonCache) the reasonCache would
remain populated (it is only updated by syncPod for non-terminating
pods) and the delete action on the pod would be delayed until the
reasonCache entry expired due to other pods.

This dramatically reduces the amount of time the Kubelet waits to
delete pods that are terminating and encountered a container runtime
error.
2020-03-04 13:34:25 -05:00
Yu-Ju Hong
2364c10e2e kubelet: Don't delete pod until all container status is available
After a pod reaches a terminal state and all containers are complete
we can delete the pod from the API server. The dispatchWork method
needs to wait for all container status to be available before invoking
delete. Even after the worker stops, status updates will continue to
be delivered and the sync handler will continue to sync the pods, so
dispatchWork gets multiple opportunities to see status.

The previous code assumed that a pod in Failed or Succeeded had no
running containers, but eviction or deletion of running pods could
still have running containers whose status needed to be reported.

This modifies earlier test to guarantee that the "fallback" exit
code 137 is never reported to match the expectation that all pods
exit with valid status for all containers (unless some exceptional
failure like eviction were to occur while the test is running).
2020-03-04 13:34:25 -05:00
Clayton Coleman
ad3d8949f0 kubelet: Preserve existing container status when pod terminated
The kubelet must not allow a container that was reported failed in a
restartPolicy=Never pod to be reported to the apiserver as success.
If a client deletes a restartPolicy=Never pod, the dispatchWork and
status manager race to update the container status. When dispatchWork
(specifically podIsTerminated) returns true, it means all containers
are stopped, which means status in the container is accurate. However,
the TerminatePod method then clears this status. This results in a
pod that has been reported with status.phase=Failed getting reset to
status.phase.Succeeded, which is a violation of the guarantees around
terminal phase.

Ensure the Kubelet never reports that a container succeeded when it
hasn't run or been executed by guarding the terminate pod loop from
ever reporting 0 in the absence of container status.
2020-03-04 13:34:24 -05:00
Kubernetes Prow Robot
497a998ba6 Merge pull request #88654 from ddebroy/gmsa-disable1
Promote GMSA support for Windows to GA
2020-03-04 02:32:01 -08:00
Kubernetes Prow Robot
4d19c6f2ad Merge pull request #87537 from uthark/oatamanenko/apiversion
Fixes #87506 Add apiVersion to involvedObject
2020-03-04 02:31:47 -08:00
Kubernetes Prow Robot
0535520f6e Merge pull request #88758 from soltysh/hide_last_applied
Hide kubectl.kubernetes.io/last-applied-configuration in describe
2020-03-03 21:06:01 -08:00
Kubernetes Prow Robot
cd23e78c3d Merge pull request #88684 from saad-ali/updateMountLib
Update AzureFile and CephFS to use MountSensitive
2020-03-03 21:05:48 -08:00
Deep Debroy
16d221e407 Promote GMSA to GA
Signed-off-by: Deep Debroy <ddebroy@docker.com>
2020-03-04 02:56:21 +00:00
Kubernetes Prow Robot
aeb88b6ecd Merge pull request #88587 from cmluciano/cml/v1beta1paths
Adding PathType to Ingress
2020-03-03 18:13:47 -08:00
Kubernetes Prow Robot
0773f108c7 Merge pull request #88710 from SataQiu/ipvs-readme-20200302
kube-proxy: small cleanup for ipvs readme
2020-03-03 12:18:22 -08:00
Kubernetes Prow Robot
9d0cbb7503 Merge pull request #88673 from jsafrane/block-feature-ga
Promote block volumes to GA
2020-03-03 12:17:12 -08:00
Kubernetes Prow Robot
bfb3fb54b4 Merge pull request #88240 from soltysh/pod_conditions
Present more concrete information about pod readiness
2020-03-03 12:15:42 -08:00
Kubernetes Prow Robot
62dc3ea6d1 Merge pull request #87368 from 928234269/fix_staticcheck01
fix staticcheck errors in pkg/controller/daemon.
2020-03-03 12:15:28 -08:00
saad-ali
3784438b56 Prevent CephFS from logging senstive options 2020-03-03 11:20:08 -08:00
saad-ali
548b297a00 Prevent AzureFile from logging senstive options 2020-03-03 11:20:08 -08:00
saad-ali
727582311f Fix MountError Test 2020-03-03 11:20:08 -08:00
Rob Scott
f38904d6f4 Adding PathType to Ingress
Co-authored-by: Christopher M. Luciano <cmluciano@us.ibm.com>
2020-03-03 11:11:16 -08:00
Maciej Szulik
02cd65d7bb Squash pkg/describe/versioned/ into pkg/describe/ 2020-03-03 19:20:06 +01:00
Kubernetes Prow Robot
06b798781a Merge pull request #88591 from smarterclayton/status_update
kubelet: Avoid sending no-op patches
2020-03-03 09:43:38 -08:00
James Munnelly
4144a2a1cf Add unit tests for IsKubeletClientCSR and IsKubeletServingCSR 2020-03-03 13:14:32 +00:00
Kubernetes Prow Robot
c86aec0564 Merge pull request #88745 from mborsz/slice3
Implement simple endpoint slice batching
2020-03-03 03:03:38 -08:00
Kubernetes Prow Robot
ac55a51034 Merge pull request #85056 from pohsienshih/volume/golint
Fix golint issues for pkg/volume/rbd
2020-03-03 01:37:37 -08:00
Maciej Borsz
49b11b5431 Implement simple endpoint slice batching 2020-03-03 08:16:42 +01:00
Kubernetes Prow Robot
f221dbb91b Merge pull request #88505 from liggitt/pod-ip-patch
Honor status.podIP over status.podIPs and node.spec.podCIDR over node.spec.PodCIDRs when mismatched
2020-03-02 16:16:36 -08:00
Kubernetes Prow Robot
c51ad0cb61 Merge pull request #88735 from pancernik/plugin-args-api-improvements
Improve plugin args JSON tags
2020-03-02 14:51:06 -08:00
Kubernetes Prow Robot
1b4b155332 Merge pull request #88671 from alculquicondor/feat/default-spreading
Add default constraints to PodTopologySpread plugin
2020-03-02 14:50:25 -08:00
Kubernetes Prow Robot
01593144e6 Merge pull request #88657 from chendotjs/validate-ipvs-timeout
validate configuration of kube-proxy IPVS tcp,tcpfin,udp timeout
2020-03-02 14:50:16 -08:00
Jan Safranek
3af671011a Generated API 2020-03-02 22:21:42 +01:00
Kubernetes Prow Robot
7e2394cbb0 Merge pull request #88660 from jsafrane/block-uncertain
Implement uncertain mount for block volumes
2020-03-02 11:43:08 -08:00
Jordan Liggitt
60da52a24a Honor status.podIP over status.podIPs, node.spec.podCIDR over node.spec.podCIDRs 2020-03-02 14:21:22 -05:00
Aldo Culquicondor
73ad38593a Add default constraints to PodTopologySpread
And update benchmark for even pod spreading to use default constraints

Signed-off-by: Aldo Culquicondor <acondor@google.com>
2020-03-02 13:50:21 -05:00
Kubernetes Prow Robot
62e993ce09 Merge pull request #88401 from gongguan/volume_binder
refactor volume binder
2020-03-02 09:16:44 -08:00
Rafal Wicha
09598d48f6 Improve plugin args JSON tags 2020-03-02 15:20:44 +00:00
Jan Safranek
afcbb68386 Fix unit test to fail with proper final gRPC code
Plain "errors.New" is interpreted as transient error.
2020-03-02 12:54:03 +01:00
Jan Safranek
8536787133 Add unit tests 2020-03-02 12:54:02 +01:00
Jan Safranek
c11427fef5 Call NodeUnstage after NodeStage timeout
When NodeStage times out and does not prepare destination device and user
deletes corresponding pod, the driver may continue staging the volume in
background. Kubernetes must call NodeUnstage to "cancel" this operation.

Therefore TearDownDevice should be called even when the target directory
does not exist (yet).
2020-03-02 12:54:02 +01:00
Jan Safranek
f6fc73573c Call NodeUnpublish after NodePublish timeout
When NodePublish times out and user deletes corresponding pod, the driver
may continue publishing the volume. In order to "cancel" this operation,
Kubernetes must issue NodeUnpublish and wait until it finishes.

Therefore, NodeUnpublish should be called even if the target directory
(created by the driver) does not exist yet.
2020-03-02 12:54:02 +01:00
Jan Safranek
86a5bd98b6 Add uncertain map state to block volumes
Volume mount should be marked as uncertain after NodeStage / NodePublish
timeout or similar error, when the driver can continue with the operation in
background.
2020-03-02 12:54:02 +01:00
Kubernetes Prow Robot
39ed64ec4c Merge pull request #88569 from andyzhangx/csi-corrupt-mnt-fix
fix: corrupted mount point in csi driver node stage/publish
2020-03-02 03:30:43 -08:00
SataQiu
b60c0b5c24 small cleanup for ipvs readme 2020-03-02 10:56:29 +08:00
chendotjs
e79f49ebba validate configuration of kube-proxy IPVS tcp,tcpfin,udp timeout 2020-03-02 10:28:52 +08:00
Rob Scott
132d2afca0 Adding IngressClass to networking/v1beta1
Co-authored-by: Christopher M. Luciano <cmluciano@us.ibm.com>
2020-03-01 18:17:09 -08:00
pohsienshih
9bfe818229 Fixed golint issues in RBD code 2020-02-29 23:36:58 +08:00
Kubernetes Prow Robot
03b7f272c8 Merge pull request #88246 from munnerz/csr-signername-controllers
Update CSR controllers & kubelet to respect signerName field
2020-02-28 23:38:39 -08:00
louisgong
c6b94e4606 refactor volume binder 2020-02-29 12:03:39 +08:00
Kubernetes Prow Robot
268d0a1d3a Merge pull request #85870 from Jefftree/authn-netproxy
Use Network Proxy with Authentication & Authorizer Webhooks
2020-02-28 18:44:39 -08:00
Kubernetes Prow Robot
901a884c71 Merge pull request #88338 from egernst/PodOverhead-beta
Upgrade PodOverhead to beta
2020-02-28 15:12:40 -08:00
Jan Safranek
2c1b743766 Promote block volume features to GA 2020-02-28 20:48:38 +01:00
Patrick Ohly
2e7ce8cea0 bazel update 2020-02-28 10:09:19 +01:00