Commit Graph

326 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
fa1d682bd7 Merge pull request #103353 from njuptlzf/fix_datarace
fix data race for Test_Run_Positive_VolumeMountControllerAttachEnabledRace
2021-08-04 19:00:23 -07:00
Kubernetes Prow Robot
a674fb496c Merge pull request #103261 from markusthoemmes/kubelet-volume-logs
Add pod context to volume lifecycle logs
2021-08-04 19:00:15 -07:00
Markus Thömmes
c820824711 Add pod context to volume lifecycle logs 2021-08-03 13:12:22 +02:00
njuptlzf
1555877cc5 fix data race for Test_Run_Positive_VolumeMountControllerAttachEnabledRace 2021-07-26 17:17:16 +08:00
Kubernetes Prow Robot
6aa160f3ba Merge pull request #103181 from 249043822/bugfix-volumemanager
Add sync reconstructed volume from desired state of world for volumemanager
2021-07-19 15:04:52 -07:00
Kubernetes Prow Robot
2da4d48e6d Merge pull request #100567 from jingxu97/mar/mark
Mark volume mount as uncertain in case of volume expansion fails
2021-07-13 22:20:26 -07:00
KeZhang
65618bfd69 Add sync reconstructed volume from desired state of world for volumemanager 2021-07-13 12:51:37 +08:00
Jing Xu
0fa01c371c Mark volume mount as uncertain in case of volume expansion fails
should mark volume mount in actual state even if volume expansion fails so that
reconciler can tear down the volume when needed. To avoid pods start
using it, mark volume as uncertain instead of mounted.

Will add unit test after the logic is reviewed.

Change-Id: I5aebfa11ec93235a87af8f17bea7f7b1570b603d
2021-07-08 16:00:34 -07:00
Clayton Coleman
3eadd1a9ea Keep pod worker running until pod is truly complete
A number of race conditions exist when pods are terminated early in
their lifecycle because components in the kubelet need to know "no
running containers" or "containers can't be started from now on" but
were relying on outdated state.

Only the pod worker knows whether containers are being started for
a given pod, which is required to know when a pod is "terminated"
(no running containers, none coming). Move that responsibility and
podKiller function into the pod workers, and have everything that
was killing the pod go into the UpdatePod loop. Split syncPod into
three phases - setup, terminate containers, and cleanup pod - and
have transitions between those methods be visible to other
components. After this change, to kill a pod you tell the pod worker
to UpdatePod({UpdateType: SyncPodKill, Pod: pod}).

Several places in the kubelet were incorrect about whether they
were handling terminating (should stop running, might have
containers) or terminated (no running containers) pods. The pod worker
exposes methods that allow other loops to know when to set up or tear
down resources based on the state of the pod - these methods remove
the possibility of race conditions by ensuring a single component is
responsible for knowing each pod's allowed state and other components
simply delegate to checking whether they are in the window by UID.

Removing containers now no longer blocks final pod deletion in the
API server and are handled as background cleanup. Node shutdown
no longer marks pods as failed as they can be restarted in the
next step.

See https://docs.google.com/document/d/1Pic5TPntdJnYfIpBeZndDelM-AbS4FN9H2GTLFhoJ04/edit# for details
2021-07-06 15:55:22 -04:00
Chris Henzie
2b98f8edc7 Enforce ReadWriteOncePod access mode during mount 2021-06-28 21:25:37 -07:00
Jan Safranek
d3dfe124da Update mounter interface in volume manager
Update mounter interface in volume manager's ActualStateOfWorld every time.
Otherwise kubelet uses the first mounter it gets, which may not have the
latest information.

This fixes set up of CSI volumes, which store information about SELinux
support in their `mounter` interface implementation. With each MountVolume()
retry, a new mounter is instantiated and only the final mounter that succeeds
has the right info if the volume supports SELinux or not and can later
return the right attributes on GetAttributes() call.
2021-06-24 14:11:31 +02:00
Jan Safranek
d5da73032f Add unit test for DSWP with uncertain volume
desiredStateOfWorldPopulator.findAndRemoveDeletedPods() should remove
volumes from DSW when a pod is deleted on the API server and the volume is
uncertain in ASW.
2021-06-16 18:41:44 +02:00
Jan Safranek
f795b02f4f Refactor dswp unit tests
Change existing desiredStateOfWorldPopulator.findAndAddNewPods tests to use
a common initialization function.
2021-06-16 18:41:43 +02:00
Jan Safranek
2fcb5e9cf7 Add PodRemovedFromVolume
To know when a volume has been fully unmounted (incl. uncertain mounts).
2021-06-16 18:41:41 +02:00
Jan Safranek
ca934b8f5c Add GetPossiblyMountedVolumesForPod to let kubelet know all volumes were unmounted
podVolumesExist() should consider also uncertain volumes (where kubelet
does not know if a volume was fully unmounted) when checking for pod's
volumes. Added GetPossiblyMountedVolumesForPod for that.

Adding uncertain mounts to GetMountedVolumesForPod would potentially break
other callers (e.g. `verifyVolumesMountedFunc`).
2021-06-16 18:39:12 +02:00
sanwishe
9e257ec194 Optimization logging format for pkg/kubelet
Signed-off-by: sanwishe <jiang.mingzhi35@zte.com.cn>
2021-05-25 08:52:08 +08:00
andyzhangx
e10d3948f5 fix: azure file namespace issue in csi translation
fix build failure

fix comments
2021-04-20 07:23:09 +00:00
Kubernetes Prow Robot
bacce2eca6 Merge pull request #100215 from pacoxu/fix/data-race
fix a data race in volume reconciler ut #99815
2021-03-24 20:01:29 -07:00
Chris Henzie
f756bd5189 Fix nil ptr dereference in log line 2021-03-22 16:06:51 -07:00
Kubernetes Prow Robot
b5c6434f6b Merge pull request #98850 from yangjunmyfm192085/run-test14
Structured Logging migration: modify volume and container part logs o…
2021-03-17 11:45:19 -07:00
JunYang
01a4e4face Structured Logging migration: modify volume and container part logs of kubelet.
Signed-off-by: JunYang <yang.jun22@zte.com.cn>
2021-03-17 08:59:03 +08:00
pacoxu
0bb911f90a fix two data race in volume reconciler ut 99815 2021-03-14 13:20:34 +08:00
Kubernetes Prow Robot
410d092d8a Merge pull request #99643 from pohly/generic-ephemeral-volume-beta
generic ephemeral volume beta
2021-03-09 17:39:26 -08:00
Kubernetes Prow Robot
c3bed939ff Merge pull request #97659 from chenyw1990/fixedPodTerminatingWithSubpathNotEmpty
don't delete pod from desiredStateOfWorld when pod's sandbox is running
2021-03-09 16:07:26 -08:00
Kubernetes Prow Robot
a56fa34d6b Merge pull request #99942 from jsafrane/refactor-migration-featuregates
Refactor CSI migration plugin manager to get featureGates as a parameter
2021-03-09 04:27:46 -08:00
Patrick Ohly
555d4a12bf generic ephemeral volumes: drop ReadOnly field
As discussed during the alpha review, the ReadOnly field is not really
needed because volume mounts can also be read-only. It's a historical
oddity that can be avoided for generic ephemeral volumes as part
of the promotion to beta.
2021-03-09 08:22:48 +01:00
Kubernetes Prow Robot
f34b63acee Merge pull request #99620 from jingxu97/mar/owner
Add jingxu97 to volumemanager owner
2021-03-08 22:53:31 -08:00
Jan Safranek
219cbc818a Refactor CSI migration plugin manager to get featureGates as a parameter
This allows caller to provide fake ones for testing of various corner cases
(migration on A/D controller disabled while enabled on kubelet).
2021-03-08 13:50:01 +01:00
Krzysztof Gibuła
e46b280f96 Replace klog with with testing.T logging in pkg/kubelet tests 2021-03-07 23:10:02 +01:00
chenyw1990
289990db65 don't delete pod from desired state of world when pod's sandbox is running, because volume is the resource of pod. 2021-03-05 11:33:04 +08:00
Patrick Ohly
68370c8aa6 kubelet: more tests for generic ephemeral volumes
This simulates various error scenarios (PVC not created for pod,
feature disabled) and switching between feature disabled and enabled.
2021-03-03 10:13:05 +01:00
Patrick Ohly
edb9a8584c kubelet: better error when generic ephemeral volume is disabled
Silently ignoring the unsupported volume type leads to:

  Warning  FailedMount       8s    kubelet            Unable to attach or mount volumes: unmounted volumes=[my-csi-volume default-token-bsnbz], unattached volumes=[my-csi-volume default-token-bsnbz]: failed to get Plugin from volumeSpec for volume "my-csi-volume" err=no volume plugin matched

The new message is easier to understand:
  Warning  FailedMount       6s (x5 over 49s)  kubelet            Unable to attach or mount volumes: unmounted volumes=[my-csi-volume], unattached volumes=[my-csi-volume default-token-rwlpp]: volume my-csi-volume is a generic ephemeral volume, but that feature is disabled in kubelet
2021-03-03 10:13:05 +01:00
Jing Xu
763c951238 Add jingxu97 to volumemanager owner
add myself as volumemanager owner

Change-Id: I6e6746d6efdfd3374aef6967c1c9b4cd4637cebc
2021-03-01 20:47:40 -08:00
Benjamin Elder
56e092e382 hack/update-bazel.sh 2021-02-28 15:17:29 -08:00
Hemant Kumar
6f9a3374b1 Allow uncertain mount tests to run parallely 2021-02-18 09:37:43 -05:00
Chris Henzie
9d8f994d4e Separate test Kubelet and AttachDetach VolumeHost types
fakeVolumeHost previously implemented both the KubeletVolumeHost and
AttachDetachVolumeHost interfaces. This design makes it difficult to test the
CSIAttacher since it behaves differently depending on what type of
VolumeHost is supplied.
2020-12-17 15:17:04 -08:00
Shihang Zhang
d2859cd89b plumb service account token down to csi driver 2020-11-12 09:26:43 -08:00
Kubernetes Prow Robot
a9e9cabbea Merge pull request #94676 from JornShen/fix_Test_Run_Positive_VolumeMountControllerAttachEnabledRace_data_trace
Fix flaky unit test Test_Run_Positive_VolumeMountControllerAttachEnabledRace data race
2020-10-27 23:31:56 -07:00
jornshen
b6b462beba Fix flaky unit test Test_Run_Positive_VolumeMountControllerAttachEnabledRace data race
ref: https://github.com/kubernetes/kubernetes/issues/94568
2020-10-19 16:29:15 +08:00
Kubernetes Prow Robot
6ac2930ef0 Merge pull request #94574 from auxten/pkg-kubelet-staticchecks
Fix pkg/kubelet static checks
2020-09-21 21:22:47 -07:00
Srini Brahmaroutu
fbe5daed73 Change code to use staging/k8s.io/mount-utils 2020-09-16 21:51:24 -07:00
auxten
a9c1acc044 Fix staticchecks ST1005,S1002,S1008,S1039 in pkg/kubelet 2020-09-07 10:53:43 +08:00
Jiawei Wang
a6d8e6c5c2 Detect change of volume attachability in the middle of attaching
- Add Unit tests for both volumemanager and attach/detach controller
- Add E2E test
2020-08-24 17:15:11 -07:00
Hemant Kumar
b8c0435bc2 Handle volume-in-use error 2020-07-11 09:02:58 -04:00
Patrick Ohly
ff3e5e06a7 GenericEphemeralVolume: initial implementation
The implementation consists of
- identifying all places where VolumeSource.PersistentVolumeClaim has
  a special meaning and then ensuring that the same code path is taken
  for an ephemeral volume, with the ownership check
- adding a controller that produces the PVCs for each embedded
  VolumeSource.EphemeralVolume
- relaxing the PVC protection controller such that it removes
  the finalizer already before the pod is deleted (only
  if the GenericEphemeralVolume feature is enabled): this is
  needed to break a cycle where foreground deletion of the pod
  blocks on removing the PVC, which waits for deletion of the pod

The controller was derived from the endpointslices controller.
2020-07-09 23:29:24 +02:00
Kubernetes Prow Robot
1e3eeba9fa Merge pull request #91577 from knabben/kubelet-bootstrap
kubelet: remove the --bootstrap-checkpoint-path feature
2020-07-09 00:03:41 -07:00
Amim Knabben
0ed41c3f10 Deprecating --bootstrap-checkpoint-path flag 2020-06-09 15:27:01 -04:00
liuxu
2367569f13 fix if don't set ephemeral-storage limit emptyDir's sizeLimit doesn't work 2020-05-23 13:36:56 +08:00
Davanum Srinivas
07d88617e5 Run hack/update-vendor.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-05-16 07:54:33 -04:00
Davanum Srinivas
442a69c3bd switch over k/k to use klog v2
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-05-16 07:54:27 -04:00