Commit Graph

873 Commits

Author SHA1 Message Date
Abirdcfly
00b9ead02c cleanup: remove duplicate import
Signed-off-by: Abirdcfly <fp544037857@gmail.com>
2022-07-14 11:25:19 +08:00
Jan Safranek
3b94ac228a Don't force detach volume from healthy nodes
6 minute force-deatch timeout should be used only for nodes that are not
healthy. 

In case a CSI driver is being upgraded or it's simply slow, NodeUnstage
can take more than 6 minutes. In that case, Pod is already deleted from the
API server and thus A/D controller will force-detach a mounted volume,
possibly corrupting the volume and breaking CSI - a CSI driver expects
NodeUnstage to succeed before Kubernetes can call ControllerUnpublish.
2022-06-24 12:51:41 +02:00
Jiawei Wang
760365d5c9 CSIMigration feature gate to GA 2022-06-06 21:19:19 +00:00
Deepak Kinni
a7b1fcba40 Add or Remove PV deletion protection finalizer considering the recalimPolicy
Signed-off-by: Deepak Kinni <dkinni@vmware.com>
2022-04-07 00:48:35 +05:30
Hemant Kumar
a99466ca86 check existing size before querying new size from api-server 2022-03-28 11:32:49 -04:00
Hemant Kumar
ed217f4140 rename SetVolumeSize to InitializeVolumeSize 2022-03-28 11:32:49 -04:00
Hemant Kumar
7a43406138 Do not update PVC if it already has updated size 2022-03-28 11:32:49 -04:00
Hemant Kumar
e4f62d6c41 Modify code to use new interface functions 2022-03-28 11:32:49 -04:00
Ashutosh Kumar
c00975370a
Handle Non-graceful Node Shutdown (#108486)
Signed-off-by: Ashutosh Kumar <sonasingh46@gmail.com>

Co-authored-by: Ashutosh Kumar <sonasingh46@gmail.com>

Co-authored-by: xing-yang <xingyang105@gmail.com>
2022-03-26 09:23:21 -07:00
Deepak Kinni
836ace46a0 Default enable flag for beta feature HonorPVReclaimPolicy
Signed-off-by: Deepak Kinni <dkinni@vmware.com>
2022-03-26 06:48:28 +05:30
Konstantin Misyutin
438d224f0e Cleanup k8s.io/component-helpers/storage/volume package
Signed-off-by: Konstantin Misyutin <konstantin.misyutin@huawei.com>
2022-03-16 15:43:09 +08:00
Konstantin Misyutin
4ba98a8610 cleanup: remove unnecessary import aliases 2022-03-16 15:43:09 +08:00
Konstantin Misyutin
1d7cefe9c4 Move volume helpers to "k8s.io/component-helpers/storage/volume".
This patch aims to simplify decoupling "pkg/scheduler/framework/plugins"
from internal "k8s.io/kubernetes" packages. More described in
issue #89930 and PR #102953.

Some helpers from "k8s.io/kubernetes/pkg/controller/volume/persistentvolume"
package moved to "k8s.io/component-helpers/storage/volume" package:

- IsDelayBindingMode
- GetBindVolumeToClaim
- IsVolumeBoundToClaim
- FindMatchingVolume
- CheckVolumeModeMismatches
- CheckAccessModes
- GetVolumeNodeAffinity

Also "CheckNodeAffinity" from "k8s.io/kubernetes/pkg/volume/util"
package moved to "k8s.io/component-helpers/storage/volume" package
to prevent diamond dependency conflict.

Signed-off-by: Konstantin Misyutin <konstantin.misyutin@huawei.com>
2022-03-16 15:43:09 +08:00
Deepak Kinni
d37f14d0ae Support for in-tree PV Deletion protection finalizer
Signed-off-by: Deepak Kinni <dkinni@vmware.com>
2022-03-10 21:37:43 -08:00
Kubernetes Prow Robot
66daef4aa7
Merge pull request #108167 from jfremy/fix-107973
Fix nodes volumesAttached status not being updated
2022-03-01 12:49:54 -08:00
Kubernetes Prow Robot
06e107081e
Merge pull request #104732 from mengjiao-liu/remove-flag-experimental-check-node-capabilities-before-mount
kubelet: Remove the deprecated flag `--experimental-check-node-capabilities-before-mount`
2022-02-24 07:56:30 -08:00
Jean-Francois Remy
e83184568d Add unit tests
- actual_state_of_world_test.go: test the new method GetVolumesToReportAttachedForNode
  for an existing node and a non-existing node
- node_status_updater_test.go: test UpdateNodeStatuses and UpdateNodeStatuses in nominal
  case with 2 nodes getting one volume each. Test UpdateNodeStatuses with the first call
  to node.patch failing but the following one succeeding
- add comment in node_status_updater.go
- fix log line in reconciler.go
- rename variable in actual_state_of_world.go
2022-02-22 12:21:58 -08:00
Jean-Francois Remy
f1717baaaa Fix nodes volumesAttached status not updated
The UpdateNodeStatuses code stops too early in case there is
an error when calling updateNodeStatus. It will return immediately
which means any remaining node won't have its update status put back
to true.

Looking at the call sites for UpdateNodeStatuses, it appears this is
not the only issue. If the lister call fails with anything but a Not Found
error, it's silently ignored which is wrong in the detach path.
Also the reconciler detach path calls UpdateNodeStatuses but the real intent
is to only update the node currently processed in the loop and not proceed
with the detach call if there is an error updating that specifi node volumesAttached
property. With the current implementation, it will not proceed if there is
an error updating another node (which is not completely bad but not ideal) and
worse it will proceed if there is a lister error on that node which means the
node volumesAttached property won't have been updated.

To fix those issues, introduce the following changes:
- [node_status_updater] introduce UpdateNodeStatusForNode which does what
  UpdateNodeStatuses does but only for the provided node
- [node_status_updater] if the node lister call fails for anything but a Not
  Found error, we will return an error, not ignore it
- [node_status_updater] if the update of a node volumesAttached properties fails
  we continue processing the other nodes
- [actual_state_of_world] introduce GetVolumesToReportAttachedForNode which
  does what GetVolumesToReportAttached but for the node whose name is provided
  it returns a bool which indicates if the node in question needs an update as
  well as the volumesAttached list. It is used by UpdateNodeStatusForNode
- [actual_state_of_world] use write lock in updateNodeStatusUpdateNeeded, we're
  modifying the map content
- [reconciler] use UpdateNodeStatusForNode in the detach loop
2022-02-22 12:20:53 -08:00
yujunwang
8f96600907 perf:logic-optimiz-for-DetermineVolumeAction 2022-01-22 23:45:29 +08:00
Kubernetes Prow Robot
184daed0db
Merge pull request #107559 from liggitt/invalid-selectors
Handle invalid selectors properly
2022-01-19 14:49:31 -08:00
Jordan Liggitt
c0af728f43 Handle invalid selectors properly 2022-01-14 12:11:02 -05:00
Wojciech Tyczyński
551790729f Remove selflink references in different testing-related files 2022-01-14 12:58:05 +01:00
Patrick Ohly
9eaa2dc554 avoid klog Info calls without verbosity
In the following code pattern, the log message will get logged with v=0 in JSON
output although conceptually it has a higher verbosity:

   if klog.V(5).Enabled() {
       klog.Info("hello world")
   }

Having the actual verbosity in the JSON output is relevant, for example for
filtering out only the important info messages. The solution is to use
klog.V(5).Info or something similar.

Whether the outer if is necessary at all depends on how complex the parameters
are. The return value of klog.V can be captured in a variable and be used
multiple times to avoid the overhead for that function call and to avoid
repeating the verbosity level.
2022-01-12 07:48:36 +01:00
Mengjiao Liu
beda4cafb6 kubelet: Remove the deprecated flag --experimental-check-node-capabilities-before-mount 2022-01-06 11:47:11 +08:00
Kubernetes Prow Robot
abfe397ceb
Merge pull request #107166 from jsafrane/fix-pv-controller-tests
Fix PV controller unit test 5-7
2022-01-04 23:03:27 -08:00
Jan Safranek
0f9832d095 Fix PV name in unit test
Test 5-5 should use PV with "5-5"i in the name. It makes log analysys
much easier.
2022-01-03 15:20:12 +01:00
Kubernetes Prow Robot
f0dbc32ed9
Merge pull request #106853 from gnufied/disable-exp-backoff-volume-not-inuse
When volume is not marked in-use, do not backoff
2021-12-22 19:46:37 -08:00
Jan Safranek
045ca75c03 Deflake PV metrics unit test
Test 5-7 tries to delete a PVC at the very same time when it detects that
the PV controller started processing the PVC. The controller then sometimes
can't update the PVC and generate an event for it that the test expects.

From PV controller logs (not shown in CI):
> I1221 14:36:34.548160  104481 pv_controller.go:815] updating PersistentVolumeClaim[default/claim5-7] status: set phase Lost failed: cannot update claim claim5-7: claim not found

Typical error in CI:
> FAIL: TestControllerSync (83.22s)
> framework_test.go:202: Event "Warning ClaimLost" not emitted

Therefore wait for the PVC to be fully processed before deleting the PVC to
avoid races.
2021-12-21 15:36:44 +01:00
Jan Safranek
e323306ce0 Add new watchers to PV controller tests
Add fake Pod and Node watchers to the tests. It only reduces test noise:

Failed to watch *v1.Pod: unhandled watch: testing.WatchActionImpl{ActionImpl:testing.ActionImpl{Namespace:"", Verb:"watch", Resource:schema.GroupVersionResource{Group:"", Version:"v1", Resource:"pods"}, Subresource:""}, WatchRestrictions:testing.WatchRestrictions{Labels:labels.internalSelector(nil), Fields:fields.andTerm{}, ResourceVersion:""}}
2021-12-21 15:36:34 +01:00
Hemant Kumar
7989f27044 use node informer to check volumes attachment status before backoff
fix unit tests
2021-12-20 11:57:05 -05:00
Davanum Srinivas
9405e9b55e
Check in OWNERS modified by update-yamlfmt.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2021-12-09 21:31:26 -05:00
Kubernetes Prow Robot
12901b95c9
Merge pull request #106344 from ikeeip/fix_import_formatting
Fix golang imports in k8s.io/pkg/controller/volume/persistentvolume package
2021-12-07 17:26:40 -08:00
Kubernetes Prow Robot
6805e6ee41
Merge pull request #104722 from leiyiz/migration
turning on the CSIMigrationGCE feature flag
2021-11-16 15:28:32 -08:00
Kubernetes Prow Robot
f151a40d8d
Merge pull request #106154 from gnufied/recover-expansion-failure-123
Recover expansion failure
2021-11-16 13:21:34 -08:00
Léiyì Zhang
275fdf0884 fixing unit test failures induced by turning on CSIMigrationGCE
disable CSIMigrationGCE in some unit tests
2021-11-16 19:26:30 +00:00
Hemant Kumar
1ddd598d31 Implement controller and kubelet changes for recovery from resize
failures
2021-11-16 11:06:46 -05:00
Kubernetes Prow Robot
ce98eda406
Merge pull request #106376 from jsafrane/stabilize-unit-test
Fix deletion protection unit test
2021-11-15 13:04:48 -08:00
Neha Lohia
fa1b6765d5
move pkg/util/node to component-helpers/node/util (#105347)
Signed-off-by: Neha Lohia <nehapithadiya444@gmail.com>
2021-11-12 07:52:27 -08:00
Jan Safranek
bb8157d780 Fix deletion protection unit test
The test should not depend on current set of default feature gates, it
should always ensure the ones necessary for the tests are set.
2021-11-12 10:47:15 +01:00
Konstantin Misyutin
7434fdf1d4 fix import formatting in k8s.io/pkg/controller/volume/persistentvolume package
Signed-off-by: Konstantin Misyutin <konstantin.misyutin@huawei.com>
2021-11-11 16:31:38 +08:00
Deepak Kinni
bfd5f23a0b PV controller changes to support PV Deletion protection finalizer
Signed-off-by: Deepak Kinni <dkinni@vmware.com>
2021-11-08 10:35:58 -08:00
Konstantin Misyutin
808c8f42d5 Remove StorageObjectInUseProtection feature gate logic
This feature has graduated to GA in v1.11 and will always be
enabled. So no longe need to check if enabled.

Signed-off-by: Konstantin Misyutin <konstantin.misyutin@huawei.com>
2021-11-03 00:13:50 +03:00
Mike Dame
4960d0976a Wire contexts to Core controllers 2021-11-01 10:29:00 -04:00
Kubernetes Prow Robot
c592bd40f2
Merge pull request #105609 from pohly/generic-ephemeral-volume-ga
generic ephemeral volume GA
2021-10-28 17:36:50 -07:00
Konstantin Misyutin
dbc9d7b71a Remove tests when StorageObjectInUseProtection feature is disabled
As well as feature gate are locked, the tests when this feature is
disabled will crash. So we should remove them together with locking
the feature.

Signed-off-by: Konstantin Misyutin <konstantin.misyutin@huawei.com>
2021-10-15 19:39:37 +08:00
Kubernetes Prow Robot
baaa53db64
Merge pull request #105211 from xiaopingrubyist/fix-pv-controller-claim-cache-issue
fix:claim cached in pvcontroller is not the newest may cause unexpected issue
2021-10-14 05:47:18 -07:00
torubylist
f28a8d7f2b fix:cached claim is not the newest will cause unexpected issue 2021-10-13 20:03:00 +08:00
Patrick Ohly
a8c930ef46 generic ephemeral volume: graduation to GA
The feature gate gets locked to "true", with the goal to remove it in two
releases.

All code now can assume that the feature is enabled. Tests for "feature
disabled" are no longer needed and get removed.

Some code wasn't using the new helper functions yet. That gets changed while
touching those lines.
2021-10-11 20:54:20 +02:00
Kubernetes Prow Robot
b0eac84937
Merge pull request #105345 from pohly/generic-ephemeral-volume-util
generic ephemeral volume util, base code and controller
2021-10-07 08:19:47 -07:00
Patrick Ohly
4ae0eecb34 controller: use generic ephemeral volume helper functions
The name concatenation and ownership check were originally considered small
enough to not warrant dedicated functions, but the intent of the code is more
readable with them.

There also was a missing owner check in the attach controller.
2021-10-06 14:01:44 +02:00