Commit Graph

5776 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
182e0989ec
Merge pull request #111646 from alculquicondor/fix_failed_suceeded
Fix JobTrackingWithFinalizers when a pod succeeds after the job fails
2022-08-02 17:45:52 -07:00
Aldo Culquicondor
ca8cebe5ba Fix JobTrackingWithFinalizers when a pod succeeds after the job fails
Change-Id: I3be351fb3b53216948a37b1d58224f8fbbf22b47
2022-08-02 19:33:06 -04:00
Kubernetes Prow Robot
90f9a52db6
Merge pull request #111467 from RomanBednar/retro-sc-assignment
Allow retroactive storage class assigment to PVCs
2022-08-02 15:05:57 -07:00
Kubernetes Prow Robot
369a465fae
Merge pull request #111301 from mattcary/migration-feature
Upgrade CSIMigrationGCE feature gate to GA
2022-08-02 13:58:57 -07:00
Roman Bednar
2f533cd572 add tests for pv controller 2022-08-02 20:52:04 +02:00
Roman Bednar
a0a5aa3680 allow retroactive storage class assignment in pv controller 2022-08-02 20:52:04 +02:00
Matthew Cary
e5d387c5d6 Upgrade CSIMigrationGCE feature gate to GA
Change-Id: I620bc4913765c0d6562eb1008216a72e8b0a2970
2022-08-02 09:14:27 -07:00
Aldo Culquicondor
4188d9b646 Add worker to clean up stale DisruptionTarget condition
Change-Id: I907fbdf01e7ff08d823fb23aa168ff271d8ff1ee
2022-08-02 11:25:01 -04:00
Aldo Culquicondor
dad8454ebb Add clock interface to disruption controller
To be able to write more precise unit tests in the future

Change-Id: I8f45947dfacca501acd856849bd978fad0f735cd
2022-08-02 11:17:29 -04:00
Michal Wozniak
04fcbd721c Introduction of a pod condition type indicating disruption. Its reason field indicates the reason:
- PreemptionByKubeScheduler (Pod preempted by kube-scheduler)
- DeletionByTaintManager (Pod deleted by taint manager due to NoExecute taint)
- EvictionByEvictionAPI (Pod evicted by Eviction API)
- DeletionByPodGC (an orphaned Pod deleted by PodGC)PreemptedByScheduler (Pod preempted by kube-scheduler)
2022-08-02 11:12:16 +02:00
Kubernetes Prow Robot
1e18ff5b37
Merge pull request #111479 from wongma7/migrationawsga
Promote CSIMigrationAWS to GA
2022-08-01 13:18:29 -07:00
Kubernetes Prow Robot
42b6b2887c
Merge pull request #110888 from likakuli/feature_ignoreeventforgc
feat: ignore all event resource for gc
2022-08-01 12:10:28 -07:00
Matthew Wong
777f43062c Remove unit tests that set & test CSIMigrationAWS false since it's now locked to true 2022-07-29 13:52:06 -07:00
Jakub Przychodzeń
7dd4e89a99 Enable 'running_managed_controllers' for KCM nodeipam controller 2022-07-27 14:30:40 +00:00
Davanum Srinivas
a9593d634c
Generate and format files
- Run hack/update-codegen.sh
- Run hack/update-generated-device-plugin.sh
- Run hack/update-generated-protobuf.sh
- Run hack/update-generated-runtime.sh
- Run hack/update-generated-swagger-docs.sh
- Run hack/update-openapi-spec.sh
- Run hack/update-gofmt.sh

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2022-07-26 13:14:05 -04:00
Kubernetes Prow Robot
7156c96e5d
Merge pull request #111194 from ravisantoshgudimetla/promote-maxSurge-ga
Promote DS max surge to GA
2022-07-25 06:20:46 -07:00
Kubernetes Prow Robot
a6afdf45dd
Merge pull request #110359 from MadhavJivrajani/remove-api-call-under-lock
controller/nodelifecycle: Refactor to not make API calls under lock
2022-07-25 06:20:34 -07:00
Madhav Jivrajani
3c0bc26d90 controller/nodelifecycle: Refactor to not make API calls under lock
The evictorLock only protects zonePodEvictor and zoneNoExecuteTainter.
processTaintBaseEviction showed indications of increased lock contention
among goroutines (see issue 110341 for more details).

The refactor done is to ensure that all codepaths in that function that
hold the evictorLock AND make API calls under the lock, are now making
API calls outside the lock and the lock is held only for accessing either
zonePodEvictor or zoneNoExecuteTainter or both.

Two other places where the refactor was done is the doEvictionPass and
doNoExecuteTaintingPass functions which make multiple API calls under
the evictorLock.

Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
2022-07-25 15:16:26 +05:30
Michal Wozniak
2f61b6105c Add integration tests for podgc 2022-07-20 15:17:14 +02:00
Kubernetes Prow Robot
ddeb3ab90b
Merge pull request #111084 from mimowo/retriable-pod-failures-refactor-taint-mngr
Refactor taint_manager to do not use getPod and getNode stubs
2022-07-19 06:54:06 -07:00
Kubernetes Prow Robot
9cf4f15884
Merge pull request #110633 from wojtek-t/fix_leaking_goroutines_10
Fix leaking goroutines in multiple integration tests
2022-07-18 21:56:05 -07:00
Kubernetes Prow Robot
1c1efde70d
Merge pull request #109639 from Abirdcfly/fixduplicateimport
cleanup: remove all duplicate import
2022-07-18 16:55:23 -07:00
Ravi Gudimetla
7397c029e8 Promote DS MaxSurge to GA 2022-07-18 07:54:59 -04:00
Kubernetes Prow Robot
3987c8ad91
Merge pull request #111134 from ldsdsy/modify1
Improve the accuracy of output msg in pkg/controller/endpoint/endpoints_controller.go
2022-07-17 23:51:15 -07:00
Kubernetes Prow Robot
e5f4f8d71b
Merge pull request #110896 from ravisantoshgudimetla/promote-minReadySec-sts-update-ga
Promote minReadySeconds to GA
2022-07-14 09:45:09 -07:00
Michal Wozniak
4ec8cf08da This PR refactors taint_manager to eliminate the getPod and getNode stubs. 2022-07-14 18:00:44 +02:00
Kubernetes Prow Robot
27110bd821
Merge pull request #111070 from mimowo/retriable-pod-failures-refactor-gc-controller
Refactor gc_controller to do not use the deletePod stub
2022-07-14 06:11:09 -07:00
Wojciech Tyczyński
13e4f2b554 Clean shutdown of volume integration tests 2022-07-14 11:25:57 +02:00
ldsdsy
eacddf9f28 Optimising print information 2022-07-14 15:03:03 +08:00
Abirdcfly
00b9ead02c cleanup: remove duplicate import
Signed-off-by: Abirdcfly <fp544037857@gmail.com>
2022-07-14 11:25:19 +08:00
Ravi Gudimetla
9144250a92 Promote minReadySeconds to GA 2022-07-13 11:37:10 -04:00
Michal Wozniak
778b8300bc fix nits 2022-07-12 10:16:00 +02:00
Michal Wozniak
2730d285cf do not store context 2022-07-12 10:13:47 +02:00
Michal Wozniak
4a3d51359a Refact GC controller to do not use stub deletePod 2022-07-12 10:13:47 +02:00
Andy Goldstein
a899441484
quota: add an update filter
Fix a TODO to plumb an update filter from above in the resource quota
monitor code that was handling update events for quota-able objects,
instead of hard-coding the logic in the resource quota monitor.

Signed-off-by: Andy Goldstein <andy.goldstein@redhat.com>
2022-07-08 18:39:55 -04:00
Aldo Culquicondor
b492f49c9f Do not skip job requeue in conflict error
Change-Id: Ie97977887a1cc3de58922d73dce92ae1965965bf
2022-07-08 16:14:32 +00:00
Kubernetes Prow Robot
b3be343bc8
Merge pull request #110811 from Abirdcfly/clock
Update golangci-lint to 1.46.2 and fix errors
2022-07-06 16:03:32 -07:00
likakuli
74a3b8f4a9 feat: fix a bug thaat not all event be ignored by gc controller
Signed-off-by: likakuli <1154584512@qq.com>
2022-07-04 18:00:54 +08:00
Abirdcfly
2bca77a3d9 Update golangci-lint to 1.46.2 and fix errors
Signed-off-by: Abirdcfly <fp544037857@gmail.com>
2022-06-29 17:42:46 +08:00
Kubernetes Prow Robot
7f920da442
Merge pull request #110827 from Abirdcfly/simple2
cleanup:use append other than for loop
2022-06-28 19:58:15 -07:00
Kubernetes Prow Robot
6269784cd0
Merge pull request #109250 from d-honeybadger/fix-cronjob-scheduling-every-syntax
Fix requeueing of cronjobs with every-style schedule
2022-06-28 04:37:57 -07:00
Abirdcfly
8e9a896483
cleanup:use append other than for loop
Signed-off-by: Abirdcfly <fp544037857@gmail.com>
2022-06-28 16:31:59 +08:00
Kubernetes Prow Robot
aefb71d7ef
Merge pull request #110721 from jsafrane/fix-force-detach
Don't force detach volume from healthy nodes
2022-06-27 07:49:12 -07:00
Kubernetes Prow Robot
11686e1386
Merge pull request #110771 from alculquicondor/increase_timeout
Wait for cache sync in TestSyncPastDeadlineJobFinished
2022-06-24 13:28:59 -07:00
Aldo Culquicondor
62a25920e6 Wait for cache sync in TestSyncPastDeadlineJobFinished
Change-Id: I6f023ca6999108f4f86a0f57831d47704cdbb42b
2022-06-24 09:22:59 -04:00
Jan Safranek
3b94ac228a Don't force detach volume from healthy nodes
6 minute force-deatch timeout should be used only for nodes that are not
healthy. 

In case a CSI driver is being upgraded or it's simply slow, NodeUnstage
can take more than 6 minutes. In that case, Pod is already deleted from the
API server and thus A/D controller will force-detach a mounted volume,
possibly corrupting the volume and breaking CSI - a CSI driver expects
NodeUnstage to succeed before Kubernetes can call ControllerUnpublish.
2022-06-24 12:51:41 +02:00
Kubernetes Prow Robot
ae3537120b
Merge pull request #110639 from aojea/slice_no_node
EndpointSlice with Pods without an existing Node
2022-06-22 10:43:42 -07:00
Kubernetes Prow Robot
b60978629d
Merge pull request #110700 from alculquicondor/increase_timeout
Increase timeout for TestSyncPastDeadlineJobFinished
2022-06-22 08:23:56 -07:00
Kubernetes Prow Robot
18b5efceda
Merge pull request #110410 from Jiawei0227/master
CSIMigration feature gate to GA
2022-06-22 04:05:48 -07:00
Antonio Ojea
b8ba6ab005 endpointslices: node missing on Pod scenario
When a Pod is referencing a Node that doesn't exist on the local
informer cache, the current behavior was to return an error to
retry later and stop processing.
However, this can cause scenarios that a missing node leaves a
Slice stuck, it can no reflect other changes, or be created.
Also, this doesn't respect the publishNotReadyAddresses options
on Services, that considers ok to publish pod Addresses that are
known to not be ready.

The new behavior keeps retrying the problematic Service, but it
keeps processing the updates, reflacting current state on the
EndpointSlice. If the publishNotReadyAddresses is set, a missing
node on a Pod is not treated as an error.
2022-06-22 09:45:16 +02:00