Commit Graph

187 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
27e23bad7d
Merge pull request #116529 from pohly/controllers-with-name
kube-controller-manager: convert to structured logging
2023-03-14 14:12:55 -07:00
Ziqi Zhao
d1aa73312c
pkg/controller/util support contextual logging (#115049)
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
2023-03-14 12:38:14 -07:00
Patrick Ohly
99151c39b7 kube-controller-manager: convert to structured logging
Most of the individual controllers were already converted earlier. Some log
calls were missed or added and then not updated during a rebase. Some of those
get updated here to fill those gaps.

Adding of the name to the logger used by each controller gets
consolidated in this commit. By using the name under which the
controller is registered we ensure that the names in the log
are consistent.
2023-03-14 19:16:32 +01:00
Damien Grisonnet
ac394c5c19 Cleanup deprecated metrics
Remove the following deprecated metrics:
- node_collector_evictions_number
- scheduler_e2e_scheduling_duration_seconds

Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2023-03-13 22:55:34 +01:00
Andrea Tosatto
cae19f9e85 Remove deprecated pod-eviction-timeout flag from controller-manager 2023-03-07 18:14:18 +00:00
kerthcet
e5c812bbe7 Remove CLI flag enable-taint-manager
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-03-07 18:11:49 +00:00
Kubernetes Prow Robot
37326f7cea
Merge pull request #112670 from yangjunmyfm192085/delklogV0
use contextual logging(nodeipam and nodelifecycle part)
2023-03-07 09:40:33 -08:00
JunYang
780ef3afb0 use klog.InfoS instead of klog.V(0),Info 2023-03-07 15:50:01 +08:00
Claudiu Belu
5ba74c81ca unit tests: Skip flaky tests on Windows
Some of the unit tests are currently flaky on Windows. This commit
skips them until they are resolved.
2023-03-06 20:46:05 +00:00
binacs
84ff621309 cleanup(controller): use IsSuperset to avoid interim slice 2023-02-19 21:49:58 +08:00
JunYang
29086e2b04 use klog instead of klog.V(0) 2023-01-14 21:15:50 +08:00
Kubernetes Prow Robot
1b8692ce46
Merge pull request #114296 from cbroglie/concurrent-monitor-node-health
controller/nodelifecycle: Make monitorNodeHealth process nodes concurrently
2023-01-12 12:42:54 -08:00
Christopher Broglie
3c88de52c8 controller/nodelifecycle: Make monitorNodeHealth process nodes concurrently
Marking the pods not ready on a node requires looping over them and
updating each pod's status one at a time. This is performed serially,
and can take a while if we're processing each node serially as well.

Since the time is spent waiting on io, there's an opportunity to go
faster by processing multiple nodes concurrently. This change modifies
the loop to process nodes in parallel, using the same number of workers
as doNodeProcessingPassWorker.

This change also introduces histogram metrics to better observe
monitorNodeHealth.
2023-01-11 12:34:39 -08:00
ialidzhikov
aede3fbf40 pkg/controller: Replace deprecated func usage from the k8s.io/utils/pointer pkg 2022-11-23 17:40:23 +02:00
Michal Wozniak
a910ca563b Fix race conditions 2022-11-14 10:11:26 +01:00
Michal Wozniak
3b5c3acd61 Improve stability if the taint_manager tests 2022-11-13 19:40:18 +01:00
Michal Wozniak
c803892bd8 Enable the feature into beta 2022-11-09 09:02:40 +01:00
Michal Wozniak
fea883687f SSA to add pod failure conditions - ready for review 2022-10-27 18:21:33 +02:00
Jakub Przychodzeń
de25c5fdcf NodeLifecycleController: Remove race condition
Patch request does not support RV by default, we need to include them explicitly and patching lists actually overwrites whole field. It means that there is a race condition, in which we can overwrite changes to taints that happened between GET and PATCH requests.
2022-10-24 19:36:58 +00:00
Han Kang
2bbd445f50 remove rate limiter metric as it is not in use
Change-Id: I91157653e3860eeecc3f572aee88da6ffc65faed
2022-10-13 13:07:11 -07:00
Michal Wozniak
bb561e0324 Fix controller policy and improve logging of related errors
Improve error logging from timed workers which are used for pod eviction

Co-authored-by: Aldo Culquicondor <1299064+alculquicondor@users.noreply.github.com>
2022-09-23 16:53:32 +02:00
kerthcet
b4277e7ce4 Fix potential goroutine leakages in taint manager tests
Signed-off-by: kerthcet <kerthcet@gmail.com>
2022-08-04 00:00:48 +08:00
Michal Wozniak
04fcbd721c Introduction of a pod condition type indicating disruption. Its reason field indicates the reason:
- PreemptionByKubeScheduler (Pod preempted by kube-scheduler)
- DeletionByTaintManager (Pod deleted by taint manager due to NoExecute taint)
- EvictionByEvictionAPI (Pod evicted by Eviction API)
- DeletionByPodGC (an orphaned Pod deleted by PodGC)PreemptedByScheduler (Pod preempted by kube-scheduler)
2022-08-02 11:12:16 +02:00
Davanum Srinivas
a9593d634c
Generate and format files
- Run hack/update-codegen.sh
- Run hack/update-generated-device-plugin.sh
- Run hack/update-generated-protobuf.sh
- Run hack/update-generated-runtime.sh
- Run hack/update-generated-swagger-docs.sh
- Run hack/update-openapi-spec.sh
- Run hack/update-gofmt.sh

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2022-07-26 13:14:05 -04:00
Kubernetes Prow Robot
a6afdf45dd
Merge pull request #110359 from MadhavJivrajani/remove-api-call-under-lock
controller/nodelifecycle: Refactor to not make API calls under lock
2022-07-25 06:20:34 -07:00
Madhav Jivrajani
3c0bc26d90 controller/nodelifecycle: Refactor to not make API calls under lock
The evictorLock only protects zonePodEvictor and zoneNoExecuteTainter.
processTaintBaseEviction showed indications of increased lock contention
among goroutines (see issue 110341 for more details).

The refactor done is to ensure that all codepaths in that function that
hold the evictorLock AND make API calls under the lock, are now making
API calls outside the lock and the lock is held only for accessing either
zonePodEvictor or zoneNoExecuteTainter or both.

Two other places where the refactor was done is the doEvictionPass and
doNoExecuteTaintingPass functions which make multiple API calls under
the evictorLock.

Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
2022-07-25 15:16:26 +05:30
Michal Wozniak
4ec8cf08da This PR refactors taint_manager to eliminate the getPod and getNode stubs. 2022-07-14 18:00:44 +02:00
Wojciech Tyczyński
006ff4510b Clean shutdown of nodecontroller integration tests 2022-06-06 20:33:20 +02:00
Kubernetes Prow Robot
fcffb6de7e
Merge pull request #108089 from mysunshine92/nodeController-20220212
fix typo for nodelifecycle controller
2022-05-07 03:33:18 -07:00
Kubernetes Prow Robot
f979b4094e
Merge pull request #99292 from yangjunmyfm192085/run-test23
replace all occurrences of "node", nodeName to "node", klog.KRef("", nodeName)
2022-03-17 12:22:42 -07:00
wangyamei
3808e193d9 fix typo for nodelifecycle controller 2022-02-28 14:30:28 +08:00
Kubernetes Prow Robot
3bd422dc76
Merge pull request #107293 from dims/jan-1-owners-cleanup
Cleanup OWNERS files - Jan 2021 Week 1
2022-01-13 10:30:30 -08:00
Patrick Ohly
9eaa2dc554 avoid klog Info calls without verbosity
In the following code pattern, the log message will get logged with v=0 in JSON
output although conceptually it has a higher verbosity:

   if klog.V(5).Enabled() {
       klog.Info("hello world")
   }

Having the actual verbosity in the JSON output is relevant, for example for
filtering out only the important info messages. The solution is to use
klog.V(5).Info or something similar.

Whether the outer if is necessary at all depends on how complex the parameters
are. The return value of klog.V can be captured in a variable and be used
multiple times to avoid the overhead for that function call and to avoid
repeating the verbosity level.
2022-01-12 07:48:36 +01:00
Davanum Srinivas
9682b7248f
OWNERS cleanup - Jan 2021 Week 1
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2022-01-10 08:14:29 -05:00
JUN YANG
13fa2b1949 replace all occurrences of "node", nodeName to "node", klog.KRef("", nodeName) 2022-01-06 21:19:29 +08:00
Kubernetes Prow Robot
1426587e08
Merge pull request #106436 from dims/cleanup-owners-files-no-activity-in-a-year
Cleanup OWNERS files (No Activity in the last year)
2021-12-15 12:07:51 -08:00
Davanum Srinivas
497e9c1971
Cleanup OWNERS files (No Activity in the last year)
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2021-12-15 10:34:02 -05:00
Kubernetes Prow Robot
ba200841fd
Merge pull request #106366 from cyclinder/evictions_number_stable
adding evictions_total metric and marking evictions_number deprecated
2021-12-12 23:19:59 -08:00
cyclinder
b88b51c6e5 adding evictions_total metric and marking evictions_number deprecated
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2021-12-13 10:36:02 +08:00
Davanum Srinivas
9405e9b55e
Check in OWNERS modified by update-yamlfmt.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2021-12-09 21:31:26 -05:00
Neha Lohia
fa1b6765d5
move pkg/util/node to component-helpers/node/util (#105347)
Signed-off-by: Neha Lohia <nehapithadiya444@gmail.com>
2021-11-12 07:52:27 -08:00
Mike Dame
4960d0976a Wire contexts to Core controllers 2021-11-01 10:29:00 -04:00
Kubernetes Prow Robot
519b164db1
Merge pull request #105222 from cyclinder/remove_node_lease_GA
remove nodeLease feature GA
2021-10-05 05:41:21 -07:00
Author cyclinder
e61b901628 remove nodeLease feature GA
Signed-off-by: cyclinder <qifeng.guo@daocloud.io>
2021-10-05 12:23:27 +08:00
haoyun
8d9fa1decc fix: use k8s.io/utils/clock instead
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-10-02 00:08:53 +08:00
wojtekt
e233feb99b Migrate to k8s.io/utils/clock in pkg/controller 2021-09-10 11:42:32 +02:00
Stephen Augustus
481cf6fbe7
generated: Run hack/update-gofmt.sh
Signed-off-by: Stephen Augustus <foo@auggie.dev>
2021-08-24 15:47:49 -04:00
Kubernetes Prow Robot
9067b5691d
Merge pull request #97818 from damemi/remove-util-node-dep
Scheduler: remove direct dependency for k8s.io/kubernetes/pkg/util/node
2021-03-05 04:06:32 -08:00
Mike Dame
af045087d9 Move GetZoneKey function to component-helpers 2021-03-04 10:32:38 -05:00
Sergey Kanzhelev
d24ed4ace1 sped up scheduler tests by using fake clock 2021-03-03 19:12:53 +00:00