Patrick Ohly
9eaa2dc554
avoid klog Info calls without verbosity
...
In the following code pattern, the log message will get logged with v=0 in JSON
output although conceptually it has a higher verbosity:
if klog.V(5).Enabled() {
klog.Info("hello world")
}
Having the actual verbosity in the JSON output is relevant, for example for
filtering out only the important info messages. The solution is to use
klog.V(5).Info or something similar.
Whether the outer if is necessary at all depends on how complex the parameters
are. The return value of klog.V can be captured in a variable and be used
multiple times to avoid the overhead for that function call and to avoid
repeating the verbosity level.
2022-01-12 07:48:36 +01:00
Kubernetes Prow Robot
b5103f6117
Merge pull request #107426 from yanghesong/remove_validate_runtime
...
Remove runtime in validate
2022-01-11 20:50:36 -08:00
Kubernetes Prow Robot
cadbe8dfb5
Merge pull request #107250 from cndoit18/use-errors
...
cleanup(kubelet): use errors.Is(err, os.ErrProcessDone)
2022-01-11 10:49:01 -08:00
Kubernetes Prow Robot
19069665f9
Merge pull request #107094 from adisky/d-container-runtime
...
Mark container-runtime kubelet flag as deprecated
2022-01-11 10:48:46 -08:00
Kubernetes Prow Robot
7eb5046064
Merge pull request #106470 from qmloong/qmloong/fix
...
fix: some typos and syncPod outdated workflow annotation
2022-01-11 10:48:38 -08:00
Kubernetes Prow Robot
5f4914604d
Merge pull request #106353 from gjkim42/remove-false-pleg-errors
...
kubelet: Remove false PLEG errors
2022-01-11 10:48:26 -08:00
Kubernetes Prow Robot
a0dfd958d5
Merge pull request #107163 from cyclinder/fix_leak_goroutine
...
fix goroutine leaks in TestConfigurationChannels
2022-01-10 17:23:16 -08:00
cyclinder
928e686877
fix goroutine leaks in TestConfigurationChannels
...
Signed-off-by: cyclinder <qifeng.guo@daocloud.io >
2022-01-10 19:51:16 +08:00
yanghesong
6905fef761
Remove runtime in validate
...
Validate is useless as dockershim is removed
Signed-off-by: yanghesong <hesong.yang@foxmail.com >
2022-01-09 09:11:49 +08:00
wq
4f38d4aaa1
fix a typo in the comment of ImageCredentialProviderConfigFile
2022-01-09 00:07:43 +09:00
Kubernetes Prow Robot
d1a5513cb0
Merge pull request #107006 from gnufied/add-total-mount-time-metrics
...
Add metric for reporting total end-to-end mount time
2022-01-07 06:19:31 -08:00
Kubernetes Prow Robot
09fccc3533
Merge pull request #106796 from jonyhy96/fix-timer
...
kubelet: use newtimer instead in nodeshutdown manager
2022-01-06 11:47:12 -08:00
Kubernetes Prow Robot
03ee86c09c
Merge pull request #104837 from eggiter/fix-release-reused-cpus
...
fix(cpumanager): Do not release CPUs of init containers while they are being reused in app containers
2022-01-06 11:46:38 -08:00
Kubernetes Prow Robot
0b9ad84973
Merge pull request #107116 from yxxhero/add_more_msg_for_no_podsandbox_container
...
add more message for no PodSandbox container
2022-01-06 08:58:09 -08:00
Kubernetes Prow Robot
b457ae72f5
Merge pull request #106644 from ahrtr/add_info_counter_perfcounter
...
Add more info when failing to call PdhAddEnglishCounter
2022-01-06 06:45:01 -08:00
Aditi Sharma
e03d7d3fdd
Mark container-runtime flag as deprecated
...
Signed-off-by: Aditi Sharma <adi.sky17@gmail.com >
2022-01-06 10:23:03 +05:30
Kubernetes Prow Robot
73b68f5233
Merge pull request #106979 from a2ush/fix_typo
...
Fix comment out typo (from resolve.conf to resolv.conf) and change the content name (from maxResolveConfLength to maxResolvConfLength)
2022-01-05 16:08:26 -08:00
Kubernetes Prow Robot
afd254a18f
Merge pull request #106756 from victory460/feature_helpers
...
code cleanup for container/helpers.go
2022-01-05 08:20:42 -08:00
Kubernetes Prow Robot
19591a1324
Merge pull request #105829 from yuanchen8911/master
...
Fix and improve comments on kubelet metrics
2022-01-04 23:02:32 -08:00
Kubernetes Prow Robot
abfbbe4dda
Merge pull request #107119 from hakman/remove_dockerless
...
Remove dockerless build tag and DockerLegacyService interface
2022-01-04 11:27:21 -08:00
cndoit18
601d02b90f
refactor(kubelet): use errors.Is(err, os.ErrProcessDone)
...
use errors.Is(err, os.ErrProcessDone) here and remove "process already finished" string comparison.
Signed-off-by: cndoit18 <cndoit18@outlook.com >
2021-12-29 18:10:06 +08:00
Kubernetes Prow Robot
f0dbc32ed9
Merge pull request #106853 from gnufied/disable-exp-backoff-volume-not-inuse
...
When volume is not marked in-use, do not backoff
2021-12-22 19:46:37 -08:00
Hemant Kumar
7989f27044
use node informer to check volumes attachment status before backoff
...
fix unit tests
2021-12-20 11:57:05 -05:00
Ciprian Hacman
5bae9b9288
Clean up DockerLegacyService interface
...
Signed-off-by: Ciprian Hacman <ciprian@hakman.dev >
2021-12-18 12:24:54 +02:00
Ciprian Hacman
6cdb1c225d
Clean up dockerless build tag
...
Signed-off-by: Ciprian Hacman <ciprian@hakman.dev >
2021-12-18 12:18:25 +02:00
yxxhero
a90b149be0
add more message for no PodSandbox container
...
Signed-off-by: yxxhero <aiopsclub@163.com >
2021-12-18 09:52:03 +08:00
Davanum Srinivas
497e9c1971
Cleanup OWNERS files (No Activity in the last year)
...
Signed-off-by: Davanum Srinivas <davanum@gmail.com >
2021-12-15 10:34:02 -05:00
a2ush
393dec26f6
Change the name of the constant
2021-12-14 22:42:57 +09:00
Hemant Kumar
55b5e6dc33
Add metric for reporting total end-to-end mount time
...
This metric includes time spent in waiting for devices to be attached,
any RPC calls and performing recursive chown etc.
2021-12-13 16:23:01 -05:00
a2ush
d775483381
Fix comment out typo
2021-12-11 22:27:38 +09:00
Kubernetes Prow Robot
1d66302c42
Merge pull request #106458 from dims/lint-yaml-in-owners-files
...
Lint/Beautify yaml in OWNERS files
2021-12-10 06:39:12 -08:00
Kubernetes Prow Robot
1b0d83f1d6
Merge pull request #106599 from klueska/fix-numa-bug
...
Fix Bugs in CPUManager distribute NUMA policy option
2021-12-10 04:41:12 -08:00
haoyun
92fa957dd1
feat: use clock instead
...
Signed-off-by: haoyun <yun.hao@daocloud.io >
2021-12-10 13:59:12 +08:00
Kubernetes Prow Robot
15e5f2a19a
Merge pull request #106291 from sbs2001/fix_invalid_comment
...
Remove invalid comment in legacyregistry
2021-12-09 19:03:10 -08:00
Davanum Srinivas
9405e9b55e
Check in OWNERS modified by update-yamlfmt.sh
...
Signed-off-by: Davanum Srinivas <davanum@gmail.com >
2021-12-09 21:31:26 -05:00
David Porter
95264a418d
kubelet: set failed phase during graceful shutdown
...
Revert to previous behavior in 1.21/1.20 of setting pod phase to failed
during graceful node shutdown.
Setting pods to failed phase will ensure that external controllers that
manage pods like deployments will create new pods to replace those that
are shutdown. Many customers have taken a dependency on this behavior
and it was breaking change in 1.22, so this change reverts back to the
previous behavior.
Signed-off-by: David Porter <david@porter.me >
2021-12-09 13:17:40 -08:00
Kubernetes Prow Robot
cdf3ad823a
Merge pull request #97252 from dims/drop-dockershim
...
Completely remove in-tree dockershim from kubelet
2021-12-08 12:51:46 -08:00
Kubernetes Prow Robot
f356ae4ad9
Merge pull request #101719 from SergeyKanzhelev/removeReallyCrashForTesting
...
Remove ReallyCrashForTesting and cleaned up some references to Handle…
2021-12-07 23:39:45 -08:00
Kubernetes Prow Robot
b685b3982d
Merge pull request #105360 from shuheiktgw/refactor_kubelet_config_validation_tests
...
Refactor kubelet config validation tests
2021-12-07 17:25:43 -08:00
Davanum Srinivas
bc78dff42e
update files to drop dockershim
...
Signed-off-by: Davanum Srinivas <davanum@gmail.com >
2021-12-07 15:15:13 -05:00
Davanum Srinivas
83265c9171
drop files deleted from pkg/kubelet/dockershim
...
Signed-off-by: Davanum Srinivas <davanum@gmail.com >
2021-12-07 15:15:13 -05:00
Hemant Kumar
5b7b2e2f6c
When volume is not marked in-use, do not backoff
2021-12-07 11:50:15 -05:00
Sascha Grunert
a063a2ba3e
Revert dockershim CRI v1 changes
...
We should not touch the dockershim ahead of removal and therefore
default to `v1alpha2` CRI instead of `v1`.
Partially reverts changes from https://github.com/kubernetes/kubernetes/pull/106501
Signed-off-by: Sascha Grunert <sgrunert@redhat.com >
2021-12-03 18:37:11 +01:00
xuweiwei
21238c2593
code cleanup for container/helpers.go
2021-12-01 11:17:33 +08:00
Sergey Kanzhelev
a11453efbc
remove ReallyCrashForTesting and cleaned up some references to HandleCrash behavior
2021-11-29 20:00:10 +00:00
menglong.qi
12eff56460
fix: syncPod outdated workflow comment
2021-11-28 17:21:29 +08:00
Kevin Klues
f8511877e2
Add regression test for CPUManager distribute NUMA algorithm
...
We witnessed this exact allocation attempt in a live cluster and witnessed the
algorithm fail with an accounting error. This test was added to verify that
this case is now handled by the updates to the algorithm and that we don't
regress from it in the future.
"test" description="ensure previous failure encountered on live machine has been fixed (1/1)"
"combo remainderSet balance" combo=[2 4 6] remainderSet=[2 4 6] distribution=9 remainder=1 available=[14 2 4 4 0 3 4 1] balance=4.031
"combo remainderSet balance" combo=[2 4 6] remainderSet=[2 4] distribution=9 remainder=1 available=[0 3 4 1 14 2 4 4] balance=4.031
"combo remainderSet balance" combo=[2 4 6] remainderSet=[2 6] distribution=9 remainder=1 available=[1 14 2 4 4 0 3 4] balance=4.031
"combo remainderSet balance" combo=[2 4 6] remainderSet=[4 6] distribution=9 remainder=1 available=[1 3 4 0 14 2 4 4] balance=4.031
"combo remainderSet balance" combo=[2 4 6] remainderSet=[2] distribution=9 remainder=1 available=[4 0 3 4 1 14 2 4] balance=4.031
"combo remainderSet balance" combo=[2 4 6] remainderSet=[4] distribution=9 remainder=1 available=[3 4 0 14 2 4 4 1] balance=4.031
"combo remainderSet balance" combo=[2 4 6] remainderSet=[6] distribution=9 remainder=1 available=[1 13 2 4 4 1 3 4] balance=3.606
"bestCombo found" distribution=9 bestCombo=[2 4 6] bestRemainder=[6]
Signed-off-by: Kevin Klues <kklues@nvidia.com >
2021-11-24 20:49:58 +00:00
Kevin Klues
e284c74d93
Add unit test for CPUManager distribute NUMA algorithm verifying fixes
...
Before Change:
"test" description="ensure bestRemainder chosen with NUMA nodes that have enough CPUs to satisfy the request"
"combo remainderSet balance" combo=[0 1 2 3] remainderSet=[0 1] distribution=8 remainder=2 available=[-1 -1 0 6] balance=2.915
"combo remainderSet balance" combo=[0 1 2 3] remainderSet=[0 2] distribution=8 remainder=2 available=[-1 0 -1 6] balance=2.915
"combo remainderSet balance" combo=[0 1 2 3] remainderSet=[0 3] distribution=8 remainder=2 available=[5 -1 0 0] balance=2.345
"combo remainderSet balance" combo=[0 1 2 3] remainderSet=[1 2] distribution=8 remainder=2 available=[0 -1 -1 6] balance=2.915
"combo remainderSet balance" combo=[0 1 2 3] remainderSet=[1 3] distribution=8 remainder=2 available=[0 -1 0 5] balance=2.345
"combo remainderSet balance" combo=[0 1 2 3] remainderSet=[2 3] distribution=8 remainder=2 available=[0 0 -1 5] balance=2.345
"bestCombo found" distribution=8 bestCombo=[0 1 2 3] bestRemainder=[0 3]
--- FAIL: TestTakeByTopologyNUMADistributed (0.01s)
--- FAIL: TestTakeByTopologyNUMADistributed/ensure_bestRemainder_chosen_with_NUMA_nodes_that_have_enough_CPUs_to_satisfy_the_request (0.00s)
cpu_assignment_test.go:867: unexpected error [accounting error, not enough CPUs allocated, remaining: 1]
After Change:
"test" description="ensure bestRemainder chosen with NUMA nodes that have enough CPUs to satisfy the request"
"combo remainderSet balance" combo=[0 1 2 3] remainderSet=[3] distribution=8 remainder=2 available=[0 0 0 4] balance=1.732
"bestCombo found" distribution=8 bestCombo=[0 1 2 3] bestRemainder=[3]
SUCCESS
Signed-off-by: Kevin Klues <kklues@nvidia.com >
2021-11-24 20:45:37 +00:00
Kevin Klues
031f11513d
Fix accounting bug in CPUManager distribute NUMA policy
...
Without this fix, the algorithm may decide to allocate "remainder" CPUs from a
NUMA node that has no more CPUs to allocate. Moreover, it was only considering
allocation of remainder CPUs from NUMA nodes such that each NUMA node in the
remainderSet could only allocate 1 (i.e. 'cpuGroupSize') more CPUs. With these
two issues in play, one could end up with an accounting error where not enough
CPUs were allocated by the time the algorithm runs to completion.
The updated algorithm will now omit any NUMA nodes that have 0 CPUs left from
the set of NUMA nodes considered for allocating remainder CPUs. Additionally,
we now consider *all* combinations of nodes from the remainder set of size
1..len(remainderSet). This allows us to find a better solution if allocating
CPUs from a smaller set leads to a more balanced allocation. Finally, we loop
through all NUMA nodes 1-by-1 in the remainderSet until all rmeainer CPUs have
been accounted for and allocated. This ensure that we will not hit an
accounting error later on because we explicitly remove CPUs from the remainder
set until there are none left.
A follow-on commit adds a set of unit tests that will fail before these
changes, but succeeds after them.
Signed-off-by: Kevin Klues <kklues@nvidia.com >
2021-11-24 19:18:11 +00:00
Kevin Klues
5317a2e2ac
Fix error handling in CPUManager distribute NUMA tests
...
Signed-off-by: Kevin Klues <kklues@nvidia.com >
2021-11-24 16:51:31 +00:00