Commit Graph

10234 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
b0254c8a0b
Merge pull request #108758 from fengzixu/improvement-volume-health
re-push "add volume kubelet_volume_stats_health_abnormal to kubelet #105585"
2022-03-29 17:35:34 -07:00
Kubernetes Prow Robot
1266744002
Merge pull request #108693 from gnufied/enable-rwx-call-all-nodes
Enable node-expansion to be called on all nodes for RWX volumes
2022-03-29 17:35:05 -07:00
Shiming Zhang
61b3c028ba
Field status.hostIPs added for Pod (#101566)
* Add FeatureGate PodHostIPs

* Add HostIPs field and update PodIPs field

* Types conversion

* Add dropDisabledStatusFields

* Add HostIPs for kubelet

* Add fuzzer for PodStatus

* Add status.hostIPs in ConvertDownwardAPIFieldLabel

* Add status.hostIPs in validEnvDownwardAPIFieldPathExpressions

* Downward API support for status.hostIPs

* Add DownwardAPI validation for status.hostIPs

* Add e2e to check that hostIPs works

* Add e2e to check that Downward API works

* Regenerate
2022-03-29 11:46:07 -07:00
Kubernetes Prow Robot
6c96ac04ff
Merge pull request #101218 from gjkim42/add-taint-toleration-check
kubelet: check taint/toleration before accepting pods
2022-03-29 09:16:56 -07:00
Kir Kolyshkin
37761a329e
pkg/kubelet: changes to update runc to 1.1.0
The changes (mostly in pkg/kubelet/cm) are there to adopt changed
runc 1.1 API, and simplify things a bit. In particular:

1. simplify cgroup manager instantiation, using a new, easier way of
   libcontainers/cgroups/manager.New;

2. replace libcontainerAdapter with a boolean variable (all it did
   was passing on whether systemd manager should be used);

3. trivial change due to removed cgroupfs.HugePageSizes and added
    cgroups.HugePageSizes();

4. do not calculate cgroup paths in update / destroy, since libcontainer
   cgroup managers now calculate the paths upon creation (previously,
   they were doing that only in Apply, so using e.g. Set or Destroy right
   after creation was impossible without specifying paths).

We currently still calculate cgroup paths in Exists -- this is to be
addressed separately.

Co-Authored-By: Elana Hashman <ehashman@redhat.com>
2022-03-28 16:23:20 -07:00
Kubernetes Prow Robot
4fdca04f35
Merge pull request #109059 from danwinship/kube-iptables-hint
Create a KUBE-IPTABLES-HINT chain
2022-03-28 15:24:04 -07:00
Hemant Kumar
dee48d3c36 Add more tests for volume recovery cases 2022-03-28 11:59:43 -04:00
Hemant Kumar
a99466ca86 check existing size before querying new size from api-server 2022-03-28 11:32:49 -04:00
Hemant Kumar
1809094389 address review comments for rwx volume types 2022-03-28 11:32:49 -04:00
Hemant Kumar
ed217f4140 rename SetVolumeSize to InitializeVolumeSize 2022-03-28 11:32:49 -04:00
Hemant Kumar
4d52dbb9f8 Remove legacyCallNodeExpandOnPlugin when RecoverVolumeExpansionFailure 2022-03-28 11:32:49 -04:00
Hemant Kumar
7a43406138 Do not update PVC if it already has updated size 2022-03-28 11:32:49 -04:00
Hemant Kumar
c0fbd83cde Fix code for desired state of the world populator 2022-03-28 11:32:49 -04:00
Hemant Kumar
e4f62d6c41 Modify code to use new interface functions 2022-03-28 11:32:49 -04:00
Hemant Kumar
2e54686f1b Add a function to record volume size in dsow 2022-03-28 11:32:49 -04:00
Hemant Kumar
10f91a9951 Refactor volume attach code 2022-03-28 11:32:49 -04:00
Hemant Kumar
6eea80ec97 Record size of volume in desired and actual state of the world 2022-03-28 11:32:49 -04:00
Kubernetes Prow Robot
dbd37cb8a8
Merge pull request #108831 from waynepeking348/skip_re_allocate_logic_if_pod_id_already_removed
skip re-allocate logic if pod is already removed to avoid panic
2022-03-27 11:37:21 -07:00
Dan Winship
edbce228cb Create a KUBE-IPTABLES-HINT chain for other components
Components that run in a container but modify the host network
namespace iptables rules need to know whether the system is using
iptables-legacy or iptables-nft. Given that kubelet will run before
any container-based components, it is well-positioned to help them
figure this out. So create a chain with a well-known name that they
can look for.
2022-03-27 14:12:36 -04:00
Kubernetes Prow Robot
d796dd7d0f
Merge pull request #108193 from utkarsh348/myfeature
Fixed race condition in test manager shutdown
2022-03-27 05:55:21 -07:00
waynepeking348
6157d3cc4a skip deleted activePods and return nil 2022-03-27 20:35:09 +08:00
fengzixu
38d8aae408 fix: add nil check 2022-03-27 08:38:20 +00:00
Dan Winship
749df8e022 Move iptables consts to kubelet_network_linux.go. 2022-03-26 11:22:51 -04:00
Kubernetes Prow Robot
c239b406f0
Merge pull request #108929 from gnufied/move-expansion-feature-gate-ga
Move all volume expansion feature gates to GA
2022-03-25 18:08:16 -07:00
Benjamin Jorand
3c65728ede
kubelet: fix panic triggered when playing with a wip CRI 2022-03-26 00:23:35 +01:00
Kubernetes Prow Robot
ea006f5246
Merge pull request #108531 from tallclair/redirects
Don't follow redirects with spdy
2022-03-25 15:34:23 -07:00
Kubernetes Prow Robot
e7845861a5
Merge pull request #108986 from gnufied/use-temp-dir-shutdown-tests
Use tempdir for shutdown tests
2022-03-25 05:17:51 -07:00
Kubernetes Prow Robot
68cf2a60c6
Merge pull request #108847 from adisky/update-credential-api
Move kubelet credential provider feature flag to beta and update the api's
2022-03-24 20:05:53 -07:00
Aditi Sharma
ed16ef2206 Move feature flag credential provider to beta
Signed-off-by: Aditi Sharma <adi.sky17@gmail.com>
2022-03-24 22:43:38 +05:30
Hemant Kumar
13b34d9c77 Use tempdir for shutdown tests 2022-03-24 11:58:49 -04:00
Hemant Kumar
cdfb841a52 remove ExpandInUsePersistentVolume feature gate 2022-03-24 11:19:42 -04:00
Hemant Kumar
966e1b6dd0 Fix code to not use the feature gate 2022-03-24 10:37:49 -04:00
Patrick Ohly
edffc700a4 enhance and fix log calls
Some of these changes are cosmetic (repeatedly calling klog.V instead of
reusing the result), others address real issues:

- Logging a message only above a certain verbosity threshold without
  recording that verbosity level (if klog.V().Enabled() { klog.Info... }):
  this matters when using a logging backend which records the verbosity
  level.

- Passing a format string with parameters to a logging function that
  doesn't do string formatting.

All of these locations where found by the enhanced logcheck tool from
https://github.com/kubernetes/klog/pull/297.

In some cases it reports false positives, but those can be suppressed with
source code comments.
2022-03-24 11:13:50 +01:00
Kubernetes Prow Robot
190f974dd8
Merge pull request #108902 from kolyshkin/bump-golangci-lint
Fix verify:* after go 1.18 upgrade
2022-03-24 02:59:06 -07:00
Kubernetes Prow Robot
22db936de3
Merge pull request #107750 from shiftstack/issues/cloud-provider/56
Prefer user-provided node IP
2022-03-24 02:58:42 -07:00
Kubernetes Prow Robot
68a0fccfb9
Merge pull request #108363 from houjun41544/20220226-kubeletvolume
Fix error logging statement to make it easier to understand
2022-03-23 22:30:52 -07:00
Kubernetes Prow Robot
3a2509b60e
Merge pull request #108841 from tengqm/fix-kubeletcfg-docstring
Fix doc strings for kubelet config APIs
2022-03-23 13:22:27 -07:00
Kubernetes Prow Robot
2f7d53bbf1
Merge pull request #108442 from NikhilSharmaWe/volMan
Managing nil pointer in VolumeManager
2022-03-23 13:21:55 -07:00
Kubernetes Prow Robot
75b19b242c
Merge pull request #108597 from kolyshkin/prepare-for-runc-1.1
kubelet/cm: refactor, prepare for runc 1.1 bump
2022-03-23 11:20:30 -07:00
Kir Kolyshkin
4513de06a8 Regen mocks using go 1.18
Generated by ./hack/update-mocks.sh using go 1.18

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2022-03-23 10:19:38 -07:00
Kubernetes Prow Robot
a6e65a246c
Merge pull request #107986 from wzshiming/promote/shutdown-based-on-pod-priority
Promote graceful shutdown based on pod priority to beta
2022-03-23 08:06:09 -07:00
Kubernetes Prow Robot
df98f75e93
Merge pull request #107845 from smarterclayton/wait_on_create
kubelet: If the container status is created, we are waiting
2022-03-22 12:21:59 -07:00
Kubernetes Prow Robot
41501c4fcf
Merge pull request #108704 from MartinForReal/feat/add_bootid_for_windows
Add bootid support for windows node.
2022-03-22 10:36:11 -07:00
Matthew Booth
928a5db93b
cloud-provider handles kubelet's --node-ip
When using a legacy cloud provider, if kubelet is passed a node address
in --node-ip it will use this address in preference out the the
addresses by the cloud provider.

When using an external cloud provider, kubelet will annotate the Node
with the first --node-ip for use by the cloud provider. The cloud
provider validates this annotation but does not otherwise use it,
meaning that --node-ip has no effect.

This change moves the node address filtering code from kubelet to
component-helpers and updates both kubelet and cloud-provider to use it.
There is no functional change to kubelet, but cloud-provider now honours
kubelet's --node-ip.
2022-03-22 16:58:37 +00:00
Nikhil Sharma
4224b524d5 Managing nil pointer in VolumeManager 2022-03-22 22:04:24 +05:30
ZhangKe10140699
f69fb544fa filter out terminated containers in cadvisor_stats_provider 2022-03-22 16:36:46 +08:00
waynepeking348
51883193c5 keep original cpu limit logic to be in line with the comments 2022-03-22 10:09:10 +08:00
waynepeking348
0b6d27002f fix test cases for cpu shares 2022-03-21 21:54:32 +08:00
Qiming Teng
4567032b5f Fix doc strings for kubelet config APIs 2022-03-21 16:35:21 +08:00
waynepeking348
4c87589300 fix bugs of container cpu shares when cpu request set to zero 2022-03-20 21:53:22 +08:00
waynepeking348
35a456b0c6 skip reallocate logic if pod is already removed 2022-03-20 21:09:47 +08:00
MartinForReal
d529b7e10b add bootid support for windows node.
Signed-off-by: MartinForReal <fanshangxiang@gmail.com>
2022-03-18 02:17:52 +00:00
Kubernetes Prow Robot
56062f7f4f
Merge pull request #108010 from endocrimes/dani/eviction-flake
eviction: Deflake TestStart
2022-03-17 12:22:54 -07:00
Kubernetes Prow Robot
9e50a332d8
Merge pull request #108366 from smarterclayton/terminating_not_terminated
Delay writing a terminal phase until the pod is terminated
2022-03-17 08:29:21 -07:00
Kubernetes Prow Robot
a504daa048
Merge pull request #108441 from pacoxu/pod-overload-ga
mark PodOverhead to GA in v1.24; remove in v1.26
2022-03-17 06:33:22 -07:00
Kubernetes Prow Robot
ba1c42892f
Merge pull request #100424 from yangjunmyfm192085/run-test30
Add test cases of kubelet_pods_test.go.
2022-03-17 00:41:19 -07:00
Kubernetes Prow Robot
5cb6fab8f6 Merge pull request #105585 from fengzixu/improvement-volume-health
add volume kubelet_volume_stats_health_abnormal to kubelet
2022-03-17 01:32:38 +00:00
fengzixu
7d675381f8 fix: fix panic bug when volumeHealthStatus is nil 2022-03-17 01:32:24 +00:00
Paco Xu
acd696266e mark PodOverhead to GA in v1.24; remove in v1.26 2022-03-17 09:30:14 +08:00
David Porter
c70f1955c4
test: Add E2E for job completions with cpu reservation
Create an E2E test that creates a job that spawns a pod that should
succeed. The job reserves a fixed amount of CPU and has a large number
of completions and parallelism. Use to repro github.com/kubernetes/kubernetes/issues/106884

Signed-off-by: David Porter <david@porter.me>
2022-03-16 13:15:03 -04:00
Clayton Coleman
69a3820214
kubelet: Delay writing a terminal phase until the pod is terminated
Other components must know when the Kubelet has released critical
resources for terminal pods. Do not set the phase in the apiserver
to terminal until all containers are stopped and cannot restart.

As a consequence of this change, the Kubelet must explicitly transition
a terminal pod to the terminating state in the pod worker which is
handled by returning a new isTerminal boolean from syncPod.

Finally, if a pod with init containers hasn't been initialized yet,
don't default container statuses or not yet attempted init containers
to the unknown failure state.
2022-03-16 13:15:00 -04:00
Maciej Borsz
aa95513982
Revert "add volume kubelet_volume_stats_health_abnormal to kubelet" 2022-03-16 13:44:09 +01:00
Shiming Zhang
ced991cb00 Emit Metrics in the shutdown process 2022-03-16 10:14:55 +08:00
Kubernetes Prow Robot
096cd9df63
Merge pull request #108699 from xing-yang/update_owners
Update sig-storage owners files
2022-03-15 14:28:00 -07:00
Kubernetes Prow Robot
1a5abe5d1f
Merge pull request #105585 from fengzixu/improvement-volume-health
add volume kubelet_volume_stats_health_abnormal to kubelet
2022-03-15 05:58:11 -07:00
Kubernetes Prow Robot
7858fc93e5
Merge pull request #108004 from equinix-ms/kubelet-include-oommetrics
kubelet: expose OOM metrics
2022-03-14 23:14:13 -07:00
xing-yang
aae1f2c476 Update sig-storage owners file 2022-03-14 18:57:52 +00:00
chymy
5374f6fad8 Fix comment typo
Signed-off-by: chymy <chang.min1@zte.com.cn>
2022-03-14 16:53:29 +08:00
chymy
7ed6fa7b2e Method call 'err.Error()' might lead to a nil pointer dereference for pkg/kubelet/cm/cpumanager/cpu_assignment_test.go
Signed-off-by: chymy <chang.min1@zte.com.cn>
2022-03-14 16:35:11 +08:00
Shiming Zhang
a1fadab4b0 Atomic write status file 2022-03-11 17:50:33 +08:00
Shiming Zhang
4aed18935e Add test for storage 2022-03-11 17:31:10 +08:00
Shiming Zhang
5eb3e88f6b Support metrics for node shutdown 2022-03-11 17:31:10 +08:00
Kubernetes Prow Robot
c227403973
Merge pull request #108568 from stevekuznetsov/skuznets/verbose-error
kubelet: cgroups: be verbose about validation
2022-03-10 11:59:07 -08:00
Steve Kuznetsov
8f2bc39f72
kubelet: cgroups: be verbose about validation
Previously, callers of `Exists()` would not know why the cGroup was or
was not existing. In one call-site in particular, the `kubelet` would
entirely fail to start if the cGroup validation did not succeed. In
these cases we MUST explain what went wrong and pass that information
clearly to the caller. Previously, some but not all of the reasons for
invalidation were logged at a low log-level instead. This led to poor
UX.

The original method was retained on the interface so as to make this
diff small.

Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
2022-03-10 07:25:33 -08:00
Kubernetes Prow Robot
98ada45442
Merge pull request #108402 from Shoothzj/fix-typo-in-watch_based_manager_test
Fix typo in watch_based_manager_test
2022-03-08 20:04:21 -08:00
Kir Kolyshkin
de5a69d847 pkg/kubelet/cm: fix potential nil dereference in enforceExistingCgroup
Move the rl == nil check to before we dereference it.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2022-03-08 17:05:46 -08:00
Kir Kolyshkin
9652d0cedc pkg/kubelet/cm: move common code to libctCgroupConfig
Instead of doing (almost) the same thing from the three different
methods (Create, Update, Destroy), move the functionality to
libctCgroupConfig, replacing updateSystemdCgroupInfo.

The needResources bool is needed because we do not need resources
during Destroy, so we skip the unneeded resource conversion.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2022-03-08 17:05:46 -08:00
Kir Kolyshkin
11b0d57c93 pkg/kubelet/cm/cgroup_manager: simplify setting hugetlb
Commit 79be8be10e made hugetlb settings optional if cgroup v2 is used and
hugetlb is not available, fixing issue 92933. Note at that time this was only
needed for v2, because for v1 the resources were set one-by-one, and only for
supported resources.

Commit d312ef7eb6 switched the code to using Set from runc/libcontainer
cgroups manager, and expanded the check to cgroup v1 as well.

Move this check earlier, to inside m.toResources, so instead of
converting all hugetlb resources from ResourceConfig to libcontainers's
Resources.HugetlbLimit, and then setting it to nil, we can skip the
conversion entirely if hugetlb is not supported, thus not doing the work
that is not needed.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2022-03-08 17:05:46 -08:00
Kir Kolyshkin
59148e22d0 pkg/kubelet/cm: rm dup code
Commit ecd6361f added setting PidsLimit to Create and Update.

Commit bce9d5f2 added setting PidsLimit to m.toResources.

Now, PidsLimit is assigned twice.

Remove the duplicate.

Fixes: bce9d5f2
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2022-03-08 17:05:46 -08:00
Kir Kolyshkin
a673b64864 kubelet/cm: speed up cgroup creation
There's no need to call m.Update (which will create another instance of
libcontainer cgroup manager, convert all the resources and then set
them). All this is already done here, except for Set().

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2022-03-08 17:05:46 -08:00
Kubernetes Prow Robot
29ed12e76b
Merge pull request #108527 from ddebroy/instrumentedgc1
Pass instrumented runtime service to containerGC
2022-03-08 10:24:49 -08:00
Deep Debroy
023d6fb8f4 Pass instrumented runtime service to containergc
Signed-off-by: Deep Debroy <ddebroy@gmail.com>
2022-03-08 14:33:37 +00:00
Tim Allclair
e1069c6495 Don't follow redirects with spdy 2022-03-04 16:08:58 -08:00
Kubernetes Prow Robot
5d6ef39406
Merge pull request #96004 from serathius/datapolicy-kubelet-pkg
Add datapolicy tags to  pkg/kubelet/
2022-03-04 15:34:51 -08:00
Kubernetes Prow Robot
422001df8b
Merge pull request #108154 from klueska/fix-topology-manager
Update TopologyManager algorithm for selecting "best" non-preferred hint
2022-03-02 04:13:13 -08:00
Kubernetes Prow Robot
604ab4fc6c
Merge pull request #108340 from ArangoGutierrez/misspelled/1
Fix typo in pkg/kubelet/pluginmanager/cache/actual_state_of_world
2022-03-01 15:45:55 -08:00
Kubernetes Prow Robot
5d6a793221
Merge pull request #96828 from panjf2000/opt-epoll-eventfd
kubelet/eviction: eliminate redundant allocations when handling eventfd
2022-03-01 13:59:54 -08:00
Kubernetes Prow Robot
0e8e307567
Merge pull request #106570 from odinuge/fix-cpu-shares-on-big-systems
Fix cpu share issues on systems with large amounts of cpu
2022-03-01 10:15:55 -08:00
Kevin Klues
e370b7335c Add extensive unit testing for TopologyManager hint generation algorithm
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-03-01 17:30:24 +00:00
Kevin Klues
99c57828ce Update TopologyManager algorithm for selecting "best" non-preferred hint
For the 'single-numa' and 'restricted' TopologyManager policies, pods are only
admitted if all of their containers have perfect alignment across the set of
resources they are requesting. The best-effort policy, on the other hand, will
prefer allocations that have perfect alignment, but fall back to a non-preferred
alignment if perfect alignment can't be achieved.

The existing algorithm of how to choose the best hint from the set of
"non-preferred" hints is fairly naive and often results in choosing a
sub-optimal hint. It works fine in cases where all resources would end up
coming from a single NUMA node (even if its not the same NUMA nodes), but
breaks down as soon as multiple NUMA nodes are required for the "best"
alignment.  We will never be able to achieve perfect alignment with these
non-preferred hints, but we should try and do something more intelligent than
simply choosing the hint with the narrowest mask.

In an ideal world, we would have the TopologyManager return a set of
"resources-relative" hints (as opposed to a common hint for all resources as is
done today). Each resource-relative hint would indicate how many other
resources could be aligned to it on a given NUMA node, and a  hint provider
would use this information to allocate its resources in the most aligned way
possible. There are likely some edge cases to consider here, but such an
algorithm would allow us to do partial-perfect-alignment of "some" resources,
even if all resources could not be perfectly aligned.

Unfortunately, supporting something like this would require a major redesign to
how the TopologyManager interacts with its hint providers (as well as how those
hint providers make decisions based on the hints they get back).

That said, we can still do better than the naive algorithm we have today, and
this patch provides a mechanism to do so.

We start by looking at the set of hints passed into the TopologyManager for
each resource and generate a list of the minimum number of NUMA nodes required
to satisfy an allocation for a given resource. Each entry in this list then
contains the 'minNUMAAffinity.Count()' for a given resources. Once we have this
list, we find the *maximum* 'minNUMAAffinity.Count()' from the list and mark
that as the 'bestNonPreferredAffinityCount' that we would like to have
associated with whatever "bestHint" we ultimately generate. The intuition being
that we would like to (at the very least) get alignment for those resources
that *require* multiple NUMA nodes to satisfy their allocation. If we can't
quite get there, then we should try to come as close to it as possible.

Once we have this 'bestNonPreferredAffinityCount', the algorithm proceeds as
follows:

If the mergedHint and bestHint are both non-preferred, then try and find a hint
whose affinity count is as close to (but not higher than) the
bestNonPreferredAffinityCount as possible. To do this we need to consider the
following cases and react accordingly:

  1. bestHint.NUMANodeAffinity.Count() >  bestNonPreferredAffinityCount
  2. bestHint.NUMANodeAffinity.Count() == bestNonPreferredAffinityCount
  3. bestHint.NUMANodeAffinity.Count() <  bestNonPreferredAffinityCount

For case (1), the current bestHint is larger than the
bestNonPreferredAffinityCount, so updating to any narrower mergeHint is
preferred over staying where we are.

For case (2), the current bestHint is equal to the
bestNonPreferredAffinityCount, so we would like to stick with what we have
*unless* the current mergedHint is also equal to bestNonPreferredAffinityCount
and it is narrower.

For case (3), the current bestHint is less than bestNonPreferredAffinityCount,
so we would like to creep back up to bestNonPreferredAffinityCount as close as
we can. There are three cases to consider here:

  3a. mergedHint.NUMANodeAffinity.Count() >  bestNonPreferredAffinityCount
  3b. mergedHint.NUMANodeAffinity.Count() == bestNonPreferredAffinityCount
  3c. mergedHint.NUMANodeAffinity.Count() <  bestNonPreferredAffinityCount

For case (3a), we just want to stick with the current bestHint because choosing
a new hint that is greater than bestNonPreferredAffinityCount would be
counter-productive.

For case (3b), we want to immediately update bestHint to the current
mergedHint, making it now equal to bestNonPreferredAffinityCount.

For case (3c), we know that *both* the current bestHint and the current
mergedHint are less than bestNonPreferredAffinityCount, so we want to choose
one that brings us back up as close to bestNonPreferredAffinityCount as
possible. There are three cases to consider here:

  3ca. mergedHint.NUMANodeAffinity.Count() >  bestHint.NUMANodeAffinity.Count()
  3cb. mergedHint.NUMANodeAffinity.Count() <  bestHint.NUMANodeAffinity.Count()
  3cc. mergedHint.NUMANodeAffinity.Count() == bestHint.NUMANodeAffinity.Count()

For case (3ca), we want to immediately update bestHint to mergedHint because
that will bring us closer to the (higher) value of
bestNonPreferredAffinityCount.

For case (3cb), we want to stick with the current bestHint because choosing the
current mergedHint would strictly move us further away from the
bestNonPreferredAffinityCount.

Finally, for case (3cc), we know that the current bestHint and the current
mergedHint are equal, so we simply choose the narrower of the 2.

This patch implements this algorithm for the case where we must choose from a
set of non-preferred hints and provides a set of unit-tests to verify its
correctness.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-03-01 14:38:26 +00:00
ZhangJian He
d09947a5b5 Fix typo in watch_based_manager_test 2022-03-01 10:21:17 +08:00
Kubernetes Prow Robot
bef9d807a0
Merge pull request #108325 from pacoxu/donotReturnErrWhenPauseLose
do not return err when PodSandbox not exist
2022-02-28 18:15:46 -08:00
Kubernetes Prow Robot
e9ba9dc4e4
Merge pull request #107201 from pacoxu/add-metrics-volume-stats-cal
add VolumeStatCalDuration metrics for fsquato monitoring benchmark
2022-02-28 16:07:46 -08:00
Kevin Klues
f8601cb5a3 Refactor TopologyManager to be more explicit about bestHint calculation
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-02-28 20:30:01 +00:00
houjun
df099ed923 Fix error logging statement to make it easier to understand 2022-02-26 15:25:56 +08:00
Carlos Eduardo Arango Gutierrez
bbb8ef1d10
Fix typo in pkg/kubelet/pluginmanager/cache/actual_state_of_world
Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>
2022-02-24 16:20:24 -05:00
Kubernetes Prow Robot
06e107081e
Merge pull request #104732 from mengjiao-liu/remove-flag-experimental-check-node-capabilities-before-mount
kubelet: Remove the deprecated flag `--experimental-check-node-capabilities-before-mount`
2022-02-24 07:56:30 -08:00
jonyhy96
60cd896602 fix: pod worker test
Signed-off-by: jonyhy96 <hy352144278@gmail.com>
2022-02-24 16:35:33 +08:00
chenyw1990
e26df3594c do not return err when PodSandbox not exist
Co-authored-by: pacoxu <paco.xu@daocloud.io>
2022-02-24 14:58:39 +08:00
Kubernetes Prow Robot
08c31088c1
Merge pull request #106858 from cmssczy/add_RegisterWithTaints_validation_test
add kubelet config validation test for RegisterWithTaints
2022-02-23 12:51:58 -08:00
Kubernetes Prow Robot
eacbf87bfe
Merge pull request #108156 from jsafrane/rename-selinuxsupport
Rename SupportsSELinux to SELinuxRelabel
2022-02-22 20:12:20 -08:00
utkarsh348
eaee96efd3 Fixed race condition test manager shutdown 2022-02-18 11:20:02 +05:30
Kubernetes Prow Robot
2d2a7272fc
Merge pull request #107670 from 249043822/br-notfound
Suppress container not found errors in container runtime getPodStatuses
2022-02-16 10:00:37 -08:00
Jan Safranek
525b8e5cd6 Rename SupportsSELinux to SELinuxRelabel
The field in fact says that the container runtime should relabel a volume
when running a container with it, it does not say that the volume supports
SELinux. For example, NFS can support SELinux, but we don't want NFS
volumes relabeled, because they can be shared among several Pods.
2022-02-16 10:54:08 +01:00
KeZhang
3946d99904 Ignore container notfound error while getPodstatuses 2022-02-16 08:55:19 +08:00
Peter Hunt
0b7629d2cc kubelet/stats: add unit test for when container logs are found
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2022-02-15 16:34:54 -05:00
Peter Hunt
1c3357db76 kubelet/stats: take container log stats into account when checking ephemeral stats
this commit updates checkEphemeralStorage to be able to add container log stats, if applicable.

It also updates the old check when container log stats aren't found to be more accurate.
Specifically, this check previously worked because of a fluke programming accident:

according to this block in pkg/kubelet/stats/helper.go:113
```
if result.Rootfs != nil {
    rootfsUsage := *cfs.BaseUsageBytes
    result.Rootfs.UsedBytes = &rootfsUsage
}
```

BaseUsageBytes should be the value added, not TotalUsageBytes. However, since in this case
one also needs to account for the calculated log size, which is TotalUsageBytes - BaseUsageBytes
using TotalUsageBytes value accidentally worked.

Updating the case to use the correct value AND log offset fixes this accident and makes
the behavior more in line with what happens when calculating ephemeral storage.

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2022-02-15 16:30:25 -05:00
Kubernetes Prow Robot
efa5692c0b
Merge pull request #108045 from hakman/deprecate_pod-infra-container-image
Mark pod-infra-container-image flag as deprecated
2022-02-15 13:17:19 -08:00
Peter Hunt
ab0f274a6f kubelet/stats: update cadvisor stats provider with new log location
in https://github.com/kubernetes/kubernetes/pull/74441,
the namespace and name were added to the pod log location.

However, cAdvisor stats provider wasn't correspondingly updated.

since CRI-O uses cAdvisor stats provider by default, despite being a CRI implementation,
eviction with ephemeral storage and container logs doesn't work as expected, until now!

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2022-02-15 16:04:16 -05:00
Kubernetes Prow Robot
64e83a7e43
Merge pull request #107945 from saschagrunert/cri-verbose
Add support for CRI `verbose` fields
2022-02-14 17:58:12 -08:00
Ciprian Hacman
57638ae7a1 Mark pod-infra-container-image flag as deprecated
Signed-off-by: Ciprian Hacman <ciprian@hakman.dev>
2022-02-14 09:11:51 +02:00
Matthias Bertschy
9500ee9d9c container_manager: use oomScoreAdj instead of default when set 2022-02-12 15:23:13 +01:00
Kubernetes Prow Robot
1659924a97
Merge pull request #108070 from jsafrane/remove-selinux
Remove util/selinux package
2022-02-11 18:19:47 -08:00
Kubernetes Prow Robot
8580bbf7d7
Merge pull request #107594 from hakman/remove_container-runtime_logic
Clean up logic for deprecated flag --container-runtime in kubelet
2022-02-11 12:57:47 -08:00
Kubernetes Prow Robot
e24b5333e5
Merge pull request #108052 from klueska/fix-topology-manager
Fix bug in TopologyManager with merging hints when NUM_NUMA > 2
2022-02-11 07:37:34 -08:00
Jan Safranek
77aa06d0c8 Remove util/selinux package
The package says:

> the libcontainer SELinux package is only built for Linux, so it is
> necessary to have a NOP wrapper which is built for non-Linux platforms

This is not true, Kubernetes now imports
github.com/opencontainers/selinux/go-selinux and it has proper
multiplatform support (i.e. NOOP on non-Linux platforms).

Removing the whole package and calling go-selinux directly.
2022-02-11 15:20:35 +01:00
Kubernetes Prow Robot
7cfe0ca828
Merge pull request #107774 from calvin0327/fix-data-race
fix: data race when hijack klog
2022-02-10 23:32:15 -08:00
Cheng Xing
b152fa9b6c Remove verult from OWNERS files 2022-02-10 18:25:38 -08:00
Kevin Klues
155562dd2e Fix bug in TopologyManager with merging hints when NUM_NUMA > 2
Before this fix, hint permutations such as:

	permutation: [{11 true} {0101 true}]

Could result in merged hints of:

	mergedHint: {01 true}

This was possible because both hints in the permutation container a "preferred"
allocation (i.e. the full set of NUMA nodes set in the affinity bitmask are
*required* to satisfy the allocation). With this in place, the simplified logic
we had simply kept the merged hint as preferred as well.

However, what we really want is to ensure that the merged hint is only
preferred if *true* alignment of all resources is possible (i.e. if all hints
in the permutation are preferred AND their affinities are exactly equal).

The only exception to this is if *no* topology information is provided by a
given hint provider. In this case, we assume alignment doesn't matter and only
consider the resources that actually have hints provided for them.

This changes the semantics of permutations of the form:

	permutation: [{111 true} {011 true}]

To now result in the merged hint of:

	mergedHint: {011 false}

Instead of:

	mergedHint: {011 true}

This is arguably how it should always have been though (because a hint should
not be preferred if true alignment isn't possible), and two tests have had to
change to accomodate these new semantics.

This commit changes the merge function to implement the updated logic, adds a
test to verify it is functioning correctly, and updates the two tests mentioned
above to adjust to the new semantics.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-02-10 22:07:51 +00:00
Sascha Grunert
effbcd3a0a
Add support for CRI verbose fields
The remote runtime implementation now supports the `verbose` fields,
which are required for consumers like cri-tools to enable multi CRI
version support.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2022-02-10 17:12:26 +01:00
Ciprian Hacman
0819451ea6 Clean up logic for deprecated flag --container-runtime in kubelet
Signed-off-by: Ciprian Hacman <ciprian@hakman.dev>
2022-02-10 13:26:59 +02:00
Kubernetes Prow Robot
3b4a9cdfff
Merge pull request #108007 from endocrimes/dani/cm-remove-docker
cm: Remove legacy docker references
2022-02-10 03:23:47 -08:00
Gunju Kim
eb4cd9ab4e
Check taint/toleration before accepting pods, except for static pods 2022-02-10 19:39:26 +09:00
Kubernetes Prow Robot
518a3c2f70
Merge pull request #107108 from linxiulei/fix_pid
Read number of running processes from /proc/loadavg.
2022-02-10 01:15:47 -08:00
Kubernetes Prow Robot
40c2d04946
Merge pull request #107112 from linxiulei/fix_pidmax
Consider threads-max when deciding MaxPID.
2022-02-09 20:49:45 -08:00
Kubernetes Prow Robot
0dcd6eaa0d
Merge pull request #103934 from boenn/tainttoleration
De-duplicate predicate (known as filter now) logic shared in kubelet and scheduler
2022-02-09 16:53:46 -08:00
Kubernetes Prow Robot
8d01b02c60
Merge pull request #107096 from hakman/remove_non-masquerade-cidr
Remove deprecated flag --non-masquerade-cidr in kubelet
2022-02-08 12:42:50 -08:00
Danielle Lancashire
3630328fd9 eviction: Deflake TestStart
TestStart was previously flaky. In approx 100_000 local runs, it failed
about 70% of the time, and has been mentioned as a flaky unit test in
the past.

This flake was due to a race condition with the logic as written and the
go scheduler. UpdateThreshold calls `notifier.Start(events)` in a new Go
Routine, which is not guarunteed to be called immediately.

This meant that if `m.Start()` was called before `notifier.Start()`, the
test would fail, as the notifier would not have been started before the
4 events were processed and lock released.

Here, we update the test to more closely match the intended application
behaviour, and have events passed to the channel when `Start` is called
on the notifier.

This ensures that -Start gets called and additionally validates
that the correct channel is provided to the notifier.

Stop was never called previously, as it only gets called on a subsequent
call to UpdateThreshold. `AnyTimes()` hid that this did not occur.
2022-02-08 17:03:44 +01:00
Danielle Lancashire
c198062da4 cm: Remove legacy docker references
Dockershim and built-in Docker support are gone. Cleans up dead code
references to them.
2022-02-08 16:25:04 +01:00
Jorik Jonker
27b8f13763 kubelet: expose OOM metrics
cAdvisor has code to expose OOM metrics since 0.40.0, but this was not
included in Kubelet so far. This commit enables it.

Signed-off-by: Jorik Jonker <jorik.jonker@eu.equinix.com>
2022-02-08 12:24:25 +01:00
Jordan Liggitt
3a132bd206 Fix kubelet cri round trip test 2022-02-05 17:59:29 -05:00
Kubernetes Prow Robot
469c4c4a30
Merge pull request #106715 from aojea/dual_hostnet_pods
set secondary address on host-network pods
2022-02-04 12:17:30 -08:00
Antonio Ojea
bc8e7ac1a0 ignore CRI PodSandboxNetworkStatus for host network pods 2022-02-04 18:41:57 +01:00
Gunju Kim
3ce5c944a8
kubelet: Clean up a static pod that has been terminated before starting
- Allow a podWorker to start if it is blocked by a pod that has been
  terminated before starting
- When a pod can't start AND has already been terminated, exit cleanly
- Add a unit test that exercises race conditions in pod workers
2022-02-02 16:05:32 -05:00
Clayton Coleman
b638bd8b03 kubelet: If the container status is created, we are waiting
If CRI returns a container that has been created but is not running,
it is not safe to assume it is terminal, as our connection to CRI
may have failed. Instead, created is treated as waiting, as in
"waiting for this container to start". Either syncPod or
syncTerminatingPod is responsible for handling this state.
2022-01-28 18:32:15 -05:00
Jordan Liggitt
1d27942efc Include pod UID in secret/configmap cache key 2022-01-27 22:21:52 -05:00
Kubernetes Prow Robot
4dba52cdf4
Merge pull request #107821 from liggitt/kubelet-secret-manager
Move kubelet secret and configmap manager calls to sync_Pod functions
2022-01-27 08:26:58 -08:00
Jordan Liggitt
085693eff2 Move kubelet secret and configmap manager calls to sync_Pod functions 2022-01-27 10:09:13 -05:00
Kubernetes Prow Robot
8712a903cb
Merge pull request #107608 from marseel/fake_prober_in_kubemark
Use FakeProber in kubemark clusters
2022-01-26 19:42:49 -08:00
Jyoti Mahapatra
0e0abd602f
parse ipv6 address before comparison (#107736)
* parse ipv6 address before comparison

Signed-off-by: Jyoti Mahapatra <jyotima@amazon.com>

* use parse sloppy

Signed-off-by: Jyoti Mahapatra <jyotima@amazon.com>

* use parse sloppy

Signed-off-by: Jyoti Mahapatra <jyotima@amazon.com>

* use node address from cloudprovider as is

Signed-off-by: Jyoti Mahapatra <jyotima@amazon.com>
2022-01-26 18:38:49 -08:00
Marcel Zięba
b4b4b8fd6d Use FakeProber in kubemark clusters 2022-01-26 09:29:04 +00:00
Kubernetes Prow Robot
38e9a29620
Merge pull request #106932 from SergeyKanzhelev/removeDynamicKubeletConfig
Remove dynamic kubelet config
2022-01-25 19:20:25 -08:00
Ryan Phillips
25f95f2bde kubelet: fix podstatus not containing pod full name 2022-01-25 13:21:04 -06:00
calvin
d9ab5e18d3 fix: data race when hijack klog
Signed-off-by: calvin <wen.chen@daocloud.io>
2022-01-24 15:01:49 +08:00
fengzixu
9808ae48a0 change the volume health status metrics name 2022-01-23 02:44:10 +00:00
Jack
7655702313 add container probe duration metrics 2022-01-20 16:50:02 -08:00
yanghesong
4cab028a92 Remove dockershim comments in kubelet
Signed-off-by: yanghesong <hesong.yang@foxmail.com>
2022-01-20 16:15:29 +08:00
Sergey Kanzhelev
7e7bc6d53b remove DynamicKubeletConfig logic from kubelet 2022-01-19 22:38:04 +00:00
Paco Xu
6611c36372 add volume type and seperated histogram for volume stat collection 2022-01-19 22:33:37 +08:00
Ciprian Hacman
21809043b5 Remove deprecated flag --non-masquerade-cidr in kubelet
Signed-off-by: Ciprian Hacman <ciprian@hakman.dev>
2022-01-19 09:17:26 +02:00
Kubernetes Prow Robot
feb758027c
Merge pull request #106907 from cyclinder/remove_dockershim_flags
Clean up dockershim flags in the kubelet
2022-01-18 09:09:09 -08:00
Eric Lin
fea15977c8 Consider threads-max when deciding MaxPID.
Fixes kubernetes#107111
2022-01-17 21:51:59 +00:00
Antonio Ojea
a20b2088ac set secondary address on host-network pods
host-network pods IPs are obtained from the reported kubelet nodeIPs.

Historically, host-network podIPs are immutable once set, but when
we've added dual-stack support, we didn't consider that the secondary
IP address may not be present at the same time that the primary nodeIP.

If a secondary IP address is added to a node after the host-network pods
IPs are set, we can add the secondary host-network pod IP address
maintaining the current behavior of not updating the current podIPs on
host-network pods.
2022-01-17 18:05:42 +01:00
Paco Xu
e3745a10aa add warning log if volume calculation took too long than 1 second 2022-01-17 10:40:49 +08:00
Kubernetes Prow Robot
22a03f893d
Merge pull request #107207 from ehashman/deprecate-log-sanitization
Deprecate dynamic log sanitization
2022-01-15 15:19:26 -08:00
songlh
50840f5039 change to use require.NoError 2022-01-14 21:46:12 -05:00
cyclinder
07999dac70 Clean up dockershim flags in the kubelet
Signed-off-by: cyclinder <qifeng.guo@daocloud.io>
Co-authored-by: Ciprian Hacman <ciprian@hakman.dev>
Signed-off-by: Ciprian Hacman <ciprian@hakman.dev>
2022-01-14 16:02:50 +02:00
Kubernetes Prow Robot
8c6b910e68
Merge pull request #107550 from wojtek-t/remove_selflink_from_kubelet
Remove no-longer used selflink code from kubelet
2022-01-14 03:28:27 -08:00
Wojciech Tyczyński
6088fe4221 Remove no-longer used selflink code from kubelet 2022-01-14 10:38:23 +01:00
Kubernetes Prow Robot
3bd422dc76
Merge pull request #107293 from dims/jan-1-owners-cleanup
Cleanup OWNERS files - Jan 2021 Week 1
2022-01-13 10:30:30 -08:00
JUN YANG
2247b76ab1 Add test cases of kubelet_pods_test.go.
Signed-off-by: JUN YANG <yang.jun22@zte.com.cn>
2022-01-13 14:37:31 +08:00
Patrick Ohly
9eaa2dc554 avoid klog Info calls without verbosity
In the following code pattern, the log message will get logged with v=0 in JSON
output although conceptually it has a higher verbosity:

   if klog.V(5).Enabled() {
       klog.Info("hello world")
   }

Having the actual verbosity in the JSON output is relevant, for example for
filtering out only the important info messages. The solution is to use
klog.V(5).Info or something similar.

Whether the outer if is necessary at all depends on how complex the parameters
are. The return value of klog.V can be captured in a variable and be used
multiple times to avoid the overhead for that function call and to avoid
repeating the verbosity level.
2022-01-12 07:48:36 +01:00
Kubernetes Prow Robot
b5103f6117
Merge pull request #107426 from yanghesong/remove_validate_runtime
Remove runtime in validate
2022-01-11 20:50:36 -08:00
Eric Lin
5fdf24baca Read number of running processes from /proc/loadavg.
Fallback to using sysinfo syscall if failed.

Fix kubernetes#107107
2022-01-11 21:33:53 +00:00
Kubernetes Prow Robot
cadbe8dfb5
Merge pull request #107250 from cndoit18/use-errors
cleanup(kubelet): use errors.Is(err, os.ErrProcessDone)
2022-01-11 10:49:01 -08:00
Kubernetes Prow Robot
19069665f9
Merge pull request #107094 from adisky/d-container-runtime
Mark container-runtime kubelet flag as deprecated
2022-01-11 10:48:46 -08:00
Kubernetes Prow Robot
7eb5046064
Merge pull request #106470 from qmloong/qmloong/fix
fix: some typos and syncPod outdated workflow annotation
2022-01-11 10:48:38 -08:00
Kubernetes Prow Robot
5f4914604d
Merge pull request #106353 from gjkim42/remove-false-pleg-errors
kubelet: Remove false PLEG errors
2022-01-11 10:48:26 -08:00
fengzixu
5d544d3f01 fix comment 2022-01-11 14:28:31 +00:00
fengzixu
f96449f2e2 fix unit test 2022-01-11 13:50:18 +00:00
fengzixu
e2b5b5465a improve metrics comment 2022-01-11 13:50:18 +00:00
fengzixu
c1a58d715c fix unit test 2022-01-11 13:50:18 +00:00
fengzixu
5593e27429 improve metrics comment 2022-01-11 13:50:18 +00:00
fengzixu
1cdc694ac2 fix unit test 2022-01-11 13:50:18 +00:00
fengzixu
4a72f08a28 add useful comment for volume stats metrics 2022-01-11 13:50:18 +00:00
fengzixu
b885deffe3 fix unit test 2022-01-11 13:50:17 +00:00
fengzixu
ed7fd0ced5 add volumeHealth label to metrics 2022-01-11 13:50:17 +00:00
fengzixu
bab1755274 fix: correct metrics expression 2022-01-11 13:50:17 +00:00
fengzixu
d71e21e01e add volume kubelet_volume_stats_health_abnormal to kubelet 2022-01-11 13:50:17 +00:00
Dingzhu Lurong
1de2f3cc7d add writer error handler 2022-01-11 11:47:25 +08:00
Kubernetes Prow Robot
a0dfd958d5
Merge pull request #107163 from cyclinder/fix_leak_goroutine
fix goroutine leaks in TestConfigurationChannels
2022-01-10 17:23:16 -08:00
Davanum Srinivas
9682b7248f
OWNERS cleanup - Jan 2021 Week 1
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2022-01-10 08:14:29 -05:00
cyclinder
928e686877 fix goroutine leaks in TestConfigurationChannels
Signed-off-by: cyclinder <qifeng.guo@daocloud.io>
2022-01-10 19:51:16 +08:00
yanghesong
6905fef761 Remove runtime in validate
Validate is useless as dockershim is removed

Signed-off-by: yanghesong <hesong.yang@foxmail.com>
2022-01-09 09:11:49 +08:00
wq
4f38d4aaa1 fix a typo in the comment of ImageCredentialProviderConfigFile 2022-01-09 00:07:43 +09:00
Kubernetes Prow Robot
d1a5513cb0
Merge pull request #107006 from gnufied/add-total-mount-time-metrics
Add metric for reporting total end-to-end mount time
2022-01-07 06:19:31 -08:00
Kubernetes Prow Robot
09fccc3533
Merge pull request #106796 from jonyhy96/fix-timer
kubelet: use newtimer instead in nodeshutdown manager
2022-01-06 11:47:12 -08:00
Kubernetes Prow Robot
03ee86c09c
Merge pull request #104837 from eggiter/fix-release-reused-cpus
fix(cpumanager): Do not release CPUs of init containers while they are being reused in app containers
2022-01-06 11:46:38 -08:00
Kubernetes Prow Robot
0b9ad84973
Merge pull request #107116 from yxxhero/add_more_msg_for_no_podsandbox_container
add more message for no PodSandbox container
2022-01-06 08:58:09 -08:00
Kubernetes Prow Robot
b457ae72f5
Merge pull request #106644 from ahrtr/add_info_counter_perfcounter
Add more info when failing to call PdhAddEnglishCounter
2022-01-06 06:45:01 -08:00
Aditi Sharma
e03d7d3fdd Mark container-runtime flag as deprecated
Signed-off-by: Aditi Sharma <adi.sky17@gmail.com>
2022-01-06 10:23:03 +05:30
Mengjiao Liu
beda4cafb6 kubelet: Remove the deprecated flag --experimental-check-node-capabilities-before-mount 2022-01-06 11:47:11 +08:00
Kubernetes Prow Robot
73b68f5233
Merge pull request #106979 from a2ush/fix_typo
Fix comment out typo (from resolve.conf to resolv.conf) and change the content name (from maxResolveConfLength to maxResolvConfLength)
2022-01-05 16:08:26 -08:00
Kubernetes Prow Robot
afd254a18f
Merge pull request #106756 from victory460/feature_helpers
code cleanup for container/helpers.go
2022-01-05 08:20:42 -08:00
Kubernetes Prow Robot
19591a1324
Merge pull request #105829 from yuanchen8911/master
Fix and improve comments on kubelet metrics
2022-01-04 23:02:32 -08:00
Kubernetes Prow Robot
abfbbe4dda
Merge pull request #107119 from hakman/remove_dockerless
Remove dockerless build tag and DockerLegacyService interface
2022-01-04 11:27:21 -08:00
Paco Xu
c5d8354e0e add "kubelet_volume_stat_cal_duration_seconds_bucket" VolumeStatCalDuration metrics for fsquato monitoring benchmark 2021-12-31 11:39:40 +08:00
cndoit18
601d02b90f
refactor(kubelet): use errors.Is(err, os.ErrProcessDone)
use errors.Is(err, os.ErrProcessDone) here and remove "process already finished" string comparison.

Signed-off-by: cndoit18 <cndoit18@outlook.com>
2021-12-29 18:10:06 +08:00
Elana Hashman
dbd50d9f50
Remove dynamic log sanitization fields from Kubelet config validation 2021-12-23 13:03:13 -08:00
Kubernetes Prow Robot
f0dbc32ed9
Merge pull request #106853 from gnufied/disable-exp-backoff-volume-not-inuse
When volume is not marked in-use, do not backoff
2021-12-22 19:46:37 -08:00
Hemant Kumar
7989f27044 use node informer to check volumes attachment status before backoff
fix unit tests
2021-12-20 11:57:05 -05:00
songlh
e03a0bc105 fixing the panic in TestVersion 2021-12-18 19:20:15 -05:00
Ciprian Hacman
5bae9b9288 Clean up DockerLegacyService interface
Signed-off-by: Ciprian Hacman <ciprian@hakman.dev>
2021-12-18 12:24:54 +02:00
Ciprian Hacman
6cdb1c225d Clean up dockerless build tag
Signed-off-by: Ciprian Hacman <ciprian@hakman.dev>
2021-12-18 12:18:25 +02:00
yxxhero
a90b149be0 add more message for no PodSandbox container
Signed-off-by: yxxhero <aiopsclub@163.com>
2021-12-18 09:52:03 +08:00
Davanum Srinivas
497e9c1971
Cleanup OWNERS files (No Activity in the last year)
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2021-12-15 10:34:02 -05:00
a2ush
393dec26f6 Change the name of the constant 2021-12-14 22:42:57 +09:00
Hemant Kumar
55b5e6dc33 Add metric for reporting total end-to-end mount time
This metric includes time spent in waiting for devices to be attached,
any RPC calls and performing recursive chown etc.
2021-12-13 16:23:01 -05:00
a2ush
d775483381 Fix comment out typo 2021-12-11 22:27:38 +09:00
Kubernetes Prow Robot
1d66302c42
Merge pull request #106458 from dims/lint-yaml-in-owners-files
Lint/Beautify yaml in OWNERS files
2021-12-10 06:39:12 -08:00
Kubernetes Prow Robot
1b0d83f1d6
Merge pull request #106599 from klueska/fix-numa-bug
Fix Bugs in CPUManager distribute NUMA policy option
2021-12-10 04:41:12 -08:00
haoyun
92fa957dd1 feat: use clock instead
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-12-10 13:59:12 +08:00
Kubernetes Prow Robot
15e5f2a19a
Merge pull request #106291 from sbs2001/fix_invalid_comment
Remove invalid comment in legacyregistry
2021-12-09 19:03:10 -08:00
Davanum Srinivas
9405e9b55e
Check in OWNERS modified by update-yamlfmt.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2021-12-09 21:31:26 -05:00
David Porter
95264a418d kubelet: set failed phase during graceful shutdown
Revert to previous behavior in 1.21/1.20 of setting pod phase to failed
during graceful node shutdown.

Setting pods to failed phase will ensure that external controllers that
manage pods like deployments will create new pods to replace those that
are shutdown. Many customers have taken a dependency on this behavior
and it was breaking change in 1.22, so this change reverts back to the
previous behavior.

Signed-off-by: David Porter <david@porter.me>
2021-12-09 13:17:40 -08:00
Kubernetes Prow Robot
cdf3ad823a
Merge pull request #97252 from dims/drop-dockershim
Completely remove in-tree dockershim from kubelet
2021-12-08 12:51:46 -08:00
Kubernetes Prow Robot
f356ae4ad9
Merge pull request #101719 from SergeyKanzhelev/removeReallyCrashForTesting
Remove ReallyCrashForTesting and cleaned up some references to Handle…
2021-12-07 23:39:45 -08:00
caozhiyuan
1a59bcb142 add validation test for RegisterWithTaints 2021-12-08 10:36:43 +08:00
Kubernetes Prow Robot
b685b3982d
Merge pull request #105360 from shuheiktgw/refactor_kubelet_config_validation_tests
Refactor kubelet config validation tests
2021-12-07 17:25:43 -08:00
Davanum Srinivas
bc78dff42e
update files to drop dockershim
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2021-12-07 15:15:13 -05:00
Davanum Srinivas
83265c9171
drop files deleted from pkg/kubelet/dockershim
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2021-12-07 15:15:13 -05:00
Hemant Kumar
5b7b2e2f6c When volume is not marked in-use, do not backoff 2021-12-07 11:50:15 -05:00
Sascha Grunert
a063a2ba3e
Revert dockershim CRI v1 changes
We should not touch the dockershim ahead of removal and therefore
default to `v1alpha2` CRI instead of `v1`.

Partially reverts changes from https://github.com/kubernetes/kubernetes/pull/106501

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2021-12-03 18:37:11 +01:00
xuweiwei
21238c2593 code cleanup for container/helpers.go 2021-12-01 11:17:33 +08:00
Sergey Kanzhelev
a11453efbc remove ReallyCrashForTesting and cleaned up some references to HandleCrash behavior 2021-11-29 20:00:10 +00:00
menglong.qi
12eff56460 fix: syncPod outdated workflow comment 2021-11-28 17:21:29 +08:00
boenn
cec2aae1e5 rebase master 2021-11-25 11:21:12 +08:00
Kevin Klues
f8511877e2 Add regression test for CPUManager distribute NUMA algorithm
We witnessed this exact allocation attempt in a live cluster and witnessed the
algorithm fail with an accounting error. This test was added to verify that
this case is now handled by the updates to the algorithm and that we don't
regress from it in the future.

"test" description="ensure previous failure encountered on live machine has been fixed (1/1)"
"combo remainderSet balance" combo=[2 4 6] remainderSet=[2 4 6] distribution=9 remainder=1 available=[14 2 4 4 0 3 4 1] balance=4.031
"combo remainderSet balance" combo=[2 4 6] remainderSet=[2 4] distribution=9 remainder=1 available=[0 3 4 1 14 2 4 4] balance=4.031
"combo remainderSet balance" combo=[2 4 6] remainderSet=[2 6] distribution=9 remainder=1 available=[1 14 2 4 4 0 3 4] balance=4.031
"combo remainderSet balance" combo=[2 4 6] remainderSet=[4 6] distribution=9 remainder=1 available=[1 3 4 0 14 2 4 4] balance=4.031
"combo remainderSet balance" combo=[2 4 6] remainderSet=[2] distribution=9 remainder=1 available=[4 0 3 4 1 14 2 4] balance=4.031
"combo remainderSet balance" combo=[2 4 6] remainderSet=[4] distribution=9 remainder=1 available=[3 4 0 14 2 4 4 1] balance=4.031
"combo remainderSet balance" combo=[2 4 6] remainderSet=[6] distribution=9 remainder=1 available=[1 13 2 4 4 1 3 4] balance=3.606
"bestCombo found" distribution=9 bestCombo=[2 4 6] bestRemainder=[6]

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-11-24 20:49:58 +00:00
Kevin Klues
e284c74d93 Add unit test for CPUManager distribute NUMA algorithm verifying fixes
Before Change:
"test" description="ensure bestRemainder chosen with NUMA nodes that have enough CPUs to satisfy the request"
"combo remainderSet balance" combo=[0 1 2 3] remainderSet=[0 1] distribution=8 remainder=2 available=[-1 -1 0 6] balance=2.915
"combo remainderSet balance" combo=[0 1 2 3] remainderSet=[0 2] distribution=8 remainder=2 available=[-1 0 -1 6] balance=2.915
"combo remainderSet balance" combo=[0 1 2 3] remainderSet=[0 3] distribution=8 remainder=2 available=[5 -1 0 0] balance=2.345
"combo remainderSet balance" combo=[0 1 2 3] remainderSet=[1 2] distribution=8 remainder=2 available=[0 -1 -1 6] balance=2.915
"combo remainderSet balance" combo=[0 1 2 3] remainderSet=[1 3] distribution=8 remainder=2 available=[0 -1 0 5] balance=2.345
"combo remainderSet balance" combo=[0 1 2 3] remainderSet=[2 3] distribution=8 remainder=2 available=[0 0 -1 5] balance=2.345
"bestCombo found" distribution=8 bestCombo=[0 1 2 3] bestRemainder=[0 3]

--- FAIL: TestTakeByTopologyNUMADistributed (0.01s)
    --- FAIL: TestTakeByTopologyNUMADistributed/ensure_bestRemainder_chosen_with_NUMA_nodes_that_have_enough_CPUs_to_satisfy_the_request (0.00s)
        cpu_assignment_test.go:867: unexpected error [accounting error, not enough CPUs allocated, remaining: 1]

After Change:
"test" description="ensure bestRemainder chosen with NUMA nodes that have enough CPUs to satisfy the request"
"combo remainderSet balance" combo=[0 1 2 3] remainderSet=[3] distribution=8 remainder=2 available=[0 0 0 4] balance=1.732
"bestCombo found" distribution=8 bestCombo=[0 1 2 3] bestRemainder=[3]

SUCCESS

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-11-24 20:45:37 +00:00
Kevin Klues
031f11513d Fix accounting bug in CPUManager distribute NUMA policy
Without this fix, the algorithm may decide to allocate "remainder" CPUs from a
NUMA node that has no more CPUs to allocate. Moreover, it was only considering
allocation of remainder CPUs from NUMA nodes such that each NUMA node in the
remainderSet could only allocate 1 (i.e. 'cpuGroupSize') more CPUs. With these
two issues in play, one could end up with an accounting error where not enough
CPUs were allocated by the time the algorithm runs to completion.

The updated algorithm will now omit any NUMA nodes that have 0 CPUs left from
the set of NUMA nodes considered for allocating remainder CPUs. Additionally,
we now consider *all* combinations of nodes from the remainder set of size
1..len(remainderSet). This allows us to find a better solution if allocating
CPUs from a smaller set leads to a more balanced allocation. Finally, we loop
through all NUMA nodes 1-by-1 in the remainderSet until all rmeainer CPUs have
been accounted for and allocated. This ensure that we will not hit an
accounting error later on because we explicitly remove CPUs from the remainder
set until there are none left.

A follow-on commit adds a set of unit tests that will fail before these
changes, but succeeds after them.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-11-24 19:18:11 +00:00
Kevin Klues
5317a2e2ac Fix error handling in CPUManager distribute NUMA tests
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-11-24 16:51:31 +00:00
Kevin Klues
dc4430b663 Add a sum() helper to the CPUManager cpuassignment logic
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-11-24 16:51:29 +00:00
Kevin Klues
cfacc22459 Allow the map.Values() function in the CPUManager to take a set of keys
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-11-24 16:51:28 +00:00
Kevin Klues
a160d9a8cd Fix CPUManager algo to calculate min NUMA nodes needed for distribution
Previously the algorithm was too restrictive because it tried to calculate the
minimum based on the number of *available* NUMA nodes and the number of
*available* CPUs on those NUMA nodes. Since there was no (easy) way to tell how
many CPUs an individual NUMA node happened to have, the average across them was
used. Using this value however, could result in thinking you need more NUMA
nodes to possibly satisfy a request than you actually do.

By using the *total* number of NUMA nodes and CPUs per NUMA node, we can get
the true minimum number of nodes required to satisfy a request. For a given
"current" allocation this may not be the true minimum, but its better to start
with fewer and move up than to start with too many and miss out on a better
option.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-11-24 16:51:26 +00:00
Kevin Klues
209cd20548 Fix unit tests following bug fix in CPUManager for map functions (2/2)
Now that the algorithm for balancing CPU distributions across NUMA nodes is
correct, this test actually behaves differently for the "packed" vs.
"distributed" allocation algorithms (as it should).

In the "packed" case we need to ensure that CPUs are allocated such that they
are packed onto cores. Since one CPU is already allocated from a core on NUMA
node 0, we want the next CPU to be its hyperthreaded pair (even though the
first available CPU id is on Socket 1).

In the "distributed" case, however, we want to ensure CPUs are allocated such
that we have an balanced distribution of CPUs across all NUMA nodes. This
points to allocating from Socket 1 if the only other CPU allocated has been
done on Socket 0.

To allow CPUs allocations to be packed onto full cores, one can allocate them
from the "distributed" algorithm with a 'cpuGroupSize' equal to the number of
hypthreads per core (in this case 2). We added an explicit test case for this,
demonstrating that we get the same result as the "packed" algorithm does, even
though the "distributed" algorithm is in use.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-11-24 16:51:24 +00:00
Kevin Klues
67f719cb1d Fix unit tests following bug fix in CPUManager for map functions (1/2)
This fixes two related tests to better test our "balanced" distribution algorithm.

The first test originally provided an input with the following number of CPUs
available on each NUMA node:

Node 0: 16
Node 1: 20
Node 2: 20
Node 3: 20

It then attempted to distribute 48 CPUs across them with an expectation that
each of the first 3 NUMA nodes would have 16 CPUs taken from them (leaving Node
0 with no more CPUs in the end).

This would have resulted in the following amount of CPUs on each node:

Node 0: 0
Node 1: 4
Node 2: 4
Node 3: 20

Which results in a standard deviation of 7.6811

However, a more balanced solution would actually be to pull 16 CPUs from NUMA
nodes 1, 2, and 3, and leave 0 untouched, i.e.:

Node 0: 16
Node 1: 4
Node 2: 4
Node 3: 4

Which results in a standard deviation of 5.1961524227066

To fix this test we changed the original number of available CPUs to start with
4 less CPUs on NUMA node 3, and 2 more CPUs on NUMA node 0, i.e.:

Node 0: 18
Node 1: 20
Node 2: 20
Node 3: 16

So that we end up with a result of:

Node 0: 2
Node 1: 4
Node 2: 4
Node 3: 16

Which pulls the CPUs from where we want and results in a standard deviation of 5.5452

For the second test, we simply reverse the number of CPUs available for Nodes 0
and 3 as:

Node 0: 16
Node 1: 20
Node 2: 20
Node 3: 18

Which forces the allocation to happen just as it did for the first test, except
now on NUMA nodes 1, 2, and 3 instead of NUMA nodes 0,1, and 2.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-11-24 16:51:23 +00:00
Kevin Klues
4008ea0b4c Fix bug in CPUManager map.Keys() and map.Values() implementations
Previously these would return lists that were too long because we appended to
pre-initialized lists with a specific size.

Since the primary place these functions are used is in the mean and standard
deviation calculations for the NUMA distribution algorithm, it meant that the
results of these calculations were often incorrect.

As a result, some of the unit tests we have are actually incorrect (because the
results we expect do not actually produce the best balanced
distribution of CPUs across all NUMA nodes for the input provided).

These tests will be patched up in subsequent commits.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-11-24 16:51:21 +00:00
Kevin Klues
446c58e0e7 Ensure we balance across *all* NUMA nodes in NUMA distribution algo
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-11-24 16:51:19 +00:00
Kevin Klues
c8559bc43e Short-circuit CPUManager distribute NUMA algo for unusable cpuGroupSize
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-11-24 16:51:16 +00:00
Kevin Klues
b28c1392d7 Round the CPUManager mean and stddev calculations to the nearest 1000th
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-11-24 16:51:13 +00:00
ahrtr
b7f22801fe add more info when failing to call PdhAddEnglishCounter 2021-11-24 13:49:34 +08:00
Kubernetes Prow Robot
ddfc53922c
Merge pull request #106414 from jonyhy96/kubelet-fix-flake
kubelet: fix npe in test
2021-11-19 07:06:51 -08:00
haoyun
65ac99eef5 fix: npe in kubelet test
Signed-off-by: haoyun <yun.hao@daocloud.io>
Co-authored-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
2021-11-19 17:44:05 +08:00
shuheiktgw
2acdaeb361 Refactor Kubelet config validation tests 2021-11-18 22:38:01 +09:00
shuheiktgw
35ad91ab37 Refactor Kubelet config validations 2021-11-18 22:31:31 +09:00
Shivam Sandbhor
6652c54d83 Remove invalid comment in legacyregistry
Signed-off-by: Shivam Sandbhor <shivam.sandbhor@gmail.com>
2021-11-18 15:05:00 +05:30
Kubernetes Prow Robot
d766ab88f7
Merge pull request #106501 from ehashman/cri-graduation-v1
Make CRI v1 the default and allow a fallback to v1alpha2
2021-11-17 19:57:01 -08:00
Kubernetes Prow Robot
91b7fb4dc9
Merge pull request #102915 from wzshiming/feat/graceful-shutdown-based-on-pod-priority
Graceful Node Shutdown Based On Pod Priority
2021-11-17 18:45:03 -08:00
Kubernetes Prow Robot
321e22d365
Merge pull request #106505 from ehashman/revert-103980-dkc-metrics
Revert "Bump DynamicKubeConfig metric deprecation to 1.23"
2021-11-17 16:55:03 -08:00
Kubernetes Prow Robot
e4952f32b7
Merge pull request #106463 from SergeyKanzhelev/grpcProbe
Implement grpc probe action
2021-11-17 12:43:54 -08:00
Elana Hashman
b35c500541
Revert "Bump DynamicKubeConfig metric deprecation to 1.23" 2021-11-17 11:48:49 -08:00
Elana Hashman
31c4273f66
Add test for memory equivalence
See https://github.com/kubernetes/kubernetes/pull/106006#issuecomment-971004230

Co-Authored-By: Jordan Liggitt <liggitt@google.com>
2021-11-17 11:07:09 -08:00
Sascha Grunert
de37b9d293
Make CRI v1 the default and allow a fallback to v1alpha2
This patch makes the CRI `v1` API the new project-wide default version.
To allow backwards compatibility, a fallback to `v1alpha2` has been added
as well. This fallback can either used by automatically determined by
the kubelet.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2021-11-17 11:05:05 -08:00
Sergey Kanzhelev
b7affcced1 implement :grpc probe action 2021-11-17 17:31:23 +00:00
Antonio Ojea
d126b14838 migrate nolint coments to golangci-lint 2021-11-17 13:58:53 +01:00
Hanna Lee
e78b3e8dfe Use nolint directive instead of stopping ticker, per liggit's suggestion 2021-11-17 08:56:57 +01:00
Hanna Lee
69d029bddb Add syncTicker.Stop() 2021-11-17 08:56:57 +01:00
Hanna Lee
07a883d8e6 Remove //lint:ignore pragmas that aren't being used anymore 2021-11-17 08:56:54 +01:00
Hanna Lee
1fbf06f5ad Use time.NewTicker instead of time.Tick to avoid leaking 2021-11-17 08:56:00 +01:00
Hanna Lee
0f3836dcc5 Ignore deprecation warnings with //nolint:staticcheck 2021-11-17 08:55:57 +01:00
Kubernetes Prow Robot
6c357f9996
Merge pull request #106041 from jonyhy96/volumemanager-reconciler-codefmt
kubelet: extract multiple ignore errors validate logic to isExpectedError
2021-11-16 22:55:53 -08:00
Shiming Zhang
7a6f792ff3 Add validation for GracefulNodeShutdownBasedOnPodPriority
Co-authored-by: Elana Hashman <ehashman@users.noreply.github.com>
2021-11-17 11:47:12 +08:00
Shiming Zhang
545313bdc7 Implement graceful shutdown based on Pod priority 2021-11-17 11:47:12 +08:00
Shiming Zhang
d82f606970 Add field for KubeletConfiguration and Regenerate 2021-11-17 11:47:12 +08:00
Kubernetes Prow Robot
1f6d5caa9a
Merge pull request #105437 from cmssczy/update-kubelet-configuration
migrate --register-with-taints to KubeletConfiguration
2021-11-16 17:44:00 -08:00
menglong.qi
b886b9b108 fix: typo 2021-11-17 09:22:57 +08:00
Kubernetes Prow Robot
42d8b2f3b9
Merge pull request #106289 from CatherineF-dev/fix-metrics-AlreadyRegisteredError-in-unit-test
Fix metrics AlreadyRegisteredError on TestRecordOperation and TestGetHistogramVecFromGatherer unit test
2021-11-16 16:36:15 -08:00
Kubernetes Prow Robot
6805e6ee41
Merge pull request #104722 from leiyiz/migration
turning on the CSIMigrationGCE feature flag
2021-11-16 15:28:32 -08:00
Léiyì Zhang
275fdf0884 fixing unit test failures induced by turning on CSIMigrationGCE
disable CSIMigrationGCE in some unit tests
2021-11-16 19:26:30 +00:00
CatherineF-dev
5646120fbb Use Reset at first 2021-11-16 18:57:24 +00:00
haoyun
b5409adaeb refactor: extract multiple ignore errors validate to ignoreError
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-11-16 20:43:50 +08:00
caozhiyuan
bad4faf1b9 migrate --register-with-taints to KubeletConfiguration 2021-11-16 19:10:36 +08:00
Kubernetes Prow Robot
1d1d462d2f
Merge pull request #104287 from jsturtevant/windows-stats
Reduce the number of expensive calls in the Windows stats queries for dockershim
2021-11-15 18:51:37 -08:00
Kubernetes Prow Robot
0473cab823
Merge pull request #103299 from wgahnagl/addPinned
prevents garbage collection from removing pinned images
2021-11-15 18:51:25 -08:00
Kubernetes Prow Robot
39af75af30
Merge pull request #106201 from yxxhero/fea_106111
Add more msg when exec probe timeout
2021-11-15 17:51:37 -08:00
Kubernetes Prow Robot
463802765d
Merge pull request #104650 from yxxhero/initcontainer_oomkiil_as_a_failure
fix init container oomkilled as a failure
2021-11-15 17:51:25 -08:00
Kubernetes Prow Robot
b7c4962472
Merge pull request #105685 from liggitt/kubelet-file-test
Simplify kubelet file config field allowlists
2021-11-15 14:06:48 -08:00
Odin Ugedal
de0ece541c
Fix cpu share issues on systems with large amounts of cpu
On systems where the calculated cpu shares results in a value above the
max value in linux, containers getting that value are unable to start.
This occur on systems with 300+ cpu cores, and where containers are
given such a value.

This issue was fixed for the pod and qos control groups in the similar
cm.MilliCPUToShares that also has tests verifying the behavior. Since
this code already has an dependency on kubelet/cm, lets reuse that code
instead.
2021-11-14 19:49:19 +00:00
Kubernetes Prow Robot
e4c795168b
Merge pull request #106332 from bobbypage/disable-memcg-notifier
kubelet: cgroupv2 disable memcg notifications
2021-11-12 18:36:46 -08:00
CatherineF-dev
d9737eabf4 Use HandlerFor 2021-11-12 23:09:51 +00:00
CatherineF-dev
49d341aa2b Use defer in non-loop 2021-11-12 23:03:38 +00:00
Kubernetes Prow Robot
1f6aa87a93
Merge pull request #105744 from jsturtevant/windows-containerd-networkstats
Get Windows network stats directly for Containerd
2021-11-12 12:36:41 -08:00
Kubernetes Prow Robot
5f0a94b23c
Merge pull request #104743 from gjkim42/ensure-pod-uniqueness
Ensure there is one running static pod with the same full name
2021-11-12 12:36:28 -08:00
Kubernetes Prow Robot
6c04f87470
Merge pull request #106382 from rphillips/fix_close_log
kubelet: fix file descriptor leak in log rotations
2021-11-12 09:22:40 -08:00
Neha Lohia
fa1b6765d5
move pkg/util/node to component-helpers/node/util (#105347)
Signed-off-by: Neha Lohia <nehapithadiya444@gmail.com>
2021-11-12 07:52:27 -08:00
CatherineF-dev
a30af261f1 remove lint 2021-11-12 15:03:44 +00:00
Ryan Phillips
d6f9df424a defer close the rotated log open 2021-11-12 08:13:24 -06:00
CatherineF-dev
a8324a3bb7 clean 2021-11-12 03:52:19 +00:00
CatherineF-dev
744785ee40 remove prometheus.DefaultRegisterer 2021-11-12 02:17:28 +00:00
Kubernetes Prow Robot
3ca3daac76
Merge pull request #103415 from tiloso/staticcheck-kubelet
Fix staticcheck failure in pkg/kubelet/cm/cpuset
2021-11-11 15:15:13 -08:00
Gunju Kim
2dd4a00509
kubelet: Remove false PLEG errors 2021-11-12 00:03:01 +09:00
David Porter
f5140d3145 kubelet: cgroupv2 disable memcg notifications
The current memory notifier on cgroupv2 relies on reading
`cgroup.event_control` which is unsupported on cgroupv2. For now, let's
disable the feature on cgroupv2.
2021-11-10 15:40:59 -08:00
ravisantoshgudimetla
696abecada [test][kubelet]: Fix out of bounds in TestSyncLabels unit 2021-11-10 16:53:59 -05:00
James Sturtevant
ab2e58c416 Get networks stats directly 2021-11-10 12:43:56 -08:00
James Sturtevant
c39945c116 Add unit tests to existing code 2021-11-10 11:50:04 -08:00
James Sturtevant
3564cd5beb Reduce calls to docker from dockershim for stats 2021-11-10 11:25:03 -08:00
Kubernetes Prow Robot
b56dc43458
Merge pull request #106282 from bobbypage/cadvisor-v043
vendor: Bump cAdvisor to v0.43.0
2021-11-10 08:17:38 -08:00
CatherineF-dev
8290400e9c format 2021-11-10 03:29:13 +00:00
CatherineF-dev
ef0b2dfbf4 Fix metrics AlreadyRegisteredError on TestRecordOperation and TestGetHistogramVecFromGatherer unit test 2021-11-10 03:23:54 +00:00
Kubernetes Prow Robot
5d60c8d857
Merge pull request #102393 from mengjiao-liu/fix-sysctl-regex
Upgrade preparation to verify sysctl values containing forward slashes by regex
2021-11-09 18:23:26 -08:00