kubernetes

Author	SHA1	Message	Date
Abirdcfly	e35cfbb5a7	fix: some function should pass context parameter Change-Id: Ib509573a72c8bd0c61233ade415fef470c61bf5f	2022-03-04 00:42:45 +08:00
Kubernetes Prow Robot	88f9728339	Merge pull request #108309 from zshihang/token no auto-generation of secret-based service account token	2022-03-02 06:19:15 -08:00
Kubernetes Prow Robot	422001df8b	Merge pull request #108154 from klueska/fix-topology-manager Update TopologyManager algorithm for selecting "best" non-preferred hint	2022-03-02 04:13:13 -08:00
Kubernetes Prow Robot	0dc83fe696	Merge pull request #108078 from tnqn/endpoints-resync Skip updating Endpoints if no relevant fields change	2022-03-01 23:23:13 -08:00
Kubernetes Prow Robot	2de37aa9fa	Merge pull request #108276 from AllenZMC/improve_util_test_coverage improve test coverage	2022-03-01 16:57:13 -08:00
Kubernetes Prow Robot	604ab4fc6c	Merge pull request #108340 from ArangoGutierrez/misspelled/1 Fix typo in pkg/kubelet/pluginmanager/cache/actual_state_of_world	2022-03-01 15:45:55 -08:00
Kubernetes Prow Robot	5d6a793221	Merge pull request #96828 from panjf2000/opt-epoll-eventfd kubelet/eviction: eliminate redundant allocations when handling eventfd	2022-03-01 13:59:54 -08:00
Kubernetes Prow Robot	66daef4aa7	Merge pull request #108167 from jfremy/fix-107973 Fix nodes volumesAttached status not being updated	2022-03-01 12:49:54 -08:00
Kubernetes Prow Robot	0e8e307567	Merge pull request #106570 from odinuge/fix-cpu-shares-on-big-systems Fix cpu share issues on systems with large amounts of cpu	2022-03-01 10:15:55 -08:00
Kevin Klues	e370b7335c	Add extensive unit testing for TopologyManager hint generation algorithm Signed-off-by: Kevin Klues <kklues@nvidia.com>	2022-03-01 17:30:24 +00:00
Kubernetes Prow Robot	46e78c1b80	Merge pull request #108407 from kerthcet/feature/graduate-defaultPodTopologySpread-to-ga-in-kube-feature update feature gate DefaultPodTopologySpread release note	2022-03-01 08:53:47 -08:00
Kevin Klues	99c57828ce	Update TopologyManager algorithm for selecting "best" non-preferred hint For the 'single-numa' and 'restricted' TopologyManager policies, pods are only admitted if all of their containers have perfect alignment across the set of resources they are requesting. The best-effort policy, on the other hand, will prefer allocations that have perfect alignment, but fall back to a non-preferred alignment if perfect alignment can't be achieved. The existing algorithm of how to choose the best hint from the set of "non-preferred" hints is fairly naive and often results in choosing a sub-optimal hint. It works fine in cases where all resources would end up coming from a single NUMA node (even if its not the same NUMA nodes), but breaks down as soon as multiple NUMA nodes are required for the "best" alignment. We will never be able to achieve perfect alignment with these non-preferred hints, but we should try and do something more intelligent than simply choosing the hint with the narrowest mask. In an ideal world, we would have the TopologyManager return a set of "resources-relative" hints (as opposed to a common hint for all resources as is done today). Each resource-relative hint would indicate how many other resources could be aligned to it on a given NUMA node, and a hint provider would use this information to allocate its resources in the most aligned way possible. There are likely some edge cases to consider here, but such an algorithm would allow us to do partial-perfect-alignment of "some" resources, even if all resources could not be perfectly aligned. Unfortunately, supporting something like this would require a major redesign to how the TopologyManager interacts with its hint providers (as well as how those hint providers make decisions based on the hints they get back). That said, we can still do better than the naive algorithm we have today, and this patch provides a mechanism to do so. We start by looking at the set of hints passed into the TopologyManager for each resource and generate a list of the minimum number of NUMA nodes required to satisfy an allocation for a given resource. Each entry in this list then contains the 'minNUMAAffinity.Count()' for a given resources. Once we have this list, we find the maximum 'minNUMAAffinity.Count()' from the list and mark that as the 'bestNonPreferredAffinityCount' that we would like to have associated with whatever "bestHint" we ultimately generate. The intuition being that we would like to (at the very least) get alignment for those resources that require multiple NUMA nodes to satisfy their allocation. If we can't quite get there, then we should try to come as close to it as possible. Once we have this 'bestNonPreferredAffinityCount', the algorithm proceeds as follows: If the mergedHint and bestHint are both non-preferred, then try and find a hint whose affinity count is as close to (but not higher than) the bestNonPreferredAffinityCount as possible. To do this we need to consider the following cases and react accordingly: 1. bestHint.NUMANodeAffinity.Count() > bestNonPreferredAffinityCount 2. bestHint.NUMANodeAffinity.Count() == bestNonPreferredAffinityCount 3. bestHint.NUMANodeAffinity.Count() < bestNonPreferredAffinityCount For case (1), the current bestHint is larger than the bestNonPreferredAffinityCount, so updating to any narrower mergeHint is preferred over staying where we are. For case (2), the current bestHint is equal to the bestNonPreferredAffinityCount, so we would like to stick with what we have unless the current mergedHint is also equal to bestNonPreferredAffinityCount and it is narrower. For case (3), the current bestHint is less than bestNonPreferredAffinityCount, so we would like to creep back up to bestNonPreferredAffinityCount as close as we can. There are three cases to consider here: 3a. mergedHint.NUMANodeAffinity.Count() > bestNonPreferredAffinityCount 3b. mergedHint.NUMANodeAffinity.Count() == bestNonPreferredAffinityCount 3c. mergedHint.NUMANodeAffinity.Count() < bestNonPreferredAffinityCount For case (3a), we just want to stick with the current bestHint because choosing a new hint that is greater than bestNonPreferredAffinityCount would be counter-productive. For case (3b), we want to immediately update bestHint to the current mergedHint, making it now equal to bestNonPreferredAffinityCount. For case (3c), we know that both the current bestHint and the current mergedHint are less than bestNonPreferredAffinityCount, so we want to choose one that brings us back up as close to bestNonPreferredAffinityCount as possible. There are three cases to consider here: 3ca. mergedHint.NUMANodeAffinity.Count() > bestHint.NUMANodeAffinity.Count() 3cb. mergedHint.NUMANodeAffinity.Count() < bestHint.NUMANodeAffinity.Count() 3cc. mergedHint.NUMANodeAffinity.Count() == bestHint.NUMANodeAffinity.Count() For case (3ca), we want to immediately update bestHint to mergedHint because that will bring us closer to the (higher) value of bestNonPreferredAffinityCount. For case (3cb), we want to stick with the current bestHint because choosing the current mergedHint would strictly move us further away from the bestNonPreferredAffinityCount. Finally, for case (3cc), we know that the current bestHint and the current mergedHint are equal, so we simply choose the narrower of the 2. This patch implements this algorithm for the case where we must choose from a set of non-preferred hints and provides a set of unit-tests to verify its correctness. Signed-off-by: Kevin Klues <kklues@nvidia.com>	2022-03-01 14:38:26 +00:00
Kubernetes Prow Robot	50e07c9a12	Merge pull request #108381 from thockin/add-generated-openapi Add the last zz_generated.openapi.go file	2022-03-01 06:35:48 -08:00
kerthcet	9c5898ba90	feat: update feature gate DefaultPodTopologySpread release note Signed-off-by: kerthcet <kerthcet@gmail.com>	2022-03-01 19:06:24 +08:00
Kubernetes Prow Robot	bef9d807a0	Merge pull request #108325 from pacoxu/donotReturnErrWhenPauseLose do not return err when PodSandbox not exist	2022-02-28 18:15:46 -08:00
Kubernetes Prow Robot	e9ba9dc4e4	Merge pull request #107201 from pacoxu/add-metrics-volume-stats-cal add VolumeStatCalDuration metrics for fsquato monitoring benchmark	2022-02-28 16:07:46 -08:00
Kevin Klues	f8601cb5a3	Refactor TopologyManager to be more explicit about bestHint calculation Signed-off-by: Kevin Klues <kklues@nvidia.com>	2022-02-28 20:30:01 +00:00
Kubernetes Prow Robot	f4df7d0eb2	Merge pull request #108393 from ialidzhikov/nit/ingressclassnamespacedparams Correct comment related to IngressClassNamespacedParams feature gate	2022-02-28 12:23:46 -08:00
Tim Hockin	f9e19fc83e	Add the last zz_generated.openapi.go file We had 4 of 5 checked in.	2022-02-28 10:17:54 -08:00
ialidzhikov	856cf94d55	Correct comment related to IngressClassNamespacedParams feature gate Signed-off-by: ialidzhikov <i.alidjikov@gmail.com>	2022-02-28 18:55:13 +02:00
Kubernetes Prow Robot	bf7b9119f0	Merge pull request #108278 from kerthcet/feature/graduate-defaultPodTopologySpread-to-ga graduate default pod topology spread to ga	2022-02-28 08:02:57 -08:00
Kubernetes Prow Robot	d3ece70f0b	Merge pull request #108269 from kerthcet/refactor/rename-schedulercache-to-cache refactor: rename SchedulerCache to Cache in Scheduler	2022-02-24 14:46:13 -08:00
Carlos Eduardo Arango Gutierrez	bbb8ef1d10	Fix typo in pkg/kubelet/pluginmanager/cache/actual_state_of_world Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>	2022-02-24 16:20:24 -05:00
Kubernetes Prow Robot	06e107081e	Merge pull request #104732 from mengjiao-liu/remove-flag-experimental-check-node-capabilities-before-mount kubelet: Remove the deprecated flag `--experimental-check-node-capabilities-before-mount`	2022-02-24 07:56:30 -08:00
chenyw1990	e26df3594c	do not return err when PodSandbox not exist Co-authored-by: pacoxu <paco.xu@daocloud.io>	2022-02-24 14:58:39 +08:00
kerthcet	eafbaad9f7	refactor: rename SchedulerCache to Cache in Scheduler Signed-off-by: kerthcet <kerthcet@gmail.com>	2022-02-24 09:47:21 +08:00
kerthcet	09623be0b1	refactor: rename schedulerCache to cacheImpl in internal cache Signed-off-by: kerthcet <kerthcet@gmail.com>	2022-02-24 09:42:51 +08:00
Kubernetes Prow Robot	2fcdbd098c	Merge pull request #107993 from deads2k/simplify prevent enabling beta by default for new api groups	2022-02-23 16:03:35 -08:00
Kubernetes Prow Robot	77eb1a03df	Merge pull request #94637 from liggitt/namespace-before-admission set/validate object namespace before admission	2022-02-23 14:35:58 -08:00
Shihang Zhang	fb6c727fde	no auto-generation of secret-based service account token	2022-02-23 14:17:30 -08:00
Kubernetes Prow Robot	08c31088c1	Merge pull request #106858 from cmssczy/add_RegisterWithTaints_validation_test add kubelet config validation test for RegisterWithTaints	2022-02-23 12:51:58 -08:00
David Eads	af99d192cf	prevent enabling beta by default for new api groups	2022-02-23 13:51:43 -05:00
David Eads	a59b92e8c0	reduce API surface area of whether a resource is enabled	2022-02-23 13:36:33 -05:00
Kubernetes Prow Robot	343125cc6c	Merge pull request #107997 from d-honeybadger/fix-tracking-cronjob-owned-jobs Fix cronjob status reconciliation when job template labels change	2022-02-23 07:14:18 -08:00
d-honeybadger	fb094dc44e	cronjob_controllerv2: do not filter jobs to be reconciled by labels	2022-02-23 09:10:33 -05:00
kerthcet	4439fc3590	feat: graduate DefaultPodTopologySpread to GA Co-authored-by: drfish <drfish.me@gmail.com> Signed-off-by: kerthcet <kerthcet@gmail.com>	2022-02-23 19:45:27 +08:00
Kubernetes Prow Robot	296bf4f016	Merge pull request #108230 from sanposhiho/fake-extender-name Support ExtenderName in FakeExtender	2022-02-22 21:36:18 -08:00
Kubernetes Prow Robot	eacbf87bfe	Merge pull request #108156 from jsafrane/rename-selinuxsupport Rename SupportsSELinux to SELinuxRelabel	2022-02-22 20:12:20 -08:00
sanposhiho	0b16a7fefa	Support ExtenderName in FakeExtender	2022-02-23 12:14:39 +09:00
Kubernetes Prow Robot	5211a4b214	Merge pull request #103061 from SergeyKanzhelev/removeAlphaRuntimeClass Remove RuntimeClass feature gate and stop serving older versions of RuntimeClass	2022-02-22 19:08:18 -08:00
Kubernetes Prow Robot	bb610d0816	Merge pull request #108280 from liggitt/secrets Update secrets field API doc	2022-02-22 17:48:18 -08:00
Kubernetes Prow Robot	8f3636e8ac	Merge pull request #108224 from danwinship/kube-proxy-logging Only log full iptables-restore input at V(9)	2022-02-22 16:42:18 -08:00
Kubernetes Prow Robot	a2adaf75b7	Merge pull request #108205 from dkkb/fix/typo Fix typo allcoated -> allocated	2022-02-22 14:35:03 -08:00
Jean-Francois Remy	e83184568d	Add unit tests - actual_state_of_world_test.go: test the new method GetVolumesToReportAttachedForNode for an existing node and a non-existing node - node_status_updater_test.go: test UpdateNodeStatuses and UpdateNodeStatuses in nominal case with 2 nodes getting one volume each. Test UpdateNodeStatuses with the first call to node.patch failing but the following one succeeding - add comment in node_status_updater.go - fix log line in reconciler.go - rename variable in actual_state_of_world.go	2022-02-22 12:21:58 -08:00
Jean-Francois Remy	f1717baaaa	Fix nodes volumesAttached status not updated The UpdateNodeStatuses code stops too early in case there is an error when calling updateNodeStatus. It will return immediately which means any remaining node won't have its update status put back to true. Looking at the call sites for UpdateNodeStatuses, it appears this is not the only issue. If the lister call fails with anything but a Not Found error, it's silently ignored which is wrong in the detach path. Also the reconciler detach path calls UpdateNodeStatuses but the real intent is to only update the node currently processed in the loop and not proceed with the detach call if there is an error updating that specifi node volumesAttached property. With the current implementation, it will not proceed if there is an error updating another node (which is not completely bad but not ideal) and worse it will proceed if there is a lister error on that node which means the node volumesAttached property won't have been updated. To fix those issues, introduce the following changes: - [node_status_updater] introduce UpdateNodeStatusForNode which does what UpdateNodeStatuses does but only for the provided node - [node_status_updater] if the node lister call fails for anything but a Not Found error, we will return an error, not ignore it - [node_status_updater] if the update of a node volumesAttached properties fails we continue processing the other nodes - [actual_state_of_world] introduce GetVolumesToReportAttachedForNode which does what GetVolumesToReportAttached but for the node whose name is provided it returns a bool which indicates if the node in question needs an update as well as the volumesAttached list. It is used by UpdateNodeStatusForNode - [actual_state_of_world] use write lock in updateNodeStatusUpdateNeeded, we're modifying the map content - [reconciler] use UpdateNodeStatusForNode in the detach loop	2022-02-22 12:20:53 -08:00
Sergey Kanzhelev	06ee2969ef	do not serve node.k8s.io, version v1alpha1	2022-02-22 18:30:24 +00:00
Kubernetes Prow Robot	b917653296	Merge pull request #108263 from deads2k/more-resthandlers migrate more rest handlers to select by resource enablement	2022-02-22 10:15:16 -08:00
Jordan Liggitt	6b09e232cd	Update secrets field API doc	2022-02-22 13:12:03 -05:00
David Eads	0ec20f97d2	migrate more rest handlers to select by resource enablement	2022-02-22 12:07:43 -05:00
Kubernetes Prow Robot	108e8136e2	Merge pull request #107393 from danwinship/filter-endpoints kube-proxy endpoint filtering unit test refactoring	2022-02-22 08:55:15 -08:00

1 2 3 4 5 ...

44022 Commits