kubernetes

Author	SHA1	Message	Date
Kevin Klues	dc4430b663	Add a sum() helper to the CPUManager cpuassignment logic Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 16:51:29 +00:00
Kevin Klues	cfacc22459	Allow the map.Values() function in the CPUManager to take a set of keys Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 16:51:28 +00:00
Kevin Klues	a160d9a8cd	Fix CPUManager algo to calculate min NUMA nodes needed for distribution Previously the algorithm was too restrictive because it tried to calculate the minimum based on the number of available NUMA nodes and the number of available CPUs on those NUMA nodes. Since there was no (easy) way to tell how many CPUs an individual NUMA node happened to have, the average across them was used. Using this value however, could result in thinking you need more NUMA nodes to possibly satisfy a request than you actually do. By using the total number of NUMA nodes and CPUs per NUMA node, we can get the true minimum number of nodes required to satisfy a request. For a given "current" allocation this may not be the true minimum, but its better to start with fewer and move up than to start with too many and miss out on a better option. Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 16:51:26 +00:00
Kevin Klues	209cd20548	Fix unit tests following bug fix in CPUManager for map functions (2/2) Now that the algorithm for balancing CPU distributions across NUMA nodes is correct, this test actually behaves differently for the "packed" vs. "distributed" allocation algorithms (as it should). In the "packed" case we need to ensure that CPUs are allocated such that they are packed onto cores. Since one CPU is already allocated from a core on NUMA node 0, we want the next CPU to be its hyperthreaded pair (even though the first available CPU id is on Socket 1). In the "distributed" case, however, we want to ensure CPUs are allocated such that we have an balanced distribution of CPUs across all NUMA nodes. This points to allocating from Socket 1 if the only other CPU allocated has been done on Socket 0. To allow CPUs allocations to be packed onto full cores, one can allocate them from the "distributed" algorithm with a 'cpuGroupSize' equal to the number of hypthreads per core (in this case 2). We added an explicit test case for this, demonstrating that we get the same result as the "packed" algorithm does, even though the "distributed" algorithm is in use. Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 16:51:24 +00:00
Kevin Klues	67f719cb1d	Fix unit tests following bug fix in CPUManager for map functions (1/2) This fixes two related tests to better test our "balanced" distribution algorithm. The first test originally provided an input with the following number of CPUs available on each NUMA node: Node 0: 16 Node 1: 20 Node 2: 20 Node 3: 20 It then attempted to distribute 48 CPUs across them with an expectation that each of the first 3 NUMA nodes would have 16 CPUs taken from them (leaving Node 0 with no more CPUs in the end). This would have resulted in the following amount of CPUs on each node: Node 0: 0 Node 1: 4 Node 2: 4 Node 3: 20 Which results in a standard deviation of 7.6811 However, a more balanced solution would actually be to pull 16 CPUs from NUMA nodes 1, 2, and 3, and leave 0 untouched, i.e.: Node 0: 16 Node 1: 4 Node 2: 4 Node 3: 4 Which results in a standard deviation of 5.1961524227066 To fix this test we changed the original number of available CPUs to start with 4 less CPUs on NUMA node 3, and 2 more CPUs on NUMA node 0, i.e.: Node 0: 18 Node 1: 20 Node 2: 20 Node 3: 16 So that we end up with a result of: Node 0: 2 Node 1: 4 Node 2: 4 Node 3: 16 Which pulls the CPUs from where we want and results in a standard deviation of 5.5452 For the second test, we simply reverse the number of CPUs available for Nodes 0 and 3 as: Node 0: 16 Node 1: 20 Node 2: 20 Node 3: 18 Which forces the allocation to happen just as it did for the first test, except now on NUMA nodes 1, 2, and 3 instead of NUMA nodes 0,1, and 2. Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 16:51:23 +00:00
Kevin Klues	4008ea0b4c	Fix bug in CPUManager map.Keys() and map.Values() implementations Previously these would return lists that were too long because we appended to pre-initialized lists with a specific size. Since the primary place these functions are used is in the mean and standard deviation calculations for the NUMA distribution algorithm, it meant that the results of these calculations were often incorrect. As a result, some of the unit tests we have are actually incorrect (because the results we expect do not actually produce the best balanced distribution of CPUs across all NUMA nodes for the input provided). These tests will be patched up in subsequent commits. Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 16:51:21 +00:00
Kevin Klues	446c58e0e7	Ensure we balance across all NUMA nodes in NUMA distribution algo Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 16:51:19 +00:00
Kevin Klues	c8559bc43e	Short-circuit CPUManager distribute NUMA algo for unusable cpuGroupSize Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 16:51:16 +00:00
Kevin Klues	b28c1392d7	Round the CPUManager mean and stddev calculations to the nearest 1000th Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 16:51:13 +00:00
Dave Chen	8609358975	Graduate `PreferNominatedNode` to GA Signed-off-by: Dave Chen <dave.chen@arm.com>	2021-11-24 14:50:53 +08:00
ahrtr	b7f22801fe	add more info when failing to call PdhAddEnglishCounter	2021-11-24 13:49:34 +08:00
Kubernetes Prow Robot	a5622f3f6e	Merge pull request #106616 from mattcary/pvc-race Clean up deep copy needed for UpdateStatefulSet	2021-11-23 09:38:17 -08:00
Matthew Cary	0e2b901762	Clean up deep copy needed for UpdateStatefulSet Change-Id: Id732358183d682d1a945cfee56f83bcaac0d7c31	2021-11-23 06:48:54 -08:00
Kubernetes Prow Robot	f572e4d5b4	Merge pull request #106518 from SergeyKanzhelev/tryProbeFix Fix the bug with GRPC probe	2021-11-22 15:38:54 -08:00
Aohan Yang	ad4fe13528	fix the error when cleaning up jobs for cronjob	2021-11-22 17:06:22 +08:00
Amim Knabben	8b37bfec8e	Enabling kube-proxy metrics on windows kernel mode	2021-11-21 21:23:55 -03:00
Kubernetes Prow Robot	084b28f6d5	Merge pull request #106510 from robscott/topology-ready-fix-controller Updating TopologyCache to disregard unready endpoints in calculations	2021-11-19 17:07:11 -08:00
Kubernetes Prow Robot	37ae94f9ed	Merge pull request #106507 from robscott/topology-ready-fix Updating kube-proxy to ignore unready endpoints for Topology Hints	2021-11-19 17:06:59 -08:00
Sergey Kanzhelev	f390d49e24	fix the grpc probes	2021-11-20 00:23:53 +00:00
Kubernetes Prow Robot	8f9dd0a14c	Merge pull request #105916 from kevindelgado/validation-unify-all Server Side Strict Field Validation	2021-11-19 14:27:22 -08:00
Kevin Delgado	e50e2bbc88	Server Side Field Validation Implements server side field validation behind the `ServerSideFieldValidation` feature gate. With the feature enabled, any create/update/patch request with the `fieldValidation` query param set to "Strict" will error if the object in the request body have unknown fields. A value of "Warn" (also the default when the feautre is enabled) will succeed the request with a warning. When the feature is disabled (or the query param has a value of "Ignore"), the request will succeed as it previously had with no indications of any unknown or duplicate fields.	2021-11-19 21:24:36 +00:00
Kubernetes Prow Robot	ddfc53922c	Merge pull request #106414 from jonyhy96/kubelet-fix-flake kubelet: fix npe in test	2021-11-19 07:06:51 -08:00
haoyun	65ac99eef5	fix: npe in kubelet test Signed-off-by: haoyun <yun.hao@daocloud.io> Co-authored-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>	2021-11-19 17:44:05 +08:00
Rob Scott	1983f41065	Updating kube-proxy to ignore unready endpoints for Topology Hints	2021-11-18 14:04:44 -08:00
Rob Scott	9813ec7e8a	Updating TopologyCache to disregard unready endpoints in calculations	2021-11-18 13:54:09 -08:00
shuheiktgw	2acdaeb361	Refactor Kubelet config validation tests	2021-11-18 22:38:01 +09:00
shuheiktgw	35ad91ab37	Refactor Kubelet config validations	2021-11-18 22:31:31 +09:00
Kubernetes Prow Robot	b8af116327	Merge pull request #99728 from mattcary/control StatefulSet PVC auto-delete implementation	2021-11-18 03:37:02 -08:00
Shivam Sandbhor	6652c54d83	Remove invalid comment in legacyregistry Signed-off-by: Shivam Sandbhor <shivam.sandbhor@gmail.com>	2021-11-18 15:05:00 +05:30
Kubernetes Prow Robot	d766ab88f7	Merge pull request #106501 from ehashman/cri-graduation-v1 Make CRI v1 the default and allow a fallback to v1alpha2	2021-11-17 19:57:01 -08:00
Kubernetes Prow Robot	91b7fb4dc9	Merge pull request #102915 from wzshiming/feat/graceful-shutdown-based-on-pod-priority Graceful Node Shutdown Based On Pod Priority	2021-11-17 18:45:03 -08:00
hyschumi	7ad629c864	rm makeNodeWithExtendedResource method in noderesources unit test && doc: update least_allocated strategy detail doc	2021-11-18 09:15:26 +08:00
Kubernetes Prow Robot	321e22d365	Merge pull request #106505 from ehashman/revert-103980-dkc-metrics Revert "Bump DynamicKubeConfig metric deprecation to 1.23"	2021-11-17 16:55:03 -08:00
Matthew Cary	53b3a6c1d9	controller change for statefulset auto-delete (tests) Change-Id: I16b50e6853bba65fc89c793d2b9b335581c02407	2021-11-17 16:48:50 -08:00
Matthew Cary	bce87a3e4f	controller change for statefulset auto-delete (implementation)	2021-11-17 16:48:50 -08:00
Matthew Cary	f1d5d4df5a	tests for statefulset PersistentVolumeClaimDeletePolicy api change	2021-11-17 16:46:48 -08:00
Matthew Cary	98c37f9e2a	statefulset PersistentVolumeClaimDeletePolicy api change	2021-11-17 16:46:47 -08:00
Matthew Cary	0f5ffe6cf8	Add StatefulSetAutoDeletePVC feature gate	2021-11-17 15:35:49 -08:00
Kubernetes Prow Robot	e4952f32b7	Merge pull request #106463 from SergeyKanzhelev/grpcProbe Implement grpc probe action	2021-11-17 12:43:54 -08:00
Elana Hashman	b35c500541	Revert "Bump DynamicKubeConfig metric deprecation to 1.23"	2021-11-17 11:48:49 -08:00
Elana Hashman	31c4273f66	Add test for memory equivalence See https://github.com/kubernetes/kubernetes/pull/106006#issuecomment-971004230 Co-Authored-By: Jordan Liggitt <liggitt@google.com>	2021-11-17 11:07:09 -08:00
Sascha Grunert	de37b9d293	Make CRI `v1` the default and allow a fallback to `v1alpha2` This patch makes the CRI `v1` API the new project-wide default version. To allow backwards compatibility, a fallback to `v1alpha2` has been added as well. This fallback can either used by automatically determined by the kubelet. Signed-off-by: Sascha Grunert <sgrunert@redhat.com>	2021-11-17 11:05:05 -08:00
Sergey Kanzhelev	b7affcced1	implement :grpc probe action	2021-11-17 17:31:23 +00:00
Antonio Ojea	d126b14838	migrate nolint coments to golangci-lint	2021-11-17 13:58:53 +01:00
Hanna Lee	e78b3e8dfe	Use nolint directive instead of stopping ticker, per liggit's suggestion	2021-11-17 08:56:57 +01:00
Hanna Lee	69d029bddb	Add syncTicker.Stop()	2021-11-17 08:56:57 +01:00
Hanna Lee	07a883d8e6	Remove //lint:ignore pragmas that aren't being used anymore	2021-11-17 08:56:54 +01:00
Hanna Lee	c8fde197f5	Add more //nolint:staticcheck for failures caught in PR tests	2021-11-17 08:56:02 +01:00
Hanna Lee	1fbf06f5ad	Use time.NewTicker instead of time.Tick to avoid leaking	2021-11-17 08:56:00 +01:00
Hanna Lee	30ea05ae7b	Update IPVar and IPPortVar functions to have pointer receivers to fix 'ineffective assignment'	2021-11-17 08:56:00 +01:00

... 20 21 22 23 24 ...

44631 Commits