Commit Graph

931 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
833f585104 Merge pull request #85760 from yutedz/chkpt-write-err
Log error when writing checkpoint fails
2019-12-02 10:27:06 -08:00
Ted Yu
84a9803741 Log error when writing checkpoint fails 2019-11-29 19:47:17 -08:00
Ted Yu
6415fa765e Remove nodes slice in loop of takeByTopology 2019-11-29 12:12:22 -08:00
Kubernetes Prow Robot
80eed952f0 Merge pull request #84854 from BSWANG/fix-hugetlb-cgroup
fix kubelet failed to start on setting hugetlb limits
2019-11-27 12:29:03 -08:00
Ted Yu
86f3bc25e1 Reduce unnecessary Set in updateAllocatedDevices 2019-11-27 08:48:06 -08:00
Travis Rhoden
0c5c3d8bb9 Remove pkg/util/mount (moved out of tree)
This patch removes pkg/util/mount completely, and replaces it with the
mount package now located at k8s.io/utils/mount. The code found at
k8s.io/utils/mount was moved there from pkg/util/mount, so the code is
identical, just no longer in-tree to k/k.
2019-11-15 08:29:12 -07:00
Kubernetes Prow Robot
30e6238795 Merge pull request #85147 from yutedz/devmgr-rm-contents
Continue removing file in ManagerImpl#removeContents
2019-11-14 16:38:28 -08:00
Ted Yu
fb046f7787 Continue removing file in ManagerImpl#removeContents 2019-11-13 06:00:34 -08:00
Kubernetes Prow Robot
ed10b5b17f Merge pull request #85047 from yutedz/dev-mgr-err-handling
Handle error return from allocatePodResources
2019-11-12 11:51:27 -08:00
Kubernetes Prow Robot
897ce3073c Merge pull request #84533 from davidz627/fix/deprecatedPath
Remove plugin watching of deprecated directory and CSI v0 support in accordance with deprecation policy
2019-11-12 04:48:20 -08:00
David Zhu
802fe12803 Remove plugin watching of deprecated directory {kubelet_root_dir}/plugins and support for CSI V0 in accordance with deprecation announcement in https://v1-13.docs.kubernetes.io/docs/setup/release/notes/ 2019-11-11 11:42:58 -08:00
Ted Yu
db0f616974 Handle error return from allocatePodResources 2019-11-09 16:25:15 -08:00
Travis Rhoden
1fd8921546 Move mount/fake.go to mount/fake_mount.go
This patch moves fake.go to mount_fake.go, and follows to principle of
always returning a discrete type rather than an Interface. All callers
of "FakeMounter" are changed to instead use "NewFakeMounter()". The
FakeMounter "Log" struct member is changed to not be exported, and
instead only access through a new "GetLog()" method.
2019-11-08 08:07:41 -07:00
mrobson
e401ee9158 Errors from cgroup destroy and pid kills are swallowed. Log a warning when that happens. 2019-11-07 07:47:57 -05:00
Kubernetes Prow Robot
73b2c82b28 Merge pull request #83592 from jianzzha/opt-reserved-cpus
added --reserved-cpus kubelet command option
2019-11-06 22:14:42 -08:00
Kubernetes Prow Robot
695c3061dd Merge pull request #82809 from liggitt/go-1.13-no-modules
update to use go1.13.4
2019-11-06 17:02:43 -08:00
Kubernetes Prow Robot
08e5781b41 Merge pull request #84525 from klueska/upstream-fix-hint-generation-after-kubelet-restart
Fix bug in TopologyManager hint generation after kubelet restart
2019-11-06 15:33:50 -08:00
Jordan Liggitt
297570e06a hack/update-vendor.sh 2019-11-06 17:42:34 -05:00
Kubernetes Prow Robot
46472773cb Merge pull request #84836 from yuxiaobo96/k8s-checks
Correct spelling mistakes
2019-11-06 12:21:11 -08:00
Kevin Klues
4d4d4bdd61 Ensure devicemanager TopologyHints are regenerated after kubelet restart
This patch also includes test to make sure the newly added logic works
as expected.
2019-11-06 15:01:34 +00:00
Jianzhu Zhang
89dfd24483 added --reserved-cpus kubelet command option 2019-11-06 07:33:52 -05:00
yuxiaobo
81e9f21f83 Correct spelling mistakes
Signed-off-by: yuxiaobo <yuxiaobogo@163.com>
2019-11-06 20:25:19 +08:00
bingshen.wbs
47642a0bad fix kubelet failed to start on setting hugetlb limits in non-exist cgroup dir
cause by kubelet startup be interrupted on setting list of cgroups
In the 'cgroupManagerImpl.Exists' not check&recreate the hugetlb cgroup dir. Then setting the limits in non-exist cgroup dir will cause kubelet start failed.

Signed-off-by: bingshen.wbs <bingshen.wbs@alibaba-inc.com>
2019-11-06 16:39:55 +08:00
Kubernetes Prow Robot
0c0408c790 Merge pull request #76407 from yanghaichao12/dev0411
change directory permissions from 0755 to 0750
2019-11-05 19:30:59 -08:00
Kevin Klues
9dc116eb08 Ensure CPUManager TopologyHints are regenerated after kubelet restart
This patch also includes test to make sure the newly added logic works
as expected.
2019-11-05 15:48:51 +00:00
Kevin Klues
a338c8f7fd Add some more comments to GetTopologyHints() in the devicemanager 2019-11-05 13:06:23 +00:00
Kevin Klues
58f3554ebe Sync all CPU and device state before generating TopologyHints for them
This ensures that we have the most up-to-date state when generating
topology hints for a container. Without this, it's possible that some
resources will be seen as allocated, when they are actually free.
2019-11-05 13:00:20 +00:00
Kevin Klues
d9adf20360 Abstract removeStaleState from reconcileState in CPUManager
This will become especially important as we move to a model where
exclusive CPUs are assigned at pod admission time rather than at pod
creation time.

Having this function will allow us to do garbage collection on these
CPUs anytime we are about to allocate CPUs to a new set of containers,
in addition to reclaiming state periodically in the reconcileState()
loop.
2019-11-05 12:45:11 +00:00
Kevin Klues
b5f52e6072 Modularize TopologyManager policy Merge() tests
These changes make it so that a set of common test cases can be used for
all merge strategies, with specific test cases being able to be
specified on a policy-by-policy basis.
2019-11-04 18:43:07 +01:00
Kevin Klues
7ea1fc9be4 Move TopologyManager TestPolicyMerge() to shared test file 2019-11-04 18:43:07 +01:00
Kevin Klues
d7d7bfcda0 Abstract TopologyManager Policy Merge() tests into their own function 2019-11-04 18:43:07 +01:00
Adrian Chiris
dee22d1fbc Fix comments in TopologyManager 2019-11-04 18:43:07 +01:00
Adrian Chiris
5f7db54d3c Move function from top-level TopologyManager to best-effort policy
This is in preparation for removing the special-case of the
SingleNumaNode policy in mergeProvidersHints() in favor of a custom
merging strategy with much less overhead.
2019-11-04 18:43:07 +01:00
Adrian Chiris
d95464645c Add Merge() API to TopologyManager Policy abstraction
This abstraction moves the responsibility of merging topology hints to
the individual policies themselves. As part of this, it removes the
CanAdmitPodResult() API from the policy abstraction, and rolls it into a
second return value from Merge()
2019-11-04 18:43:07 +01:00
Adrian Chiris
78d7856288 Globalize a few TopologyManager functions
This is in preparation for a larger refactoring effort that will add a
'Merge()'  API to the TopologyManager policy API.
2019-11-04 18:43:07 +01:00
Adrian Chiris
e72847676f Pass a list of NUMA nodes to the various TopologyManager policies
This is in preparation for a larger refactoring effort that will add a
'Merge()'  API to the TopologyManager policy API.
2019-11-04 18:43:07 +01:00
Adrian Chiris
6fd8a6eb69 Make restricted TopologyManager policy inherit from best-effort policy
These policies only differ on whether they admit the pod or not when a
TopologyHint is preferred or not. As such, the restricted policy should
simply inherit whatever it can from the best effort policy and only
overwrite what is necessary.

This does not matter for now, but will become important when we add a
new 'Merge()' abstraction to a Policy later on.
2019-11-04 18:43:07 +01:00
Adrian Chiris
3391daeb00 Break TopologyManager.calculateAffinity() into more modular functions
This modularization is in preparation for a larger refactoring effort
that will add a 'Merge()'  API to the TopologyManager policy API.
2019-11-04 18:43:07 +01:00
Adrian Chiris
b17706b149 Added LessThan() and IsEqual() methods for TopologyHints 2019-11-04 18:43:07 +01:00
yanghaichao12
5cbafba457 change directory permissions from 0755 to 0750 2019-11-04 17:04:37 +08:00
Kubernetes Prow Robot
002dbf6a4c Merge pull request #83777 from lmdaly/fix-single-numa-node-with-best-effort-pods
Fixed bug in TopologyManager with SingleNUMANode Policy
2019-11-01 04:53:23 -07:00
Kubernetes Prow Robot
17a57f99d5 Merge pull request #81344 from zouyee/cpm
fix cpumanager reconcileState without sourceready
2019-10-30 23:33:36 -07:00
nolancon
b0a85177d2 Clean-up and additional test cases for socket-mask unit test. 2019-10-18 04:16:06 +01:00
Kubernetes Prow Robot
017842d49d Merge pull request #83492 from ConnorDoyle/topo-align-all-qos
Topology manager aligns pods of all QoS classes.
2019-10-11 03:03:40 -07:00
Louise Daly
a353247d44 Fixed bug in TopologyManager with SingleNUMANode Policy
This patch fixes an issue where best-effort pods were not admitted
to the node if the single-numa-node policy was set.

This was because the Admit policy in single-numa-node policy does
not admit any pod where the hint is anything but single NUMA node. The 'best hint' in this case is {<set bits for num. Numa Nodes on machine>, true}
So on a machine with 2 NUMA nodes the best hint for a best-effort pod is {11,true} as best-effort pods have no Topology preferences.

The single-numa-node policy fails any pod with a not preferred hint OR a hint where > 1 bits are set, thus the above example resulting in termintaed pods with a Topology Affinity Error.

This is a short term fix for the single-numa-node policy, as there will be code refactoring for the 1.17 release.
2019-10-11 07:00:37 +01:00
Kubernetes Prow Robot
4561b67971 Merge pull request #83697 from klueska/fix-single-numa-with-one-provider
Fixed bug in TopologyManager with SingleNUMANode Policy
2019-10-10 19:00:33 -07:00
Kubernetes Prow Robot
3db6d3abcf Merge pull request #83551 from dims/move-external-facing-kubelet-apis-to-staging
Move external facing kubelet apis to staging
2019-10-10 13:41:36 -07:00
Connor Doyle
a598369e3c Gofmt. 2019-10-10 12:16:21 -07:00
Connor Doyle
a9203ebdcf Topology manager aligns pods of all QoS classes. 2019-10-10 12:16:21 -07:00
Kevin Klues
5501f542cd Fixed bug in TopologyManager with SingleNUMANode Policy
This patch fixes an issue in the TopologyManager that wouldn't allow
pods to be admitted if pods were launched with the SingleNUMANode policy
and any of the hint providers had no NUMA preferences.

This is due to 2 factors:

1) Any hint provider that passes back a `nil` as its hints, has its hint
automatically transformed into a single {11 true} hint before merging

2) We added a special casing for the SingleNumaNodePolicy() in the
TopologyManager that essentially turns these hints into a
{11 false} anytime a {11 true} is seen.

The current patch reworks this logic so the that TopologyManager can
tell the difference between a "don't care" hint and a true "{11 true}"
hint returned by the hint provider. Only true "{11 true}" hints will be
converted by the special casing for the SingleNumaNodePolicy(), while
"don't care" hints will not.

This is a short term fix for this issue until we do a larger refactoring
of this code for the 1.17 release.
2019-10-09 17:41:08 -07:00