Commit Graph

65 Commits

Author SHA1 Message Date
wojtekt
53ce79a18a Migrate to k8s.io/utils/clock in pkg/kubelet 2021-09-10 12:20:09 +02:00
Clayton Coleman
3eadd1a9ea
Keep pod worker running until pod is truly complete
A number of race conditions exist when pods are terminated early in
their lifecycle because components in the kubelet need to know "no
running containers" or "containers can't be started from now on" but
were relying on outdated state.

Only the pod worker knows whether containers are being started for
a given pod, which is required to know when a pod is "terminated"
(no running containers, none coming). Move that responsibility and
podKiller function into the pod workers, and have everything that
was killing the pod go into the UpdatePod loop. Split syncPod into
three phases - setup, terminate containers, and cleanup pod - and
have transitions between those methods be visible to other
components. After this change, to kill a pod you tell the pod worker
to UpdatePod({UpdateType: SyncPodKill, Pod: pod}).

Several places in the kubelet were incorrect about whether they
were handling terminating (should stop running, might have
containers) or terminated (no running containers) pods. The pod worker
exposes methods that allow other loops to know when to set up or tear
down resources based on the state of the pod - these methods remove
the possibility of race conditions by ensuring a single component is
responsible for knowing each pod's allowed state and other components
simply delegate to checking whether they are in the window by UID.

Removing containers now no longer blocks final pod deletion in the
API server and are handled as background cleanup. Node shutdown
no longer marks pods as failed as they can be restarted in the
next step.

See https://docs.google.com/document/d/1Pic5TPntdJnYfIpBeZndDelM-AbS4FN9H2GTLFhoJ04/edit# for details
2021-07-06 15:55:22 -04:00
Kubernetes Prow Robot
8b057cdfa4
Merge pull request #99095 from maxlaverse/fix_kubelet_stuck_in_diskpressure
Prevent Kubelet from getting stuck in DiskPressure when imagefs minReclaim is set
2021-04-23 18:23:14 -07:00
Niekvdplas
fec272a7b2 Fixed several spelling mistakes 2021-03-30 23:02:09 +02:00
Maxime Lagresle
848dc9e65f
Check that minReclaim is taken into account in reclaimNodeLevelResources() 2021-02-22 11:58:36 +01:00
Marek Siarkowicz
7d309e0104 Move Kubelet Summary API to staging repo 2020-09-22 18:23:28 +02:00
yuxiaobo
81e9f21f83 Correct spelling mistakes
Signed-off-by: yuxiaobo <yuxiaobogo@163.com>
2019-11-06 20:25:19 +08:00
ianlang
22d8e054bc unit test: TestAdmitUnderNodeConditions 2019-10-28 11:37:18 +08:00
Ted Yu
0939f90103 Check whether mirror pod is ciritical in managerImpl#evictPod 2019-10-01 11:12:18 -07:00
draveness
495faa22db feat: cleanup pod critical pod annotations feature 2019-08-09 08:41:23 +08:00
draveness
d83526d253 Revert "feat: cleanup pod critical pod annotations feature"
This reverts commit b6d41ee5cc.
2019-07-18 13:31:12 +08:00
draveness
b6d41ee5cc feat: cleanup pod critical pod annotations feature 2019-07-11 08:54:19 +08:00
draveness
ca6003bc75 feat: cleanup PodPriority features gate 2019-06-23 11:57:24 +08:00
Andrew Kim
c919139245 update import of generic featuregate code from k8s.io/apiserver/pkg/util/feature -> k8s.io/component-base/featuregate 2019-05-08 10:01:50 -04:00
Wei Huang
d67e7fd47f
kubelet: updated logic of verifying a static critical pod
- check if a pod is static by its static pod info
- meanwhile, check if a pod is critical by its corresponding mirror pod info
2019-03-12 23:40:20 -07:00
Jordan Liggitt
4dca07ef7e Fixup incorrect use of DefaultFeatureGate.Set in tests 2018-11-21 11:51:33 -05:00
Christoph Blecker
97b2992dc1
Update gofmt for go1.11 2018-10-05 12:59:38 -07:00
David Ashpole
93b6d026d9 fix memcg fd leak 2018-06-11 11:37:50 -07:00
David Ashpole
960856f4e8 collect metrics on the /kubepods cgroup on-demand 2018-02-17 12:32:40 -08:00
David Ashpole
e0830d0b71 reevaluate eviction thresholds after reclaim functions 2018-02-16 08:35:24 -08:00
David Ashpole
8b3bd5ae60 take disk requests into account during evictions 2017-11-21 10:21:30 -08:00
David Ashpole
527611ee41 remove disk allocatable evictions 2017-11-18 10:34:59 -08:00
Dr. Stefan Schimanski
012b085ac8 pkg/apis/core: mechanical import fixes in dependencies 2017-11-09 12:14:08 +01:00
David Ashpole
539fddb49d kubelet evictions take priority into account 2017-10-12 13:15:05 -07:00
David Ashpole
8659676408 feature gate local storage allocatable eviction 2017-10-11 09:53:56 -07:00
Jing Xu
8f98230f20 Map a resource to multiple signals in eviction manager
It is possible to have multiple signals that point to the same type of
resource, e.g., both SignalNodeFsAvailable and
SignalAllocatableNodeFsAvailable refer to the same resource NodeFs.
Change the map from map[v1.ResourceName]evictionapi.Signal to
map[v1.ResourceName][]evictionapi.Signal
2017-09-01 12:54:37 -07:00
Kubernetes Submit Queue
9350afd772 Merge pull request #48976 from supereagle/cleanup-api-package
Automatic merge from submit-queue (batch tested with PRs 48976, 49474, 40050, 49426, 49430)

Remove duplicated import and wrong alias name of api package

**What this PR does / why we need it**:

**Which issue this PR fixes**: fixes #48975

**Special notes for your reviewer**:
/assign @caesarxuchao

**Release note**:
```release-note
NONE
```
2017-07-25 12:14:38 -07:00
supereagle
adc0eef43e remove duplicated import and wrong alias name of api package 2017-07-25 10:04:25 +08:00
Jing Xu
bb1920edcc Fix issues for local storage allocatable feature
This PR fixes the following issues:
1. Use ResourceStorageScratch instead of ResourceStorage API to represent
local storage capacity
2. In eviction manager, use container manager instead of node provider
(kubelet) to retrieve the node capacity and reserved resources. Node
provider (kubelet) has a feature gate so that storagescratch information
may not be exposed if feature gate is not set. On the other hand,
container manager has all the capacity and allocatable resource
information.
2017-07-13 12:06:19 -07:00
Chao Xu
60604f8818 run hack/update-all 2017-06-22 11:31:03 -07:00
Chao Xu
f2d3220a11 run root-rewrite-import-client-go-api-types 2017-06-22 11:30:59 -07:00
Chao Xu
f4989a45a5 run root-rewrite-v1-..., compile 2017-06-22 10:25:57 -07:00
David Ashpole
889afa5e2d trigger aggressive container garbage collection when under disk pressure 2017-06-03 07:52:36 -07:00
Clayton Coleman
3e095d12b4
Refactor move of client-go/util/clock to apimachinery 2017-05-20 14:19:48 -04:00
Michael Taufen
cbad320205 Reorganize kubelet tree so apis can be independently versioned 2017-05-12 10:02:33 -07:00
David Ashpole
ac612eab8e eviction manager changes for allocatable 2017-03-02 07:36:24 -08:00
Vishnu Kannan
cc5f5474d5 add support for node allocatable phase 2 to kubelet
Signed-off-by: Vishnu Kannan <vishnuk@google.com>
2017-02-27 21:24:44 -08:00
Vishnu kannan
26f9598279 admit critical pods under resource pressure\n evict critical pods that are not static
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-02-19 19:19:09 -08:00
Vishnu Kannan
c967ab7b99 Avoid evicting critical pods in Kubelet if a special feature gate is enabled
Signed-off-by: Vishnu Kannan <vishnuk@google.com>
2017-02-02 11:32:20 -08:00
Vishnu Kannan
ffd7dda234 Revert "Kubelet admits critical pods even under memory pressure"
This reverts commit afd676d94c.
2017-02-02 10:41:24 -08:00
deads2k
a106d9f848 switch kubelet to use external (client-go) object references for events 2017-01-31 19:15:33 -05:00
deads2k
8a12000402 move client/record 2017-01-31 19:14:13 -05:00
Dr. Stefan Schimanski
bc6fdd925d pkg/api/resource: move to apimachinery 2017-01-29 21:41:44 +01:00
deads2k
5a8f075197 move authoritative client-go utils out of pkg 2017-01-24 08:59:18 -05:00
deads2k
c47717134b move utils used in restclient to client-go 2017-01-19 07:55:14 -05:00
deads2k
6a4d5cd7cc start the apimachinery repo 2017-01-11 09:09:48 -05:00
bprashanth
afd676d94c Kubelet admits critical pods even under memory pressure 2016-12-15 18:58:09 -08:00
Chao Xu
5e1adf91df cmd/kubelet 2016-11-23 15:53:09 -08:00
Kubernetes Submit Queue
6515e3573e Merge pull request #34818 from nebril/eviction-test-cleanup
Automatic merge from submit-queue

Cleanup kubelet eviction manager tests

It cleans up kubelet eviction manager tests

Extracted parts of tests that were similar to each other to functions
2016-11-09 02:36:46 -08:00
David Ashpole
9aca40dee6 revert #33218. dont need #36180. We only use diskpressure 2016-11-04 08:29:27 -07:00