Commit Graph

9478 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
617064d732 Merge pull request #101432 from swatisehgal/smtaware
node: cpumanager: add options to reject non SMT-aligned workload
2021-07-08 21:04:53 -07:00
Kubernetes Prow Robot
83baa708df Merge pull request #103429 from saschagrunert/metrics-test-fix
Fix resource metrics e2e test
2021-07-08 17:58:53 -07:00
Kubernetes Prow Robot
dab6f6a43d Merge pull request #102344 from smarterclayton/keep_pod_worker
Prevent Kubelet from incorrectly interpreting "not yet started" pods as "ready to terminate pods" by unifying responsibility for pod lifecycle into pod worker
2021-07-08 16:48:53 -07:00
Kubernetes Prow Robot
57716897eb Merge pull request #103434 from perithompson/windows-etchostcreate-skip
Explicitly skip host file mounting for Windows when HostProcess pod
2021-07-08 15:36:53 -07:00
Francesco Romani
23abdab2b7 smtalign: propagate policy options to policies
Consume in the static policy the cpu manager policy options from
the cpumanager instance.
Validate in the none policy if any option is given, and fail if so -
this is almost surely a configuration mistake.

Add new cpumanager.Options type to hold the options and translate from
user arguments to flags.

Co-authored-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-07-08 23:15:37 +02:00
Francesco Romani
6dcec345df smtalign: cm: factor out admission response
Introduce a new `admission` subpackage to factor out the responsability
to create `PodAdmitResult` objects. This enables resource manager
to report specific errors in Allocate() and to bubble up them
in the relevant fields of the `PodAdmitResult`.

To demonstrate the approach we refactor TopologyAffinityError as a
proper error.

Co-authored-by: Kevin Klues <kklues@nvidia.com>
Co-authored-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-07-08 23:15:37 +02:00
Francesco Romani
c5cb263dcf smtalign: propagate policy options to cpumanager
The CPUManagerPolicyOptions received from the kubelet config/command line args
is propogated to the Container Manager.

We defer the consumption of the options to a later patch(set).

Co-authored-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-07-08 23:15:35 +02:00
Francesco Romani
6dccad45b4 smtalign: add auto generated code
Files generate after running `make generated_files`.

Co-authored-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-07-08 23:14:59 +02:00
Swati Sehgal
cc76a756e4 smtalign: add cpu-manager-policy-options flag in Kubelet
In this patch we enhance the kubelet configuration to support
cpuManagerPolicyOptions.

In order to introduce SMT-awareness in CPU Manager, we introduce a
new flag in Kubelet to allow the user to specify an additional flag
called `cpumanager-policy-options` to allow the user to modify the
behaviour of static policy to strictly guarantee allocation of whole
core.

Co-authored-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-07-08 23:14:59 +02:00
Kubernetes Prow Robot
4d78db54a5 Merge pull request #103580 from tkestack/fix-version-format
fix kubelet panic when DynamicKubeletConfig enabled
2021-07-08 14:02:24 -07:00
Kubernetes Prow Robot
a9d7526864 Merge pull request #102970 from tkestack/feature-memory-qos
Feature: Support memory qos with cgroups v2
2021-07-08 14:01:36 -07:00
Kubernetes Prow Robot
7c84064a4f Merge pull request #99000 from verb/1.21-kubelet-metrics
Add kubelet metrics for ephemeral containers
2021-07-08 14:00:55 -07:00
Peri Thompson
8e2b728c68 Explicitly skip host file mounting for windows 2021-07-08 19:38:49 +01:00
Li Bo
79e230ea21 fix kubelet panic when DynamicKubeletConfig enabled 2021-07-08 16:20:51 +08:00
Li Bo
c3d9b10ca8 feature: support Memory QoS for cgroups v2 2021-07-08 09:26:46 +08:00
Kubernetes Prow Robot
36a7426aa5 Merge pull request #99144 from bart0sh/PR0094-promote-HugePageStorageMediumSize-to-GA
promote huge page storage medium size to GA
2021-07-07 18:09:05 -07:00
Kubernetes Prow Robot
ebbe63f116 Merge pull request #92863 from AkihiroSuda/rootless-pr
kubelet & kube-proxy: ignore sysctl errors and rlimit errors when running in UserNS (for rootless)
2021-07-07 18:08:53 -07:00
Kubernetes Prow Robot
8e56a34195 Merge pull request #102966 from SergeyKanzhelev/deprecateDynamicKubeletConfig
deprecate and disable by default DynamicKubeletConfig feature flag
2021-07-07 17:05:15 -07:00
Akihiro Suda
26e83ac4d4 kubelet: ignore /dev/kmsg error when running in userns
oomwatcher.NewWatcher returns "open /dev/kmsg: operation not permitted" error,
when running with sysctl value `kernel.dmesg_restrict=1`.

The error is negligible for KubeletInUserNamespace.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-07-07 14:23:31 +09:00
Akihiro Suda
dbe0155139 kubelet/cm: ignore sysctl error when running in userns
Errors during setting the following sysctl values are ignored:
- vm.overcommit_memory
- vm.panic_on_oom
- kernel.panic
- kernel.panic_on_oops
- kernel.keys.root_maxkeys
- kernel.keys.root_maxbytes

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-07-07 14:23:29 +09:00
Kubernetes Prow Robot
2547c5bb97 Merge pull request #103307 from aojea/kubelet_podIPs
podIPs order match node IP family preference (Downward API)
2021-07-06 22:11:20 -07:00
Kubernetes Prow Robot
561959f682 Merge pull request #102823 from ehashman/kep-2400-swap
Alpha node swap support
2021-07-06 22:11:11 -07:00
Antonio Ojea
a7469cf680 sort and filter exposed Pod IPs
runtimes may return an arbitrary number of Pod IPs, however, kubernetes
only takes into consideration the first one of each IP family.

The order of the IPs are the one defined by the Kubelet:
- default prefer IPv4
- if NodeIPs are defined, matching the first nodeIP family

PodIP is always the first IP of PodIPs.

The downward API must expose the same IPs and in the same order than
the pod.Status API object.
2021-07-07 00:15:31 +02:00
Elana Hashman
5584725605 Explicitly set LimitedSwap case with fallthrough 2021-07-06 13:50:09 -07:00
Clayton Coleman
3eadd1a9ea Keep pod worker running until pod is truly complete
A number of race conditions exist when pods are terminated early in
their lifecycle because components in the kubelet need to know "no
running containers" or "containers can't be started from now on" but
were relying on outdated state.

Only the pod worker knows whether containers are being started for
a given pod, which is required to know when a pod is "terminated"
(no running containers, none coming). Move that responsibility and
podKiller function into the pod workers, and have everything that
was killing the pod go into the UpdatePod loop. Split syncPod into
three phases - setup, terminate containers, and cleanup pod - and
have transitions between those methods be visible to other
components. After this change, to kill a pod you tell the pod worker
to UpdatePod({UpdateType: SyncPodKill, Pod: pod}).

Several places in the kubelet were incorrect about whether they
were handling terminating (should stop running, might have
containers) or terminated (no running containers) pods. The pod worker
exposes methods that allow other loops to know when to set up or tear
down resources based on the state of the pod - these methods remove
the possibility of race conditions by ensuring a single component is
responsible for knowing each pod's allowed state and other components
simply delegate to checking whether they are in the window by UID.

Removing containers now no longer blocks final pod deletion in the
API server and are handled as background cleanup. Node shutdown
no longer marks pods as failed as they can be restarted in the
next step.

See https://docs.google.com/document/d/1Pic5TPntdJnYfIpBeZndDelM-AbS4FN9H2GTLFhoJ04/edit# for details
2021-07-06 15:55:22 -04:00
Kubernetes Prow Robot
eae87bfe7e Merge pull request #103483 from odinuge/revert-102508-runc-1.0
Revert "Update runc to 1.0.0"
2021-07-06 10:42:56 -07:00
Artyom Lukianov
bb6d5b1f95 memory manager: provide unittests for init containers re-use
- provide tests for static policy allocation, when init containers
requested memory bigger than the memory requested by app containers
- provide tests for static policy allocation, when init containers
requested memory smaller than the memory requested by app containers
- provide tests to verify that init containers removed from the state
file once the app container started

Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
2021-07-05 20:52:25 +03:00
Artyom Lukianov
960da7895c memory manager: remove init containers once app container started
Remove init containers from the state file once the app container started,
it will release the memory allocated for the init container and can intense
the density of containers on the NUMA node in cases when the memory allocated
for init containers is bigger than the memory allocated for app containers.

Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
2021-07-05 20:52:25 +03:00
Artyom Lukianov
b965502c49 memory manager: re-use the memory allocated for init containers
The idea that during allocation phase we will:

- during call to `Allocate` and `GetTopologyHints`  we will take into account the init containers reusable memory,
which means that we will re-use the memory and update container memory blocks accordingly.
For example for the pod with two init containers that requested: 1Gi and 2Gi,
and app container that requested 4Gi, we can re-use 2Gi of memory.

Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
2021-07-05 20:52:25 +03:00
Odin Ugedal
61d88af9e4 Revert "Update runc to 1.0.0" 2021-07-05 14:03:04 +02:00
Sascha Grunert
2d0f99fba1 Fix resource metrics e2e test
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2021-07-05 11:16:05 +02:00
Sergey Kanzhelev
dffc2a60a2 deprecate and disable by default DynamicKubeletConfig feature flag 2021-07-02 23:53:11 +00:00
Kubernetes Prow Robot
659c7e709f Merge pull request #99494 from enj/enj/i/not_after_ttl_hint
csr: add expirationSeconds field to control cert lifetime
2021-07-01 23:02:12 -07:00
Monis Khan
cd91e59f7c csr: add expirationSeconds field to control cert lifetime
This change updates the CSR API to add a new, optional field called
expirationSeconds.  This field is a request to the signer for the
maximum duration the client wishes the cert to have.  The signer is
free to ignore this request based on its own internal policy.  The
signers built-in to KCM will honor this field if it is not set to a
value greater than --cluster-signing-duration.  The minimum allowed
value for this field is 600 seconds (ten minutes).

This change will help enforce safer durations for certificates in
the Kube ecosystem and will help related projects such as
cert-manager with their migration to the Kube CSR API.

Future enhancements may update the Kubelet to take advantage of this
field when it is configured in a way that can tolerate shorter
certificate lifespans with regular rotation.

Signed-off-by: Monis Khan <mok@vmware.com>
2021-07-01 23:38:15 -04:00
Kubernetes Prow Robot
062bc359ca Merge pull request #102444 from sanwishe/resourceStartTime
Expose container start time in kubelet /metrics/resource endpoint
2021-07-01 14:27:51 -07:00
Kir Kolyshkin
ab5b77944e kubelet/cm: don't set Devices
Since runc 1.0.0 it is now sufficient to have SkipDevices: true.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-06-30 16:17:35 -07:00
Elana Hashman
39f32d7286 Ensure MemorySwapConfig can't be set without feature flag 2021-06-29 12:08:25 -07:00
Elana Hashman
d4041cb80f Add generated files for swap API changes 2021-06-29 12:08:25 -07:00
Elana Hashman
d3fd1362ca Rename NoSwap to LimitedSwap as workloads may still swap
Also made the options a kubelet type, address API review feedback
2021-06-29 12:08:21 -07:00
Elana Hashman
0deef4610e Set MemorySwapLimitInBytes for CRI when NodeSwapEnabled 2021-06-29 11:59:02 -07:00
Elana Hashman
7342acb0b8 Add validation for KubeletConfig MemorySwap 2021-06-29 11:59:01 -07:00
Elana Hashman
bda03b4818 API change: add MemorySwap to KubeletConfiguration 2021-06-29 11:58:59 -07:00
Kubernetes Prow Robot
01819dd322 Merge pull request #102028 from chrishenzie/read-write-once-pod-access-mode
ReadWriteOncePod access mode for PVs and PVCs
2021-06-29 10:04:40 -07:00
Kubernetes Prow Robot
756203fda0 Merge pull request #102576 from dobsonj/101911
kubelet: do not call RemoveAll on volumes directory for orphaned pods
2021-06-29 06:54:40 -07:00
Chris Henzie
2b98f8edc7 Enforce ReadWriteOncePod access mode during mount 2021-06-28 21:25:37 -07:00
Kubernetes Prow Robot
15d3c3a5e2 Merge pull request #102821 from ehashman/phase-fix
Ensure kubelet statuses can handle loss of container runtime state
2021-06-28 15:38:40 -07:00
Kubernetes Prow Robot
07358f1663 Merge pull request #103146 from tech-geek29/fix-95380
Change log level to Debug
2021-06-25 07:44:45 -07:00
Kubernetes Prow Robot
49ab9ac160 Merge pull request #103154 from jsafrane/fix-asw-mounter
Update mounter interface in volume manager
2021-06-24 14:18:05 -07:00
Kubernetes Prow Robot
2e93b3924a Merge pull request #101943 from saschagrunert/seccomp-default
Add kubelet `SeccompDefault` alpha feature
2021-06-24 13:07:41 -07:00
Kubernetes Prow Robot
79494183b7 Merge pull request #102869 from mengjiao-liu/json-register-move
Remove default JSON logging format registration from k8s.io/component-base/logs package
2021-06-24 11:59:41 -07:00