Automatic merge from submit-queue (batch tested with PRs 44061, 46614, 46419, 46628, 46134)
cleanup kubelet new node status test
**What this PR does / why we need it**:
this scopes the test to just testing allocatable values. extra parts of the original test were copied from another test that was not relevant.
Automatic merge from submit-queue
kubelet: group all container-runtime-specific flags/options into a separate struct
They don't belong in the KubeletConfig.
This addresses #43253
Automatic merge from submit-queue
add myself and liggitt to pkg/kubelet/certificats OWNERs
For as long a kubelet is using the internal client, this certificate
manager is bound to the kubelet. Once kubelet has moved to client-go we
plan to extract this library to be general purpose. In the meantime,
liggitt and I should handle reviews of this code.
@liggitt @timstclair
For as long a kubelet is using the internal client, this certificate
manager is bound to the kubelet. Once kubelet has moved to client-go we
plan to extract this library to be general purpose. In the meantime,
liggitt and I should handle reviews of this code.
Automatic merge from submit-queue
use make slice to store objects to improve efficiency
Signed-off-by: allencloud <allen.sun@daocloud.io>
**What this PR does / why we need it**:
we we know the slice length in advance, I think we had better use make to create the specified length of slice. This will improve some kind of performance. Since if we create a slice with []type{}, we did not know how much space runtime should reserve, since slice implementation should be continuous in memory. While when we make a slice with specified length, runtime would reserve a continuous memory space which will not result in slice movement in case of current space is not enough.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
NONE
**Special notes for your reviewer**:
NONE
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45488, 45335, 45909, 46074, 46615)
Fix screwed-up log message format
It had two %-verbs and three arguments
**What this PR does / why we need it**:
Fixes kubelet log lines like this:
May 08 11:49:04 brya-1 kubelet[23248]: W0508 11:49:04.248123 23248 eviction_manager.go:128] Failed to admit pod kube-proxy-g3hjs_kube-system(55c1fbbb-33e4-11e7-b83c-42010a800002) - node has conditions: %v%!(EXTRA []v1.NodeConditionType=[MemoryPressure])
to remove the `%v%!(EXTRA`
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46302, 44597, 44742, 46554)
Do not install do-nothing iptables rules
Deprecate kubelet non-masquerade-cidr.
Do not install iptables rules if it is set to 0.0.0.0/0.
Fixes#46553
Automatic merge from submit-queue (batch tested with PRs 46252, 45524, 46236, 46277, 46522)
Support sandbox images from private registries
**What this PR does / why we need it**:
The --pod-infra-container-image parameter allows the user to specify
an arbitrary image to be used as the pod infra container (AKA
sandbox), an internal piece of the dockershim implementation of the
Container Runtime Interface.
The dockershim does not have access to any of the pod-level image pull
credentials configuration, so if the user specifies an image from a
private registry, the image pull will fail.
This change allows the dockershim to read local docker configuration
(e.g. /root/.docker/config.json) and use it when pulling the pod infra
container image.
**Which issue this PR fixes**: fixes#45738
**Special notes for your reviewer**:
The changes to fake_client for writing local config files deserve some
attention.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46450, 46272, 46453, 46019, 46367)
Move MountVolume.SetUp succeeded to debug level
This message is verbose and repeated over and over again in log files
creating a lot of noise. Leave the message in, but require a -v in
order to actually log it.
**What this PR does / why we need it**: Moves a verbose log message to actually be verbose.
**Which issue this PR fixes** fixes#46364Fixes#29059
Automatic merge from submit-queue (batch tested with PRs 45809, 46515, 46484, 46516, 45614)
CRI: add methods for container stats
**What this PR does / why we need it**:
Define methods in CRI to get container stats.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
Part of https://github.com/kubernetes/features/issues/290; addresses #27097
**Special notes for your reviewer**:
This PR defines the *minimum required* container metrics for the existing components to function, loosely based on the previous discussion on [core metrics](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/core-metrics-pipeline.md) as well as the existing cadvisor/summary APIs.
Two new RPC calls are added to the RuntimeService: `ContainerStats` and `ListContainerStats`. The former retrieves stats for a given container, while the latter gets stats for all containers in one call.
The stats gathering time of each subsystem can vary substantially (e.g., cpu vs. disk), so even though the on-demand model preferred due to its simplicity, we’d rather give the container runtime more flexibility to determine the collection frequency for each subsystem*. As a trade-off, each piece of stats for the subsystem must contain a timestamp to let kubelet know how fresh/recent the stats are. In the future, we should also recommend a guideline for how recent the stats should be in order to ensure the reliability (e.g., eviction) and the responsiveness (e.g., autoscaling) of the kubernetes cluster.
The next step is to plumb this through kubelet so that kubelet can choose consume container stats from CRI or cadvisor.
**Alternatively, we can add calls to get stats of individual subsystems. However, kubelet does not have the complete knowledge of the runtime environment, so this would only lead to unnecessary complexity in kubelet.*
**Release note**:
```release-note
Augment CRI to support retrieving container stats from the runtime.
```
Automatic merge from submit-queue (batch tested with PRs 45809, 46515, 46484, 46516, 45614)
kubelet was sending negative allocatable values
**What this PR does / why we need it**:
if you set reservations > node capacity, the node sent negative values for allocatable values on create. setting negative values on update is rejected.
**Which issue this PR fixes**
xref https://bugzilla.redhat.com/show_bug.cgi?id=1455420
**Special notes for your reviewer**:
at this time, the node is allowed to set status on create. without this change, a node was being registered with negative allocatable values. i think we need to revisit letting node set status on create, and i will send a separate pr to debate the merits of that point.
```release-note
Prevent kubelet from setting allocatable < 0 for a resource upon initial creation.
```
Automatic merge from submit-queue (batch tested with PRs 42256, 46479, 45436, 46440, 46417)
Log out digest when digest is invalid
Notice this in frakti: missing image ref when logging it out.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 42256, 46479, 45436, 46440, 46417)
Fix naming and comments in Container Manage
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
**What this PR does / why we need it**:
The --pod-infra-container-image parameter allows the user to specify
an arbitrary image to be used as the pod infra container (AKA
sandbox), an internal piece of the dockershim implementation of the
Container Runtime Interface.
The dockershim does not have access to any of the pod-level image pull
credentials configuration, so if the user specifies an image from a
private registry, the image pull will fail.
This change allows the dockershim to read local docker configuration
(e.g. /root/.docker/config.json) and use it when pulling the pod infra
container image.
**Which issue this PR fixes**: fixes#45738
**Special notes for your reviewer**:
The changes to fake_client for writing local config files deserve some
attention.
**Release note**:
```release-note
NONE
```
This message is verbose and repeated over and over again in log files
creating a lot of noise. Leave the messsage in, but require a -v in
order to actually log it.
Fixes#29059
Automatic merge from submit-queue (batch tested with PRs 46501, 45944, 46473)
fix func comment in helpers.go
**What this PR does / why we need it**:
fix func comment in helpers.go
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
NONE
**Special notes for your reviewer**:
NONE
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46429, 46308, 46395, 45867, 45492)
Implement FakeVolumePlugin's ConstructVolumeSpec method according to interface expectation.
This fixes#45803 and #46204.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46124, 46434, 46089, 45589, 46045)
Support TCP type runtime endpoint for kubelet
**What this PR does / why we need it**:
Currently the grpc server for kubelet and dockershim has a hardcoded endpoint: unix socket '/var/run/dockershim.sock', which is not applicable on non-unix OS.
This PR is to support TCP endpoint type besides unix socket.
**Which issue this PR fixes**
This is a first attempt to address issue https://github.com/kubernetes/kubernetes/issues/45927
**Special notes for your reviewer**:
Before this change, running on Windows node results in:
```
Container Manager is unsupported in this build
```
After adding the cm stub, error becomes:
```
listen unix /var/run/dockershim.sock: socket: An address incompatible with the requested protocol was used.
```
This PR is to fix those two issues.
After this change, still meets 'seccomp' related issue when running on Windows node, needs more updates later.
**Release note**:
Automatic merge from submit-queue (batch tested with PRs 45949, 46009, 46320, 46423, 46437)
Unregister some metrics
delete some registered metrics since they are not observed
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
fix regression in UX experience for double attach volume
send event when volume is not allowed to multi-attach
Fixes#46012
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Fix some typo of comment in kubelet.go
**What this PR does / why we need it**:
The PR is to fix some typo in kubelet.go
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
N/A
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue
Double `StopContainer` request timeout.
Doubled `StopContainer` request timeout to leave some time for `SIGKILL` container.
@yujuhong @feiskyer
Automatic merge from submit-queue (batch tested with PRs 46022, 46055, 45308, 46209, 43590)
Eviction does not evict unless the previous pod has been cleaned up
Addresses #43166
This PR makes two main changes:
First, it makes the eviction loop re-trigger immediately if there may still be pressure. This way, if we already waited 10 seconds to delete a pod, we dont need to wait another 10 seconds for the next synchronize call.
Second, it waits for the pod to be cleaned up (including volumes, cgroups, etc), before moving on to the next synchronize call. It has a timeout for this operation currently set to 30 seconds.
Automatic merge from submit-queue (batch tested with PRs 46022, 46055, 45308, 46209, 43590)
Remove Save() from iptables interface
This is what @thockin requested in one of the reviews.
Automatic merge from submit-queue
Fix kubelet event recording
**What this PR does / why we need it**:
There are numerous areas where the kubelet was not properly recording events due to an incorrect type.
To keep this small, I updated all references to `RefManager` that result in throwing an event to ensure it does a conversion.
**Which issue this PR fixes**
Fixes https://github.com/kubernetes/kubernetes/issues/46241Fixes#44348Fixes#44652
**Special notes for your reviewer**:
I updated all references I could find to the existing RefManager in kubelet.
**Release note**:
```release-note
fix kubelet event recording for selected events.
```
Automatic merge from submit-queue
Moved qos to api.helpers.
**What this PR does / why we need it**:
The `GetPodQoS` is also used by other components, e.g. kube-scheduler and it's not bound to kubelet; moved it to api helpers so client-go.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #N/A
**Release note**:
```release-note-none
```
Automatic merge from submit-queue
fix pleg relist time
This PR fix pleg reslist time. According to current implementation, we have a `Healthy` method periodically check the relist time. If current timestamp subtracts latest relist time is longer than `relistThreshold`(default is 3 minutes), we should return an error to indicate the error of runtime.
`relist` method is also called periodically. If runtime(docker) hung, the relist method should return immediately without updating the latest relist time. If we update latest relist time no matter runtime(docker) hung(default timeout is 2 minutes), the `Healthy` method will never return an error.
```release-note
Kubelet PLEG updates the relist timestamp only after successfully relisting.
```
/cc @yujuhong @Random-Liu @dchen1107