Automatic merge from submit-queue
Switch DisruptionBudget api from bool to int allowed disruptions [only v1beta1]
Continuation of #34546. Apparently it there is some bug that prevents us from having 2 different incompatibile version of API in integration tests. So in this PR v1alpha1 is removed until testing infrastructure is fixed.
Base PR comment:
Currently there is a single bool in disruption budget api that denotes whether 1 pod can be deleted or not. Every time a pod is deleted the apiserver filps the bool to false and the disruptionbudget controller sets it to true if more deletions are allowed. This works but it is far from optimal when the user wants to delete multiple pods (for example, by decreasing replicaset size from 10000 to 8000).
This PR adds a new api version v1beta1 and changes bool to int which contains a number of pods that can be deleted at once.
cc: @davidopp @mml @wojtek-t @fgrzadkowski @caesarxuchao
Stopping a sandbox includes reclaiming the network resources. By always
stopping the sandbox before removing it, we reduce the possibility of leaking
resources in some corner cases.
Pods which are evicted by the nodecontroller due to network
malfunction, or unresponsive kubelet should be differentiated
from termination initiated by other sources. The reason/message
are consumed by kubectl to provide a better summary using get/describe.
--v=2 is low noise (record changes), can be default
--v=3 will shows per request logging
Note: due to the code path with which we integrate with
skydns, we don't see non-PILLAR_DOMAIN requests, so these
will never be logged.
Automatic merge from submit-queue
always allow decoding of status when returned from the API
`unversioned.Status` should be able to come back from any API version and still be properly decoded. This doesn't happen today by default.
@smarterclayton Our projectrequest endpoint returns a `Status` object on a 200 return from list to indicate everything went well. This (or something like it) is needed to make the API accepted by `kubectl`. Alternatively, we change the API to return a different (still not a `Project`) value from list, which still feels wrong.
Automatic merge from submit-queue
Populate Node.Status.Addresses with Hostname
This PR is supposed to address #22063
Currently `NodeName` has to be a resolvable dns address on the master to allow apiserver -> kubelet communication (exec, log, port-forward operations on a pod). In some situations this is unfortunate (see the discussions on the issue).
The PR aims to do the following:
- Populate the `Type: Hostname` in the `Node.Status.Addresses` array, the type is already defined, but was not used so far.
- Add logic to resolve a Node's Hostname when the apiserver initiates communication with the Kubelet, instead of using the Nodename string as Hostname.
```release-note
The hostname of the node (as autodetected by the kubelet, specified via --hostname-override, or determined by the cloudprovider) is now recorded as an address of type "Hostname" in the status of the Node API object. The hostname is expected to be resolveable from the apiserver.
```
Automatic merge from submit-queue
promote /healthz and /metrics to genericapiserver
Promotes `/healthz` to genericapiserver with methods to add healthz checks before running.
Promotes `/metrics` to genericapiserver gated by config flag.
@lavalamp adds the healthz checks linked to `postStartHooks` as promised.
Automatic merge from submit-queue
clean up client version negotiation to handle no legacy API
Version negotiation fails if the legacy API endpoint isn't available.
This tightens up the negotiation interface based to more clearly express what each stage is doing and what the constraints on negotiation are. This is needed to speak to generic API servers.
@kubernetes/kubectl
Automatic merge from submit-queue
Don't rely on device name provided by Cinder
See issue #33128
We can't rely on the device name provided by Cinder, and thus must perform
detection based on the drive serial number (aka It's cinder ID) on the
kubelet itself.
This patch re-works the cinder volume attacher to ignore the supplied
deviceName, and instead defer to the pre-existing GetDevicePath method to
discover the device path based on it's serial number and /dev/disk/by-id
mapping.
This new behavior is controller by a config option, as falling back
to the cinder value when we can't discover a device would risk devices
not showing up, falling back to cinder's guess, and detecting the wrong
disk as attached.
Automatic merge from submit-queue
Add BindNetwork to genericapiserver Config
This is needed for downstream use:
`BindNetwork` is the type of network to bind to - defaults to "tcp4", accepts "tcp", "tcp4", and "tcp6".
Automatic merge from submit-queue
Corect filtering of OpenStack LBaaS resources to delete
Neutron's API ignores unknown parameters. When listing pools etc, K8
attempts to filter on "LoadBalancerID", which is not a valid filter.
As such, it is ignored by Neutron, and a list of all pools is
returned. K8 then proceeds to delete each of the pools.
Instead, we now double check the resources really belong to the LB
we're trying to delete.
Fixes issue #33759
Automatic merge from submit-queue
[Kubelet] Use the custom mounter script for Nfs and Glusterfs only
This patch reduces the scope for the containerized mounter to NFS and GlusterFS on GCE + GCI clusters
This patch also enabled the containerized mounter on GCI nodes
Shepherding multiple PRs through the submit queue is painful. Hence I combined them into this PR. Please review each commit individually.
cc @jingxu97 @saad-ali
https://github.com/kubernetes/kubernetes/pull/35652 has also been reverted as part of this PR
Automatic merge from submit-queue
pod and qos level cgroup support
```release-note
[Kubelet] Add alpha support for `--cgroups-per-qos` using the configured `--cgroup-driver`. Disabled by default.
```
When looking for overlapping deployments, we should also find other deployments that select current deployment's pods,
not just the ones whose pods are selected by current deployment.
--force is required for --grace-period=0. --now is == --grace-period=1.
Improve command help to explain what graceful deletion is and warn about
force deletion.
Automatic merge from submit-queue
Avoid double decoding all client responses
Fixes#35982
The linked issue uncovered that we were always double decoding the response in restclient for get, list, update, create, and patch. That's fairly expensive, most especially for list. This PR refines the behavior of the rest client to avoid double decoding, and does so while minimizing the changes to rest client consumers.
restclient must be able to deal with multiple types of servers. Alter the behavior of restclient.Result#Raw() to not process the body on error, but instead to return the generic error (which still matches the error checking cases in api/error like IsBadRequest). If the caller uses
.Error(), .Into(), or .Get(), try decoding the body as a Status.
For older servers, continue to default apiVersion "v1" when calling restclient.Result#Error(). This was only for 1.1 servers and the extensions group, which we have since fixed.
This removes a double decode of very large objects (like LIST) - we were trying to DecodeInto status, but that ends up decoding the entire result and then throwing it away. This makes the decode behavior specific to the type of action the user wants.
```release-note
The error handling behavior of `pkg/client/restclient.Result` has changed. Calls to `Result.Raw()` will no longer parse the body, although they will still return errors that react to `pkg/api/errors.Is*()` as in previous releases. Callers of `Get()` and `Into()` will continue to receive errors that are parsed from the body if the kind and apiVersion of the body match the `Status` object.
This more closely aligns rest client as a generic RESTful client, while preserving the special Kube API extended error handling for the `Get` and `Into` methods (which most Kube clients use).
```
See issue #33128
We can't rely on the device name provided by Cinder, and thus must perform
detection based on the drive serial number (aka It's cinder ID) on the
kubelet itself.
This patch re-works the cinder volume attacher to ignore the supplied
deviceName, and instead defer to the pre-existing GetDevicePath method to
discover the device path based on it's serial number and /dev/disk/by-id
mapping.
This new behavior is controller by a config option, as falling back
to the cinder value when we can't discover a device would risk devices
not showing up, falling back to cinder's guess, and detecting the wrong
disk as attached.
Automatic merge from submit-queue
Making the pod.alpha.kubernetes.io/initialized annotation optional in PetSet pods
**What this PR does / why we need it**: As of now, the absence of the annotation `pod.alpha.kubernetes.io/initialized` in PetSets causes the PetSet controller to effectively "pause". Being a debug hook, users expect that its absence has no effect on the working of a PetSet. This PR inverts the logic so that we let the PetSet controller operate as expected in the absence of the annotation.
Letting the annotation remain alpha seems ok. Renaming it to something more meaningful needs further discussion.
**Which issue this PR fixes** _(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)_: fixes https://github.com/kubernetes/kubernetes/issues/35498
**Special notes for your reviewer**:
**Release note**:
``` release-note
The annotation "pod.alpha.kubernetes.io/initialized" on StatefulSets (formerly PetSets) is now optional and only encouraged for debug use.
```
cc @erictune @smarterclayton @bprashanth @kubernetes/sig-apps
@kow3ns The examples will need to be cleaned up as well I think later on to remove them.
Automatic merge from submit-queue
Fix build break on non-Linux OS introduced in 87aaf4c0
**What this PR does / why we need it**: simple fix for build breakage on non-Linux OS.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
**Special notes for your reviewer**:
```console
+++ [1102 11:16:58] linux/amd64: go build started
+++ [1102 11:20:11] linux/amd64: go build finished
+++ [1102 11:16:58] darwin/amd64: go build started
# k8s.io/kubernetes/pkg/kubelet/dockershim/cm
pkg/kubelet/dockershim/cm/container_manager_unsupported.go:33: undefined: fmt in fmt.Errorf
+++ [1102 11:16:58] windows/amd64: go build started
# k8s.io/kubernetes/pkg/kubelet/dockershim/cm
pkg/kubelet/dockershim/cm/container_manager_unsupported.go:33: undefined: fmt in fmt.Errorf
Makefile:79: recipe for target 'all' failed
make[1]: *** [all] Error 1
Makefile:255: recipe for target 'cross' failed
make: *** [cross] Error 1
Makefile:239: recipe for target 'release' failed
make: *** [release] Error 1
```
**Release note**:
```release-note
NONE
```
We are more liberal in what we accept as a volume id in k8s, and indeed
we ourselves generate names that look like `aws://<zone>/<id>` for
dynamic volumes.
This volume id (hereafter a KubernetesVolumeID) cannot directly be
compared to an AWS volume ID (hereafter an awsVolumeID).
We introduce types for each, to prevent accidental comparison or
confusion.
Issue #35746