Automatic merge from submit-queue
Eviction Manager Enforces Allocatable Thresholds
This PR modifies the eviction manager to enforce node allocatable thresholds for memory as described in kubernetes/community#348.
This PR should be merged after #41234.
cc @kubernetes/sig-node-pr-reviews @kubernetes/sig-node-feature-requests @vishh
** Why is this a bug/regression**
Kubelet uses `oom_score_adj` to enforce QoS policies. But the `oom_score_adj` is based on overall memory requested, which means that a Burstable pod that requested a lot of memory can lead to OOM kills for Guaranteed pods, which violates QoS. Even worse, we have observed system daemons like kubelet or kube-proxy being killed by the OOM killer.
Without this PR, v1.6 will have node stability issues and regressions in an existing GA feature `out of Resource` handling.
Automatic merge from submit-queue (batch tested with PRs 42443, 38924, 42367, 42391, 42310)
Dell EMC ScaleIO Volume Plugin
**What this PR does / why we need it**
This PR implements the Kubernetes volume plugin to allow pods to seamlessly access and use data stored on ScaleIO volumes. [ScaleIO](https://www.emc.com/storage/scaleio/index.htm) is a software-based storage platform that creates a pool of distributed block storage using locally attached disks on every server. The code for this PR supports persistent volumes using PVs, PVCs, and dynamic provisioning.
You can find examples of how to use and configure the ScaleIO Kubernetes volume plugin in [examples/volumes/scaleio/README.md](examples/volumes/scaleio/README.md).
**Special notes for your reviewer**:
To facilitate code review, commits for source code implementation are separated from other artifacts such as generated, docs, and vendored sources.
```release-note
ScaleIO Kubernetes Volume Plugin added enabling pods to seamlessly access and use data stored on ScaleIO volumes.
```
Automatic merge from submit-queue (batch tested with PRs 41919, 41149, 42350, 42351, 42285)
enable cgroups tiers and node allocatable enforcement on pods by default.
```release-note
Pods are launched in a separate cgroup hierarchy than system services.
```
Depends on #41753
cc @derekwaynecarr
Automatic merge from submit-queue (batch tested with PRs 41919, 41149, 42350, 42351, 42285)
kubelet: enable qos-level memory limits
```release-note
Experimental support to reserve a pod's memory request from being utilized by pods in lower QoS tiers.
```
Enables the QoS-level memory cgroup limits described in https://github.com/kubernetes/community/pull/314
**Note: QoS level cgroups have to be enabled for any of this to take effect.**
Adds a new `--experimental-qos-reserved` flag that can be used to set the percentage of a resource to be reserved at the QoS level for pod resource requests.
For example, `--experimental-qos-reserved="memory=50%`, means that if a Guaranteed pod sets a memory request of 2Gi, the Burstable and BestEffort QoS memory cgroups will have their `memory.limit_in_bytes` set to `NodeAllocatable - (2Gi*50%)` to reserve 50% of the guaranteed pod's request from being used by the lower QoS tiers.
If a Burstable pod sets a request, its reserve will be deducted from the BestEffort memory limit.
The result is that:
- Guaranteed limit matches root cgroup at is not set by this code
- Burstable limit is `NodeAllocatable - Guaranteed reserve`
- BestEffort limit is `NodeAllocatable - Guaranteed reserve - Burstable reserve`
The only resource currently supported is `memory`; however, the code is generic enough that other resources can be added in the future.
@derekwaynecarr @vishh
Automatic merge from submit-queue (batch tested with PRs 42365, 42429, 41770, 42018, 35055)
kubeadm: Add --cert-dir, --cert-altnames instead of --api-external-dns-names
**What this PR does / why we need it**:
- For the beta kubeadm init UX, we need this change
- Also adds the `kubeadm phase certs selfsign` command that makes the phase invokable independently
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
This PR depends on https://github.com/kubernetes/kubernetes/pull/41897
**Release note**:
```release-note
```
@dmmcquay @pires @jbeda @errordeveloper @mikedanese @deads2k @liggitt
Automatic merge from submit-queue
Remove defaults from string flags
- The default is printed automatically
- The string text did not match the actual default
**What this PR does / why we need it**:
Adjust the documentation for flags on `client-gen`.
**Special notes for your reviewer**:
Doc change. String text only.
**Release note**:
```release-note
NONE
```
Before:
```
client-gen --help
Usage of ./client-gen:
--build-tag string A Go build tag to use to identify files generated by this command. Should be unique. (default "ignore_autogenerated")
--clientset-api-path string the value of default API path.
-n, --clientset-name string the name of the generated clientset package. (default "internalclientset")
--clientset-only when set, client-gen only generates the clientset shell, without generating the individual typed clients
--clientset-path string the generated clientset will be output to <clientset-path>/<clientset-name>. Default to "k8s.io/kubernetes/pkg/client/clientset_generated/" (default "k8s.io/kubernetes/pkg/client/clientset_generated/")
--fake-clientset when set, client-gen will generate the fake clientset that can be used in tests (default true)
-h, --go-header-file string File containing boilerplate header text. The string YEAR will be replaced with the current 4-digit year. (default "/Users/mhb/go/src/k8s.io/gengo/boilerplate/boilerplate.go.txt")
--included-types-overrides stringSlice list of group/version/type for which client should be generated. By default, client is generated for all types which have genclient=true in types.go. This overrides that. For each groupVersion in this list, only the types mentioned here will be included. The default check of genclient=true will be used for other group versions.
--input stringSlice group/versions that client-gen will generate clients for. At most one version per group is allowed. Specified in the format "group1/version1,group2/version2...". Default to "api/,extensions/,autoscaling/,batch/,rbac/" (default [api/,authentication/,authorization/,autoscaling/,batch/,certificates/,extensions/,rbac/,storage/,apps/,policy/])
--input-base string base path to look for the api group. Default to "k8s.io/kubernetes/pkg/apis" (default "k8s.io/kubernetes/pkg/apis")
-i, --input-dirs stringSlice Comma-separated list of import paths to get input types from.
-o, --output-base string Output base; defaults to $GOPATH/src/ or ./ if $GOPATH is not set. (default "/Users/mhb/go/src")
-O, --output-file-base string Base name (without .go suffix) for output files.
-p, --output-package string Base package path.
-t, --test set this flag to generate the client code for the testdata
--verify-only If true, only verify existing output, do not write anything.
```
After:
```
client-gen --help
Usage of ./client-gen:
--build-tag string A Go build tag to use to identify files generated by this command. Should be unique. (default "ignore_autogenerated")
--clientset-api-path string the value of default API path.
-n, --clientset-name string the name of the generated clientset package. (default "internalclientset")
--clientset-only when set, client-gen only generates the clientset shell, without generating the individual typed clients
--clientset-path string the generated clientset will be output to <clientset-path>/<clientset-name>. (default "k8s.io/kubernetes/pkg/client/clientset_generated/")
--fake-clientset when set, client-gen will generate the fake clientset that can be used in tests (default true)
-h, --go-header-file string File containing boilerplate header text. The string YEAR will be replaced with the current 4-digit year. (default "/Users/mhb/go/src/k8s.io/gengo/boilerplate/boilerplate.go.txt")
--included-types-overrides stringSlice list of group/version/type for which client should be generated. By default, client is generated for all types which have genclient=true in types.go. This overrides that. For each groupVersion in this list, only the types mentioned here will be included. The default check of genclient=true will be used for other group versions.
--input stringSlice group/versions that client-gen will generate clients for. At most one version per group is allowed. Specified in the format "group1/version1,group2/version2...". (default [api/,authentication/,authorization/,autoscaling/,batch/,certificates/,extensions/,rbac/,storage/,apps/,policy/])
--input-base string base path to look for the api group. (default "k8s.io/kubernetes/pkg/apis")
-i, --input-dirs stringSlice Comma-separated list of import paths to get input types from.
-o, --output-base string Output base; defaults to $GOPATH/src/ or ./ if $GOPATH is not set. (default "/Users/mhb/go/src")
-O, --output-file-base string Base name (without .go suffix) for output files.
-p, --output-package string Base package path.
-t, --test set this flag to generate the client code for the testdata
--verify-only If true, only verify existing output, do not write anything.
```
Automatic merge from submit-queue (batch tested with PRs 41984, 41682, 41924, 41928)
RC/RS: Fully Respect ControllerRef
**What this PR does / why we need it**:
This is part of the completion of the [ControllerRef](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/controller-ref.md) proposal. It brings ReplicaSet and ReplicationController into full compliance with ControllerRef. See the individual commit messages for details.
**Which issue this PR fixes**:
Although RC/RS had partially implemented ControllerRef, they didn't use it to determine which controller to sync, or to update expectations. This could lead to instability or controllers getting stuck.
Ref: https://github.com/kubernetes/kubernetes/issues/24433
**Special notes for your reviewer**:
**Release note**:
```release-note
```
cc @erictune @kubernetes/sig-apps-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 42128, 42064, 42253, 42309, 42322)
kubeadm: Rename some flags for beta UI and fixup some logic
**What this PR does / why we need it**:
In this PR:
- `--api-advertise-addresses` becomes `--apiserver-advertise-address`
- The API Server's logic here is that if the address is `0.0.0.0`, it chooses the host's default interface's address. kubeadm here uses exactly the same logic. This arg is then passed to `--advertise-address`, and the API Server will advertise that one for the service VIP.
- `--api-port` becomes `--apiserver-bind-port` for clarity
ref the meeting notes: https://docs.google.com/document/d/1deJYPIF4LmhGjDVaqrswErIrV7mtwJgovtLnPCDxP7U/edit#
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
@jbeda @dmmcquay @pires @lukemarsden @dgoodwin @mikedanese
Automatic merge from submit-queue (batch tested with PRs 42128, 42064, 42253, 42309, 42322)
Add storage.k8s.io/v1 API
This is combined version of reverted #40088 (first 4 commits) and #41646. The difference is that all controllers and tests use old `storage.k8s.io/v1beta1` API so in theory all tests can pass on GKE.
Release note:
```release-note
StorageClassName attribute has been added to PersistentVolume and PersistentVolumeClaim objects and should be used instead of annotation `volume.beta.kubernetes.io/storage-class`. The beta annotation is still working in this release, however it will be removed in a future release.
```
Automatic merge from submit-queue (batch tested with PRs 41931, 39821, 41841, 42197, 42195)
Apiserver: wait for Etcd to become available on startup
fixes#37704
export functions from pkg/api/validation
add settings API
add settings to pkg/registry
add settings api to pkg/master/master.go
add admission control plugin for pod preset
add new admission control plugin to kube-apiserver
add settings to import_known_versions.go
add settings to codegen
add validation tests
add settings to client generation
add protobufs generation for settings api
update linted packages
add settings to testapi
add settings install to clientset
add start of e2e
add pod preset plugin to config-test.sh
Signed-off-by: Jess Frazelle <acidburn@google.com>
Automatic merge from submit-queue
HPA Controller: Use Custom Metrics API
This commit switches over the HPA controller to use the custom metrics
API. It also converts the HPA controller to use the generated client
in k8s.io/metrics for the resource metrics API.
In order to enable support, you must enable
`--horizontal-pod-autoscaler-use-rest-clients` on the
controller-manager, which will switch the HPA controller's MetricsClient
implementation over to use the standard rest clients for both custom
metrics and resource metrics. This requires that at the least resource
metrics API is registered with kube-aggregator, and that the controller
manager is pointed at kube-aggregator. For this to work, Heapster
must be serving the new-style API server (`--api-server=true`).
Before this merges, this will need kubernetes/metrics#2 to merge, and a godeps update to pull that in.
It's also semi-dependent on kubernetes/heapster#1537, but that is not required in order for this to merge.
**Release note**:
```release-note
Allow the Horizontal Pod Autoscaler controller to talk to the metrics API and custom metrics API as standard APIs.
```
Automatic merge from submit-queue
Extensible Userspace Proxy
This PR refactors the userspace proxy to allow for custom proxy socket implementations.
It changes the the ProxySocket interface to ensure that other packages can properly implement it (making sure all arguments are publicly exposed types, etc), and adds in a mechanism for an implementation to create an instance of the userspace proxy with a non-standard ProxySocket.
Custom ProxySockets are useful to inject additional logic into the actual proxying. For example, our idling proxier uses a custom proxy socket to hold connections and notify the cluster that idled scalable resources need to be woken up.
Also-Authored-By: Ben Bennett bbennett@redhat.com
This commit switches over the HPA controller to use the custom metrics
API. It also converts the HPA controller to use the generated client
in k8s.io/metrics for the resource metrics API.
In order to enable support, you must enable
`--horizontal-pod-autoscaler-use-rest-clients` on the
controller-manager, which will switch the HPA controller's MetricsClient
implementation over to use the standard rest clients for both custom
metrics and resource metrics. This requires that at the least resource
metrics API is registered with kube-aggregator, and that the controller
manager is pointed at kube-aggregator. For this to work, Heapster
must be serving the new-style API server (`--api-server=true`).
Automatic merge from submit-queue
Extend experimental support to multiple Nvidia GPUs
Extended from #28216
```release-note
`--experimental-nvidia-gpus` flag is **replaced** by `Accelerators` alpha feature gate along with support for multiple Nvidia GPUs.
To use GPUs, pass `Accelerators=true` as part of `--feature-gates` flag.
Works only with Docker runtime.
```
1. Automated testing for this PR is not possible since creation of clusters with GPUs isn't supported yet in GCP.
1. To test this PR locally, use the node e2e.
```shell
TEST_ARGS='--feature-gates=DynamicKubeletConfig=true' FOCUS=GPU SKIP="" make test-e2e-node
```
TODO:
- [x] Run manual tests
- [x] Add node e2e
- [x] Add unit tests for GPU manager (< 100% coverage)
- [ ] Add unit tests in kubelet package
Automatic merge from submit-queue (batch tested with PRs 41921, 41695, 42139, 42090, 41949)
kubeadm: join ux changes
**What this PR does / why we need it**: Update `kubeadm join` UX according to https://github.com/kubernetes/community/pull/381
**Which issue this PR fixes**: fixes # https://github.com/kubernetes/kubeadm/issues/176
**Special notes for your reviewer**: /cc @luxas @jbeda
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
kubeadm: Turn off insecure apiserver access on localhost:8080
**What this PR does / why we need it**:
ref: https://github.com/kubernetes/kubeadm/issues/181
depends on: https://github.com/kubernetes/kubernetes/pull/41897
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Insecure access to the API Server at localhost:8080 will be turned off in v1.6 when using kubeadm
```
@jbeda @liggitt @deads2k @pires @lukemarsden @mikedanese @errordeveloper
Automatic merge from submit-queue (batch tested with PRs 42200, 39535, 41708, 41487, 41335)
Update kube-proxy support for Windows
**What this PR does / why we need it**:
The kube-proxy is built upon the sophisticated iptables NAT rules. Windows does not have an equivalent capability. This introduces a change to the architecture of the user space mode of the Windows version of kube-proxy to match the capabilities of Windows.
The proxy is organized around service ports and portals. For each service a service port is created and then a portal, or iptables NAT rule, is opened for each service ip, external ip, node port, and ingress ip. This PR merges the service port and portal into a single concept of a "ServicePortPortal" where there is one connection opened for each of service IP, external ip, node port, and ingress IP.
This PR only affects the Windows kube-proxy. It is important for the Windows kube-proxy because it removes the limited portproxy rule and RRAS service and enables full tcp/udp capability to services.
**Special notes for your reviewer**:
**Release note**:
```
Add tcp/udp userspace proxy support for Windows.
```
- Add a new type PortworxVolumeSource
- Implement the kubernetes volume plugin for Portworx Volumes under pkg/volume/portworx
- The Portworx Volume Driver uses the libopenstorage/openstorage specifications and apis for volume operations.
Changes for k8s configuration and examples for portworx volumes.
- Add PortworxVolume hooks in kubectl, kube-controller-manager and validation.
- Add a README for PortworxVolume usage as PVs, PVCs and StorageClass.
- Add example spec files
Handle code review comments.
- Modified READMEs to incorporate to suggestions.
- Add a test for ReadWriteMany access mode.
- Use util.UnmountPath in TearDown.
- Add ReadOnly flag to PortworxVolumeSource
- Use hostname:port instead of unix sockets
- Delete the mount dir in TearDown.
- Fix link issue in persistentvolumes README
- In unit test check for mountpath after Setup is done.
- Add PVC Claim Name as a Portworx Volume Label
Generated code and documentation.
- Updated swagger spec
- Updated api-reference docs
- Updated generated code under pkg/api/v1
Godeps update for Portworx Volume Driver
- Adds github.com/libopenstorage/openstorage
- Adds go.pedge.io/pb/go/google/protobuf
- Updates Godep Licenses
Automatic merge from submit-queue (batch tested with PRs 35094, 42095, 42059, 42143, 41944)
add aggregation integration test
Wires up an integration test which runs a full kube-apiserver, the wardle server, and the kube-aggregator and creates the APIservice object for the wardle server. Without services and DNS the aggregator doesn't proxy, but it does ensure we don't have an obvious panic or bring up failure.
@sttts @ncdc
Automatic merge from submit-queue
clean up generic apiserver options
Clean up generic apiserver options before we tag any levels. This makes them more in-line with "normal" api servers running on the platform.
Also remove dead example code.
@sttts