Automatic merge from submit-queue
Enforce Node Allocatable via cgroups
This PR enforces node allocatable across all pods using a top level cgroup as described in https://github.com/kubernetes/community/pull/348
This PR also provides an option to enforce `kubeReserved` and `systemReserved` on user specified cgroups.
This PR will by default make kubelet create top level cgroups even if `kubeReserved` and `systemReserved` is not specified and hence `Allocatable = Capacity`.
```release-note
New Kubelet flag `--enforce-node-allocatable` with a default value of `pods` is added which will make kubelet create a top level cgroup for all pods to enforce Node Allocatable. Optionally, `system-reserved` & `kube-reserved` values can also be specified separated by comma to enforce node allocatable on cgroups specified via `--system-reserved-cgroup` & `--kube-reserved-cgroup` respectively. Note the default value of the latter flags are "".
This feature requires a **Node Drain** prior to upgrade failing which pods will be restarted if possible or terminated if they have a `RestartNever` policy.
```
cc @kubernetes/sig-node-pr-reviews @kubernetes/sig-node-feature-requests
TODO:
- [x] Adjust effective Node Allocatable to subtract hard eviction thresholds
- [x] Add unit tests
- [x] Complete pending e2e tests
- [x] Manual testing
- [x] Get the proposal merged
@dashpole is working on adding support for evictions for enforcing Node allocatable more gracefully. That work will show up in a subsequent PR for v1.6
Automatic merge from submit-queue (batch tested with PRs 41205, 42196, 42068, 41588, 41271)
[CRI] enable kubenet traffic shaping
ref: https://github.com/kubernetes/kubernetes/issues/37316
Another way to do this is to expose another interface in network host to allow network plugins to retrieve annotation. But that seems unnecessary and more complicated.
Automatic merge from submit-queue (batch tested with PRs 42053, 41282, 42056, 41663, 40927)
Allow getting logs directly from deployment, job and statefulset
**Special notes for your reviewer**:
@smarterclayton you asked for it in OpenShift
```release-note
kubectl logs allows getting logs directly from deployment, job and statefulset
```
Automatic merge from submit-queue (batch tested with PRs 42053, 41282, 42056, 41663, 40927)
Update kubeadm token to work as expected
**What this PR does / why we need it**:
Follows up: https://github.com/kubernetes/kubernetes/pull/41509
Updates `kubeadm token` to work as discussed in https://docs.google.com/document/d/1deJYPIF4LmhGjDVaqrswErIrV7mtwJgovtLnPCDxP7U/edit#
Promotes the command from the `ex` subcommand which now is named `alpha` for clarity. (This will later become `kubeadm alpha phase`)
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
Example UX:
```console
sudo ./kubeadm token --help
This command will manage Bootstrap Token for you.
Please note this usage of this command is optional, and mostly for advanced users.
In short, Bootstrap Tokens are used for establishing bidirectional trust between a client and a server.
A Bootstrap Token can be used when a client (for example a node that's about to join the cluster) needs
to trust the server it is talking to. Then a Bootstrap Token with the "signing" usage can be used.
Bootstrap Tokens can also function as a way to allow short-lived authentication to the API Server
(the token serves as a way for the API Server to trust the client), for example for doing the TLS Bootstrap.
What is a Bootstrap Token more exactly?
- It is a Secret in the kube-system namespace of type "bootstrap.kubernetes.io/token".
- A Bootstrap Token must be of the form "[a-z0-9]{6}.[a-z0-9]{16}"; the former part is the public Token ID,
and the latter is the Token Secret, which must be kept private at all circumstances.
- The name of the Secret must be named "bootstrap-token-(token-id)".
You can read more about Bootstrap Tokens in this proposal:
https://github.com/kubernetes/community/blob/master/contributors/design-proposals/bootstrap-discovery.md
Usage:
kubeadm token [flags]
kubeadm token [command]
Available Commands:
create Create bootstrap tokens on the server.
delete Delete bootstrap tokens on the server.
generate Generate and print a bootstrap token, but do not create it on the server.
list List bootstrap tokens on the server.
Flags:
--kubeconfig string The KubeConfig file to use for talking to the cluster (default "/etc/kubernetes/admin.conf")
Use "kubeadm token [command] --help" for more information about a command.
lucas@THENINJA:~/luxas/kubernetes$ sudo ./kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION
70c388.41a07b703aa4bedf <forever> <never> authentication,signing The default bootstrap token generated by 'kubeadm init'.
lucas@THENINJA:~/luxas/kubernetes$ sudo ./kubeadm token create
c57e6a.abb75fa1debe555f
lucas@THENINJA:~/luxas/kubernetes$ sudo ./kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION
70c388.41a07b703aa4bedf <forever> <never> authentication,signing The default bootstrap token generated by 'kubeadm init'.
c57e6a.abb75fa1debe555f <forever> <never> authentication,signing <none>
lucas@THENINJA:~/luxas/kubernetes$ sudo ./kubeadm token create s
token ["s"] was not of form ["^([a-z0-9]{6})\\.([a-z0-9]{16})$"]
lucas@THENINJA:~/luxas/kubernetes$ sudo ./kubeadm token create c57e6a.abb75fa1debe555f
a token with id "c57e6a" already exists
lucas@THENINJA:~/luxas/kubernetes$ sudo ./kubeadm token delete c57e6a.abb75fa1debe555f
bootstrap token with id "c57e6a" deleted
```
**Release note**:
```release-note
NONE
```
@dmmcquay @jbeda @mikedanese @errordeveloper @pires
Automatic merge from submit-queue (batch tested with PRs 42053, 41282, 42056, 41663, 40927)
Fully remove hand-written listers and informers
Note: the first commit is from #41927. Adding do-not-merge for now as we'll want that to go in first, and then I'll rebase this on top.
Update statefulset controller to use a lister for PVCs instead of a client request. Also replace a unit test's dependency on legacylisters with the generated ones. cc @kargakis @kow3ns @foxish @kubernetes/sig-apps-pr-reviews
Remove all references to pkg/controller/informers and pkg/client/legacylisters, and remove those packages.
@smarterclayton @deads2k this should be it!
cc @gmarek @wojtek-t @derekwaynecarr @kubernetes/sig-scalability-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 41116, 41804, 42104, 42111, 42120)
make kubectl taint command respect effect NoExecute
**What this PR does / why we need it**:
Part of feature forgiveness implementation, make kubectl taint command respect effect NoExecute.
**Which issue this PR fixes**:
Related Issue: #1574
Related PR: #39469
**Special notes for your reviewer**:
**Release note**:
```release-note
make kubectl taint command respect effect NoExecute
```
Automatic merge from submit-queue (batch tested with PRs 41116, 41804, 42104, 42111, 42120)
Remove SandboxReceived event
This PR removes SandboxReceived event in sync pod.
> This event seems somewhat meaningless, and clouds the event records for a pod. Do we actually need it? Pulling and pod received on the node are very relevant, this seems much less so. Would suggest we either remove it, or turn it into a message that clearly indicates why it has value.
Refer d65309399a (commitcomment-21052453).
cc @smarterclayton @yujuhong
Automatic merge from submit-queue (batch tested with PRs 41116, 41804, 42104, 42111, 42120)
Add support for attacher/detacher interface in Flex volume
Add support for attacher/detacher interface in Flex volume
This change breaks backward compatibility and requires to be release noted.
```release-note
Flex volume plugin is updated to support attach/detach interfaces. It broke backward compatibility. Please update your drivers and implement the new callouts.
```
Automatic merge from submit-queue (batch tested with PRs 41962, 42055, 42062, 42019, 42054)
dockershim puts pause container in pod cgroup
**What this PR does / why we need it**:
The CRI was not launching the pause container in the pod level cgroup. The non-CRI code path was.
Automatic merge from submit-queue (batch tested with PRs 42044, 41694, 41927, 42050, 41987)
Add apply set-last-applied subcommand
implement part of https://github.com/kubernetes/community/pull/287, will rebase after https://github.com/kubernetes/kubernetes/pull/41699 got merged, EDIT: since bug output format has been confirmed, will update the behavior of output format soon
cc @kubernetes/sig-cli-pr-reviews @AdoHe @pwittrock
```release-note
Support kubectl apply set-last-applied command to update the applied-applied-configuration annotation
```
Automatic merge from submit-queue (batch tested with PRs 35408, 41915, 41992, 41964, 41925)
azure: document config file (+ remove unused field)
**What this PR does / why we need it**:
* documents the config file used by the Azure cloudprovider
* removes an unused field that shouldn't have been added
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 35408, 41915, 41992, 41964, 41925)
add secret option to flag
To resolve the issue of security(pr #35030 ),
> @smarterclayton commented 5 days ago
> This is unfortunately not all flags that could be secrets. The best option would be to add support in spf13/pflag to tag a flag as a secret, and then use that bit to determine the list.
>
> Also, Command() could be used in contexts that need exact parameters (for subshell execution), so we would need to add a new method or extend the signature here to allow exact flags to be retrieved.
we could add a secret option to the flags.
Automatic merge from submit-queue
make iscsi portals optional
**What this PR does / why we need it**: Make iSCSI portals optional
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 41954, 40528, 41875, 41165, 41877)
Updating apiserver to return 202 when resource is being deleted asynchronously via cascading deletion
As per https://github.com/kubernetes/kubernetes/issues/33196#issuecomment-278440622.
cc @kubernetes/sig-api-machinery-pr-reviews @smarterclayton @caesarxuchao @bgrant0607 @kubernetes/api-reviewers
```release-note
Updating apiserver to return http status code 202 for a delete request when the resource is not immediately deleted because of user requesting cascading deletion using DeleteOptions.OrphanDependents=false.
```
Automatic merge from submit-queue (batch tested with PRs 41701, 41818, 41897, 41119, 41562)
Allow updates to pod tolerations.
Opening this PR to continue discussion for pod spec tolerations updates when a pod has been scheduled already. This PR is built on top of https://github.com/kubernetes/kubernetes/pull/38957.
@kubernetes/sig-scheduling-pr-reviews @liggitt @davidopp @derekwaynecarr @kubernetes/rh-cluster-infra
Automatic merge from submit-queue
Admit critical pods under resource pressure
And evict critical pods that are not static.
Depends on #40952.
For #40573
Automatic merge from submit-queue (batch tested with PRs 41994, 41969, 41997, 40952, 40576)
Updating kubectl to send delete requests with orphanDependents=false if --cascade is true
Ref https://github.com/kubernetes/kubernetes/issues/40568#38897
Updating kubectl to always set `DeleteOptions.orphanDependents=false` when deleting a resource with `--cascade=true`.
This is primarily for federation where we want to use server side cascading deletion.
Impact on kubernetes: kubectl will do another GET after sending a DELETE and wait till the resource is actually deleted. This can have an impact if the resource has a finalizer. kubectl will wait till the finalizer is removed and then the resource is deleted, which is the right thing to do but a notable change in behavior.
cc @caesarxuchao @lavalamp @smarterclayton @kubernetes/sig-federation-pr-reviews @kubernetes/sig-cli-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 41994, 41969, 41997, 40952, 40576)
Guaranteed admission for Critical Pods
This is the first step in implementing node-level preemption for critical pods.
It defines the AdmissionFailureHandler interface, which allows callers, like the kubelet, to define how failed predicates are handled, and take steps to correct failures if necessary.
In the kubelet's implementation, it triggers preemption if the pod being admitted is critical, and if the only failed predicates are InsufficientResourceErrors, then it prempts (not yet implemented) other other pods to allow admission of the critical pod.
cc: @vishh
Automatic merge from submit-queue
Add namespaced role to inspect particular configmap for delegated authentication
Builds on https://github.com/kubernetes/kubernetes/pull/41814 and https://github.com/kubernetes/kubernetes/pull/41922 (those are already lgtm'ed) with the ultimate goal of making an extension API server zero-config for "normal" authentication cases.
This part creates a namespace role in `kube-system` that can *only* look the configmap which gives the delegated authentication check. When a cluster-admin grants the SA running the extension API server the power to run delegated authentication checks, he should also bind this role in this namespace.
@sttts Should we add a flag to aggregated API servers to indicate they want to look this up so they can crashloop on startup? The alternative is sometimes having it and sometimes not. I guess we could try to key on explicit "disable front-proxy" which may make more sense.
@kubernetes/sig-api-machinery-misc
@ncdc I spoke to @liggitt about this before he left and he was ok in concept. Can you take a look at the details?
Automatic merge from submit-queue (batch tested with PRs 41857, 41864, 40522, 41835, 41991)
kubectl: Allow 'drain --force' to remove orphaned pods
If the managing resource of a given pod (e.g. DaemonSet/ReplicaSet/etc) is deleted (effectively orphaning the pod), and ``kubectl drain --force`` is invoked on the node hosting the pod, the command would fail with an error indicating that the managing resource was not found. This PR reduces the error to a warning if ``--force`` is specified, allowing nodes with orphaned pods to be drained.
Reference: https://bugzilla.redhat.com/show_bug.cgi?id=1424678
cc: @derekwaynecarr
```release-note
Allow drain --force to remove pods whose managing resource is deleted.
```
Automatic merge from submit-queue (batch tested with PRs 41814, 41922, 41957, 41406, 41077)
add kubectl can-i to see if you can perform an action
Adds `kubectl auth can-i <verb> <resource> [<name>]` so that a user can see if they are allowed to perform an action.
@kubernetes/sig-cli-pr-reviews @fabianofranz
This particular command satisfies the immediate need of knowing if you can perform an action without trying that action. When using RBAC in a script that is adding permissions, there is a lag between adding the permission and the permission being realized in the RBAC cache. As a user on the CLI, you almost never see it, but as a script adding a binding and then using that new power, you hit it quite often.
There are natural follow-ons to the same area (hence the `auth` subcommand) to figure out if someone else can perform an action, what actions you can perform in total, and who can perform a given action. Someone else is an API we have already, what-can-i-do was a proposed API a while back and a very useful one for interfaces, and who-can is common question if someone is administering a namespace.
Automatic merge from submit-queue (batch tested with PRs 41814, 41922, 41957, 41406, 41077)
pv_controller: Do not report exponential backoff as error.
It's not an error when recycle/delete/provision operation cannot be started
because it has failed recently. It will be restarted automatically when
backoff expires.
This just pollutes logs without any useful information:
```
E0214 08:00:30.428073 77288 pv_controller.go:1410] error scheduling operaion "delete-pvc-1fa0e8b4-f2b5-11e6-a8bb-fa163ecb84eb[1fbd52ee-f2b5-11e6-a8bb-fa163ecb84eb]": Failed to create operation with name "delete-pvc-1fa0e8b4-f2b5-11e6-a8bb-fa163ecb84eb[1fbd52ee-f2b5-11e6-a8bb-fa163ecb84eb]". An operation with that name failed at 2017-02-14 08:00:15.631133152 -0500 EST. No retries permitted until 2017-02-14 08:00:31.631133152 -0500 EST (16s). Last error: "Cannot delete the volume \"11a4faea-bfc7-4713-88b3-dec492480dba\", it's still attached to a node".
```
```release-note
NONE
```
@kubernetes/sig-storage-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 41814, 41922, 41957, 41406, 41077)
Use consistent helper for getting secret names from pod
Kubelet secret-manager and mirror-pod admission both need to know what secrets a pod spec references. Eventually, a node authorizer will also need to know the list of secrets.
This creates a single (well, double, because api versions) helper that can be used to traverse the secret names referenced from a pod, optionally short-circuiting (for places that are just looking to see if any secrets are referenced, like admission, or are looking for a particular secret ref, like authorization)
Fixes:
* secret manager not handling secrets used by env/envFrom in initcontainers
* admission allowing mirror pods with secret references
@smarterclayton @wojtek-t
Automatic merge from submit-queue
make reconcilation generic to handle roles and clusterroles
We have a need to reconcile regular roles, so this pull moves the reconciliation code to use interfaces (still tightly coupled) rather than structs.
@liggitt @kubernetes/sig-auth-pr-reviews