Automatic merge from submit-queue (batch tested with PRs 42854, 43105, 43090)
Update ScaleIO volume plugin default readOnly value
This commit updates the code to set readOnly attribute to be set to false.
**What this PR does / why we need it**:
This PR is a minor fix that updates the default value of `readOnly` attribute to `false`.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 42775, 42991, 42968, 43029)
Remove 'beta' from default storage class annotation
I forgot to update default storage class annotation in my storage.k8s.io/v1beta1 -> v1 PRs. Let's fix it before 1.6 is released.
I consider it as a bugfix, in #40088 I already updated the release notes to include non-beta annotation `storageclass.kubernetes.io/is-default-class`
```release-note
NONE
```
@kubernetes/sig-storage-pr-reviews
@msau42, please help with merging.
Automatic merge from submit-queue (batch tested with PRs 42775, 42991, 42968, 43029)
`kubectl taint` update for `NoExecute`.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
Fixes#42774
**Release note**:
```release-note
```
Automatic merge from submit-queue
Add DaemonSet templateGeneration validation and tests, and fix a bunch of validation test errors
For DaemonSet update:
1. Validate that templateGeneration is increased when and only when template is changed
1. Validate that templateGeneration is never decreased
1. Added validation tests for templateGeneration
1. Fix a bunch of errors in validate tests
- fake tests: almost all validation test error cases failed on "missing resource version", "name changes", "missing update strategy", "selector/template labels mismatch", not on the real validation we wanted to test
- some error cases should be success cases
@kargakis @lukaszo @kubernetes/sig-apps-bugs
*I've verified locally that all DaemonSet e2e tests pass with this change.*
This commit updates the code to set the default value of the readOnly attribute to false.
It also updates the example docs to add full list of supported plugin attributes and doc.
Automatic merge from submit-queue (batch tested with PRs 42942, 42935)
pkg/util/flock: Fix the flock so it actually locks.
With this PR, the second call to `Acquire()` will block unless the lock is released (process exits).
Also removed the memory mutex in the previous code since we don't need `Release()` here so no need to save and protect the local fd.
Fix#42929.
Automatic merge from submit-queue (batch tested with PRs 42942, 42935)
[Bug] Handle container restarts and avoid using runtime pod cache while allocating GPUs
Fixes#42412
**Background**
Support for multiple GPUs is an experimental feature in v1.6.
Container restarts were handled incorrectly which resulted in stranding of GPUs
Kubelet is incorrectly using runtime cache to track running pods which can result in race conditions (as it did in other parts of kubelet). This can result in same GPU being assigned to multiple pods.
**What does this PR do**
This PR tracks assignment of GPUs to containers and returns pre-allocated GPUs instead of (incorrectly) allocating new GPUs.
GPU manager is updated to consume a list of active pods derived from apiserver cache instead of runtime cache.
Node e2e has been extended to validate this failure scenario.
**Risk**
Minimal/None since support for GPUs is an experimental feature that is turned off by default. The code is also isolated to GPU manager in kubelet.
**Workarounds**
In the absence of this PR, users can mitigate the original issue by setting `RestartPolicyNever` in their pods.
There is no workaround for the race condition caused by using the runtime cache though.
Hence it is worth including this fix in v1.6.0.
cc @jianzhangbjz @seelam @kubernetes/sig-node-pr-reviews
Replaces #42560
With this PR, the second call to `Acquire()` will block unless the lock is released (process exits).
Also removed the memory mutex in the previous code since we don't need `Release()` here so no need to save and protect the local fd.
Fix#42929.
Automatic merge from submit-queue
Fixed incorrect result of getMinTolerationTime.
For the following case, `getMinTolerationTime` should return one; but it returned -1 :
1. for tolerations[0], TolerationSeconds is nil, minTolerationTime is not set
2. for tolerations[1], it's TolerationSeconds (1) is bigger than `minTolerationTime`, so minTolerationTime is still -1 which means infinite.
```
+ {
+ tolerations: []v1.Toleration{
+ {
+ TolerationSeconds: nil,
+ },
+ {
+ TolerationSeconds: &one,
+ },
+ },
+ },
```
Automatic merge from submit-queue
Fix taint based pod eviction for clusters where controller manager is not running with allocate-node-cidrs set
Fixes https://github.com/kubernetes/kubernetes/issues/42733
In my cluster, I have not set allocate-node-cidr, and It is causing taint based pod eviction to fail.
@gmarek @kubernetes/sig-scheduling-bugs @davidopp @derekwaynecarr
Automatic merge from submit-queue (batch tested with PRs 41794, 42349, 42755, 42901, 42933)
Fix DefaultTolerationSeconds admission plugin
DefaultTolerationSeconds is not working as expected. It is supposed to add default tolerations (for unreachable and notready conditions). but no pod was getting these toleration. And api server was throwing this error:
```
Mar 08 13:43:57 fedora25 hyperkube[32070]: E0308 13:43:57.769212 32070 admission.go:71] expected pod but got Pod
Mar 08 13:43:57 fedora25 hyperkube[32070]: E0308 13:43:57.789055 32070 admission.go:71] expected pod but got Pod
Mar 08 13:44:02 fedora25 hyperkube[32070]: E0308 13:44:02.006784 32070 admission.go:71] expected pod but got Pod
Mar 08 13:45:39 fedora25 hyperkube[32070]: E0308 13:45:39.754669 32070 admission.go:71] expected pod but got Pod
Mar 08 14:48:16 fedora25 hyperkube[32070]: E0308 14:48:16.673181 32070 admission.go:71] expected pod but got Pod
```
The reason for this error is that the input to admission plugins is internal api objects not versioned objects so expecting versioned object is incorrect. Due to this, no pod got desired tolerations and it always showed:
```
Tolerations: <none>
```
After this fix, the correct tolerations are being assigned to pods as follows:
```
Tolerations: node.alpha.kubernetes.io/notReady=:Exists:NoExecute for 300s
node.alpha.kubernetes.io/unreachable=:Exists:NoExecute for 300s
```
@davidopp @kevin-wangzefeng @kubernetes/sig-scheduling-pr-reviews @kubernetes/sig-scheduling-bugs @derekwaynecarr
Fixes https://github.com/kubernetes/kubernetes/issues/42716
Automatic merge from submit-queue
Invalid environment var names are reported and pod starts
When processing EnvFrom items, all invalid keys are collected and
reported as a single event.
The Pod is allowed to start.
fixes#42583
Automatic merge from submit-queue
[Federation] Kubefed Init should use the right RBAC API version clientset
**What this PR does / why we need it**:
Implements the need as described in https://github.com/kubernetes/kubernetes/issues/41263
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/41263
**Special notes for your reviewer**:
@madhusudancs @shashidharatd @marun
cc @kubernetes/sig-federation-bugs
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 38805, 42362, 42862)
Let GC print specific message for RESTMapping failure
Make the error messages reported in https://github.com/kubernetes/kubernetes/issues/39816 to be more specific, also only print the message once.
I'll also update the garbage collector's doc to clearly state we don't support tpr yet.
We'll wait for the watchable discovery feature (@sttts are you going to work on that?) to land in 1.7, and then enable the garbage collector to handle TPR.
cc @hongchaodeng @MikaelCluseau @djMax
Automatic merge from submit-queue (batch tested with PRs 38805, 42362, 42862)
Fix deployment generator after introducing deployments in apps/v1beta1
This PR does two things:
1. Switches all generator to produce versioned objects, to bypass the problem of having an object in multiple versions, which then results in not having stable generator (iow. producing exactly the same object).
2. Introduces new generator for `apps/v1beta1` deployments.
@kargakis @janetkuo ptal
@kubernetes/sig-apps-pr-reviews @kubernetes/sig-cli-pr-reviews ptal
This is a followup to https://github.com/kubernetes/kubernetes/pull/39683, so I'm adding 1.6 milestone.
```release-note
Introduce new generator for apps/v1beta1 deployments
```
Automatic merge from submit-queue (batch tested with PRs 42608, 42444)
Return nil when deleting non-exist GCE PD
When gce cloud tries to delete a disk, if the disk could not be found
from the zones, the function should return nil error. This modified behavior is also consistent with AWS
Automatic merge from submit-queue (batch tested with PRs 42877, 42853)
Remove unused functions and make logs slightly better
Zero risk cleanup, removing function that are not used anymore, and adding few more logs to help debugging problems.
cc @aveshagarwal
Automatic merge from submit-queue
Use Prometheus instrumentation conventions
The `System` and `Subsystem` parameters are subject to removal.
(x-ref: https://github.com/prometheus/client_golang/issues/240)
All metrics should use base units, which is seconds in the duration
case.
Counters should always end in `_total` and metrics should avoid
referring to potential label dimensions. Those should rather be
mentioned in the documentation string.
@kubernetes/sig-instrumentation
Reference docs:
https://prometheus.io/docs/practices/instrumentation/https://prometheus.io/docs/practices/naming/
**Release note**:
```
Breaking change: Renamed REST client Prometheus metrics to follow the instrumentation conventions ("request_latency_microseconds" -> "rest_client_request_latency_seconds", "request_status_codes" -> "rest_client_requests_total"). Please update your alerting pipeline if you rely on them.
```
Automatic merge from submit-queue
Add new DaemonSetStatus to kubectl printer and describer
@kargakis @lukaszo @kubernetes/sig-apps-pr-reviews @kubernetes/sig-cli-pr-reviews
```release-note
Add new DaemonSet status fields to kubectl printer and describer.
```
The `System` and `Subsystem` parameters are subject to removal.
(x-ref: https://github.com/prometheus/client_golang/issues/240)
All metrics should use base units, which is seconds in the duration
case.
Counters should always end in `_total` and metrics should avoid
referring to potential label dimensions. Those should rather be
mentioned in the documentation string.
Automatic merge from submit-queue (batch tested with PRs 42811, 42859)
Validation PVs for mount options
We are going to move the validation in its own package and we will be calling validation for individual volume types as needed.
Fixes https://github.com/kubernetes/kubernetes/issues/42573
Automatic merge from submit-queue (batch tested with PRs 42024, 42780, 42808, 42640)
kubectl: respect DaemonSet strategy parameters for rollout status
It handles "after-merge" comments from #41116
cc @kargakis @janetkuo
I will add one more e2e test later. I need to handle some in company stuff.
Automatic merge from submit-queue (batch tested with PRs 42024, 42780, 42808, 42640)
Node controller test flake 39975 with delay for try function
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#39975
/cc @ncdc @gmarek @liggitt
Automatic merge from submit-queue
Remove VCenterPort from vsphere cloud provider.
**What this PR does / why we need it**:
Address a bug inside vsphere cloud provider when a port number other than 443 is specified inside the config file.
The url which is used for communicating with govmomi should not include port number.
A port number other than 443 will result in 404 error.
VCenterPort stays in VSphereConfig structure for backward compatibility.
**Which issue this PR fixes** : fixes https://github.com/kubernetes/kubernetes-anywhere/issues/338
Automatic merge from submit-queue (batch tested with PRs 42734, 42745, 42758, 42814, 42694)
Dropped docker 1.9.x support. Changed the minimumDockerAPIVersion to
1.22
cc/ @Random-Liu @yujuhong
We talked about dropping docker 1.9.x support for a while. I just realized that we haven't really done it yet.
```release-note
Dropped the support for docker 1.9.x and the belows.
```