Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix an issue in inter-pod affinity predicate that cause affinity to self being processed incorrectly
**What this PR does / why we need it**:
Fixes the anti-affinity issue explained in #62232.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#62232
**Special notes for your reviewer**:
**Release note**:
```release-note
Fix an issue in inter-pod affinity predicate that cause affinity to self being processed incorrectly
```
/sig scheduling
Automatic merge from submit-queue (batch tested with PRs 62676, 62612). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix WaitForAttach failure issue for azure disk
**What this PR does / why we need it**:
From v1.10, `devicePath` will be updated due to following code change:
568afb4ecc/pkg/volume/util/operationexecutor/operation_generator.go (L517-L518)
So in v1.10.0, MountVolume.WaitForAttach will fail in the azure disk remount, error logs would be like following:
```
MountVolume.WaitForAttach failed for volume "pvc-f1562ecb-3e5f-11e8-ab6b-000d3af9f967" : azureDisk - Wait for attach expect device path as a lun number, instead got: /dev/disk/azure/scsi1/lun1 (strconv.Atoi: parsing "/dev/disk/azure/scsi1/lun1": invalid syntax)
Warning FailedMount 1m (x10 over 21m) kubelet, k8s-agentpool-66825246-0 Unable to mount volumes for pod
```
This PR does not use `devicePath` anymore since it could be changed, instead, it use `diskController.GetDiskLun(diskName, volumeSource.DataDiskURI, nodeName)` to get disk LUN, this ARM api call would cost about 0.12s
The GCE disk won't have this issue since `devicePath` is not used in [WaitForAttach func](https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/gce_pd/attacher.go#L133), while aws disk is also using `devicePath` in [WaitForAttach func](https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/aws_ebs/attacher.go#L145), I think there is potentical issue for aws_ebs
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#62540
**Special notes for your reviewer**:
should cherry-pick to v1.10
**Release note**:
```
fix WaitForAttach failure issue for azure disk
```
/assign @feiskyer
/sig azure
FYI @khenidak
Automatic merge from submit-queue (batch tested with PRs 56040, 62627). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Node-level Checkpointing manager: Migrate dockershim and device plugin manager checkpointing
**What this PR does / why we need it**:
This PR abstracts checkpoint manager at kubelet level. Currently, `dockershim`, `deviceplugin` have their own native checkpointing primitives. And most recently `cpumanager` also added package native checkpointing primitives. This adds to the redundancy at implementation level. Also degrades code readability and consistency.
To help this:
1. Checkpointing interface is being abstracted at kubelet level as `checkpointmanager` package.
2. `dockershim` and `deviceplugin` packages are modified to use `checkpointmanager` instead native checkpointing.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
```release-note
None
```
cc @jeremyeder @vishh @derekwaynecarr @sjenning @yujuhong @dchen1107 @RenaudWasTaken @ConnorDoyle @RenaudWasTaken @jiayingz @mindprince @timstclair
/sig node
Automatic merge from submit-queue (batch tested with PRs 62650, 62303, 62545, 62375). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix user visible files creation for windows
**What this PR does / why we need it**:
Fix user visible files creation for windows. Without this, [createUserVisibleFiles](https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/util/atomic_writer.go#L415:24) will get linkname with subpath included, and then symlink will fail. This is because "/" is used in pod spec (e.g. `"new/path/data-1"`) while "\" is used on Windows to get linkname.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#62338
**Special notes for your reviewer**:
Should also be cherry-picked to old releases.
**Release note**:
```release-note
Fix user visible files creation for windows
```
Automatic merge from submit-queue (batch tested with PRs 62650, 62303, 62545, 62375). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Move podsecuritypolicy registry to policy package
**What this PR does / why we need it:**
This is a part of the PSP migration from extensions to policy API group. This PR moves registry to policy package and changes preferred storage format to policy/v1beta1
**Which issue(s) this PR fixes:**
Addressed to https://github.com/kubernetes/features/issues/5
Automatic merge from submit-queue (batch tested with PRs 58784, 62057, 62621, 62652, 62656). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
[GCE] Remove parallel
**What this PR does / why we need it**:
Removes the parallel from the Loadbalancer tests. Looks like one mock method modifies a singleton variable, hence the tests currently cannot be run in parallel.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#62601
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 58784, 62057, 62621, 62652, 62656). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Show deprecated kube-apiserver flags
**What this PR does / why we need it**:
This PR unhides deprecated kube-apiserver flags, so that the deprecation notice is clearly visible in --help.
Fixes#62617
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 58784, 62057, 62621, 62652, 62656). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
remove deprecated initresource admission plugin
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
xref https://github.com/kubernetes/kubernetes/pull/55375#issuecomment-360329586
**Special notes for your reviewer**:
/assign @piosz @deads2k
**Release note**:
```release-note
remove deprecated initresource admission plugin
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
add andyzhangx as windows related code Reviewer of pkg/util/mount
**What this PR does / why we need it**:
I just found recently there is some feature not working on windows storage, e.g. local, hostpath volume etc. So I woul like to be a reviewer for windows related code of volume storage. The windows code under https://github.com/kubernetes/kubernetes/tree/master/pkg/util/mount are mostly implemented by me, I am quite familiar with this component. Just let me know if it's ok, thanks.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```
none
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix machineID getting for vmss nodes when using instance metadata
**What this PR does / why we need it**:
When instancemetadata is for Kubelet on master nodes , kubelet is not able to register itself with errors:
```sh
Unable to construct v1.Node object for kubelet: failed to get external ID from cloud provider: not a vmss instance
```
This PR fixes this issue by composing standard instance ID for such nodes.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#62610
**Special notes for your reviewer**:
Need cherry pick to 1.10.
**Release note**:
```release-note
Fix machineID getting for vmss nodes when using instance metadata
```
/assign @andyzhangx
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix volume node affinity to OR node selector terms
**What this PR does / why we need it**:
Fixes node selector terms to be ORed, to be consistent with documentation and Pod.NodeAffinity. Also handles the "node selector term nil or empty matches nothing" behavior.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#62551
**Special notes for your reviewer**:
**Release note**:
```release-note
Fixes issue where PersistentVolume.NodeAffinity.NodeSelectorTerms were ANDed instead of ORed.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Prevent virtual infinite loop in volume controller
**What this PR does / why we need it**:
In WatchPod(), if one of the two channels being watched (pod updates and events) is closed, the for/select loop turns into a tight infinite loop because the select immediately falls through due to the channel being closed.
This PR changes WatchPod() to Watch the two channels independently instead.
**Which issue(s) this PR fixes**:
Fixes#62571
**Release note**:
```release-note
Fix potential infinite loop that can occur when NFS PVs are recycled.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix Forward chain default reject policy for IPVS proxier
**What this PR does / why we need it**:
Testing with the IPVS mode proxier on a host with iptables FORWARD policy = DROP, as configured by docker in recent versions, I found that traffic to NodePorts failed when the NodePort forwarded the traffic to another node.
Saw the iptables FORWARD=DROP counter increasing with each packet.
IPVS mode should whitelist such traffic in a similar way to the iptables mode:
PR implementing the fix for iptables mode: #52569
**Which issue(s) this PR fixes**:
Fixes#59656
**Special notes for your reviewer**:
**Release note**:
```release-note
Fix Forward chain default reject policy for IPVS proxier
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
use glog.Infof instead of glog.Info in volumn_host
**What this PR does / why we need it**:
use glog.Infof instead of glog.Info
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
[GCE] Loadbalancer Tests
**What this PR does / why we need it**:
* Refactors existing Loadbalancer tests
- Create a new v1.Service per test, instead of a global one
- Encapsulate checking resource creation/deletion for internal and external loadbalancers in functions
* Adds tests for `gce_loadbalancer.go` - brings coverage from 10.3% -> 65.4%
**Release note**:
```release-note
NONE
```
In WatchPod(), if one of the two channels being watched (pod updates and
events) is closed, the for/select loop turns into a tight infinite loop because
the select immediately falls through due to the channel being closed. Watch
them independently instead.
Automatic merge from submit-queue (batch tested with PRs 62486, 62471, 62183). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
sarapprover: remove self node cert
The functionality to bootstrap node certificates is ready but is blocked by a separable issue discussed in: https://github.com/kubernetes/community/pull/1982. The functionality could be useful for power users who want to write their own approvers if the feature could be promoted to beta. In it's current state this feature doesn't help anybody.
I propose that we remove automated approval of node serving certificates for now and work towards getting the node functionality to beta.
cc @awly @kubernetes/sig-auth-pr-reviews
```release-note
Remove alpha functionality that allowed the controller manager to approve kubelet server certificates.
```
Automatic merge from submit-queue (batch tested with PRs 62486, 62471, 62183). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
CSI - Update to apply fsGroup volume ownership
**What this PR does / why we need it**:
This PR correctly fixes the CSI internal driver to apply fsGroup volume ownership value during mount.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#62413
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 61306, 60270, 62496, 62181, 62234). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
split up the huge set of flags into smaller option structs
**What this PR does / why we need it**:
To make generic, we do following work:
1. Spliting `KubeControllerManagerConfiguration` in kube-controller-manager and cloud-controller-manager into fewer smaller struct options order by controller, and modify relative flag. Also part of #59483.
2. Spliting `componentconfig` in controller-manager into fewer smaller config order by controller too.
All works follow #59582, using `option+config` logic.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
The functionality to bootstrap node certificates is ready but is blocked
by a seperable issue discussed in:
https://github.com/kubernetes/community/pull/1982. The functionality
could be useful for power users who want to write their own approvers if
the feature could be promoted to beta. In it's current state this
feature doesn't help anybody.
I propose that we remove automated approval of node serving certificates
for now and work towards getting the node functionality to beta.
Automatic merge from submit-queue (batch tested with PRs 60476, 62462, 61391, 62535, 62394). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Revert "git: Use VolumeHost.GetExec() to execute stuff in volume plugins"
This reverts commit c578542ad7 (PR #51098). The PR added support for containerized git, on the other hand it required git 1.8.5. This breaks git volumes on older distros (CentOS 7, Ubuntu 14.04) that have old git.
Git volumes are getting deprecated (https://github.com/kubernetes/kubernetes/issues/60999) so we should restore it to the last working state and not touch it any longer.
**Release note**:
```release-note
gitRepo volumes in pods no longer require git 1.8.5 or newer, older git versions are supported too now.
```
I'd like to cherry-pick it into 1.10.
/sig storage
Automatic merge from submit-queue (batch tested with PRs 60476, 62462, 61391, 62535, 62394). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Use function aws.Int64Value replace of deprecated function orZero
**What this PR does / why we need it**:
```
// orZero returns the value, or 0 if the pointer is nil
// Deprecated: prefer aws.Int64Value
func orZero(v *int64) int64 {
return aws.Int64Value(v)
}
```
Use function aws.Int64Value replace of deprecated function orZero and remove unused orZero .
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 60476, 62462, 61391, 62535, 62394). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Private mount propagation
This PR changes the default mount propagation from "rslave" (newly added in 1.10) to "private" (default in 1.9 and before). "rslave" as default causes regressions, see below.
Value `"None"` has to be added to `MountPropagationMode` enum in API ("I don't want any propagation at all"), which translates to "private" on Linux. [We did not have use cases for it](https://github.com/kubernetes/community/pull/659#discussion_r131454319), but we have them now.
**Which issue(s) this PR fixes**
Fixes#62397, fixes#62396
**Special notes for your reviewer**:
CRI already has an option for private mount propagation in volumes, however it's called "PRIVATE", while Kubernetes API value is "None". I did not change PRIVATE to NONE to keep the interface stable. See `kubelet_pods.go`.
**Release note**:
```release-note
Default mount propagation has changed from "HostToContainer" ("rslave" in Linux terminology) to "None" ("private") to match the behavior in 1.9 and earlier releases. "HostToContainer" as a default caused regressions in some pods.
```
/sig storage
/sig node
Automatic merge from submit-queue (batch tested with PRs 62467, 62482, 62211). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Improve performance of affinity/anti-affinity predicate by 20x in large clusters
**What this PR does / why we need it**:
Improves performance of affinity/anti-affinity predicate by over 20x in large clusters. Performance improvement is smaller in small clusters, but it is still very significant and is about 4x. Also, before this PR, performance of the predicate was dropping quadratically with increasing size of nodes and pods. As the results shows, the slow down is now linear in larger clusters.
Affinity/anti-affinity predicate was checking all pods of the cluster for each node in the cluster to determine feasibility of affinit/anti-affinity terms of the pod being scheduled. This optimization first finds all the pods in a cluster that match the affinity/anti-affinity terms of the pod being scheduled once and stores the metadata. It then only checks the topology of the matching pods for each node in the cluster.
This results in major reduction of the search space per node and improves performance significantly.
Below results are obtained by running scheduler benchmarks:
```
make test-integration WHAT=./test/integration/scheduler_perf KUBE_TEST_ARGS="-run=xxx -bench=.*BenchmarkSchedulingAntiAffinity"
```
```
AntiAffinity Topology: Hostname
before: BenchmarkSchedulingAntiAffinity/500Nodes/250Pods-12 37031638 ns/op
after: BenchmarkSchedulingAntiAffinity/500Nodes/250Pods-12 10373222 ns/op
before: BenchmarkSchedulingAntiAffinity/500Nodes/5000Pods-12 134205302 ns/op
after: BenchmarkSchedulingAntiAffinity/500Nodes/5000Pods-12 12000580 ns/op
befor: BenchmarkSchedulingAntiAffinity/1000Nodes/10000Pods-12 498439953 ns/op
after: BenchmarkSchedulingAntiAffinity/1000Nodes/10000Pods-12 24692552 ns/op
AntiAffinity Topology: Region
before: BenchmarkSchedulingAntiAffinity/500Nodes/250Pods-12 60003672 ns/op
after: BenchmarkSchedulingAntiAffinity/500Nodes/250Pods-12 13346400 ns/op
before: BenchmarkSchedulingAntiAffinity/1000Nodes/10000Pods-12 600085491 ns/op
after: BenchmarkSchedulingAntiAffinity/1000Nodes/10000Pods-12 27783333 ns/op
```
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
ref/ #56032#47318#25319
**Release note**:
```release-note
improve performance of affinity/anti-affinity predicate of default scheduler significantly.
```
/sig scheduling
Automatic merge from submit-queue (batch tested with PRs 62467, 62482, 62211). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Ensure resources created by `run --rm` are cleaned up
**Release note**:
```release-note
NONE
```
Resources created by `kubectl run --rm ...` are now cleaned up, even in the event of an error.
Relevant downstream issue: https://github.com/openshift/origin/issues/13276
cc @soltysh
Automatic merge from submit-queue (batch tested with PRs 62467, 62482, 62211). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix nsenter GetFileType issue in containerized kubelet
**What this PR does / why we need it**:
UnmountDevice would fail in containerized kubelet in v1.9.x, and in the end, pod with volume mount could not be scheduled from one node to another since the original volume will never be released. This PR fixed this issue.
In [nsenter_mount.GetFileType func](https://github.com/kubernetes/kubernetes/blob/master/pkg/util/mount/nsenter_mount.go#L238), return error of following code will be different than [mount_linux.GetFileType func](https://github.com/kubernetes/kubernetes/blob/master/pkg/util/mount/mount.go#L347)
```
outputBytes, err := mounter.ne.Exec("stat", []string{"-L", `--printf "%F"`, pathname}).CombinedOutput()
```
Return error and output would be like following in nsenter_mount.GetFileType func:
error: `exit status 1`
output: `/usr/bin/stat: cannot stat '2': No such file or directory`
This PR makes the return error consistent as mount_linux.GetFileType func, and finally makes [isDeviceOpened func](https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/util/operationexecutor/operation_generator.go#L1340) work.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#62282
**Special notes for your reviewer**:
/assign @dixudx
**Release note**:
```
fix nsenter GetFileType issue in containerized kubelet
```
/sig node
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add msau42 to approvers for volume scheduling
**What this PR does / why we need it**:
Add me as an approver for the volume scheduling code
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
The err has checked in TearDownAt func/kind bug
**What this PR does / why we need it**:
The err has checked in TearDownAt func/kind bug
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```