Automatic merge from submit-queue (batch tested with PRs 65582, 65480, 65310, 65644, 65645). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix test failure of truncated time
**What this PR does / why we need it**:
The test of `TestFsStoreAssignedModified` in `pkg/kubelet/kubeletconfig/checkpoint/store` fails in my environment like below.
```
$ make test WHAT=./pkg/kubelet/kubeletconfig/checkpoint/store/
Running tests for APIVersion: v1,admissionregistration.k8s.io/v1alpha1,admissionregistration.k8s.io/v1beta1,admission.k8s.io/v1beta1,apps/v1beta1,apps/v1beta2,apps/v1,authentication.k8s.io/v1,authentication.k8s.io/v1beta1,authorization.k8s.io/v1,authorization.k8s.io/v1beta1,autoscaling/v1,autoscaling/v2beta1,batch/v1,batch/v1beta1,batch/v2alpha1,certificates.k8s.io/v1beta1,coordination.k8s.io/v1beta1,extensions/v1beta1,events.k8s.io/v1beta1,imagepolicy.k8s.io/v1alpha1,networking.k8s.io/v1,policy/v1beta1,rbac.authorization.k8s.io/v1,rbac.authorization.k8s.io/v1beta1,rbac.authorization.k8s.io/v1alpha1,scheduling.k8s.io/v1alpha1,scheduling.k8s.io/v1beta1,settings.k8s.io/v1alpha1,storage.k8s.io/v1beta1,storage.k8s.io/v1,storage.k8s.io/v1alpha1,
+++ [0628 22:53:39] Running tests without code coverage
--- FAIL: TestFsStoreAssignedModified (0.00s)
fsstore_test.go:316: expect "2018-06-28T22:53:43+09:00" but got "2018-06-28T22:53:43+09:00"
FAIL
FAIL k8s.io/kubernetes/pkg/kubelet/kubeletconfig/checkpoint/store 0.236s
make: *** [test] Error 1
```
My environment is
OS: macOS Sierra Version 10.12.6
File System: Journaled HFS+
The error message confused me because the comparing times looked the same in the error log. If we know certain systems truncate times, I think we can just compare less precise times to avoid confusions in tests.
**Special notes for your reviewer**:
N/A
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 60150, 65467, 65487, 65595, 65374). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubelet: feature gate LSI capacity calculation
Currently if `cm.cadvisorInterface.RootFsInfo()` fails, the whole kubelet bails. If `/var/lib/kubelet` is on a tmpfs or bindmount, this can happen (this is the case for some of our CI envs https://github.com/openshift/origin/issues/19948).
We would be able to workaround this, in the short term, by disabling the LSI feature gate if the capacity calculate was protected by the gate, but currently it isn't.
This PR adds the gate check around setting the ephemeral storage capacity.
@liggitt @derekwaynecarr @dashpole
It might be a different discussion about whether or not this should be fatal. If it isn't fatal, seems that it would just prevent pods that had a ephemeral storage request from being scheduled.
/sig node
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Revert "certs: only append locally discovered addresses when we got none from the cloudprovider"
This reverts commit 7354bbe5ac.
https://github.com/kubernetes/kubernetes/pull/61869 caused a mismatch between the requested CSR and the addresses in node status.
Instead of computing addresses in two places, the cert manager should derive its CSR request from the addresses in node status. This would enable the kubelet to react to address changes, as well as be driven by an external cloud provider.
/cc @mikedanese
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add support for plugin directory hierarchy
**What this PR does / why we need it**:
Add hierarchy support for plugin directory, it traverses and
watch plugin directory and its sub directory recursively.
plugin socket file only need be unique within one directory,
```
plugin socket directory
|
---->sub directory 1
| |
| -----> socket1, socket2 ...
----->sub directory 2
|
------> socket1, socket2 ...
```
the design itself allow sub directory be anything,
but in practical, each plugin type could just use one sub directory.
**Which issue(s) this PR fixes**:
Fixes#64003
**Special notes for your reviewer**:
twos bonus changes added as below
1) propose to let pluginWatcher bookkeeping registered plugins,
to make sure plugin name is unique within one plugin type.
arguably, we could let each handler do the same work, but it requires
every handler repeat the same thing.
2) extract example handler out from test, it is easier to read the code with the
seperation.
**Release note**:
```release-note
N/A
```
/sig node
/cc @vikaschoudhary16 @jiayingz @RenaudWasTaken @vishh @derekwaynecarr @saad-ali @vladimirvivien @dchen1107 @yujuhong @tallclair @Random-Liu @anfernee @akutz
Automatic merge from submit-queue (batch tested with PRs 65453, 65523, 65513, 65560). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Cleanup verbose cAdvisor mocking in Kubelet unit tests
These tests had a lot of duplicate code to set up the cAdvisor mock, but weren't really depending on the mock functionality. By moving the tests to use the fake cAdvisor, most of the setup can be cleaned up.
/kind cleanup
/sig node
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 59214, 65330). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Migrate cpumanager to use checkpointing manager
**What this PR does / why we need it**:
This PR migrates `cpumanager` to use new kubelet level node checkpointing feature (#56040) to decrease code redundancy and improve consistency.
**Which issue(s) this PR fixes**:
Fixes#58339
**Notes**:
At point of submitting PR the most straightforward approach was used - `state_checkpoint` implementation of `State` interface was added. However, with checkpointing implementation there might be no point to keep `State` interface and just use single implementation with checkpoint backend and in case of different backend than filestore needed just supply `cpumanager` with custom `CheckpointManager` implementation.
/kind feature
/sig node
cc @flyingcougar @ConnorDoyle
it traverses and watch plugin directory and its sub directory recursively,
plugin socket file only need be unique within one directory,
- plugin socket directory
- |
- ---->sub directory 1
- | |
- | -----> socket1, socket2 ...
- ----->sub directory 2
- |
- ------> socket1, socket2 ...
the design itself allow sub directory be anything,
but in practical, each plugin type could just use one sub directory.
four bonus changes added as below
1. extract example handler out from test, it is easier to read the code
with the seperation.
2. there are two variables here: "Watcher" and "watcher".
"Watcher" is the plugin watcher, and "watcher" is the fsnotify watcher.
so rename the "watcher" to "fsWatcher" to make code easier to
understand.
3. change RegisterCallbackFn() return value order, it is
conventional to return error last, after this change,
the pkg/volume/csi is compliance with golint, so remove it
from hack/.golint_failures
4. refactor errors handling at invokeRegistrationCallbackAtHandler()
to make error message more clear.
Automatic merge from submit-queue (batch tested with PRs 61330, 64793, 64675, 65059, 65368). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fixes data races for pkg/kubelet/config/file_linux_test.go
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#64655
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 65290, 65326, 65289, 65334, 64860). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
checkLimitsForResolvConf for the pod create and update events instead of checking period
**What this PR does / why we need it**:
- Check for the same at pod create and update events instead of checking continuously for every 30 seconds.
- Increase the logging level to 4 or higher since the event is not catastrophic to cluster health .
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#64849
**Special notes for your reviewer**:
@ravisantoshgudimetla
**Release note**:
```release-note
checkLimitsForResolvConf for the pod create and update events instead of checking period
```
Automatic merge from submit-queue (batch tested with PRs 65187, 65206, 65223, 64752, 65238). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Kubelet watches necessary secrets/configmaps instead of periodic polling
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
move oldNodeUnschedulable pkg var to kubelet struct
**What this PR does / why we need it**:
move oldNodeUnschedulable pkg var to kubelet struct
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 58690, 64773, 64880, 64915, 64831). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
ignore not found file error when watching manifests
**What this PR does / why we need it**:
An alternative of #63910.
When using vim to create a new file in manifest folder, a temporary file, with an arbitrary number (like 4913) as its name, will be created to check if a directory is writable and see the resulting ACL.
These temporary files will be deleted later, which should by ignored when watching the manifest folder.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#55928, #59009, #48219
**Special notes for your reviewer**:
/cc dims luxas yujuhong liggitt tallclair
**Release note**:
```release-note
ignore not found file error when watching manifests
```
Automatic merge from submit-queue (batch tested with PRs 64688, 64451, 64504, 64506, 56358). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
cleanup some dead kubelet code
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 65032, 63471, 64104, 64672, 64427). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
pkg: kubelet: remote: increase grpc client default size to 16MiB
**What this PR does / why we need it**:
Increase the gRPC max message size to 16MB in the remote container runtime. I've seen sizes over 8MB in clusters with big (256GB RAM) nodes.
**Release note**:
```release-note
Increase the gRPC max message size to 16MB in the remote container runtime.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
bind alpha feature network plugin flags correctly
**What this PR does / why we need it**:
When working #63542, I found the flags, like `--cni-conf-dir` and `cni-bin-dir`, were not correctly bound.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
/cc kubernetes/sig-node-pr-reviews
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 64142, 64426, 62910, 63942, 64548). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Clean up fake mounters.
**What this PR does / why we need it**:
Fixes https://github.com/kubernetes/kubernetes/issues/61502
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
list of fake mounters:
- (keep) pkg/util/mount.FakeMounter
- (removed) pkg/kubelet/cm.fakeMountInterface:
- (inherit from mount.FakeMounter) pkg/util/mount.fakeMounter
- (inherit from mount.FakeMounter) pkg/util/removeall.fakeMounter
- (removed) pkg/volume/host_path.fakeFileTypeChecker
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 65230, 57355, 59174, 63698, 63659). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
TODO has already been implemented
**What this PR does / why we need it**:
TODO has already been implemented, remove the TODO tag.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```NONE
If we create a new key on each CSR, if CSR fails the next attempt will
create a new one instead of reusing previous CSR.
If approver/signer don't handle CSRs as quickly as new nodes come up,
they can pile up and approver would keep handling old abandoned CSRs and
Nodes would keep timing out on startup.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
dockershim/network: add dcbw to OWNERS as an approver
I've been involved with the kubelet network code, including most
of this code, for a couple years and contributed a good number
of PRs for these directories. I've also been a SIG Network
co-lead for couple years.
I've also been on the CNI maintainers team for a couple years.
```release-note
NONE
```
@freehan @thockin @kubernetes/sig-network-pr-reviews
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Revert #64189: Fix Windows CNI for the sandbox case
**What this PR does / why we need it**:
This reverts PR #64189, which breaks DNS for Windows containers.
Refer https://github.com/kubernetes/kubernetes/pull/64189#issuecomment-395248704
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#64861
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
cc @madhanrm @PatrickLang @alinbalutoiu @dineshgovindasamy
Automatic merge from submit-queue (batch tested with PRs 64503, 64903, 64643, 64987). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Use unix.EpollWait to determine when memcg events are available to be Read
**What this PR does / why we need it**:
This fixes a file descriptor leak introduced in https://github.com/kubernetes/kubernetes/pull/60531 when the `--experimental-kernel-memcg-notification` kubelet flag is enabled. The root of the issue is that `unix.Read` blocks indefinitely when reading from an event file descriptor and there is nothing to read. Since we refresh the memcg notifications, these reads accumulate until the memcg threshold is crossed, at which time all reads complete. However, if the node never comes under memory pressure, the node can run out of file descriptors.
This PR changes the eviction manager to use `unix.EpollWait` to wait, with a 10 second timeout, for events to be available on the eventfd. We only read from the eventfd when there is an event available to be read, preventing an accumulation of `unix.Read` threads, and allowing the event file descriptors to be reclaimed by the kernel.
This PR also breaks the creation, and updating of the memcg threshold into separate portions, and performs creation before starting the periodic synchronize calls. It also moves the logic of configuring memory thresholds into memory_threshold_notifier into a separate file.
This also reverts https://github.com/kubernetes/kubernetes/pull/64582, as the underlying leak that caused us to disable it for testing is fixed here.
Fixes#62808
**Release note**:
```release-note
NONE
```
/sig node
/kind bug
/priority critical-urgent
I've been involved with the kubelet network code, including most
of this code, for a couple years and contributed a good number
of PRs for these directories. I've also been a SIG Network
co-lead for couple years.
I've also been on the CNI maintainers team for a couple years.
Automatic merge from submit-queue (batch tested with PRs 63905, 64855). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Setup dns servers and search domains for Windows Pods
**What this PR does / why we need it**:
Kubelet is depending on docker container's ResolvConfPath (e.g. /var/lib/docker/containers/439efe31d70fc17485fb6810730679404bb5a6d721b10035c3784157966c7e17/resolv.conf) to setup dns servers and search domains. While this is ok for Linux containers, ResolvConfPath is always an empty string for windows containers. So that the DNS setting for windows containers is always not set.
This PR setups DNS for Windows sandboxes. In this way, Windows Pods could also use kubernetes dns policies.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#61579
**Special notes for your reviewer**:
Requires Docker EE version >= 17.10.0.
**Release note**:
```release-note
Setup dns servers and search domains for Windows Pods in dockershim. Docker EE version >= 17.10.0 is required for propagating DNS to containers.
```
/cc @PatrickLang @taylorb-microsoft @michmike @JiangtianLi