Automatic merge from submit-queue
Prevent kube-proxy from panicing when sysfs is mounted as read-only.
Fixes https://github.com/kubernetes/kubernetes/issues/25543.
This PR:
* Checks the permission of sysfs before setting conntrack hashsize, and returns an error "readOnlySysFSError" if sysfs is readonly. As I know, this is the only place we need write permission to sysfs, CMIIW.
* Update a new node condition 'RuntimeUnhealthy' with specific reason, message and hit to the administrator about the remediation.
I think this should be an acceptable fix for now.
Node problem detector is designed to integrate with different problem daemons, but **the main logic is in the problem detection phase**. After the problem is detected, what node problem detector does is also simply updating a node condition.
If we let kube-proxy pass the problem to node problem detector and let node problem detector update the node condition. It looks like an unnecessary hop. The logic in kube-proxy won't be different from this PR, but node problem detector will have to open an unsafe door to other pods because the lack of authentication mechanism.
It is a bit hard to test this PR, because we don't really have a bad docker in hand. I can only manually test it:
* If I manually change the code to let it return `"readOnlySysFSError`, the node condition will be updated:
```
NetworkUnavailable False Mon, 01 Jan 0001 00:00:00 +0000 Fri, 08 Jul 2016 01:36:41 -0700 RouteCreated RouteController created a route
OutOfDisk False Fri, 08 Jul 2016 01:37:36 -0700 Fri, 08 Jul 2016 01:34:49 -0700 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Fri, 08 Jul 2016 01:37:36 -0700 Fri, 08 Jul 2016 01:34:49 -0700 KubeletHasSufficientMemory kubelet has sufficient memory available
Ready True Fri, 08 Jul 2016 01:37:36 -0700 Fri, 08 Jul 2016 01:35:26 -0700 KubeletReady kubelet is posting ready status. WARNING: CPU hardcapping unsupported
RuntimeUnhealthy True Fri, 08 Jul 2016 01:35:31 -0700 Fri, 08 Jul 2016 01:35:31 -0700 ReadOnlySysFS Docker unexpectedly mounts sysfs as read-only for privileged container (docker issue #24000). This causes the critical system components of Kubernetes not properly working. To remedy this please restart the docker daemon.
KernelDeadlock False Fri, 08 Jul 2016 01:37:39 -0700 Fri, 08 Jul 2016 01:35:34 -0700 KernelHasNoDeadlock kernel has no deadlock
Addresses: 10.240.0.3,104.155.176.101
```
* If not, the node condition `RuntimeUnhealthy` won't appear.
* If I run the permission checking code in a unprivileged container, it did return `readOnlySysFSError`.
I'm not sure whether we want to mark the node as `Unscheduable` when this happened, which only needs few lines change. I can do that if we think we should.
I'll add some unit test if we think this fix is acceptable.
/cc @bprashanth @dchen1107 @matchstick @thockin @alex-mohr
Mark P1 to match the original issue.
[]()