Modify kubelet plugin watcher to support older CSI drivers that use an
the old plugins directory for socket registration.
Also modify CSI plugin registration to support multiple versions of CSI
registering with the same name.
- Move from the old github.com/golang/glog to k8s.io/klog
- klog as explicit InitFlags() so we add them as necessary
- we update the other repositories that we vendor that made a similar
change from glog to klog
* github.com/kubernetes/repo-infra
* k8s.io/gengo/
* k8s.io/kube-openapi/
* github.com/google/cadvisor
- Entirely remove all references to glog
- Fix some tests by explicit InitFlags in their init() methods
Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135
When node lease feature is enabled, kubelet reports node status to api server
only if there is some change or it didn't report over last report interval.
Individual implementations are not yet being moved.
Fixed all dependencies which call the interface.
Fixed golint exceptions to reflect the move.
Added project info as per @dims and
https://github.com/kubernetes/kubernetes-template-project.
Added dims to the security contacts.
Fixed minor issues.
Added missing template files.
Copied ControllerClientBuilder interface to cp.
This allows us to break the only dependency on K8s/K8s.
Added TODO to ControllerClientBuilder.
Fixed GoDeps.
Factored in feedback from JustinSB.
Automatic merge from submit-queue (batch tested with PRs 65251, 67255, 67224, 67297, 68105). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Promote mount propagation to GA
**What this PR does / why we need it**:
This PR promotes mount propagation to GA.
Website PR: https://github.com/kubernetes/website/pull/9823
**Release note**:
```release-note
Mount propagation has promoted to GA. The `MountPropagation` feature gate is deprecated and will be removed in 1.13.
```
Automatic merge from submit-queue (batch tested with PRs 64283, 67910, 67803, 68100). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
CSI Cluster Registry and Node Info CRDs
**What this PR does / why we need it**:
Introduces the new `CSIDriver` and `CSINodeInfo` API Object as proposed in https://github.com/kubernetes/community/pull/2514 and https://github.com/kubernetes/community/pull/2034
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/features/issues/594
**Special notes for your reviewer**:
Per the discussion in https://groups.google.com/d/msg/kubernetes-sig-storage-wg-csi/x5CchIP9qiI/D_TyOrn2CwAJ the API is being added to the staging directory of the `kubernetes/kubernetes` repo because the consumers will be attach/detach controller and possibly kubelet, but it will be installed as a CRD (because we want to move in the direction where the API server is Kubernetes agnostic, and all Kubernetes specific types are installed).
**Release note**:
```release-note
Introduce CSI Cluster Registration mechanism to ease CSI plugin discovery and allow CSI drivers to customize Kubernetes' interaction with them.
```
CC @jsafrane
Automatic merge from submit-queue (batch tested with PRs 64283, 67910, 67803, 68100). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Add a ProcMount option to the SecurityContext & AllowedProcMountTypes to PodSecurityPolicy
So there is a bit of a chicken and egg problem here in that the CRI runtimes will need to implement this for there to be any sort of e2e testing.
**What this PR does / why we need it**: This PR implements design proposal https://github.com/kubernetes/community/pull/1934. This adds a ProcMount option to the SecurityContext and AllowedProcMountTypes to PodSecurityPolicy
Relies on https://github.com/google/cadvisor/pull/1967
**Release note**:
```release-note
ProcMount added to SecurityContext and AllowedProcMounts added to PodSecurityPolicy to allow paths in the container's /proc to not be masked.
```
cc @Random-Liu @mrunalp
Automatic merge from submit-queue (batch tested with PRs 67739, 65222). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Honor --hostname-override, report compatible hostname addresses with cloud provider
xref #677147828e5d made cloud providers authoritative for the addresses reported on Node objects, so that the addresses used by the node (and requested as SANs in serving certs) could be verified via cloud provider metadata.
This had the effect of no longer reporting addresses of type Hostname for Node objects for some cloud providers. Cloud providers that have the instance hostname available in metadata should add a `type: Hostname` address to node status. This is being tracked in #67714
This PR does a couple other things to ease the transition to authoritative cloud providers:
* if `--hostname-override` is set on the kubelet, make the kubelet report that `Hostname` address. if it can't be verified via cloud-provider metadata (for cert approval, etc), the kubelet deployer is responsible for fixing the situation by adjusting the kubelet configuration (as they were in 1.11 and previously)
* if `--hostname-override` is not set, *and* the cloud provider didn't report a Hostname address, *and* the auto-detected hostname matches one of the addresses the cloud provider *did* report, make the kubelet report that as a Hostname address. That lets the addresses remain verifiable via cloud provider metadata, while still including a `Hostname` address whenever possible.
/sig node
/sig cloud-provider
/cc @mikedanese
fyi @hh
```release-note
NONE
```
This extends the Kubelet to create and periodically update leases in a
new kube-node-lease namespace. Based on [KEP-0009](https://github.com/kubernetes/community/blob/master/keps/sig-node/0009-node-heartbeat.md),
these leases can be used as a node health signal, and will allow us to
reduce the load caused by over-frequent node status reporting.
- add NodeLease feature gate
- add kube-node-lease system namespace for node leases
- add Kubelet option for lease duration
- add Kubelet-internal lease controller to create and update lease
- add e2e test for NodeLease feature
- modify node authorizer and node restriction admission controller
to allow Kubelets access to corresponding leases
Automatic merge from submit-queue (batch tested with PRs 65247, 63633, 67425). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix kubelet iptclient in ipv6 cluster
**What this PR does / why we need it**:
Kubelet uses "iptables" instead of "ip6tables" in an ipv6-only cluster. This causes failed traffic for type: LoadBalancer services (and probably a lot of other problems).
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#67398
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 66177, 66185, 67136, 67157, 65065). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubelet: reduce logging for backoff situations
xref https://bugzilla.redhat.com/show_bug.cgi?id=1555057#c6
Pods that are in `ImagePullBackOff` or `CrashLoopBackOff` currently generate a lot of logging at the `glog.Info()` level. This PR moves some of that logging to `V(3)` and avoids logging in situations where the `SyncPod` only fails because pod are in a BackOff error condition.
@derekwaynecarr @liggitt
Automatic merge from submit-queue (batch tested with PRs 67177, 53042). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Adding unit tests to methods of pod's format
What this PR does / why we need it:
Add unit test cases, thank you!
Automatic merge from submit-queue (batch tested with PRs 58755, 66414). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Use probe based plugin watcher mechanism in Device Manager
**What this PR does / why we need it**:
Uses this probe based utility in the device plugin manager.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#56944
**Notes For Reviewers**:
Changes are backward compatible and existing device plugins will continue to work. At the same time, any new plugins that has required support for probing model (Identity service implementation), will also work.
**Release note**
```release-note
Add support kubelet plugin watcher in device manager.
```
/sig node
/area hw-accelerators
/cc /cc @jiayingz @RenaudWasTaken @vishh @ScorpioCPH @sjenning @derekwaynecarr @jeremyeder @lichuqiang @tengqm @saad-ali @chakri-nelluri @ConnorDoyle
The --docker-disable-shared-pid flag has been deprecated since 1.10 and
has been superceded by ShareProcessNamespace in the pod API, which is
scheduled for beta in 1.12.
Automatic merge from submit-queue (batch tested with PRs 66152, 66406, 66218, 66278, 65660). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Refactor kubelet node status setters, add test coverage
This internal refactor moves the node status setters to a new package, explicitly injects dependencies to facilitate unit testing, and adds individual unit tests for the setters.
I gave each setter a distinct commit to facilitate review.
Non-goals:
- I intentionally excluded the class of setters that return a "modified" boolean, as I want to think more carefully about how to cleanly handle the behavior, and this PR is already rather large.
- I would like to clean up the status update control loops as well, but that belongs in a separate PR.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 65052, 65594). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Derive kubelet serving certificate CSR template from node status addresses
xref https://github.com/kubernetes/features/issues/267fixes#55633
Builds on https://github.com/kubernetes/kubernetes/pull/65587
* Makes the cloud provider authoritative when recording node status addresses
* Makes the node status addresses authoritative for the kube-apiserver determining how to speak to a kubelet (stops paying attention to the hostname label when determining how to reach a kubelet, which was only done to support kubelets < 1.5)
* Updates kubelet certificate rotation to be driven from node status
* Avoids needing to compute node addresses a second time, and differently, in order to request serving certificates.
* Allows the kubelet to react to changes in its status addresses by updating its serving certificate
* Allows the kubelet to be driven by external cloud providers recording node addresses on the node status
test procedure:
```sh
# setup
export FEATURE_GATES=RotateKubeletServerCertificate=true
export KUBELET_FLAGS="--rotate-server-certificates=true --cloud-provider=external"
# cleanup from previous runs
sudo rm -fr /var/lib/kubelet/pki/
# startup
hack/local-up-cluster.sh
# wait for a node to register, verify it didn't set addresses
kubectl get nodes
kubectl get node/127.0.0.1 -o jsonpath={.status.addresses}
# verify the kubelet server isn't available, and that it didn't populate a serving certificate
curl --cacert _output/certs/server-ca.crt -v https://localhost:10250/pods
ls -la /var/lib/kubelet/pki
# set an address on the node
curl -X PATCH http://localhost:8080/api/v1/nodes/127.0.0.1/status \
-H "Content-Type: application/merge-patch+json" \
--data '{"status":{"addresses":[{"type":"Hostname","address":"localhost"}]}}'
# verify a csr was submitted with the right SAN, and approve it
kubectl describe csr
kubectl certificate approve csr-...
# verify the kubelet connection uses a cert that is properly signed and valid for the specified hostname, but NOT the IP
curl --cacert _output/certs/server-ca.crt -v https://localhost:10250/pods
curl --cacert _output/certs/server-ca.crt -v https://127.0.0.1:10250/pods
ls -la /var/lib/kubelet/pki
# set an hostname and IP address on the node
curl -X PATCH http://localhost:8080/api/v1/nodes/127.0.0.1/status \
-H "Content-Type: application/merge-patch+json" \
--data '{"status":{"addresses":[{"type":"Hostname","address":"localhost"},{"type":"InternalIP","address":"127.0.0.1"}]}}'
# verify a csr was submitted with the right SAN, and approve it
kubectl describe csr
kubectl certificate approve csr-...
# verify the kubelet connection uses a cert that is properly signed and valid for the specified hostname AND IP
curl --cacert _output/certs/server-ca.crt -v https://localhost:10250/pods
curl --cacert _output/certs/server-ca.crt -v https://127.0.0.1:10250/pods
ls -la /var/lib/kubelet/pki
```
```release-note
* kubelets that specify `--cloud-provider` now only report addresses in Node status as determined by the cloud provider
* kubelet serving certificate rotation now reacts to changes in reported node addresses, and will request certificates for addresses set by an external cloud provider
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Store the latest cloud provider node addresses
**What this PR does / why we need it**:
Buffer the recently retrieved node address so they can be used as soon as the next node status update is run.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#65814
**Special notes for your reviewer**:
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 65290, 65326, 65289, 65334, 64860). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
checkLimitsForResolvConf for the pod create and update events instead of checking period
**What this PR does / why we need it**:
- Check for the same at pod create and update events instead of checking continuously for every 30 seconds.
- Increase the logging level to 4 or higher since the event is not catastrophic to cluster health .
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#64849
**Special notes for your reviewer**:
@ravisantoshgudimetla
**Release note**:
```release-note
checkLimitsForResolvConf for the pod create and update events instead of checking period
```
Automatic merge from submit-queue (batch tested with PRs 65187, 65206, 65223, 64752, 65238). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Kubelet watches necessary secrets/configmaps instead of periodic polling
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
move oldNodeUnschedulable pkg var to kubelet struct
**What this PR does / why we need it**:
move oldNodeUnschedulable pkg var to kubelet struct
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add proxy for container streaming in kubelet for streaming auth.
For https://github.com/kubernetes/kubernetes/issues/36666, option 2 of https://github.com/kubernetes/kubernetes/issues/36666#issuecomment-378440458.
This PR:
1. Removed the `DirectStreamingRuntime`, and changed `IndirectStreamingRuntime` to `StreamingRuntime`. All `DirectStreamingRuntime`s, `dockertools` and `rkt`, were removed.
2. Proxy container streaming in kubelet instead of returning redirect to apiserver. This solves the container runtime authentication issue, which is what we agreed on in https://github.com/kubernetes/kubernetes/issues/36666.
Please note that, this PR replaced the redirect with proxy directly instead of adding a knob to switch between the 2 behaviors. For existing CRI runtimes like containerd and cri-o, they should change to serve container streaming on localhost, so as to make the whole container streaming connection secure.
If a general authentication mechanism proposed in https://github.com/kubernetes/kubernetes/issues/62747 is ready, we can switch back to redirect, and all code can be found in github history.
Please also note that this added some overhead in kubelet when there are container streaming connections. However, the actual bottleneck is in the apiserver anyway, because it does proxy for all container streaming happens in the cluster. So it seems fine to get security and simplicity with this overhead. @derekwaynecarr @mrunalp Are you ok with this? Or do you prefer a knob?
@yujuhong @timstclair @dchen1107 @mikebrow @feiskyer
/cc @kubernetes/sig-node-pr-reviews
**Release note**:
```release-note
Kubelet now proxies container streaming between apiserver and container runtime. The connection between kubelet and apiserver is authenticated. Container runtime should change streaming server to serve on localhost, to make the connection between kubelet and container runtime local.
In this way, the whole container streaming connection is secure. To switch back to the old behavior, set `--redirect-container-streaming=true` flag.
```
While I normally try to avoid adding flags, this is a short term
scalability fix for v1.11, and there are other long-term solutions in
the works, so we shouldn't commit to this in the v1beta1 Kubelet config.
Flags are our escape hatch.
The goal of this change is to remove the registration of signal
handling from pkg/kubelet. We now pass in a stop channel.
If you register a signal handler in `main()` to aid in a controlled
and deliberate exit then the handler registered in `pkg/kubelet` often
wins and the process exits immediately. This means all other signal
handler registrations are currently racy if `DockerServer.Start()` is
directly or indirectly invoked.
This change also removes another signal handler registration from
`NewAPIServerCommand()`; a stop channel is now passed to this
function.
Automatic merge from submit-queue (batch tested with PRs 63314, 63884, 63799, 63521, 62242). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add memcg notifications for allocatable cgroup
**What this PR does / why we need it**:
Use memory cgroup notifications to trigger the eviction manager when the allocatable eviction threshold is crossed. This allows the eviction manager to respond more quickly when the allocatable cgroup's available memory becomes low. Evictions are preferable to OOMs in the cgroup since the kubelet can enforce its priorities on which pod is killed.
**Which issue(s) this PR fixes**:
Fixes https://github.com/kubernetes/kubernetes/issues/57901
**Special notes for your reviewer**:
This adds the alloctable cgroup from the container manager to the eviction config.
**Release note**:
```release-note
NONE
```
/sig node
/priority important-soon
/kind feature
I would like this to be included in the 1.11 release.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
track/close kubelet->API connections on heartbeat failure
xref #48638
xref https://github.com/kubernetes-incubator/kube-aws/issues/598
we're already typically tracking kubelet -> API connections and have the ability to force close them as part of client cert rotation. if we do that tracking unconditionally, we gain the ability to also force close connections on heartbeat failure as well. it's a big hammer (means reestablishing pod watches, etc), but so is having all your pods evicted because you didn't heartbeat.
this intentionally does minimal refactoring/extraction of the cert connection tracking transport in case we want to backport this
* first commit unconditionally sets up the connection-tracking dialer, and moves all the cert management logic inside an if-block that gets skipped if no certificate manager is provided (view with whitespace ignored to see what actually changed)
* second commit plumbs the connection-closing function to the heartbeat loop and calls it on repeated failures
follow-ups:
* consider backporting this to 1.10, 1.9, 1.8
* refactor the connection managing dialer to not be so tightly bound to the client certificate management
/sig node
/sig api-machinery
```release-note
kubelet: fix hangs in updating Node status after network interruptions/changes between the kubelet and API server
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Report node DNS info with --node-ip
**What this PR does / why we need it**:
This PR adds `ExternalDNS`, `InternalDNS`, and `ExternalIP` info for kubelets with the `--nodeip` flag enabled.
**Which issue(s) this PR fixes**
Fixes#63158
**Special notes for your reviewer**:
I added a field to the Kubelet to make IP validation more testable (`validateNodeIP` relies on the `net` package and the IP address of the host that is executing the test.) I also converted the test to use a table so new cases could be added more easily.
**Release Notes**
```release-note
Report node DNS info with --node-ip flag
```
@andrewsykim
@nckturner
/sig node
/sig network
Automatic merge from submit-queue (batch tested with PRs 62448, 59317, 59947, 62418, 62352). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubelet: add configuration to optionally enable server tls bootstrap
right now if the RotateKubeletServerCertificate feature is enabled,
kubelet will bootstrap server tls. this is undesirable if the deployment
is not or cannot run an approver to handle these certificate signing
requests.
Fixes https://github.com/kubernetes/kubernetes/issues/62077
```release-note
NONE
```
right now if the RotateKubeletServerCertificate feature is enabled,
kubelet will bootstrap server tls. this is undesirable if the deployment
is not or cannot run an approver to handle these certificate signing
requests.
The code was added to support rktnetes and non-CRI docker integrations.
These legacy integrations have already been removed from the codebase.
This change removes the compatibility code existing soley for the
legacy integrations.
Automatic merge from submit-queue (batch tested with PRs 57658, 61304, 61560, 61859, 61870). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
certs: exclude more nonsensical addresses from SANs
I noticed this when I saw 169.254.* SANs using server TLS bootstrap.
This change excludes more nonsensical addresses from being requested as
SANs in that flow.
I noticed this when I saw 169.254.* SANs using server TLS bootstrap.
This change excludes more nonsensical addresses from being requested as
SANs in that flow.
rktnetes is scheduled to be deprecated in 1.10 (#53601). According to
the deprecation policy for beta CLI and flags, we can remove the feature
in 1.11.
Fixes#58721
Automatic merge from submit-queue (batch tested with PRs 61487, 58353, 61078, 61219, 60792). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
remove dead code in kubelet
clean up dead code
/kind cleanup
/sig node
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 59740, 59728, 60080, 60086, 58714). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubelet: make --cni-bin-dir accept a comma-separated list of CNI plugin directories
Allow CNI-related network plugin drivers (kubenet, cni) to search a list of
directories for plugin binaries instead of just one. This allows using an
administrator-provided path and fallbacks to others (like the previous default
of /opt/cni/bin) for backwards compatibility.
```release-note
kubelet's --cni-bin-dir option now accepts multiple comma-separated CNI binary directory paths, which are search for CNI plugins in the given order.
```
@kubernetes/rh-networking @kubernetes/sig-network-misc @freehan @pecameron @rajatchopra
Automatic merge from submit-queue (batch tested with PRs 51423, 53880). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Disable ImageGC when high threshold is set to 100
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
fixes#51268
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
The LocalStorageCapacityIsolation feature added a new resource type
ResourceEphemeralStorage "ephemeral-storage" so that this resource can
be allocated, limited, and consumed as the same way as CPU/memory. All
the features related to resource management (resource request/limit, quota, limitrange) are avaiable for local ephemeral storage.
This local ephemeral storage represents the storage for root file system, which will be consumed by containers' writtable layer and logs. Some volumes such as emptyDir might also consume this storage.
Allow CNI-related network plugin drivers (kubenet, cni) to search a list of
directories for plugin binaries instead of just one. This allows using an
administrator-provided path and fallbacks to others (like the previous default
of /opt/cni/bin) for backwards compatibility.
Automatic merge from submit-queue (batch tested with PRs 60157, 60337, 60246, 59714, 60467). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
backoff runtime errors in kubelet sync loop
The runtime health check can race with PLEG's first relist, and this
often results in an unnecessary 5 second wait during Kubelet bootstrap.
This change aims to improve the performance.
```release-note
NONE
```
The word 'manifest' technically refers to a container-group specification
that predated the Pod abstraction. We should avoid using this legacy
terminology where possible. Fortunately, the Kubelet's config API will
be beta in 1.10 for the first time, so we still had the chance to make
this change.
I left the flags alone, since they're deprecated anyway.
I changed a few var names in files I touched too, but this PR is the
just the first shot, not the whole campaign
(`git grep -i manifest | wc -l -> 1248`).
The runtime health check can race with PLEG's first relist, and this
often results in an unnecessary 5 second wait during Kubelet bootstrap.
This change aims to improve the performance.
Automatic merge from submit-queue (batch tested with PRs 54191, 59374, 59824, 55032, 59906). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Adding per container stats for CRI runtimes
**What this PR does / why we need it**
This commit aims to collect per container log stats. The change was proposed as a part of #55905. The change includes change the log path from /var/pod/<pod uid>/containername_attempt.log to /var/pod/<pod uid>/containername/containername_attempt.log. The logs are collected by reusing volume package to collect metrics from the log path.
Fixes#55905
**Special notes for your reviewer:**
cc @Random-Liu
**Release note:**
```
Adding container log stats for CRI runtimes.
```
This commit aims to collect per container log stats. The
change was proposed as a part of #55905. The change includes
change of the log path from /var/pod/<pod uid>/containername_attempt.log
to /var/pod/<pod uid>/containername/containername_attempt.log.
The logs are collected by reusing volume package to collect
metrics from the log path.
Signed-off-by: abhi <abhi@docker.com>
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Collect ephemeral storage capacity on initialization
**What this PR does / why we need it**:
We have had some node e2e flakes where a pod can be rejected if it requests ephemeral storage. This is because we don't set capacity and allocatable for ephemeral storage on initialization.
This PR causes cAdvisor to do one round of stats collection during initialization, which will allow it to get the disk capacity when it first sets the node status.
It also sets the node to NotReady if capacities have not been initialized yet.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/assign @jingxu97 @Random-Liu
/sig node
/kind bug
/priority important-soon
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix kubelet PVC stale metrics
**What this PR does / why we need it**:
Volumes on each node changes, we should not only add PVC metrics into
gauge vector. It's better use a collector to collector metrics from internal
stats.
Currently, if a PV (bound to a PVC `testpv`) is attached and used by node A, then migrated to node B or just deleted from node A later. `testpvc` metrics will not disappear from kubelet on node A. After a long running time, `kubelet` process will keep a lot of stale volume metrics in memory.
For these dynamic metrics, it's better to use a collector to collect metrics from a data source (`StatsProvider` here), like [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics) scraping metrics from kube-apiserver.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/57686
**Special notes for your reviewer**:
**Release note**:
```release-note
Fix kubelet PVC stale metrics
```
Automatic merge from submit-queue (batch tested with PRs 59276, 51042, 58973, 59377, 59472). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubelet: only register api source when connecting
**What this PR does / why we need it**:
before this change, an api source was always registered, even when there
was no kubeclient. this lead to some operations blocking waiting for
podConfig.SeenAllSources to pass, which it never would.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#59275
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
This adds context to all the relevant cloud provider interface signatures.
Callers of those APIs are currently satisfied using context.TODO().
There will be follow on PRs to push the context through the stack.
For an idea of the full scope of this change please look at PR #58532.
before this change, an api source was always registered, even when there
was no kubeclient. this lead to some operations blocking waiting for
podConfig.SeenAllSources to pass, which it never would.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Only rotate certificates in the background
Change the Kubelet to not block until the first certs have rotated (we didn't act on it anyway) and fall back to the bootstrap cert if the most recent rotated cert is expired on startup.
The certificate manager originally had a "block on startup" rotation behavior to ensure at least one rotation happened on startup. However, since rotation may not succeed within the first time window the code was changed to simply print the error rather than return it. This meant that the blocking rotation has no purpose - it cannot cause the kubelet to fail, and it *does* block the kubelet from starting static pods before the api server becomes available.
The current block behavior causes a bootstrapped kubelet that is also set to run static pods to wait several minutes before actually launching the static pods, which means self-hosted masters using static pods have a pointless delay on startup.
Since blocking rotation has no benefit and can't actually fail startup, this commit removes the blocking behavior and simplifies the code at the same time. The goroutine for rotation now completely owns the deadline, the shouldRotate() method is removed, and the method that sets rotationDeadline now returns it. We also explicitly guard against a negative sleep interval and omit the message.
Should have no impact on bootstrapping except the removal of a long delay on startup before static pods start.
The other change is that an expired certificate from the cert manager is *not* considered a valid cert, which triggers an immediate rotation. This causes the cert manager to fall back to the original bootstrap certificate until a new certificate is issued. This allows the bootstrap certificate on masters to be "higher powered" and allow the node to function prior to initial approval, which means someone configuring the masters with a pre-generated client cert can be guaranteed that the kubelet will be able to communicate to report self-hosted static pod status, even if the first client rotation hasn't happened. This makes master self-hosting more predictable for static configuration environments.
```release-note
When using client or server certificate rotation, the Kubelet will no longer wait until the initial rotation succeeds or fails before starting static pods. This makes running self-hosted masters with rotation more predictable.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remove setInitError.
**What this PR does / why we need it**:
Removes setInitError, it's not sure it was ever really used, and it causes the kubelet to hang and get wedged.
**Which issue(s) this PR fixes**
Fixes#46086
**Special notes for your reviewer**:
If `initializeModules()` in `kubelet.go` encounters an error, it calls `runtimeState.setInitError(...)`
47d61ef472/pkg/kubelet/kubelet.go (L1339)
The trouble with this is that `initError` is never cleared, which means that `runtimeState.runtimeErrors()` always returns this `initError`, and thus pods never start sync-ing.
In normal operation, this is expected and desired because eventually the runtime is expected to become healthy, but in this case, `initError` is never updated, and so the system just gets wedged.
47d61ef472/pkg/kubelet/kubelet.go (L1751)
We could add some retry to `initializeModules()` but that seems unnecessary, as eventually we'd want to just die anyway. Instead, just log fatal and die, a supervisor will restart us.
Note, I'm happy to add some retry here too, if that makes reviewers happier.
**Release note**:
```release-note
Prevent kubelet from getting wedged if initialization of modules returns an error.
```
@feiskyer @dchen1107 @janetkuo
@kubernetes/sig-node-bugs
The certificate manager originally had a "block on startup" rotation
behavior to ensure at least one rotation happened on startup. However,
since rotation may not succeed within the first time window the code was
changed to simply print the error rather than return it. This meant that
the blocking rotation has no purpose - it cannot cause the kubelet to
fail, and it *does* block the kubelet from starting static pods before
the api server becomes available.
The current block behavior causes a bootstrapped kubelet that is also
set to run static pods to wait several minutes before actually launching
the static pods, which means self-hosted masters using static pods have
a pointless delay on startup.
Since blocking rotation has no benefit and can't actually fail startup,
this commit removes the blocking behavior and simplifies the code at the
same time. The goroutine for rotation now completely owns the deadline,
the shouldRotate() method is removed, and the method that sets
rotationDeadline now returns it. We also explicitly guard against a
negative sleep interval and omit the message.
Should have no impact on bootstrapping except the removal of a long
delay on startup before static pods start.
Also add a guard condition where if the current cert in the store is
expired, we fall back to the bootstrap cert initially (we use the
bootstrap cert to communicate with the server). This is consistent with
when we don't have a cert yet.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add deprecation warnings for rktnetes flags
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#53601
**Special notes for your reviewer**:
**Release note**:
```release-note
rktnetes has been deprecated in favor of rktlet. Please see https://github.com/kubernetes-incubator/rktlet for more information.
```
This moves plugin/pkg/scheduler to pkg/scheduler and
plugin/cmd/kube-scheduler to cmd/kube-scheduler.
Bulk of the work was done with gomvpkg, except for kube-scheduler main
package.
Automatic merge from submit-queue (batch tested with PRs 57127, 57011, 56754, 56601, 56483). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remove hacks added for mesos
**What this PR does / why we need it**:
Since Mesos is no longer in your main repository and since we have
things like dynamic kubelet configuration in progress, we should
drop these undocumented, untested, private hooks.
cmd/kubelet/app/server.go::CreateAPIServerClientConfig
CreateAPIServerClientConfig::getRuntime
pkg/kubelet/kubelet_pods.go::getPhase
Also remove stuff from Dependencies struct that were specific to
the Mesos integration (ContainerRuntimeOptions and Options)
Also remove stale references in test/e2e and and test owners file
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Drop hacks used for Mesos integration that was already removed from main kubernetes repository
```
Automatic merge from submit-queue (batch tested with PRs 57172, 55382, 56147, 56146, 56158). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
delete useless params containerized
Signed-off-by: zhangjie <zhangjie0619@yeah.net>
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
delete useless params containerized
```
Since Mesos is no longer in your main repository and since we have
things like dynamic kubelet configuration in progress, we should
drop these undocumented, untested, private hooks.
cmd/kubelet/app/server.go::CreateAPIServerClientConfig
CreateAPIServerClientConfig::getRuntime
pkg/kubelet/kubelet_pods.go::getPhase
Also remove stuff from Dependencies struct that were specific to
the Mesos integration (ContainerRuntimeOptions and Options)
Also remove stale references in test/e2e and and test owners file
Automatic merge from submit-queue (batch tested with PRs 55812, 55752, 55447, 55848, 50984). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Initial basic bootstrap-checkpoint support
**What this PR does / why we need it**:
Adds initial support for Pod checkpointing to allow for controlled recovery of the control plane during self host failure conditions.
fixes#49236
xref https://github.com/kubernetes/features/issues/378
**Special notes for your reviewer**:
Proposal is here: https://docs.google.com/document/d/1hhrCa_nv0Sg4O_zJYOnelE8a5ClieyewEsQM6c7-5-o/edit?ts=5988fba8#
1. Controlled tests work, but I have not tested the self hosted api-server recovery, that requires validation and logs. /cc @luxas
2. In adding hooks for checkpoint manager much of the tests around basicpodmanager appears to be stub'd. This has become an anti-pattern in the code and should be avoided.
3. I need a node-e2e to ensure consistency of behavior.
**Release note**:
```
Add basic bootstrap checkpointing support to the kubelet for control plane recovery
```
/cc @kubernetes/sig-cluster-lifecycle-misc @kubernetes/sig-node-pr-reviews
- Instead of using cm.capacity field to communicate device plugin resource
capacity, this PR changes to use an explicit cm.GetDevicePluginResourceCapacity()
function that returns device plugin resource capacity as well as any inactive
device plugin resource. Kubelet syncNodeStatus call this function during its
periodic run to update node status capacity and allocatable. After this call,
device plugin can remove the inactive device plugin resource from its allDevices
field as the update is already pushed to API server.
- Extends device plugin checkpoint data to record registered resources
so that we can finish resource removing even upon kubelet restarts.
- Passes sourcesReady from kubelet to device plugin to avoid removing
inactive pods during grace period of kubelet restart.