Currently kubelet syncs all pods every 10s. This is not preferred because
* Some pods may have been sync'd recently.
* This may cause all the pods to be sync'd at once, causing undesirable
CPU spikes.
This PR replaces the global syncs with independent, periodic pod syncs. At the
end of syncing, each pod worker will enqueue itslef with a future timestamp (
current time + sync interval), when it will be due for another sync.
* If the pod worker encoutners an sync error, it may requeue with a different
timestamp to retry sooner.
* If a sync is triggered by the update channel (events or spec changes), the
pod worker would enqueue a new sync time.
This change is necessary for moving to long or no periodic sync period once pod
lifecycle event generator is completed. We will still rely on the mechanism to
requeue the pod on sync error.
This change also makes sure that if a sync does not succeed (either due to
real error or the per-container backoff mechanism), an error would be propagated
back to the pod worker, which is responsible for requeuing.
Relaxes the very very ancient restriction we put in place to keep the
original API surface area PR small. Better to be consistent with actual
expected use of tail.
This commit builds on previous work and creates an independent
worker for every liveness probe. Liveness probes behave largely the same
as readiness probes, so much of the code is shared by introducing a
probeType paramater to distinguish the type when it matters. The
circular dependency between the runtime and the prober is broken by
exposing a shared liveness ResultsManager, owned by the
kubelet. Finally, an Updates channel is introduced to the ResultsManager
so the kubelet can react to unhealthy containers immediately.
Change all references to the container ID in pkg/kubelet/... to the
strong type defined in pkg/kubelet/container: ContainerID
The motivation for this change is to make the format of the ID
unambiguous, specifically whether or not it includes the runtime
prefix (e.g. "docker://").
Each container with a readiness has an individual go-routine which
handles periodic probing for that container. The results are cached, and
written to the status.Manager in the pod sync path.
Correct port-forward data copying logic so that the server closes its
half of the data stream when socat exits, and the client closes its half
of the data stream when it finishes writing.
Modify the client to wait for both copies (client->server,
server->client) to finish before it unblocks.
Fix race condition in the Kubelet's handling of incoming port forward
streams. Have the client generate a connectionID header to be used to
associate the error and data streams for a single connection, instead of
assuming that streams n and n+1 go together. Attempt to generate a
pseudo connectionID in the server in the event the connectionID header
isn't present (older clients); this is a best-effort approach that only
really works with 1 connection at a time, whereas multiple concurrent
connections will only work reliably with a newer client that is
generating connectionID.
Add an experimental network plugin implementation named "cni" that
uses the Container Networking Interface (CNI) specification for
configuring networking for pods.
https://github.com/appc/cni/blob/master/SPEC.md
Increase the supported controls on pod logging. Add validaiton to pod
log options. Ensure the Kubelet is using a consistent, structured way to
process pod log arguments.
Add ?sinceSeconds=<durationInSeconds>, &sinceTime=<RFC3339>, ?timestamps=<bool>,
?tailLines=<number>, and ?limitBytes=<number>
Add a HostIPC field to the Pod Spec to create containers sharing
the same ipc of the host.
This feature must be explicitly enabled in apiserver using the
option host-ipc-sources.
Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>
1. Make reason field of StatusReport objects in kubelet in CamelCase format.
2. Add Message field for ContainerStateWaiting to describe detail about Reason.
3. Make reason field of Events in kubelet in CamelCase format.
4. Update swagger,deep-copy and so on.
A lot of packages use StringSet, but they don't use anything else from
the util package. Moving StringSet into another package will shrink
their dependency trees significantly.
The sync loop should check for terminated pods that are no longer
running and clear them. The status loop should never write status
if the pod UID changes. Mirror pods should be deleted immediately
rather than gracefully.
Avoid TTL by deleting pods immediately when they aren't
scheduled, and letting the Kubelet delete them otherwise.
Ensure the Kubelet uses pod.Spec.TerminationGracePeriodSeconds
when no pod.DeletionGracePeriodSeconds is available.
Since it takes a while (1-2mins) for kubelet to pulling a big image
(>500MB). Just showing "Pending" for pod status is not very helpful.
This commit introduces a "pulling" event, and inserts it before the
kubelet starts to pull an image.
If a pod was deleted and the associated volumes/directory were removed, there
could be a window where the pod worker is still active. If the pod worker tries
to inspect the logs, such an error would be logged. Since the pod has been
deleted, such error messages are meaningless.
This change stops logging this error, but stores the error string in the pod
status. The pod status will be updated for pods that are still alive, and will
be discarded eventually for deleted pods.
This commit wires together the graceful delete option for pods
on the Kubelet. When a pod is deleted on the API server, a
grace period is calculated that is based on the
Pod.Spec.TerminationGracePeriodInSeconds, the user's provided grace
period, or a default. The grace period can only shrink once set.
The value provided by the user (or the default) is set onto metadata
as DeletionGracePeriod.
When the Kubelet sees a pod with DeletionTimestamp set, it uses the
value of ObjectMeta.GracePeriodSeconds as the grace period
sent to Docker. When updating status, if the pod has DeletionTimestamp
set and all containers are terminated, the Kubelet will update the
status one last time and then invoke Delete(pod, grace: 0) to
clean up the pod immediately.