test/e2e images have lost the parity compared the e2e suite image
versions and this commit make them in parity.
Signed-off-by: Humble Chirammal <humble.devassy@gmail.com>
The plugins get called by scheduler goroutines. At least the polling seems to
be done concurrently and thus needs locking.
Locking the PreBindPlugin state is less obvious. It might be that the scheduler
is really done with the test pod, but that ordering doesn't seem to be enough
for the race detector. It's simpler to add mutex locking.
v1.17.0 has been built at present where this commit make use
of available latest version/tag for the build purpose.
Signed-off-by: Humble Chirammal <humble.devassy@gmail.com>
a) add namespacing to metrics: fixes interference between `should scale up when one metric is missing (Pod and External metrics)` and `should not scale down when one metric is missing (Container Resource and External Metrics)` specs, cause of flakiness.
b) replaces deployments containing unused exporters (metrics ignored) with deployments without any exporters: potential fix for often hitting a rate-limit on creating metrics descriptors (429 errors), also adds clarity.
c) fixes metric types: some external metrics tests used non-average type while expecting the value to be constant regardless of the number of pods. However, queries resulting from metric specs don't filter by pods, so a sum of metrics for all the pods is the fetched metric value (https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/#autoscaling-on-metrics-not-related-to-kubernetes-objects). Adding averaging back by the number of pods fixes a couple of specs where the tests were passing for the wrong reason (wanted d ifferent test conditions).
The nightly containerd binary no longer works in the current kind base images:
May 15 16:32:31 kind-worker containerd[222]: /usr/local/bin/containerd:
/lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by
/usr/local/bin/containerd)
kind now builds containerd directly with the base images. The official base
images still use containerd 1.6, so we have to use a special base image that
was prepared for this purpose.
Because the containerd config can be patched through kind, we don't need to
modify the generated node image anymore.
The goal is to only label workloads as "performance" which actually run long
enough to provide useful metrics. The throughput collector samples once per
second, so a workload should run at least 5, better 10 seconds to get at least
a minimal amount of samples for the percentile calculation.
For benchstat analysis of runs with sufficient repetitions to get statistically
meaningful results, each workload shouldn't run more than one minute, otherwise
before/after analysis becomes too slow.
The labels were chosen based on benchmark runs on a reasonably fast desktop. To
know how long each workload takes, a new "runtime_seconds" benchmark result
gets added.
This PR updates changes related references to the legacy
release bucket, excluding CHANGELOG updates.
Signed-off-by: Ricky Sadowski <richard.j.sadowski@gmail.com>
When certain status conditions are not expected, we need to see
the nested objects, but %#v doesn't handle pointers well. Output
as simple encoded JSON.
Add two new metrics to monitor the client-go logic that
generate http.Transports for the clients.
- rest_client_transport_cache_entries is a gauge metrics
with the number of existin entries in the internal cache
- rest_client_transport_create_calls_total is a counter
that increments each time a new transport is created, storing
the result of the operation needed to generate it: hit, miss
or uncacheable
Change-Id: I2d8bde25281153d8f8e8faa249385edde3c1cb39
* update serial number to a valid non-zero number in ca certificate
* fix the existing problem (0 SerialNumber in all certificate) as part of this PR in a separate commit
There are some e2e tets on networking services that depend on the
healthcheck nodeport to be available. However, the healtcheck nodeport
will be available asynchronously, so we should wait until it is
available on the tests and not fail hard if it is not.
Change-Id: I595402c070c263f0e7855ee8d5662ae975dbd1d3
The existing termination period of 600 seconds for pods on the
e2e test causes that those pods are kept running after the
test has finished. 100 seconds is a good compromise to avoid
leaving pods lingering and more than enought for the test to finish.
Change-Id: I993162a77125345df1829044dc2514e03b13a407
The default scheduler configuration must be based on the v1 API where the
plugin is enabled by default. Then if (and only if) the
DynamicResourceAllocation feature gate for a test is set, the corresponding
API group also gets enabled.
The normal dynamic resource claim controller is started if needed to create
ResourceClaims from ResourceClaimTemplates.
Without the upcoming optimizations in the scheduler, scheduling with dynamic
resources is fairly slow. The new test cases take around 15 minutes wall clock
time on my desktop.
add a new functionality to the agnhost image to run a server that
closes the connections received by sending a RST.
If a TCP servers closes the connection before all the socket is read,
it sends a RST. This implementations just reads only one byte from the
connection and closes it after that, that means that in order for this
to work the client has to send at least 2 bytes of data.
Using a simple curl is enough to trigger a RST:
curl http://127.0.0.1:8080
curl: (56) Recv failure: Connection reset by peer
Change-Id: I238fba0f790f2c92b37c732f51910a8b125f65db
image_list.go is one of the files included in the non-test variant Go build list, but its getSampleDevicePluginPod function references readDaemonSetV1OrDie function defined in device_plugin_test.go which is included in the test variant Go build list only. (The file name is *_test.go).
As a result, "go build" fails with the undefined reference error.
In practice, that may not be an issue since k8s project contributors aren't meant to run go build on this package. However, tools that depend on go build to operate - e.g., gopls or govulncheck ./... - will report this as an error.
Fix this error and make test/e2e package pass go build by moving this file to also test-only source code.
Sometimes, the kube-proxy endpoints take longer than 10 seconds to
be ready, so the initial connection check fails (via `gomega.Eventually`).
This patch uses a separate longer timeout for `gomega.Eventually`.
Signed-off-by: Ionut Balutoiu <ibalutoiu@cloudbasesolutions.com>
* Fix flaky HPA e2e tests by not failing on context cancelled
Consume requests are sent during test execution in a loop in a separate goroutine. Once the test completes, it is expected that a consumption request may be pending. Cancelling the request during cleanup should not cause test failures.
Tests started being flaky since #112923 introduced passing test context that gets cancelled during cleanup.
* Use PollUntilContextTimeout and restructure error ignoring logic
Additional test cases added:
Keeps device plugin assignments across pod and kubelet restarts (no device plugin re-registration)
Keeps device plugin assignments after the device plugin has re-registered (no kubelet or pod restart)
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Add a test suit to simulate node reboot (achieved by removing pods
using CRI API before kubelet is restarted).
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
This test captures that scenario where after kubelet restart,
application pod comes up and the device plugin pod hasn't re-registered
itself, the pod fails with admission error. It is worth noting that
once the device plugin pod has registered itself, another
application pod requesting devices ends up running
successfully.
For the test case where kubelet is restarted and device plugin
has re-registered without involving pod restart, since the
pod after kubelet restart ends up with admission error,
we cannot be certain the device that the second pod (pod2) would
get. As long as, it gets a device we consider the test to pass.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Breakdown of the steps implemented as part of this e2e test is as follows:
1. Create a file `registration` at path `/var/lib/kubelet/device-plugins/sample/`
2. Create sample device plugin with an environment variable with
`REGISTER_CONTROL_FILE=/var/lib/kubelet/device-plugins/sample/registration` that
waits for a client to delete the control file.
3. Trigger plugin registeration by deleting the abovementioned directory.
4. Create a test pod requesting devices exposed by the device plugin.
5. Stop kubelet.
6. Remove pods using CRI to ensure new pods are created after kubelet restart.
7. Restart kubelet.
8. Wait for the sample device plugin pod to be running. In this case,
the registration is not triggered.
9. Ensure that resource capacity/allocatable exported by the device plugin is zero.
10. The test pod should fail with `UnexpectedAdmissionError`
11. Delete the test pod.
12. Delete the sample device plugin pod.
13. Remove `/var/lib/kubelet/device-plugins/sample/` and its content, the directory
created to control registration
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
This commit reuses e2e tests implmented as part of https://github.com/kubernetes/kubernetes/pull/110729.
The commit is borrowed from the aforementioned PR as is to preserve
authorship. Subsequent commit will update the end to end test to
simulate the problem this PR is trying to solve by reproducing
the issue: 109595.
Co-authored-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
This will change when adding dynamic resource allocation test cases. Instead of
changing mustSetupScheduler and StartScheduler for that, let's return the
informer factory and create informers as needed in the test.
Entire test cases and workloads can have labels attached to them. The union of
these must match the label filter which works as in GitHub. The benchmark by
default runs the tests that are labeled "performance", which is the same as
before.
Capture explicitly a test case pertaining to kubelet restart
but with no pod restart and device plugin re-registration.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Based on whether the test case requires pod restart or not, the sleep
interval needs to be updated and we define constants to represent the two
sleep intervals that can be used in the corresponding test cases.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Co-authored-by: Francesco Romani <fromani@redhat.com>
Explicitly state that the test involves kubelet restart and device plugin
re-registration (no pod restart)
We remove the part of the code where we wait for the pod to restart as this
test case should no longer involve pod restart.
In addition to that, we use `waitForNodeReady` instead of `WaitForAllNodesSchedulable`
for ensuring that the node is ready for pods to be scheduled on it.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Co-authored-by: Francesco Romani <fromani@redhat.com>
Rather than testing out for both pod restart and kubelet restart,
we change the tests to just handle pod restart scenario.
Clarify the test purpose and add extra check to tighten the test.
We would be adding additional tests to cover kubelet restart scenarios
in subsequent commits.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
With this change the error message are more helpful and easier
to troubleshoot in case of test failures.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
We rename to make the intent more explicit;
We make it global to be able to reuse the value all across the module
(e.g. to check the node readiness) later on.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Co-authored-by: Francesco Romani <fromani@redhat.com>
Rather than only returning a string forcing us to log failure with
`framework.Fail`, we return a string and error to handle error cases
more conventionally. This enables us to use the `parseLog` function
inside `Eventually` and `Consistently` blocks, or in general to delegate
the error processing and enable better composability.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Co-authored-by: Francesco Romani <fromani@redhat.com>
The previous solution had some shortcomings:
- It was based on the assumption that the goroutine gets woken up at regular
intervals. This is not actually guaranteed. Now the code keeps track of the
actual start and end of an interval and verifies that assumption.
- If no pod was scheduled (unlikely, but could happen), then
"0 pods/s" got recorded. In such a case, the metric was always either
zero or >= 1. A better solution is to extend the interval
until some pod gets scheduled. With the larger time interval
it is then possible to also track, for example, 0.5 pods/s.
This allows us to return with a timeout error as soon as the
context is canceled. Previously in cases where the mount will
never succeed pods can get stuck deleting for 2 minutes.
In the Sync*Pod methods that call VolumeManager.WaitFor*, we
must filter out wait.Interrupted errors from being logged as
they are part of control flow, not runtime problems. Any
early interruption should result in exiting the Sync*Pod method
as quickly as possible without logging intermediate errors.
Copied and modified RemoveString function from
k/k/pkg/util/slice/slice.go to e2e/framework/pod/pod_client.go
This is the last dependency from e2e framework to k/k/pkg/util
Copied and modified pod format function from
k/k/pkg/kubelet/util/format/pod.go to e2e/framework/pod/pod_client.go
This is the last dependency from e2e framework to k/k/pkg/kubelet
It's conceptually wrong to have dependencies to k/k/pkg in
the e2e framework code. They should be moved to corresponding
packages, in this particular case to the test/e2e_node.
* Convert file but warn user with impossible conversions
* Only continuing for NotRegisteredErrors. Using iostreams for warning user instead of stdError
* Formatting, correct tests to use valid DNS-1035.
By generating the unique name in advance, the label also can be set to a
matching value directly in the Create request. This makes test startup in
test/integration/scheduler_perf a bit faster because the extra patching can be
avoided.
It also leads to a better label because previously, the unique label value
didn't match the node name. This is required for simulating dynamic resource
allocation, which relies on the label to track where an allocated claim is
available.
Add a regression test for https://issues.k8s.io/116925. The test
exercises the following:
1) Start a restart never pod which will exit with
`v1.PodSucceeded` phase.
2) Start a graceful deletion of the pod (set a deletion timestamp)
3) Restart the kubelet as soon as the kubelet reports the pod is
terminal (but before the pod is deleted).
4) Verify that after kubelet restart, the pod is deleted.
As of v1.27, there is a delay between the pod being marked terminal
phaes, and the status manager deleting the pod. If the kubelet is
restarted in the middle, after starting up again, the kubelet needs to
ensure the pod will be deleted on the API server.
Signed-off-by: David Porter <david@porter.me>
It makes sense to define a test where, depending on the parameters, some
operation creations zero pods, namespaces or nodes. The validation didn't allow
that previously due to the way how it was implemented although the underlying
code works fine with zero as count.
collector.collect got called without ensuring that collector.run had
terminated, so it could have happened that collector.run adds another sample
while collector.collect is reading them.
test-e2e-node for AWS is out-of-tree so that we won't need to vendor
in AWS related packages. For this to work, some of the scripts/golang
code need to know where the k8s tree is git cloned.
So let's add an option to lookup the env var, so that we can then,
change directory to this specified directory to run some make commands
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
grpclog.SetLoggerV is not thread-safe and may only be called before code starts
using GRPC. Calling RunCustomEtcd multiple times, for example in
k8s.io/kubernetes/test/integration/apiserver.TestWatchCacheUpdatedByEtcd,
causes a data race:
WARNING: DATA RACE
Read at 0x00000c8e8d20 by goroutine 135612:
k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog.V()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog/grpclog.go:41 +0x30
k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog.(*componentData).V()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog/component.go:103 +0x4e
k8s.io/kubernetes/vendor/google.golang.org/grpc/internal/transport.(*loopyWriter).run.func1()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/internal/transport/controlbuf.go:528 +0xf1
runtime.deferreturn()
/home/prow/go/src/k8s.io/kubernetes/_output/local/.gimme/versions/go1.20.2.linux.amd64/src/runtime/panic.go:476 +0x32
k8s.io/kubernetes/vendor/google.golang.org/grpc/internal/transport.newHTTP2Client.func6()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/internal/transport/http2_client.go:442 +0x112
Previous write at 0x00000c8e8d20 by goroutine 140228:
k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog.SetLoggerV2()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog/loggerv2.go:76 +0xc6a
k8s.io/kubernetes/test/integration/framework.RunCustomEtcd()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/integration/framework/etcd.go:153 +0xb89
k8s.io/kubernetes/test/integration/apiserver.multiEtcdSetup()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/integration/apiserver/watchcache_test.go:40 +0xac
k8s.io/kubernetes/test/integration/apiserver.TestWatchCacheUpdatedByEtcd()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/integration/apiserver/watchcache_test.go:88 +0x4a
testing.tRunner()
/home/prow/go/src/k8s.io/kubernetes/_output/local/.gimme/versions/go1.20.2.linux.amd64/src/testing/testing.go:1576 +0x216
testing.(*T).Run.func1()
/home/prow/go/src/k8s.io/kubernetes/_output/local/.gimme/versions/go1.20.2.linux.amd64/src/testing/testing.go:1629 +0x47
Users can pass resources into `kubectl events` command via `--for` flag,
if they have desire to only get events for the resource they specify.
However, current `kubectl events` does not support passing fully qualified
names(e.g. `replicasets.apps`, `cronjobs.v1.batch`, etc.). This PR adds support
for this.
The newly added `MirrorPodWithGracePeriod when create a mirror pod and
the container runtime is temporarily down during pod termination` test
is currently flaking because in some cases when it is run there are
other pods from other tests that are still in progress of being
terminated. This results in the test failing because it asserts metrics
that assume that there is only one pod running on the node.
To fix the flake, prior to starting the test, verify that no pods exist
in the api server other then the newly created mirror pod.
Signed-off-by: David Porter <david@porter.me>
this check needs to go after any mutations. After the mutating admission chain, rest.BeforeUpdate (which is responsible for reverting updates to immutable timestamp fields, among other things.) is called in the store.Update function. Without moving this check, it will be possible for an object to be written to etcd with only a change to its managed fields timestamp.
when adding a DisruptionTarget condition into a pod that will be deleted
- handle ResourceVersion and Preconditions correctly
- handle DryRun option correctly
Co-authored-by: Jordan Liggitt jordan@liggitt.net
* Slightly changed pod spec to repro issue #116262
* Refactor test to ensure that the static pod is deleted even if the
test fails
Signed-off-by: David Porter <david@porter.me>
We now get structured output using jsonpath for the
name and version fields of the node object and then
compare the outputs.
Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
* api changes adding match conditions
* feature gate and registry strategy to drop fields
* matchConditions logic for admission webhooks
* feedback
* update test
* import order
* bears.com
* update fail policy ignore behavior
* update docs and matcher to hold fail policy as non-pointer
* update matcher error aggregation, fix early fail failpolicy ignore, update docs
* final cleanup
* openapi gen
This PR makes the NodePrepareResources() and NodeUnprepareResource()
calls of the kubeletplugin API for DynamicResourceAllocation
symmetrical. It wasn't clear how one would use the set of CDIDevices
passed back in the NodeUnprepareResource() of the v1alpha1 API, and the
new API now passes back the full ResourceHandle that was originally
passed to the Prepare() call. Passing the ResourceHandle is strictly
more informative and a plugin could always (re)derive the set of
CDIDevice from it.
This is a breaking change, but this release is scheduled to break
multiple APIs for DynamicResourceAllocation, so it makes sense to do
this now instead of later.
Signed-off-by: Kevin Klues <kklues@nvidia.com>
They contain some nice-to-have improvements (for example, better printing of
errors with gomega/format.Object) but nothing that is critical right now.
"go mod tidy" was run manually in
staging/src/k8s.io/kms/internal/plugins/mock (https://github.com/kubernetes/kubernetes/pull/116613
not merged yet).
This change updates KMS v2 to not create a new DEK for every
encryption. Instead, we re-use the DEK while the key ID is stable.
Specifically:
We no longer use a random 12 byte nonce per encryption. Instead, we
use both a random 4 byte nonce and an 8 byte nonce set via an atomic
counter. Since each DEK is randomly generated and never re-used,
the combination of DEK and counter are always unique. Thus there
can never be a nonce collision. AES GCM strongly encourages the use
of a 12 byte nonce, hence the additional 4 byte random nonce. We
could leave those 4 bytes set to all zeros, but there is no harm in
setting them to random data (it may help in some edge cases such as
live VM migration).
If the plugin is not healthy, the last DEK will be used for
encryption for up to three minutes (there is no difference on the
behavior of reads which have always used the DEK cache). This will
reduce the impact of a short plugin outage while making it easy to
perform storage migration after a key ID change (i.e. simply wait
ten minutes after the key ID change before starting the migration).
The DEK rotation cycle is performed in sync with the KMS v2 status
poll thus we always have the correct information to determine if a
read is stale in regards to storage migration.
Signed-off-by: Monis Khan <mok@microsoft.com>
The name "PodScheduling" was unusual because in contrast to most other names,
it was impossible to put an article in front of it. Now PodSchedulingContext is
used instead.
Annotation
As part of this change, kube-proxy accepts any value for either
annotation that is not "disabled".
Change-Id: Idfc26eb4cc97ff062649dc52ed29823a64fc59a4
The kube-apiserver validation expects the Count of an EventSeries to be
at least 2, otherwise it rejects the Event. There was is discrepancy
between the client and the server since the client was iniatizing an
EventSeries to a count of 1.
According to the original KEP, the first event emitted should have an
EventSeries set to nil and the second isomorphic event should have an
EventSeries with a count of 2. Thus, we should matcht the behavior
define by the KEP and update the client.
Also, as an effort to make the old clients compatible with the servers,
we should allow Events with an EventSeries count of 1 to prevent any
unexpected rejections.
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
Implement DOS prevention wiring a global rate limit for podresources
API. The goal here is not to introduce a general ratelimiting solution
for the kubelet (we need more research and discussion to get there),
but rather to prevent misuse of the API.
Known limitations:
- the rate limits value (QPS, BurstTokens) are hardcoded to
"high enough" values.
Enabling user-configuration would require more discussion
and sweeping changes to the other kubelet endpoints, so it
is postponed for now.
- the rate limiting is global. Malicious clients can starve other
clients consuming the QPS quota.
Add e2e test to exercise the flow, because the wiring itself
is mostly boilerplate and API adaptation.
We have quite a few podresources e2e tests and, as the feature
progresses to GA, we should consider moving them to NodeConformance.
Unfortunately most of them require linux-specific features not in the
test themselves but in the test prelude (fixture) to check or create the
node conditions (e.g. presence or not of devices, online CPUS...) to be
verified in the test proper.
For this reason we promote only a single test for starters.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Since we can't rely on the test runner and hosts under test to
be on the same machine, we write to the terminate log from each
container and concatenate the results.
If a CRI error occurs during the terminating phase after a pod is
force deleted (API or static) then the housekeeping loop will not
deliver updates to the pod worker which prevents the pod's state
machine from progressing. The pod will remain in the terminating
phase but no further attempts to terminate or cleanup will occur
until the kubelet is restarted.
The pod worker now maintains a store of the pods state that it is
attempting to reconcile and uses that to resync unknown pods when
SyncKnownPods() is invoked, so that failures in sync methods for
unknown pods no longer hang forever.
The pod worker's store tracks desired updates and the last update
applied on podSyncStatuses. Each goroutine now synchronizes to
acquire the next work item, context, and whether the pod can start.
This synchronization moves the pending update to the stored last
update, which will ensure third parties accessing pod worker state
don't see updates before the pod worker begins synchronizing them.
As a consequence, the update channel becomes a simple notifier
(struct{}) so that SyncKnownPods can coordinate with the pod worker
to create a synthetic pending update for unknown pods (i.e. no one
besides the pod worker has data about those pods). Otherwise the
pending update info would be hidden inside the channel.
In order to properly track pending updates, we have to be very
careful not to mix RunningPods (which are calculated from the
container runtime and are missing all spec info) and config-
sourced pods. Update the pod worker to avoid using ToAPIPod()
and instead require the pod worker to directly use
update.Options.Pod or update.Options.RunningPod for the
correct methods. Add a new SyncTerminatingRuntimePod to prevent
accidental invocations of runtime only pod data.
Finally, fix SyncKnownPods to replay the last valid update for
undesired pods which drives the pod state machine towards
termination, and alter HandlePodCleanups to:
- terminate runtime pods that aren't known to the pod worker
- launch admitted pods that aren't known to the pod worker
Any started pods receive a replay until they reach the finished
state, and then are removed from the pod worker. When a desired
pod is detected as not being in the worker, the usual cause is
that the pod was deleted and recreated with the same UID (almost
always a static pod since API UID reuse is statistically
unlikely). This simplifies the previous restartable pod support.
We are careful to filter for active pods (those not already
terminal or those which have been previously rejected by
admission). We also force a refresh of the runtime cache to
ensure we don't see an older version of the state.
Future changes will allow other components that need to view the
pod worker's actual state (not the desired state the podManager
represents) to retrieve that info from the pod worker.
Several bugs in pod lifecycle have been undetectable at runtime
because the kubelet does not clearly describe the number of pods
in use. To better report, add the following metrics:
kubelet_desired_pods: Pods the pod manager sees
kubelet_active_pods: "Admitted" pods that gate new pods
kubelet_mirror_pods: Mirror pods the kubelet is tracking
kubelet_working_pods: Breakdown of pods from the last sync in
each phase, orphaned state, and static or not
kubelet_restarted_pods_total: A counter for pods that saw a
CREATE before the previous pod with the same UID was finished
kubelet_orphaned_runtime_pods_total: A counter for pods detected
at runtime that were not known to the kubelet. Will be
populated at Kubelet startup and should never be incremented
after.
Add a metric check to our e2e tests that verifies the values are
captured correctly during a serial test, and then verify them in
detail in unit tests.
Adds 23 series to the kubelet /metrics endpoint.
Add some additional init container tests that work via monitoring
container lifetime based on logs written to a common file. This allows
more easily writing assertions about the container lifetimes with
respect to one another.
6f2cd1b5bd swapped the order of cancel() and
closeFn() so that closeFn got called first when the test was done. This caused
it to block while waiting for goroutines which themselves were waiting for
the context cancellation. The test still shut down, it just took ~86s instead
of ~30s.
The fix is to register the cancel twice: once as soon as the context is
created (to clean up in case of an unexpected panic) and once after
closeFn (because then it'll get called first, as before).
The test creates a Service exposing two protocols on the same port
and a backend that replies on both protocols.
1. Test that Service with works for both protocol
2. Update Service to expose only the TCP port
3. Verify that TCP works and UDP does not work
4. Update Service to expose only the UDP port
5. Verify that TCP does not work and UDP does work
Change-Id: Ic4f3a6509e332aa5694d20dfc3b223d7063a7871
Test 2 scenarios:
- pod can connect to a terminating pods
- terminating pod can connect to other pods
Change-Id: Ia5dc4e7370cc055df452bf7cbaddd9901b4d229d