Deprecated metrics are removed and suggest to use the Histogram
metrics got from scheduler extension points.
Signed-off-by: Dave Chen <dave.chen@arm.com>
Co-authored-by: wawa0210 <xiaozhang0210@hotmail.com>
PodTopologySpread plugin will only count the existing pod when that
pod's label matches with `constraint.Selector`, which means all pods
could be scheduled to one topology zone when the constraint does not
have any selector defined.
Signed-off-by: Dave Chen <dave.chen@arm.com>
- as soon as a request is received by the apiserver, determine the
timeout of the request and set a new request context with the deadline.
- the timeout filter that times out non-long-running requests should
use the request context as opposed to a fixed 60s wait today.
- admission and storage layer uses the same request context with the
deadline specified.
Instead of hardcoding fedora:latest, use one of our e2e images as
source inside the created pods. This will allow users who test with
this data outside of integration environments to reference a real
image and avoid spurious errors.
To make sure that the storage version filter can block certain requests until
the storage version updates are completed, and that the apiserver works
properly after the storage version updates are done.
* Rename const for topology.../zone
* Rename const for topology.../region
* Rename const for failure-domain.../zone
* Rename const for failure-domain.../region
* Restore old names for compat
add integration tests to verify the behaviour of the endpoints
and endpointslices controller with dual stack services.
Since services can be single or dual stack, endpoints should be
generated for each IP family in the endpoint slice controller.
The legacy endpoint controller only will generate endpoints
in the first IP family configured in the service.
integration fix
* api: structure change
* api: defaulting, conversion, and validation
* [FIX] validation: auto remove second ip/family when service changes to SingleStack
* [FIX] api: defaulting, conversion, and validation
* api-server: clusterIPs alloc, printers, storage and strategy
* [FIX] clusterIPs default on read
* alloc: auto remove second ip/family when service changes to SingleStack
* api-server: repair loop handling for clusterIPs
* api-server: force kubernetes default service into single stack
* api-server: tie dualstack feature flag with endpoint feature flag
* controller-manager: feature flag, endpoint, and endpointSlice controllers handling multi family service
* [FIX] controller-manager: feature flag, endpoint, and endpointSlicecontrollers handling multi family service
* kube-proxy: feature-flag, utils, proxier, and meta proxier
* [FIX] kubeproxy: call both proxier at the same time
* kubenet: remove forced pod IP sorting
* kubectl: modify describe to include ClusterIPs, IPFamilies, and IPFamilyPolicy
* e2e: fix tests that depends on IPFamily field AND add dual stack tests
* e2e: fix expected error message for ClusterIP immutability
* add integration tests for dualstack
the third phase of dual stack is a very complex change in the API,
basically it introduces Dual Stack services. Main changes are:
- It pluralizes the Service IPFamily field to IPFamilies,
and removes the singular field.
- It introduces a new field IPFamilyPolicyType that can take
3 values to express the "dual-stack(mad)ness" of the cluster:
SingleStack, PreferDualStack and RequireDualStack
- It pluralizes ClusterIP to ClusterIPs.
The goal is to add coverage to the services API operations,
taking into account the 6 different modes a cluster can have:
- single stack: IP4 or IPv6 (as of today)
- dual stack: IPv4 only, IPv6 only, IPv4 - IPv6, IPv6 - IPv4
* [FIX] add integration tests for dualstack
* generated data
* generated files
Co-authored-by: Antonio Ojea <aojea@redhat.com>
Simulating a cluster with 500 nodes in 3 zones, deploying 3, 12 and 27 Pods belonging to the same service.
Change-Id: I16425594012ea7bd24b888acedb12958360bff97
The integration test for pods produces a warning caused by using deprecated
default cluster IPs.
$ make test-integration WHAT=./test/integration/pods GOFLAGS="-v"
W1007 17:25:28.217410 100721 services.go:37] No CIDR for service cluster IPs specified. Default value which was 10.0.0.0/24 is deprecated and will be removed in future releases. Please specify it using --service-cluster-ip-range on kube-apiserver.
This warning appears 36 times after running all tests. This patch removes all
the warnings.
Signed-off-by: Masashi Honma <masashi.honma@gmail.com>
PATCH verb is used when creating a namespace using server-side apply,
while POST verb is used when creating a namespace using client-side
apply.
The difference in path between the two ways to create a namespace led to
an inconsistency when calling webhooks. When server-side apply is used,
the request sent to webhooks has the field "namespace" populated with
the name of namespace being created. On the other hand, when using
client-side apply the "namespace" field is omitted.
This commit aims to make the behaviour consistent and populates the
"namespace" field when creating a namespace using POST verb (i.e.
client-side apply).
The provided DialContext wraps existing clients' DialContext in an attempt to
preserve any existing timeout configuration. In some cases, we may replace
infinite timeouts with golang defaults.
- scaleio: tcp connect/keepalive values changed from 0/15 to 30/30
- storageos: no change
When updating ephemeral containers, convert Pod to EphemeralContainers
in storage validation. This resolves a bug where admission webhook
validation fails for ephemeral container updates because the webhook
client cannot perform the conversion.
Also enable the EphemeralContainers feature gate for the admission
control integration test, which would have caught this bug.
the endpoints API handler was using the Canonicalize() method to
reorder the endpoints, however, due to differences with the
endpoint controller RepackSubsets(), the controller was considering
the endpoints different despite they were not, generating unnecessary
updates evert resync period.
This test was trying to create an Endpoints resource that the Endpoints
controller would also attempt to create. This could result in a failure
if the Endpoints controller created the resource before the test did.
- Test that client-side apply users don't encounter a conflict with
server-side apply for objects that previously didn't track managedFields
- Test that we stop tracking managed fields with `managedFields: []`
- Test that we stop tracking managed fields when the feature is disabled
- Revert "Fix integration test flake on TestFilter and TestPostFilter"
This reverts commit 94fc18c2dc.
- Relax checking logic on expected Filter/PostFilter counters.
- Move "ForgetPod" after "RunReservePluginsUnreserve", so that the cache would hold the pod to
avoid it's being retried simutaneously until Unreserve is completed.
- Move "assume" ahead of "RunReservePluginsReserve". This is based on the fact that "ForgetPod" is
the last step of failure path, so "assume" should be reversly treated as the first step. The
current failure path is like this:
assume -> reserve -> unreserve -> forgetPod -> recordingFailure
- Make subtests of TestReservePluginUnreserve stateless
The IPAM and scheduler performance tests are currently causing
integration-master job to fail because of timeouts. They were not
previously running as part of integration-master, so we can disable them
without loss of test coverage. They should be re-enabled as part of fix
for #93112.
Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
- Allow client-side to server-side apply upgrade.
Ensure that a user can change management of an object from client-side apply to
server-side apply without conflicts.
- Allow server-side apply to client-side downgrade.
For an object managed with client-side apply, a user may upgrade to
managing the object with server-side apply, then decide to downgrade.
We can support this downgrade by keeping the last-applied-configuration
annotation for client-side apply updated with server-side apply.
Using NodeWrapper in the integration tests gives more flexibility when
creating nodes. For instance, tests can create nodes with labels or
with a specific sets of resources.
Also, NodeWrapper initialises a node with a capacity of 32 pods, which
can be overridden by the caller. This makes sure that a node is usable
as soon as it is created.
If a reserve plugin's Reserve method returns an error, there could be
previously allocated resources from successfully completed reserve
plugins that must be unallocated by the corresponding Unreserve
operation. Since Unreserve operations are idempotent, this patch runs
the Unreserve operation of ALL reserve plugins when a Reserve operation
fails.
Previously, separate interfaces were defined for Reserve and Unreserve
plugins. However, in nearly all cases, a plugin that allocates a
resource using Reserve will likely want to register itself for Unreserve
as well in order to free the allocated resource at the end of a failed
scheduling/binding cycle. Having separate plugins for Reserve and
Unreserve also adds unnecessary config toil. To that end, this patch
aims to merge the two plugins into a single interface called a
ReservePlugin that requires implementing both the Reserve and Unreserve
methods.
ingress: use new serviceBackend split
ingress: remove all v1beta1 restrictions on creation
This change removes creation and update restrictions enforced by
k8s 1.18 for not allowing resource backends.
Paths are no longer
required to be valid regex and a PathType is now user-specified
and no longer defaulted.
Also remove all TODOs in staging/net/v1 types
Signed-off-by: Christopher M. Luciano <cmluciano@us.ibm.com>
- use in-cache Pod instead of real-time Pod (by calling API server) to mark it as unschedulable
in internal schedulingQ
- remove the backoff logic as now we don't call API server
- the whole logic is changed to a synchronous call
We have little coverage around node addition and removal. Since distinct event handlers interact, it is important to cover this in integration tests.
Signed-off-by: Aldo Culquicondor <acondor@google.com>
In case two or more controllers share the informers created through InitTestScheduler,
it's not safe to start the informers until all controllers set their informer
indexers. Otherwise, some controller might fail to register their indexers
in time. Thus, it's responsibility of each consumer to make sure all informers
are started after all controllers had time to get initiliazed.
this is mainly to ensure integration tests (which all end in _test)
are properly bossed around for their imports
I had to adjust some of the _test files to adhere to existing
reverse_rules specified elsewhere
yaml has comments, so we can explain why we have certain rules or
certain prefixes
for those files that weren't already commented yaml, I converted them to
yaml and took a best guess at comments based on the PRs that introduced
or updated them
The service allocator is used to allocate ip addresses for the
Service IP allocator and NodePorts for the Service NodePort
allocator. It uses a bitmap backed by etcd to store the allocation
and tries to allocate the resources directly from the local memory
instead from etcd, that can cause issues in environment with
high concurrency.
It may happen, in deployments with multiple apiservers, that the
resource allocation information is out of sync, this is more
sensible with NodePorts, per example:
1. apiserver A create a service with NodePort X
2. apiserver B deletes the service
3. apiserver A creates the service again
If the allocation data of apiserver A wasn't refreshed with the
deletion of apiserver B, apiserver A fails the allocation because
the data is out of sync. The Repair loops solve the problem later,
but there are some use cases that require to improve the concurrency
in the allocation logic.
We can try to not do the Allocation and Release operations locally,
and try instead to check if the local data is up to date with etcd,
and operate over the most recent version of the data.
Integration tests imported e2e test code and the dependency made two drawbacks:
- Hard to move test/e2e/framework into staging (#74352)
- Need to run integration tests always even if PRs are just changing e2e test code
This enables import-boss check for blocking such dependency.
- Move utilities or constants out so that both of them should be able
to run independently.
- Rename the legacy test so that it can eventually be deleted when the
perf dash changes is done
So multiple instances of kube-apiserver can bind on the same address and
port, to provide seamless upgrades.
Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
Most of these could have been refactored automatically but it wouldn't
have been uglier. The unsophisticated tooling left lots of unnecessary
struct -> pointer -> struct transitions.
This is gross but because NewDeleteOptions is used by various parts of
storage that still pass around pointers, the return type can't be
changed without significant refactoring within the apiserver. I think
this would be good to cleanup, but I want to minimize apiserver side
changes as much as possible in the client signature refactor.
The configuration file was design as a yaml file on purpose.
To easily extend the test cases without a need to modify
the testing binary. Also, it's possible to extend the configuration
itself to enrich individual test cases.
After moving Permit() to the scheduling cycle test PermitPlugin should
no longer wait inside Permit() for another pod to enter Permit() and become waiting pod.
In the past this was a way to make test work regardless of order in
which pods enter Permit(), but now only one Permit() can be executed at
any given moment and waiting for another pod to enter Permit() inside
Permit() leads to timeouts.
In this change waitAndRejectPermit and waitAndAllowPermit flags make first
pod to enter Permit() a waiting pod and second pod to enter Permit()
either rejecting or allowing pod.
Mentioned in #88469
1. move the integration test of TaintBasedEvictions to test/integration/node
2. move the e2e test of TaintBasedEvictions e2e test/e2e/node
3. modify the conformance file to adapt the TaintBasedEviction test
Close outbound connections when using a cert callback and certificates rotate. This means that we won't get into a situation where we have open TLS connections using expires certs, which would get unauthorized errors at the apiserver
Attempt to retrieve a new certificate if open connections near expiry, to prevent the case where the cert expires but we haven't yet opened a new TLS connection and so GetClientCertificate hasn't been called.
Move certificate rotation logic to a separate function
Rely on generic transport approach to handle closing TLS client connections in exec plugin; no need to use a custom dialer as this is now the default behaviour of the transport when faced with a cert callback. As a result of handling this case, it is now safe to apply the transport approach even in cases where there is a custom Dialer (this will not affect kubelet connrotation behaviour, because that uses a custom transport, not just a dialer).
Check expiry of the full TLS certificate chain that will be presented, not only the leaf. Only do this check when the certificate actually rotates. Start the certificate as a zero value, not nil, so that we don't see a rotation when there is in fact no client certificate
Drain the timer when we first initialize it, to prevent immediate rotation. Additionally, calling Stop() on the timer isn't necessary.
Don't close connections on the first 'rotation'
Remove RotateCertFromDisk and RotateClientCertFromDisk flags.
Instead simply default to rotating certificates from disk whenever files are exclusively provided.
Add integration test for client certificate rotation
Simplify logic; rotate every 5 mins
Instead of trying to be clever and checking for rotation just before an
expiry, let's match the logic of the new apiserver cert rotation logic
as much as possible. We write a controller that checks for rotation
every 5 mins. We also check on every new connection.
Respond to review
Fix kubelet certificate rotation logic
The kubelet rotation logic seems to be broken because it expects its
cert files to end up as cert data whereas in fact they end up as a
callback. We should just call the tlsConfig GetCertificate callback
as this obtains a current cert even in cases where a static cert is
provided, and check that for validity.
Later on we can refactor all of the kubelet logic so that all it does is
write files to disk, and the cert rotation work does the rest.
Only read certificates once a second at most
Respond to review
1) Don't blat the cert file names
2) Make it more obvious where we have a neverstop
3) Naming
4) Verbosity
Avoid cache busting
Use filenames as cache keys when rotation is enabled, and add the
rotation later in the creation of the transport.
Caller should start the rotating dialer
Add continuous request rotation test
Rebase: use context in List/Watch
Swap goroutine around
Retry GETs on net.IsProbableEOF
Refactor certRotatingDialer
For simplicity, don't affect cert callbacks
To reduce change surface, lets not try to handle the case of a changing
GetCert callback in this PR. Reverting this commit should be sufficient
to handle that case in a later PR.
This PR will focus only on rotating certificate and key files.
Therefore, we don't need to modify the exec auth plugin.
Fix copyright year