- Allow client-side to server-side apply upgrade.
Ensure that a user can change management of an object from client-side apply to
server-side apply without conflicts.
- Allow server-side apply to client-side downgrade.
For an object managed with client-side apply, a user may upgrade to
managing the object with server-side apply, then decide to downgrade.
We can support this downgrade by keeping the last-applied-configuration
annotation for client-side apply updated with server-side apply.
Using NodeWrapper in the integration tests gives more flexibility when
creating nodes. For instance, tests can create nodes with labels or
with a specific sets of resources.
Also, NodeWrapper initialises a node with a capacity of 32 pods, which
can be overridden by the caller. This makes sure that a node is usable
as soon as it is created.
If a reserve plugin's Reserve method returns an error, there could be
previously allocated resources from successfully completed reserve
plugins that must be unallocated by the corresponding Unreserve
operation. Since Unreserve operations are idempotent, this patch runs
the Unreserve operation of ALL reserve plugins when a Reserve operation
fails.
Previously, separate interfaces were defined for Reserve and Unreserve
plugins. However, in nearly all cases, a plugin that allocates a
resource using Reserve will likely want to register itself for Unreserve
as well in order to free the allocated resource at the end of a failed
scheduling/binding cycle. Having separate plugins for Reserve and
Unreserve also adds unnecessary config toil. To that end, this patch
aims to merge the two plugins into a single interface called a
ReservePlugin that requires implementing both the Reserve and Unreserve
methods.
ingress: use new serviceBackend split
ingress: remove all v1beta1 restrictions on creation
This change removes creation and update restrictions enforced by
k8s 1.18 for not allowing resource backends.
Paths are no longer
required to be valid regex and a PathType is now user-specified
and no longer defaulted.
Also remove all TODOs in staging/net/v1 types
Signed-off-by: Christopher M. Luciano <cmluciano@us.ibm.com>
- use in-cache Pod instead of real-time Pod (by calling API server) to mark it as unschedulable
in internal schedulingQ
- remove the backoff logic as now we don't call API server
- the whole logic is changed to a synchronous call
We have little coverage around node addition and removal. Since distinct event handlers interact, it is important to cover this in integration tests.
Signed-off-by: Aldo Culquicondor <acondor@google.com>
In case two or more controllers share the informers created through InitTestScheduler,
it's not safe to start the informers until all controllers set their informer
indexers. Otherwise, some controller might fail to register their indexers
in time. Thus, it's responsibility of each consumer to make sure all informers
are started after all controllers had time to get initiliazed.
this is mainly to ensure integration tests (which all end in _test)
are properly bossed around for their imports
I had to adjust some of the _test files to adhere to existing
reverse_rules specified elsewhere
yaml has comments, so we can explain why we have certain rules or
certain prefixes
for those files that weren't already commented yaml, I converted them to
yaml and took a best guess at comments based on the PRs that introduced
or updated them
The service allocator is used to allocate ip addresses for the
Service IP allocator and NodePorts for the Service NodePort
allocator. It uses a bitmap backed by etcd to store the allocation
and tries to allocate the resources directly from the local memory
instead from etcd, that can cause issues in environment with
high concurrency.
It may happen, in deployments with multiple apiservers, that the
resource allocation information is out of sync, this is more
sensible with NodePorts, per example:
1. apiserver A create a service with NodePort X
2. apiserver B deletes the service
3. apiserver A creates the service again
If the allocation data of apiserver A wasn't refreshed with the
deletion of apiserver B, apiserver A fails the allocation because
the data is out of sync. The Repair loops solve the problem later,
but there are some use cases that require to improve the concurrency
in the allocation logic.
We can try to not do the Allocation and Release operations locally,
and try instead to check if the local data is up to date with etcd,
and operate over the most recent version of the data.
Integration tests imported e2e test code and the dependency made two drawbacks:
- Hard to move test/e2e/framework into staging (#74352)
- Need to run integration tests always even if PRs are just changing e2e test code
This enables import-boss check for blocking such dependency.
- Move utilities or constants out so that both of them should be able
to run independently.
- Rename the legacy test so that it can eventually be deleted when the
perf dash changes is done
So multiple instances of kube-apiserver can bind on the same address and
port, to provide seamless upgrades.
Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
Most of these could have been refactored automatically but it wouldn't
have been uglier. The unsophisticated tooling left lots of unnecessary
struct -> pointer -> struct transitions.
This is gross but because NewDeleteOptions is used by various parts of
storage that still pass around pointers, the return type can't be
changed without significant refactoring within the apiserver. I think
this would be good to cleanup, but I want to minimize apiserver side
changes as much as possible in the client signature refactor.
The configuration file was design as a yaml file on purpose.
To easily extend the test cases without a need to modify
the testing binary. Also, it's possible to extend the configuration
itself to enrich individual test cases.
After moving Permit() to the scheduling cycle test PermitPlugin should
no longer wait inside Permit() for another pod to enter Permit() and become waiting pod.
In the past this was a way to make test work regardless of order in
which pods enter Permit(), but now only one Permit() can be executed at
any given moment and waiting for another pod to enter Permit() inside
Permit() leads to timeouts.
In this change waitAndRejectPermit and waitAndAllowPermit flags make first
pod to enter Permit() a waiting pod and second pod to enter Permit()
either rejecting or allowing pod.
Mentioned in #88469
1. move the integration test of TaintBasedEvictions to test/integration/node
2. move the e2e test of TaintBasedEvictions e2e test/e2e/node
3. modify the conformance file to adapt the TaintBasedEviction test
Close outbound connections when using a cert callback and certificates rotate. This means that we won't get into a situation where we have open TLS connections using expires certs, which would get unauthorized errors at the apiserver
Attempt to retrieve a new certificate if open connections near expiry, to prevent the case where the cert expires but we haven't yet opened a new TLS connection and so GetClientCertificate hasn't been called.
Move certificate rotation logic to a separate function
Rely on generic transport approach to handle closing TLS client connections in exec plugin; no need to use a custom dialer as this is now the default behaviour of the transport when faced with a cert callback. As a result of handling this case, it is now safe to apply the transport approach even in cases where there is a custom Dialer (this will not affect kubelet connrotation behaviour, because that uses a custom transport, not just a dialer).
Check expiry of the full TLS certificate chain that will be presented, not only the leaf. Only do this check when the certificate actually rotates. Start the certificate as a zero value, not nil, so that we don't see a rotation when there is in fact no client certificate
Drain the timer when we first initialize it, to prevent immediate rotation. Additionally, calling Stop() on the timer isn't necessary.
Don't close connections on the first 'rotation'
Remove RotateCertFromDisk and RotateClientCertFromDisk flags.
Instead simply default to rotating certificates from disk whenever files are exclusively provided.
Add integration test for client certificate rotation
Simplify logic; rotate every 5 mins
Instead of trying to be clever and checking for rotation just before an
expiry, let's match the logic of the new apiserver cert rotation logic
as much as possible. We write a controller that checks for rotation
every 5 mins. We also check on every new connection.
Respond to review
Fix kubelet certificate rotation logic
The kubelet rotation logic seems to be broken because it expects its
cert files to end up as cert data whereas in fact they end up as a
callback. We should just call the tlsConfig GetCertificate callback
as this obtains a current cert even in cases where a static cert is
provided, and check that for validity.
Later on we can refactor all of the kubelet logic so that all it does is
write files to disk, and the cert rotation work does the rest.
Only read certificates once a second at most
Respond to review
1) Don't blat the cert file names
2) Make it more obvious where we have a neverstop
3) Naming
4) Verbosity
Avoid cache busting
Use filenames as cache keys when rotation is enabled, and add the
rotation later in the creation of the transport.
Caller should start the rotating dialer
Add continuous request rotation test
Rebase: use context in List/Watch
Swap goroutine around
Retry GETs on net.IsProbableEOF
Refactor certRotatingDialer
For simplicity, don't affect cert callbacks
To reduce change surface, lets not try to handle the case of a changing
GetCert callback in this PR. Reverting this commit should be sufficient
to handle that case in a later PR.
This PR will focus only on rotating certificate and key files.
Therefore, we don't need to modify the exec auth plugin.
Fix copyright year