The functionality used to exist entirely in the NC which would
previously clean up pods and nodes together. Now, we simply
wait for the PodGC to see that the node is now deleted and clean up the
pods. This may take a while and hence we set a 1 minute timeout.
Automatic merge from submit-queue
Node controller to not force delete pods
Fixes https://github.com/kubernetes/kubernetes/issues/35145
- [x] e2e tests to test Petset, RC, Job.
- [x] Remove and cover other locations where we force-delete pods within the NodeController.
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
``` release-note
Node controller no longer force-deletes pods from the api-server.
* For StatefulSet (previously PetSet), this change means creation of replacement pods is blocked until old pods are definitely not running (indicated either by the kubelet returning from partitioned state, or deletion of the Node object, or deletion of the instance in the cloud provider, or force deletion of the pod from the api-server). This has the desirable outcome of "fencing" to prevent "split brain" scenarios.
* For all other existing controllers except StatefulSet , this has no effect on the ability of the controller to replace pods because the controllers do not reuse pod names (they use generate-name).
* User-written controllers that reuse names of pod objects should evaluate this change.
```
This changes framework.GetReadySchedulableNodesOrDie and
framework.GetMasterAndWorkerNodesOrDie so that nodes that can't take a
generic fake pod due to a taint/toleration mismatch aren't returned.
This is a rehash of #35210, but pulls in the scheduler code.
Automatic merge from submit-queue
remove the non-generated client
Removes the non-generated client from kube. The package has a few methods left, but nothing that needs updating when adding new groups.
@ingvagabund
Automatic merge from submit-queue
Set done to true & return error if RestartPolicy not Always in test framework
Found a small issue with https://github.com/kubernetes/kubernetes/pull/34632, it returns an error if the RestartPolicy is not Always, but the user will never see it because done isn't set to true & they will timeout instead.
@Random-Liu because you wrote that PR
Automatic merge from submit-queue
Adding cascading deletion support to federated namespaces
Ref https://github.com/kubernetes/kubernetes/issues/33612
With this change, whenever a federated namespace is deleted with `DeleteOptions.OrphanDependents = false`, then federation namespace controller first deletes the corresponding namespaces from all underlying clusters before deleting the federated namespace.
cc @kubernetes/sig-cluster-federation @caesarxuchao
```release-note
Adding support for DeleteOptions.OrphanDependents for federated namespaces. Setting it to false while deleting a federated namespace also deletes the corresponding namespace from all registered clusters.
```
Automatic merge from submit-queue
Make E2E tests easier to debug
This PR removes deferred deletion calls in E2E tests in order to make test easier to debug. Some of these tests predate namespace finalization; we should just rely on that mechanism to ensure that framework test artifacts are deleted.
Automatic merge from submit-queue
controller: set minReadySeconds in deployment's replica sets
* Estimate available pods for a deployment by using minReadySeconds on
the replica set.
* Stop requeueing deployments on pod events, superseded by following the
replica set status.
* Cleanup redundant deployment utilities
Fixes https://github.com/kubernetes/kubernetes/issues/26079
@kubernetes/deployment ptal
Automatic merge from submit-queue
Move RunRC-like functions to test/utils
Ref. #34336
cc @timothysc - the "move" part of the small refactoring. @jayunit100
* Estimate available pods for a deployment by using minReadySeconds on
the replica set.
* Stop requeueing deployments on pod events, superseded by following the
replica set status.
* Cleanup redundant deployment utilities
Automatic merge from submit-queue
Allow garbage collection to work against different API prefixes
The GC needs to build clients based only on Resource or Kind. Hoist the
restmapper out of the controller and the clientpool, support a new
ClientForGroupVersionKind and ClientForGroupVersionResource, and use the
appropriate one in both places.
Allows OpenShift to use the GC
The GC needs to build clients based only on Resource or Kind. Hoist the
restmapper out of the controller and the clientpool, support a new
ClientForGroupVersionKind and ClientForGroupVersionResource, and use the
appropriate one in both places.
Automatic merge from submit-queue
Allow to use GetSigner with vagrant provider
In order to run tests that require ssh access to a node on vagrant
we need to provide path to private ssh key.
Now it will be possible to do using VAGRANT_SSH_KEY environment variable
Automatic merge from submit-queue
Add test for --quiet flag for kubectl run
This adds a test for the changes introduced in #30247 and #28801.
Ref #28695
In order to run tests that require ssh access to a node on vagrant
we need to provide path to private ssh key.
Now it will be possible to do using VAGRANT_SSH_KEY environment variable
Change-Id: Ic5fe0037edd46d0db3b8036ad7fc03cf1ea07574
Automatic merge from submit-queue
Provide an e2e skip helper checking for available resource
@janetkuo @dims this is the promised util function, but unfortunately I just learned that dynamic client suffers from the problem I've fixed in the manually written one (https://github.com/kubernetes/kubernetes/pull/29187) I need to look into the dynamic client in that case :/
Automatic merge from submit-queue
update taints e2e, restrict taints operation with key, effect
Since taints are now unique by key, effect on a node, this PR is to restrict existing taints adding/removing/updating operations in taints e2e.
Also fixes https://github.com/kubernetes/kubernetes/issues/31066#issuecomment-242870101
Related prior Issue/PR #29362 and #30590
Automatic merge from submit-queue
Log pressure condition, memory usage, events in memory eviction test
I want to log this to help us debug some of the latest memory eviction test flakes, where we are seeing burstable "fail" before the besteffort. I saw (in the logs) attempts by the eviction manager to evict besteffort a while before burstable phase changed to "Failed", but the besteffort's phase appeared to remain "Running". I want to see the pressure condition interleaved with the pod phases to get a sense of the eviction manager's knowledge vs. pod phase.
Automatic merge from submit-queue
Return detailed error message for better debugging.
Try to provide more details error message for debugging when this flake #31561 happens again.
@pwittrock
Automatic merge from submit-queue
e2e: log wget output on CheckConnectivityToHost error
Log output might help to diagnose e2e flakes, whether they are caused by dns issues or connection timeouts.
Might help with flake https://github.com/kubernetes/kubernetes/issues/28188.
Automatic merge from submit-queue
add retries for add/update/remove taints on node in taints e2e
fixes taint update conflict in taints e2e by adding retries for add/update/remove taints on node.
ref #27655 and #31066