Automatic merge from submit-queue (batch tested with PRs 35408, 41915, 41992, 41964, 41925)
e2e/upgrade: add sysctls
Add sysctl upgrade tests.
How can these be effectively tested?
Automatic merge from submit-queue (batch tested with PRs 41954, 40528, 41875, 41165, 41877)
Updating apiserver to return 202 when resource is being deleted asynchronously via cascading deletion
As per https://github.com/kubernetes/kubernetes/issues/33196#issuecomment-278440622.
cc @kubernetes/sig-api-machinery-pr-reviews @smarterclayton @caesarxuchao @bgrant0607 @kubernetes/api-reviewers
```release-note
Updating apiserver to return http status code 202 for a delete request when the resource is not immediately deleted because of user requesting cascading deletion using DeleteOptions.OrphanDependents=false.
```
Automatic merge from submit-queue (batch tested with PRs 41701, 41818, 41897, 41119, 41562)
Optimization of on-wire information sent to scheduler extender interfaces that are stateful
The changes are to address the issue described in https://github.com/kubernetes/kubernetes/issues/40991
```release-note
Added support to minimize sending verbose node information to scheduler extender by sending only node names and expecting extenders to cache the rest of the node information
```
cc @ravigadde
Automatic merge from submit-queue (batch tested with PRs 41994, 41969, 41997, 40952, 40576)
Guaranteed admission for Critical Pods
This is the first step in implementing node-level preemption for critical pods.
It defines the AdmissionFailureHandler interface, which allows callers, like the kubelet, to define how failed predicates are handled, and take steps to correct failures if necessary.
In the kubelet's implementation, it triggers preemption if the pod being admitted is critical, and if the only failed predicates are InsufficientResourceErrors, then it prempts (not yet implemented) other other pods to allow admission of the critical pod.
cc: @vishh
Automatic merge from submit-queue (batch tested with PRs 41994, 41969, 41997, 40952, 40576)
Inode Eviction test Tests that Innocent pods are not evicted throughout test.
The current version of the inode eviction test only makes sure that the two abusive pods are evicted before the innocent pod. This change adds a check during the post-eviction monitoring period to ensure that the innocent pod is not evicted, even after the abusive pods.
This will likely result in a higher failure rate for this test, but since it is in the flaky test suite, this is not an issue.
This change should help give us a better perspective on eviction issues.
cc @Random-Liu @vishh @derekwaynecarr
Automatic merge from submit-queue (batch tested with PRs 40932, 41896, 41815, 41309, 41628)
Modify CronJob API to add job history limits, cleanup jobs in controller
**What this PR does / why we need it**:
As discussed in #34710: this adds two limits to `CronJobSpec`, to limit the number of finished jobs created by a CronJob to keep.
**Which issue this PR fixes**: fixes#34710
**Special notes for your reviewer**:
cc @soltysh, please have a look and let me know what you think -- I'll then add end to end testing and update the doc in a separate commit. What is the timeline to get this into 1.6?
The plan:
- [x] API changes
- [x] Changing versioned APIs
- [x] `types.go`
- [x] `defaults.go` (nothing to do)
- [x] `conversion.go` (nothing to do?)
- [x] `conversion_test.go` (nothing to do?)
- [x] Changing the internal structure
- [x] `types.go`
- [x] `validation.go`
- [x] `validation_test.go`
- [x] Edit version conversions
- [x] Edit (nothing to do?)
- [x] Run `hack/update-codegen.sh`
- [x] Generate protobuf objects
- [x] Run `hack/update-generated-protobuf.sh`
- [x] Generate json (un)marshaling code
- [x] Run `hack/update-codecgen.sh`
- [x] Update fuzzer
- [x] Actual logic
- [x] Unit tests
- [x] End to end tests
- [x] Documentation changes and API specs update in separate commit
**Release note**:
```release-note
Add configurable limits to CronJob resource to specify how many successful and failed jobs are preserved.
```
Automatic merge from submit-queue (batch tested with PRs 42106, 42094, 42069, 42098, 41852)
Pod deletion observation is flaking, increase timeout and debug more
We can afford to wait longer than 30 seconds, and we should be printing
more error and output information about the cause of the failure.
Fixes / triages #41902
Automatic merge from submit-queue (batch tested with PRs 42106, 42094, 42069, 42098, 41852)
Add ncdc to test/OWNERS
**What this PR does / why we need it**: add me to test/OWNERS
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
cc @kubernetes/sig-testing-pr-reviews @smarterclayton @bgrant0607
Automatic merge from submit-queue (batch tested with PRs 41854, 41801, 40088, 41590, 41911)
Add storage.k8s.io/v1 API
v1 API is direct copy of v1beta1 API. This v1 API gets installed and exposed in this PR, I tested that kubectl can create both v1beta1 and v1 StorageClass.
~~Rest of Kubernetes (controllers, examples,. tests, ...) still use v1beta1 API, I will update it when this PR gets merged as these changes would get lost among generated code.~~ Most parts use v1 API now, it would not compile / run tests without it.
**Release note**:
```
Kubernetes API storage.k8s.io for storage objects is now fully supported and is available as storage.k8s.io/v1. Beta version of the API storage.k8s.io/v1beta1 is still available in this release, however it will be removed in a future Kubernetes release.
Together with the API endpoint, StorageClass annotation "storageclass.beta.kubernetes.io/is-default-class" is deprecated and "storageclass.kubernetes.io/is-default-class" should be used instead to mark a default storage class. The beta annotation is still working in this release, however it won't be supported in the next one.
```
@kubernetes/sig-storage-misc
Automatic merge from submit-queue
Supports 'ensure exist' class addon in Addon-manager
Fixes#39561, fixes#37047 and fixes#36411. Depends on #40057.
This PR splits cluster addons into two categories:
- Reconcile: Addons that need to be reconciled (`kube-dns` for instance).
- EnsureExists: Addons that need to be exist but changeable (`default-storage-class`).
The behavior for the 'EnsureExists' class addon would be:
- Create it if not exist.
- Users could do any modification they want, addon-manager will not reconcile it.
- If it is deleted, addon-manager will recreate it with the given template.
- It will not be updated/clobbered during upgrade.
As Brian pointed out in [#37048/comment](https://github.com/kubernetes/kubernetes/issues/37048#issuecomment-272510835), this may not be the best solution for addon-manager. Though #39561 needs to be fixed in 1.6 and we might not have enough bandwidth to do a big surgery.
@mikedanese @thockin
cc @kubernetes/sig-cluster-lifecycle-misc
---
Tasks for this PR:
- [x] Supports 'ensure exist' class addon and switch to use new labels in addon-manager.
- [x] Updates READMEs regarding the new behavior of addon-manager.
- [x] Updated `test/e2e/addon_update.go` to match the new behavior.
- [x] Go through all current addons and apply the new labels on them regarding what they need.
- [x] Bump addon-manager and update its template files.
Automatic merge from submit-queue (batch tested with PRs 41714, 41510, 42052, 41918, 31515)
Switch scheduler to use generated listers/informers
Where possible, switch the scheduler to use generated listers and
informers. There are still some places where it probably makes more
sense to use one-off reflectors/informers (listing/watching just a
single node, listing/watching scheduled & unscheduled pods using a field
selector).
I think this can wait until master is open for 1.7 pulls, given that we're close to the 1.6 freeze.
After this and #41482 go in, the only code left that references legacylisters will be federation, and 1 bit in a stateful set unit test (which I'll clean up in a follow-up).
@resouer I imagine this will conflict with your equivalence class work, so one of us will be doing some rebasing 😄
cc @wojtek-t @gmarek @timothysc @jayunit100 @smarterclayton @deads2k @liggitt @sttts @derekwaynecarr @kubernetes/sig-scheduling-pr-reviews @kubernetes/sig-scalability-pr-reviews
This PR is to modify the containerized mounter script to use chroot
instead of rkt fly. This will avoid the problem of possible large number
of mounts caused by rkt containers if they are not cleaned up.
Fix bash command's execution in MakePod().
Add isPriviledged as a parameter to MakePod().
Move PD utils to pv_util.go
Ran all the tests in pd.go, persistent_volumes.go,
persistent_volumes-disruptive.go.
These changes are needed for the PV upgrade test I am working on.
Automatic merge from submit-queue (batch tested with PRs 41667, 41820, 40910, 41645, 41361)
Switch admission to use shared informers
Originally part of #40097
cc @smarterclayton @derekwaynecarr @deads2k @liggitt @sttts @gmarek @wojtek-t @timothysc @lavalamp @kubernetes/sig-scalability-pr-reviews @kubernetes/sig-api-machinery-pr-reviews
node cache don't need to get all the information about every
candidate node. For huge clusters, sending full node information
of all nodes in the cluster on the wire every time a pod is scheduled
is expensive. If the scheduler is capable of caching node information
along with its capabilities, sending node name alone is sufficient.
These changes provide that optimization in a backward compatible way
- removed the inadvertent signature change of Prioritize() function
- added defensive checks as suggested
- added annotation specific test case
- updated the comments in the scheduler types
- got rid of apiVersion thats unused
- using *v1.NodeList as suggested
- backing out pod annotation update related changes made in the
1st commit
- Adjusted the comments in types.go and v1/types.go as suggested
in the code review