In 1.7, we add controller history to avoid unnecessary DaemonSet pod
restarts during pod adoption. We will not restart pods with matching
templateGeneration for backward compatibility, and will not restart pods
when template hash label matches current DaemonSet history, regardless
of templateGeneration.
Automatic merge from submit-queue (batch tested with PRs 47084, 46016, 46372)
Update adoption/release of DaemonSet controller history, and wait for history store sync
**What this PR does / why we need it**:
~Depends on #47075, so that DaemonSet controller can update history's controller ref. Ignore that commit when reviewing.~ (merged)
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: #46981
**Special notes for your reviewer**: @kubernetes/sig-apps-bugs
**Release note**:
```release-note
NONE
```
That installer decouples itself from COS image version (as long as the
image version is newer than cos-stable-59-9460-60-0).
A separate commit in the test-infra repo will update the cos version
used for this test to cos-stable-59-9460-60-0.
Automatic merge from submit-queue
Change what is stored in DaemonSet history `.data`
**What this PR does / why we need it**:
In DaemonSet history `.data`, store a strategic merge patch that can be applied to restore a DaemonSet. Only PodSpecTemplate is saved.
This will become consistent with the data stored in StatefulSet history.
Before this fix, a serialized pod template is stored in `.data`; however, seriazlized pod template isn't a `runtime.RawExtension`, and caused problems when controllers try to patch the history's controller ref.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47008
**Special notes for your reviewer**: @kubernetes/sig-apps-bugs @erictune @kow3ns @kargakis @lukaszo @mengqiy
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47113, 46665, 47189)
Improve the e2e node restart test
This commit includes the following two changes:
* Move pre-test checks (pods/nodes ready) to BeforeEach() so that it's
clear whether the test has run or not.
* Dumping logs for unready pods.
Automatic merge from submit-queue (batch tested with PRs 46835, 46856)
Made tests that create Horizontal Pod Autoscaler delete it after they are done.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#46847
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46835, 46856)
Made WaitForReplicas and EnsureDesiredReplicas use PollImmediate and improved logging.
**What this PR does / why we need it**: Most importantly, this results in better logging: timeout is logged at the level of the caller, not the helper function, helping debugging.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Allow pods to opt out of PodPreset mutation via an annotation on the pod
An annotation in the pod spec of the form:
podpreset.admission.kubernetes.io/PodPresetOptOut: "true"
Will cause the admission controller to skip manipulating the pod spec,
no matter the labelling.
This is an alternative implementation to pull #44163.
```release-note
Allow pods to opt out of PodPreset mutation via an annotation on the pod.
```
Automatic merge from submit-queue (batch tested with PRs 46885, 47197)
Fix e2e ns deletion message for flake analysis
**What this PR does / why we need it**:
Let's us know when pods have a missing deletion timestamp.
**Special notes for your reviewer**:
helps https://github.com/kubernetes/kubernetes/issues/47135
Automatic merge from submit-queue (batch tested with PRs 47065, 47157, 47143)
Use actual hostname when creating network e2e test pod
**What this PR does / why we need it**:
This changes a e2e framework network test Pod use the actual hostname value to match the `kubernetes.io/hostname` label in it's `NodeSelector`. Currently it assumes the Node name will match that hostname label which is not true in all environments.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
Fixescoreos/tectonic-installer#1018
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47065, 47157, 47143)
Removed a race condition from ResourceConsumer
**What this PR does / why we need it**: Without this PR there is a race condition in ResourceConsumer that sometimes results in communication to pods that might not exist anymore.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47127
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Bump external provisioner image to smaller version
The image is roughly half as big so this should improve speed/flakiness maybe
-->
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46979, 47078, 47138, 46916)
DeleteCollection should include uninitialized resources
Users who delete a collection expect all resources to be deleted, and
users can also delete an uninitialized resource. To preserve this
expectation, DeleteCollection selects all resources regardless of
initialization.
The namespace controller should list uninitialized resources in order to
gate cleanup of a namespace.
Fixes#47137
Automatic merge from submit-queue (batch tested with PRs 46979, 47078, 47138, 46916)
[federation][e2e] Fix cleanupServiceShardLoadBalancer
**What this PR does / why we need it**:
Fixes the issue mentioned in #46976
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#46976
**Special notes for your reviewer**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45877, 46846, 46630, 46087, 47003)
update NetworkPolicy e2e test for v1 semantics
This makes the NetworkPolicy test at least correct for v1, although ideally we'll eventually add a few more tests... (So this covers about half of #46625.)
I've tested that this compiles, but not that it passes, since I don't have a v1-compatible NetworkPolicy implementation yet...
@caseydavenport @ozdanborne, maybe you're closer to having a testable plugin than I am?
**Release note**:
```release-note
NONE
```
Users who delete a collection expect all resources to be deleted, and
users can also delete an uninitialized resource. To preserve this
expectation, DeleteCollection selects all resources regardless of
initialization.
The namespace controller should list uninitialized resources in order to
gate cleanup of a namespace.
Automatic merge from submit-queue (batch tested with PRs 43005, 46660, 46385, 46991, 47103)
[gke-slow always fails] Defer DeleteGCEStaticIP before asserting error
From https://github.com/kubernetes/kubernetes/issues/46918.
I'm getting close to the root cause: During tests, CreateGCEStaticIP() in fact successfully created static IP, but the parser we wrote in test mistakenly think we failed, probably because the gcloud output format was changed recently (or not). I'm still looking into fixing that.
This PR defer the delete function before asserting the error so that we can stop consistently leaking static IP in every run.
/assign @krzyzacy @dchen1107
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46977, 47005, 47018, 47061, 46809)
Fix for cluster-autoscaler e2e failures
This may help with cluster-autoscaler e2e failing in setup if the tests are run before all machines in mig get fully ready.
Automatic merge from submit-queue (batch tested with PRs 46235, 44786, 46833, 46756, 46669)
implements StatefulSet update
**What this PR does / why we need it**:
1. Implements rolling update for StatefulSets
2. Implements controller history for StatefulSets.
3. Makes StatefulSet status reporting consistent with DaemonSet and ReplicaSet.
https://github.com/kubernetes/features/issues/188
**Special notes for your reviewer**:
**Release note**:
```release-note
Implements rolling update for StatefulSets. Updates can be performed using the RollingUpdate, Paritioned, or OnDelete strategies. OnDelete implements the manual behavior from 1.6. status now tracks
replicas, readyReplicas, currentReplicas, and updatedReplicas. The semantics of replicas is now consistent with DaemonSet and ReplicaSet, and readyReplicas has the semantics that replicas did prior to this release.
```
Automatic merge from submit-queue (batch tested with PRs 46235, 44786, 46833, 46756, 46669)
Fixed ResourceConsumer.CleanUp to properly clean up non-replication-controller resources and pods
**What this PR does / why we need it**: Without this fix CleanUp does not remove non-replication-controller resources and pods. This leads to pollution that in some cases inadvertently affects what is happening in AfterEachs before the namespace gets deleted.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46997, 47021)
Don't parse human-readable output from gcloud in tests
This is the reason `[k8s.io] Services should be able to change the type and ports of a service [Slow]` is currently failing on GKE e2e tests. For GKE jobs we run a prerelease version of gcloud, in which the default command output was changed.
gcloud's default output for commands is human readable, and is subject to change. Anything scripting against gcloud should always pass `--format=json|yaml|value(...)` so you get standardized output.
fixes: #46918
Automatic merge from submit-queue (batch tested with PRs 47083, 44115, 46881, 47082, 46577)
Add an e2e test for server side get
Print a better error from the response. Performs validation to ensure it
does not regress in alpha state.
This is tests and bug fixes for https://github.com/kubernetes/community/pull/363
@kubernetes/sig-api-machinery-pr-reviews
Implements history utilities for ControllerRevision in the controller/history package
StatefulSetStatus now has additional fields for consistency with DaemonSet and Deployment
StatefulSetStatus.Replicas now represents the current number of createdPods and StatefulSetStatus.ReadyReplicas is the current number of ready Pods
This commit includes the following two changes:
* Move pre-test checks (pods/nodes ready) to BeforeEach() so that it's
clear whether the test has run or not.
* Dumping logs for unready pods.
Automatic merge from submit-queue (batch tested with PRs 46972, 42829, 46799, 46802, 46844)
Multizone static pv test
**What this PR does / why we need it**:
Adds an e2e test for checking that pods get scheduled to the same zone as statically created PVs. This tests the PersistentVolumeLabel admission controller, which adds zone and region labels when PVs are created. As part of this, I also had to make changes to volume test utility code to pass in a zone parameter for creating PDs, and also had to add an argument to the e2e test program to accept a list of zones.
Fixes#46995
**Special notes for your reviewer**:
It's probably easier to review each commit separately.
**Release note**:
NONE
Automatic merge from submit-queue (batch tested with PRs 46550, 46663, 46816, 46820, 46460)
[GCE] Support internal load balancers
**What this PR does / why we need it**:
Allows users to expose K8s services externally of the K8s cluster but within their GCP network.
Fixes#33483
**Important User Notes:**
- This is a beta feature. ILB could be enabled differently in the future.
- Requires nodes having version 1.7.0+ (ILB requires health checking and a health check endpoint on kube-proxy has just been exposed)
- This cannot be used for intra-cluster communication. Do not call the load balancer IP from a K8s node/pod.
- There is no reservation system for private IPs. You can specify a RFC 1918 address in `loadBalancerIP` field, but it could be lost to another VM or LB if service settings are modified.
- If you're running an ingress, your existing loadbalancer backend service must be using BalancingMode type `RATE` - not `UTILIZATION`.
- Option 1: With a 1.5.8+ or 1.6.4+ version master, delete all your ingresses, and re-create them.
- Option 2: Migrate to a new cluster running 1.7.0. Considering ILB requires nodes with 1.7.0, this isn't a bad idea.
- Option 3: Possible migration opportunity, but use at your own risk. More to come later.
**Reviewer Notes**:
Several files were renamed, so github thinks ~2k lines have changed. Review commits one-by-one to see the actual changes.
**Release note**:
```release-note
Support creation of GCP Internal Load Balancers from Service objects
```
Handle failure cases on startup gracefully to avoid causing cascading
errors and poor initialization in other components. Initial errors from
config load cause the initializer to pause and hold requests. Return
typed errors to better communicate failures to clients.
Add code to handle two specific cases - admin wants to bypass
initialization defaulting, and mirror pods (which want to bypass
initialization because the kubelet owns their lifecycle).
An annotation in the pod spec of the form:
podpreset.admission.kubernetes.io/exclude: "true"
Will cause the admission controller to skip manipulating the pod spec,
no matter the labelling.
The annotation for a podpreset acting on a pod has also been slightly
modified to contain a podpreset prefix:
podpreset.admission.kubernetes.io/podpreset-{name} = resource version
Fixes#44161
Automatic merge from submit-queue (batch tested with PRs 45871, 46498, 46729, 46144, 46804)
PD e2e test: Ready node check now uses the most up-to-date node count.
Follow-up to PR #46746
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
Automatic merge from submit-queue
Respect PDBs during node upgrades and add test coverage to the ServiceTest upgrade test.
This is still a WIP... needs to be squashed at least, and I don't think it's currently passing until I increase the scale of the RC, but please have a look at the general outline. Thanks!
Fixes#38336
@kow3ns @bdbauer @krousey @erictune @maisem @davidopp
```
On GCE, node upgrades will now respect PodDisruptionBudgets, if present.
```
Automatic merge from submit-queue
Implement Daemonset history
~Depends on #45867 (the 1st commit, ignore it when reviewing)~ (already merged)
Ref https://github.com/kubernetes/community/pull/527/ and https://github.com/kubernetes/community/pull/594
@kubernetes/sig-apps-api-reviews @kubernetes/sig-apps-pr-reviews @erictune @kow3ns @lukaszo @kargakis
---
TODOs:
- [x] API changes
- [x] (maybe) Remove rollback subresource if we decide to do client-side rollback
- [x] deployment controller
- [x] controller revision
- [x] owner ref (claim & adoption)
- [x] history reconstruct (put revision number, hash collision avoidance)
- [x] de-dup history and relabel pods
- [x] compare ds template with history
- [x] hash labels (put it in controller revision, pods, and maybe deployment)
- [x] clean up old history
- [x] Rename status.uniquifier when we reach consensus in #44774
- [x] e2e tests
- [x] unit tests
- [x] daemoncontroller_test.go
- [x] update_test.go
- [x] ~(maybe) storage_test.go // if we do server side rollback~
kubectl part is in #46144
---
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 46620, 46732, 46773, 46772, 46725)
Fix AppArmor test for docker 1.13
... & better debugging.
The issue is that we run the pod containers in a shared PID namespace with docker 1.13, so PID 1 is no longer the container's root process. Since it's messy to get the container's root process, I switched to using `/proc/self` to read the apparmor profile. While this wouldn't catch a regression that caused only the init process to run with the wrong profile, I think it's a good approximation.
/cc @aulanov @Amey-D
Automatic merge from submit-queue (batch tested with PRs 46620, 46732, 46773, 46772, 46725)
Added missing documentation to NodeInstanceGroup.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue
Add initializer support to admission and uninitialized filtering to rest storage
Initializers are the opposite of finalizers - they allow API clients to react to object creation and populate fields prior to other clients seeing them.
High level description:
1. Add `metadata.initializers` field to all objects
2. By default, filter objects with > 0 initializers from LIST and WATCH to preserve legacy client behavior (known as partially-initialized objects)
3. Add an admission controller that populates .initializer values per type, and denies mutation of initializers except by certain privilege levels (you must have the `initialize` verb on a resource)
4. Allow partially-initialized objects to be viewed via LIST and WATCH for initializer types
5. When creating objects, the object is "held" by the server until the initializers list is empty
6. Allow some creators to bypass initialization (set initializers to `[]`), or to have the result returned immediately when the object is created.
The code here should be backwards compatible for all clients because they do not see partially initialized objects unless they GET the resource directly. The watch cache makes checking for partially initialized objects cheap. Some reflectors may need to change to ask for partially-initialized objects.
```release-note
Kubernetes resources, when the `Initializers` admission controller is enabled, can be initialized (defaulting or other additive functions) by other agents in the system prior to those resources being visible to other clients. An initialized resource is not visible to clients unless they request (for get, list, or watch) to see uninitialized resources with the `?includeUninitialized=true` query parameter. Once the initializers have completed the resource is then visible. Clients must have the the ability to perform the `initialize` action on a resource in order to modify it prior to initialization being completed.
```
Automatic merge from submit-queue (batch tested with PRs 46239, 46627, 46346, 46388, 46524)
move labels to components which own the APIs
During the apimachinery split in 1.6, we accidentally moved several label APIs into apimachinery. They don't belong there, since the individual APIs are not general machinery concerns, but instead are the concern of particular components: most commonly the kubelet. This pull moves the labels into their owning components and out of API machinery.
@kubernetes/sig-api-machinery-misc @kubernetes/api-reviewers @kubernetes/api-approvers
@derekwaynecarr since most of these are related to the kubelet
Add support for creating resources that are not immediately visible to
naive clients, but must first be initialized by one or more privileged
cluster agents. These controllers can mark the object as initialized,
allowing others to see them.
Permission to override initialization defaults or modify an initializing
object is limited per resource to a virtual subresource "RESOURCE/initialize"
via RBAC.
Initialization is currently alpha.
Automatic merge from submit-queue (batch tested with PRs 46648, 46500, 46238, 46668, 46557)
Add an e2e test for AdvancedAuditing
Enable a simple "advanced auditing" setup for e2e tests running on GCE, and add an e2e test that creates & deletes a pod, a secret, and verifies that they're audited.
Includes https://github.com/kubernetes/kubernetes/pull/46548
For https://github.com/kubernetes/features/issues/22
/cc @ericchiang @sttts @soltysh @ihmccreery
Respect PDBs during node upgrades and add test coverage to the
ServiceTest upgrade test. Modified that test so that we include pod
anti-affinity constraints and a PDB.
Automatic merge from submit-queue (batch tested with PRs 46076, 43879, 44897, 46556, 46654)
Local storage plugin
**What this PR does / why we need it**:
Volume plugin implementation for local persistent volumes. Scheduler predicate will direct already-bound PVCs to the node that the local PV is at. PVC binding still happens independently.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
Part of #43640
**Release note**:
```
Alpha feature: Local volume plugin allows local directories to be created and consumed as a Persistent Volume. These volumes have node affinity and pods will only be scheduled to the node that the volume is at.
```
Automatic merge from submit-queue
Support grabbing test suite metrics
**What this PR does / why we need it**:
Add support for grabbing metrics that cover the entire test suite's execution.
Update the "interesting" controller-manager metrics to match the
current names for the garbage collector, and add namespace controller
metrics to the list.
If you enable `--gather-suite-metrics-at-teardown`, the metrics file is written to a file with a name such as `MetricsForE2ESuite_2017-05-25T20:25:57Z.json` in the `--report-dir`. If you don't specify `--report-dir`, the metrics are written to the test log output.
I'd like to enable this for some of the `pull-*` CI jobs, which will require a separate PR to test-infra.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
@kubernetes/sig-testing-pr-reviews @smarterclayton @wojtek-t @gmarek @derekwaynecarr @timothysc
Automatic merge from submit-queue (batch tested with PRs 45534, 37212, 46613, 46350)
[e2e]Fix define redundant parameter
When timeout to reach HTTP service, redundant parameter make the
error is nil.
Automatic merge from submit-queue
no-snat test
This test checks that Pods can communicate with each other in the same cluster without SNAT.
I intend to create a job that runs this in small clusters (\~3 nodes) at a low frequency (\~once per day) so that we have a signal as we work on allowing multiple non-masquerade CIDRs to be configured (see [kubernetes-incubator/ip-masq-agent](https://github.com/kubernetes-incubator/ip-masq-agent), for example).
/cc @dnardo