Commit Graph

1625 Commits

Author SHA1 Message Date
deads2k
aa5cbb704f convert deployment controller to shared informers 2016-10-07 15:06:57 -04:00
Michail Kargakis
52822d7d6b Add a test case and consolidate deployment constructors 2016-10-06 10:37:35 +02:00
Michail Kargakis
ed8b77087a controller: scale proportionally before rolling out new templates 2016-10-06 10:36:39 +02:00
Kubernetes Submit Queue
42e5f95a6b Merge pull request #34024 from deads2k/controller-06-deployment-controller
Automatic merge from submit-queue

update deployment and replicaset listers

Updates the deployment lister to avoid copies and updates the deployment controller to use shared informers.

Pushing WIP to see which tests are broken.
2016-10-06 00:02:34 -07:00
deads2k
c30b2efc46 update replicaset lister 2016-10-05 15:20:27 -04:00
deads2k
358a57d74a update deployment lister 2016-10-05 13:27:35 -04:00
deads2k
8ea2acc6a3 use service accounts as clients for controllers 2016-10-05 13:15:16 -04:00
Kubernetes Submit Queue
f79a53a734 Merge pull request #31777 from dshulyak/evict_pet
Automatic merge from submit-queue

Delete evicted pet

If pet was evicted by kubelet - it will stuck in this state forever.
By analogy to regular pod we need to re-create pet so that it will
be re-scheduled to another node, so in order to re-create pet
and preserve consitent naming we will delete it in petset controller
and create after that.

fixes: https://github.com/kubernetes/kubernetes/issues/31098
2016-10-04 01:32:02 -07:00
Kubernetes Submit Queue
1dc8277507 Merge pull request #33796 from jingxu97/quickfix-aws-9-28
Automatic merge from submit-queue

Fix issue in updating device path when volume is attached multiple times

When volume is attached, it is possible that the actual state
already has this volume object (e.g., the volume is attached to multiple
nodes, or volume was detached and attached again). We need to update the
device path in such situation, otherwise, the device path would be stale
information and cause kubelet mount to the wrong device.

This PR partially fixes issue #29324
2016-10-03 23:01:08 -07:00
Jing Xu
9e8edf6baf Fix issue in updating device path when volume is attached multiple times
When volume is attached, it is possible that the actual state
already has this volume object (e.g., the volume is attached to multiple
nodes, or volume was detached and attached again). We need to update the
device path in such situation, otherwise, the device path would be stale
information and cause kubelet mount to the wrong device.

This PR partially fixes issue #29324
2016-10-03 17:14:23 -07:00
Lucas Käldström
0bba65ca1a Remove old references to contrib/mesos 2016-10-01 16:46:48 +03:00
Kubernetes Submit Queue
5cfed5ff22 Merge pull request #33374 from deads2k/controller-05-more-informers
Automatic merge from submit-queue

switch node controller to shared informers

Switches the node controller to re-use existing watches and caches.
2016-10-01 03:39:47 -07:00
Kubernetes Submit Queue
a2cd107e14 Merge pull request #32373 from nebril/petset-count-test-master
Automatic merge from submit-queue

PetSet replica count status test

**What this PR does / why we need it**: It adds a test for PetSet status replica count. It should fail now, but will pass when https://github.com/kubernetes/kubernetes/pull/32117 is merged.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #31965

**Special notes for your reviewer**: It will need to be rebased after #32117 is merged in, don't need detailed review before that.

**Release note**:
```release-note
NONE
```

Added fakeKubeClient and other fake types needed to test what is sent to
API when replica count is updated. These fakes can be extended for
other tests.
2016-09-29 23:37:18 -07:00
Kubernetes Submit Queue
10239c983d Merge pull request #32850 from m1093782566/m109-disruption
Automatic merge from submit-queue

fix disruption controller hotloop

<!--  Thanks for sending a pull request!  Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->


Fix disruption controller hotloop on unexpected API server rejections.

**Which issue this PR fixes** 

Related issue is #30629

**Special notes for your reviewer**:

@deads2k @derekwaynecarr PTAL.
2016-09-29 07:10:15 -07:00
deads2k
0961784a9b switch node controller to shared informers 2016-09-29 09:16:41 -04:00
Jerzy Szczepkowski
0f0a9b6d61 Fixes in HPA: consider only running pods; proper denominator in avg calculations.
Fixes in HPA: consider only running pods; proper denominator in avg calculations.
2016-09-29 11:20:53 +02:00
Maciej Kwiek
caf2411f88 PetSet replica count status test 2016-09-29 10:07:14 +02:00
Kubernetes Submit Queue
33d29b5d6b Merge pull request #33235 from caesarxuchao/fix-TestCreateWithNonExistentOwner
Automatic merge from submit-queue

Fix TestCreateWithNonExistentOwner

Fix #30228
As https://github.com/kubernetes/kubernetes/issues/30228#issuecomment-248779567 described, the GC did delete the garbage, it's the test logic failed. 
The test used to rely on `gc.QueuesDrained()`, which could return before the GC finished processing. It seems to be the only possible reason of the test failure. Hence, this PR changed the test to poll for the deletion of garbage.
2016-09-28 07:33:45 -07:00
Kubernetes Submit Queue
96a7b0920a Merge pull request #32495 from gmarek/podgc
Automatic merge from submit-queue

Move orphaned Pod deletion logic to PodGC

cc @mwielgus @mikedanese @davidopp
2016-09-28 06:55:46 -07:00
Kubernetes Submit Queue
5af1b25235 Merge pull request #32771 from kargakis/minReadySecondsForRS
Automatic merge from submit-queue

MinReadySeconds / AvailableReplicas for ReplicaSets

This PR adds minReadySeconds and availableReplicas for replica sets / replication controllers

Partially addresses https://github.com/kubernetes/kubernetes/issues/28381

cc: @mfojtik 

@bgrant0607 for the api changes, @janetkuo for controller changes
2016-09-28 06:17:54 -07:00
gmarek
cb0a13c1e5 Move orphaned Pod deletion logic to PodGC 2016-09-28 13:58:31 +02:00
Kubernetes Submit Queue
43758c8f17 Merge pull request #32117 from nebril/petset-count
Automatic merge from submit-queue

PetSet returns valid replica count in status

**What this PR does / why we need it**: It prevents the PetSet replica count to be invalid regardless of pods not being created due to 

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #31965

**Special notes for your reviewer**:

**Release note**:
```release-note
```
2016-09-28 02:24:18 -07:00
Michail Kargakis
f7c232b8c6 extensions: add minReadySeconds/availableReplicas to replica sets 2016-09-28 11:06:40 +02:00
Kubernetes Submit Queue
1854bdcb0c Merge pull request #29048 from justinsb/volumes_nodename_not_hostname
Automatic merge from submit-queue

Use strongly-typed types.NodeName for a node name

We had another bug where we confused the hostname with the NodeName.

Also, if we want to use different values for the Node.Name (which is
an important step for making installation easier), we need to keep
better control over this.

A tedious but mechanical commit therefore, to change all uses of the
node name to use types.NodeName
2016-09-27 17:58:41 -07:00
Kubernetes Submit Queue
8d72f66e47 Merge pull request #32129 from jsafrane/refactor-controller-startup
Automatic merge from submit-queue

Refactor volume controller parameters into a structure

`persistentvolumecontroller.NewPersistentVolumeController` has 11 arguments now,
put them into a structure.

Also, rename `NewPersistentVolumeController` to `NewController`, `persistentvolume`
is already name of the package.

Fixes #30219
2016-09-27 08:09:39 -07:00
Justin Santa Barbara
54195d590f Use strongly-typed types.NodeName for a node name
We had another bug where we confused the hostname with the NodeName.

To avoid this happening again, and to make the code more
self-documenting, we use types.NodeName (a typedef alias for string)
whenever we are referring to the Node.Name.

A tedious but mechanical commit therefore, to change all uses of the
node name to use types.NodeName

Also clean up some of the (many) places where the NodeName is referred
to as a hostname (not true on AWS), or an instanceID (not true on GCE),
etc.
2016-09-27 10:47:31 -04:00
Maciej Kwiek
0bec588202 PetSet returns valid replica count in status
If the first pod is not healthy and next pods are not yet created, do
not provide the status with incorrect replica count
2016-09-27 10:58:26 +02:00
Kubernetes Submit Queue
7309d34873 Merge pull request #33492 from kargakis/stop-retrying-selector-overlaps
Automatic merge from submit-queue

controller: don't retry deployments with overlapping selectors

Returning an error will cause the deployment to be requeued. We should
just emit an event for deployments with overlapping selectors and silently
drop then out of the queue. This should be transitioned to a Condition
once we have them.

@kubernetes/deployment ptal
2016-09-26 23:50:40 -07:00
Kubernetes Submit Queue
6c1c0b9842 Merge pull request #32027 from m1093782566/m109-petset-fix-test-err
Automatic merge from submit-queue

[BUG FIX] Fix bug of UT in Pet Set

<!--  Thanks for sending a pull request!  Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->

**What this PR does / why we need it**:

Fix bug of UT in Pet Set.

[1] https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/petset/pet_set_test.go#L74-L75, 

I think` len(pl)` is not equal to `len(fc.pets)`, see [here](https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/petset/fakes.go#L229-L233)

[2] https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/petset/fakes.go#L249

I think should change to 

```
if len(f.pets) <= index {
```

because when `len(f.pets)==index`, then [here](https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/petset/fakes.go#L252-L254) will cause `index out of range` panic!

[3] https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/petset/fakes.go#L271

same reason with [2]

[4] https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/petset/pet_set_test.go#L79

which doesn't make use of the error returned by [setHealthy](https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/petset/fakes.go#L248) and has a risk of letting the error out.

Should we catch the error and use `t.Errorf()` to stop the test?
2016-09-26 21:13:14 -07:00
Chao Xu
7249c9bd8a fix TestCreateWithNonExistentOwner
remove the use of gc.QueuesDrained
2016-09-26 16:51:56 -07:00
Ivan Shvedunov
5651f822fd Fix DaemonSet namespace handling for predicates
In order to determine whether a node should run its daemon pod,
DaemonController creates a dummy pod based on DaemonSet's template and
then uses scheduler predicates (currently GeneralPredicates) to test
whether such pod can be run by the node. The problem was that
DaemonController was not setting Namespace for the dummy pod. This was
not affecting currently used GeneralPredicates but this problem could
bite later when some namespace-dependent predicates are added to
GeneralPredicates or directly to DaemonController's node checks
(e.g. pod affinity).

Stumbled upon it while working on e2e test for #31136
2016-09-26 22:14:28 +03:00
Michail Kargakis
0a843a50ba controller: don't retry deployments with overlapping selectors
Returning an error will cause the deployment to be requeued. We should
just emit an event for deployments with overlapping selectors and silently
drop then out of the queue. This should be transitioned to a Condition
once we have them.
2016-09-26 17:59:51 +02:00
Kubernetes Submit Queue
eed1e02346 Merge pull request #33012 from wojtek-t/informer_in_route_controller
Automatic merge from submit-queue

Use Informer framework in route controller
2016-09-26 06:56:06 -07:00
Jan Safranek
a54c9e2887 Refactor volume controller parameters into a structure
persistentvolumecontroller.NewPersistentVolumeController has 11 arguments now,
put them into a structure.

Also, rename NewPersistentVolumeController to NewController, persistentvolume
is already name of the package.

Fixes #30219
2016-09-26 14:15:25 +02:00
Jan Safranek
5ff1597cf9 Rename controller*.go to pv_controller*.go
To make log filtering easier. controller.go is used by several controllers and
matching logs for "pv_controller.*" is much better.
2016-09-26 12:26:58 +02:00
Kubernetes Submit Queue
4785f6f517 Merge pull request #31978 from jsafrane/detach-before-delete
Automatic merge from submit-queue

Do not report error when deleting an attached volume

Persistent volume controller should not send warning events to a PV and mark the PV as failed when the volume is still attached.

This happens when a user quickly deletes a pod and associated PVC - PV is slowly detaching, while the PVC is already deleted and the PV enters Failed phase.

`Deleter.Deleter` can now return `tryAgainError`, which is sent as INFO to the PV to let the user know we did not forget to delete the PV, however the PV stays in Released state. The controller tries again in the next sync (15 seconds by default).

Fixes #31511
2016-09-25 18:55:32 -07:00
Kubernetes Submit Queue
64777d37b6 Merge pull request #33268 from deads2k/client-14-rc-svc-lister
Automatic merge from submit-queue

simplify RC listers

Make the RC and SVC listers use the common list functions that more closely match client APIs, are consistent with other listers, and avoid unnecessary copies.
2016-09-23 23:37:15 -07:00
Kubernetes Submit Queue
071927a59d Merge pull request #32549 from smarterclayton/gc_non_kube_legacy
Automatic merge from submit-queue

Allow garbage collection to work against different API prefixes

The GC needs to build clients based only on Resource or Kind. Hoist the
restmapper out of the controller and the clientpool, support a new
ClientForGroupVersionKind and ClientForGroupVersionResource, and use the
appropriate one in both places.

Allows OpenShift to use the GC
2016-09-23 14:06:35 -07:00
Kubernetes Submit Queue
8152cfb9c3 Merge pull request #32670 from soltysh/cron_update
Automatic merge from submit-queue

Remove hacks from ScheduledJobs cron spec parsing 

Previusly `github.com/robfig/cron` library did not allow passing cron spec without seconds. First commit updates the library, which has additional method ParseStandard which follows the standard cron spec, iow. minute, hour, day of month, month, day of week.

@janetkuo @erictune as promised in #30227 I've updated the library and now I'm updating it in k8s
2016-09-23 13:27:16 -07:00
Kubernetes Submit Queue
331eb83585 Merge pull request #33376 from luxas/fix_arm_atomics_2
Automatic merge from submit-queue

Move HighWaterMark to the top of the struct in order to fix arm, second time

ref: #33117

Sorry for not fixing everyone at once, but I seriously wasn't prepared for that quick LGTM 😄, so here's the other half.

@lavalamp 

> lgtm, but seriously, this is terrible, we probably have this bug all over. And what if someone embeds the etcdWatcher struct in something else not at the top? We need the compiler to enforce things like this, it just can't be done manually. Can you file or link a golang issue for this?

I totally agree! There isn't currently a way of programmatically detecting this unfortunately.
I guess @davecheney or @minux can explain better to you why it's so hard.

This is noted in https://github.com/kubernetes/kubernetes/blob/master/docs/proposals/multi-platform.md as a corner case indeed.

@pwittrock This should be cherrypicked toghether with #33117
2016-09-23 12:05:09 -07:00
Kubernetes Submit Queue
0a4316f11e Merge pull request #32807 from jingxu97/stateupdateNeeded-9-15
Automatic merge from submit-queue

Fix race condition in setting node statusUpdateNeeded flag

This PR fixes the race condition in setting node statusUpdateNeeded flag
in master's attachdetach controller. This flag is used to indicate
whether a node status has been updated by the node_status_updater or
not. When updater finishes update a node status, it is set to false.
When the node status is changed such as volume is detached or new volume
is attached to the node, the flag is set to true so that updater can
update the status again. The previous workflow has a race condition as
follows
1. updater gets the currently attached volume list from the node which needs to be
updated.
2. A new volume A is attached to the same node right after 1 and set the
flag to TRUE
3. updater updates the node attached volume list (which does not include volume A) and then set the flag to FALSE.
The result is that volume A will be never added to the attached volume
list so at node side, this volume is never attached.

So in this PR, the flag is set to FALSE when updater tries to get the
attached volume list (as in an atomic operation). So in the above
example, after step 2, the flag will be TRUE again, in step 3, updater
does not set the flag if updates is sucessful. So after that, flag is
still TRUE and in next round of update, the node status will be updated.
2016-09-23 11:25:16 -07:00
Lucas Käldström
06917531b3 Move HighWaterMark to the top of the struct in order to fix arm, second time 2016-09-23 20:58:28 +03:00
deads2k
500959b70c fix RC lister 2016-09-23 08:12:03 -04:00
Kubernetes Submit Queue
b2aed32578 Merge pull request #33269 from deads2k/client-15-svc-lister
Automatic merge from submit-queue

simplify svc lister

trying to track down what killed the e2e tests.
2016-09-23 03:10:57 -07:00
Jing Xu
14cad206f5 Fix race conditino in setting node statusUpdateNeeded flag
This PR fixes the race condition in setting node statusUpdateNeeded flag
in master's attachdetach controller. This flag is used to indicate
whether a node status has been updated by the node_status_updater or
not. When updater finishes update a node status, it is set to false.
When the node status is changed such as volume is detached or new volume
is attached to the node, the flag is set to true so that updater can
update the status again. The previous workflow has a race condition as
follows
1. updater gets the currently attached volume list from the node which needs to be
updated.
2. A new volume A is attached to the same node right after 1 and set the
flag to TRUE
3. updater updates the node attached volume list (which does not include volume A) and then set the flag to FALSE.
The result is that volume A will be never added to the attached volume
list so at node side, this volume is never attached.

So in this PR, the flag is set to FALSE when updater tries to get the
attached volume list (as in an atomic operation). So in the above
example, after step 2, the flag will be TRUE again, in step 3, updater
does not set the flag if updates is sucessful. So after that, flag is
still TRUE and in next round of update, the node status will be updated.

This PR also changes a unit test due to the workflow changes
2016-09-22 14:02:30 -07:00
Kubernetes Submit Queue
e9f4db2748 Merge pull request #27714 from jsafrane/event-recycle
Automatic merge from submit-queue

Send recycle events from pod to pv.

This allows users to diagnose what's wrong with recycler. Recycler pods are started automatically with a cryptic name and they are deleted immediately when they finish.

e.g, `kubectl describe pv` could show that NFS cannot be mounted (and how many pods have tried it):

```
  FirstSeen     LastSeen        Count   From                            SubobjectPath   Type            Reason          Message
  ---------     --------        -----   ----                            -------------   --------        ------          -------
  59m           59m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(5421800e-347b-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  53m           53m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(3c9809e5-347c-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  46m           46m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(250dd2a2-347d-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  40m           40m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(0d84ea33-347e-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  33m           33m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(f5fb63bf-347e-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  27m           27m             1       {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(de7128fd-347f-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  1h            3m              75      {persistentvolume-controller }                  Normal          RecyclerPod     Recycler pod: Successfully assigned recycler-for-nfs to 127.0.0.1
  1h            3m              76      {persistentvolume-controller }                  Normal          RecyclerPod     Recycler pod: Pod was active on the node longer than specified deadline
  1h            1m              12      {persistentvolume-controller }                  Warning         RecyclerPod     Recycler pod: Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
  20m           1m              4       {persistentvolume-controller }                  Warning         RecyclerPod     (events with common reason combined)
```

These steps were necessary:

- added event watcher to volume.RecycleVolumeByWatchingPodUntilCompletion
- pass all these events through volume plugins to volume controller
- rework volume.RecycleVolumeByWatchingPodUntilCompletion unit tests to a table (too much copy-paste)
- fix all unit tests along the way
2016-09-22 12:18:53 -07:00
Clayton Coleman
97c35fcc67
Allow garbage collection to work against different API prefixes
The GC needs to build clients based only on Resource or Kind. Hoist the
restmapper out of the controller and the clientpool, support a new
ClientForGroupVersionKind and ClientForGroupVersionResource, and use the
appropriate one in both places.
2016-09-22 15:00:58 -04:00
Kubernetes Submit Queue
4ab5a76338 Merge pull request #33103 from deads2k/controller-03-kill-non-generatedclient
Automatic merge from submit-queue

switch controller manager to generated clients

Switches the controller manager to generated clients.

@ncdc ptal
2016-09-22 11:37:01 -07:00
Kubernetes Submit Queue
6e25117891 Merge pull request #32655 from dshulyak/fix_node_fake_update
Automatic merge from submit-queue

Fix FakeNodeHandler Update behaviour

Two problems:
1. Get is always using Existing nodes slice, and you will for sure miss any updated data
2. Each Update adds a duplicate node entry to UpdatedNodes slice

For the 1st, we will try to find a node in UpdatedNodes slice (same as for the List).
2nd - append only if there is no node with same name as updated, if there is we will replace object in UpdatedNodes slice.
2016-09-22 07:43:18 -07:00
deads2k
7ee5b26ad1 incorrect key determination 2016-09-22 09:55:24 -04:00