kubernetes

Author	SHA1	Message	Date
gmarek	7cac170214	AllocateOrOccupyCIDR returs quickly	2016-05-31 09:11:42 +02:00
k8s-merge-robot	d1277e34fd	Merge pull request #25913 from pweil-/ds-tombstone Automatic merge from submit-queue daemonset handle DeletedFinalStateUnknown During an e2e run in OpenShift we ran into the DS controller panic when handling `DeletedFinalStateUnknown`. This PR checks for `DeletedFinalStateUnknown` and queues the embedded object if it is a `DaemonSet`. @mikedanese - would you mind taking a look? @deads2k ``` panic: interface conversion: interface is cache.DeletedFinalStateUnknown, not extensions.DaemonSet goroutine 4369 [running]: k8s.io/kubernetes/pkg/controller/daemon.func·005(0x2f8a0c0, 0xc20b559680) /data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/daemon/controller.go:160 +0x50 k8s.io/kubernetes/pkg/controller/framework.ResourceEventHandlerFuncs.OnDelete(0xc20a0ae090, 0xc20a0ae0a0, 0xc20a0ae0b0, 0x2f8a0c0, 0xc20b559680) /data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/framework/controller.go:178 +0x41 k8s.io/kubernetes/pkg/controller/framework.(ResourceEventHandlerFuncs).OnDelete(0xc20b8ebf20, 0x2f8a0c0, 0xc20b559680) <autogenerated>:25 +0xb5 k8s.io/kubernetes/pkg/controller/framework.func·001(0x2f8a280, 0xc20b5522e0, 0x0, 0x0) /data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/framework/controller.go:248 +0x4be k8s.io/kubernetes/pkg/controller/framework.(Controller).processLoop(0xc20bb727e0) /data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/framework/controller.go:122 +0x6f k8s.io/kubernetes/pkg/controller/framework.Controller.(k8s.io/kubernetes/pkg/controller/framework.processLoop)·fm() /data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/framework/controller.go:97 +0x27 k8s.io/kubernetes/pkg/util/wait.func·001() /data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/util/wait/wait.go:66 +0x61 k8s.io/kubernetes/pkg/util/wait.JitterUntil(0xc209f8cfb8, 0x3b9aca00, 0x0, 0xc2080543c0) /data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/util/wait/wait.go:67 +0x8f k8s.io/kubernetes/pkg/util/wait.Until(0xc209f8cfb8, 0x3b9aca00, 0xc2080543c0) /data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/util/wait/wait.go:47 +0x4a k8s.io/kubernetes/pkg/controller/framework.(Controller).Run(0xc20bb727e0, 0xc2080543c0) /data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/framework/controller.go:97 +0x1fb created by k8s.io/kubernetes/pkg/controller/daemon.(DaemonSetsController).Run /data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/daemon/controller.go:212 +0xae ``` https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_origin_check/1002/artifact/origin/artifacts/test-cmd/logs/openshift.log	2016-05-30 17:54:17 -07:00
k8s-merge-robot	9aeeef1d81	Merge pull request #26414 from jsafrane/reduce-sync-period Automatic merge from submit-queue Reduce volume controller sync period fixes #24236 and most probably also fixes #25294. Needs #25881! With the cache, binder is not affected by sync period. Without the cache, binding of 1000 PVCs takes more than 5 minutes (instead of ~70 seconds). 15 seconds were chosen by fair 2d10 roll :-)	2016-05-30 05:54:51 -07:00
k8s-merge-robot	5643b7498f	Merge pull request #25881 from jsafrane/devel/pv-add-cache Automatic merge from submit-queue volume controller: Add cache with the latest version of PVs and PVCs When the controller binds a PV to PVC, it saves both objects to etcd. However, there is still an old version of these objects in the controller Informer cache. So, when a new PVC comes, the PV is still seen as available and may get bound to the new PVC. This will be blocked by etcd, still, it creates unnecessary traffic that slows everything down. To make everything worse, when periodic sync with the old PVC is performed, this PVC is seen by the controller as Pending (while it's already Bound on etcd) and will be bound to a different PV. Writing to this PV won't be blocked by etcd, only subsequent write of the PVC fails. So, the controller will need to roll back the PV in another transaction(s). The controller can keep itself pretty busy this way. Also, we save bound PVs (and PVCs) as two transactions - we save say PV.Spec first and then .Status. The controller gets "PV.Spec updated" event from etcd and tries to fix the Status, as it seems to the controller it's outdated. This write again fails - there already is a correct version in etcd. As we can't influence the Informer cache, it is read-only to the controller, this patch introduces second cache in the controller, which holds latest and greatest version on PVs and PVCs to prevent these useless writes to etcd . It gets updated with events from etcd and after etcd confirms successful save of PV/PVC modified by the controller. The cache stores only pointers to PVs/PVCs, so in ideal case it shares the actual object data with the informer cache. They will diverge only for a short time when the controller modifies something and the informer cache did not get update events yet. @kubernetes/sig-storage	2016-05-30 04:13:18 -07:00
Jan Safranek	2aa9f1dd8f	Reduce volume controller sync period	2016-05-30 09:59:31 +02:00
k8s-merge-robot	577cdf937d	Merge pull request #26415 from wojtek-t/network_not_ready Automatic merge from submit-queue Add a NodeCondition "NetworkUnavaiable" to prevent scheduling onto a node until the routes have been created This is new version of #26267 (based on top of that one). The new workflow is: - we have an "NetworkNotReady" condition - Kubelet when it creates a node, it sets it to "true" - RouteController will set it to "false" when the route is created - Scheduler is scheduling only on nodes that doesn't have "NetworkNotReady ==true" condition @gmarek @bgrant0607 @zmerlynn @cjcullen @derekwaynecarr @danwinship @dcbw @lavalamp @vishh	2016-05-29 03:06:59 -07:00
k8s-merge-robot	a550cf16b9	Merge pull request #25826 from freehan/svcsourcerange Automatic merge from submit-queue promote sourceRange into service spec @thockin one more for your pile I will add docs at `http://releases.k8s.io/HEAD/docs/user-guide/services-firewalls.md` cc: @justinsb Fixes: #20392	2016-05-28 02:20:13 -07:00
k8s-merge-robot	7fae9c14e2	Merge pull request #25662 from deads2k/prevent-hotloop Automatic merge from submit-queue prevent namespace cleanup hotloop Found chasing a sentry report. Looks like we hot-loop on namespace deletion failures. @derekwaynecarr ptal	2016-05-28 01:30:51 -07:00
Alex Robinson	d577550dd0	Merge pull request #26054 from gmarek/flags Make service-range flag in controller-manager optional	2016-05-27 14:26:15 -07:00
Wojciech Tyczynski	be1b57100d	Change to NotReadyNetworking and use in scheduler	2016-05-27 19:32:49 +02:00
gmarek	7bdf480340	Node is NotReady until the Route is created	2016-05-27 19:29:51 +02:00
Alex Robinson	7522389d8d	Merge pull request #26207 from zmerlynn/fix-unneeded-updated nodecontroller: Fix log message on successful update	2016-05-27 09:56:28 -07:00
saadali	3c345abafd	Fix DATA RACE in unit tests: reconciler_test.go	2016-05-27 01:19:25 -07:00
Alex Mohr	9803393a67	Merge pull request #25960 from jsafrane/do-not-sort-bind volume controller: Speed up binding by not sorting volumes	2016-05-26 15:47:14 -07:00
Alex Mohr	edda837142	Merge pull request #25599 from caesarxuchao/orphaning-finalizer Add orphaning finalizer logic to GC	2016-05-26 13:19:19 -07:00
Minhan Xia	a1bd33f510	promote sourceRange into service spec	2016-05-26 10:42:30 -07:00
Wojciech Tyczynski	aa65a7974a	Spread creating routes over time and retry on failures	2016-05-26 13:00:53 +02:00
k8s-merge-robot	98766f4548	Merge pull request #26301 from zmerlynn/wait_proper Automatic merge from submit-queue routecontroller: Add wait.NonSlidingUntil, use it [![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]() Make sure the reconciliation loop kicks in again immediately if it takes a loooooong time.	2016-05-26 03:29:21 -07:00
k8s-merge-robot	bda0dc88aa	Merge pull request #25457 from saad-ali/expectedStateOfWorldDataStructure Automatic merge from submit-queue Attach Detach Controller Business Logic This PR adds the meat of the attach/detach controller proposed in #20262. The PR splits the in-memory cache into a desired and actual state of the world.	2016-05-26 00:41:54 -07:00
k8s-merge-robot	da7d3c189a	Merge pull request #25869 from jsafrane/devel/operation-logs Automatic merge from submit-queue volume controller: use better operation names Using volume/claim.UID in the operation name is not really useful, as UIDs are not logged by rest of the controller. On the other hand, volume.Name and claim.Namespace/Name is logged pretty often and it would help to log these also in operation name. Still, I'd prefer to have the operation name really unique to be protected from users deleting a volume and quickly creating another one with the same name, so UID is still part of the operation name. This has been already proven to be very useful in controller debugging.	2016-05-25 17:58:07 -07:00
Zach Loafman	cb69960742	nodecontroller: Fix log message on successful update	2016-05-25 14:44:15 -07:00
k8s-merge-robot	70a71990d4	Merge pull request #26123 from brendandburns/flaker Automatic merge from submit-queue Add some extra checking in the tests to prevent flakes. Attempts to fix https://github.com/kubernetes/kubernetes/issues/25967 The hypothesis is that somehow waitTest() catches an idle that occurs before all changes have been applied. This will block until the expected number of changes have arrived.	2016-05-25 14:29:48 -07:00
Zach Loafman	3ec25c5425	routecontroller: Add wait.NonSlidingUntil, use it Make sure the reconciliation loop kicks in again immediately if it takes a loooooong time.	2016-05-25 13:58:35 -07:00
k8s-merge-robot	4e8e4a574c	Merge pull request #25636 from zhouhaibing089/delnode-fix Automatic merge from submit-queue use monotonic now in TestDelNode Fixes https://github.com/kubernetes/kubernetes/issues/24971. Briefly, the rate_limited_queue uses a `container/heap` to store values, and use this data structure to ensure we can always fetch the value with the minimum `processAt`. However, in some extreme condition, the continuous call to `time.Now()` would get the same value, which causes some unpredictable order in the queue, this fix uses a monotonic `now()` to avoid that. @smarterclayton please take a look.	2016-05-25 13:33:31 -07:00
saadali	92500a20d7	Attach detach controller business logic added Split controller cache into actual and desired state of world. Controller will only operate on volumes scheduled to nodes that have the "volumes.kubernetes.io/controller-managed-attach" annotation.	2016-05-24 23:01:16 -07:00
Chao Xu	1665546d2d	add finalizer logics to the API server and the garbage collector; handling DeleteOptions.OrphanDependents in the API server	2016-05-24 13:07:28 -07:00
Brendan Burns	88663fc58b	Add some extra checking in the tests to prevent flakes.	2016-05-23 16:25:02 -07:00
gmarek	08385b2c5f	Make service-range flag in controller-manager optional	2016-05-23 09:37:53 +02:00
gmarek	1d89d2f2d2	Add few log lines to NodeController	2016-05-23 08:49:11 +02:00
k8s-merge-robot	b84730ba16	Merge pull request #25748 from derekwaynecarr/hotloop_quota Automatic merge from submit-queue ResourceQuota controller uses rate limiter to prevent hot-loops in error situations Have resource quota controller use a rate limited queue to prevent hot-looping in error situations.	2016-05-22 15:45:03 -07:00
k8s-merge-robot	1b78799b3b	Merge pull request #25768 from piosz/metrics-api-hpa Automatic merge from submit-queue Use Metrics API in HPA	2016-05-22 13:58:07 -07:00
k8s-merge-robot	62a8394eb4	Merge pull request #25263 from jsafrane/devel/adopt-recycle-pod Automatic merge from submit-queue volume recycler: Don't start a new recycler pod if one already exists. Recycling is a long duration process and when the recycler controller is restarted in the meantime, it should not start a new recycler pod if there is one already running. This means that the recycler pod must have deterministic name based on name of the recycled PV, we then get name conflicts when creating the pod. Two things need to be changed: - recycler controller and recycler plugins must pass the PV.Name to place, where the pod is created. This is most of the patch and it should be pretty straightforward. - create recycler pod with deterministic name and check "already exists" error. When at it, remove useless 'resourceVersion' argument and make log messages starting with lowercase. There is an unit test to check the behavior + there is an e2e test that checks that regular recycling is not broken (it does not try to run two recycler pods in parallel as the recycler is single-threaded now).	2016-05-21 02:28:26 -07:00
Piotr Szczesniak	26ad827893	Use Metrics API in HPA	2016-05-20 19:50:56 +02:00
mqliang	17d5a302bb	make podcidr mask size configurable	2016-05-20 20:44:40 +08:00
mqliang	cf7a3475f3	Don't allow node controller to allocate into service CIDR range	2016-05-20 20:44:40 +08:00
mqliang	69b8453fa0	cidr allocator	2016-05-20 20:44:40 +08:00
k8s-merge-robot	3b0a6dac1f	Merge pull request #25571 from gmarek/nodecontroller Automatic merge from submit-queue NodeController doesn't evict Pods if no Nodes are Ready Fix #13412 #24597 When NodeControllers don't see any Ready Node it goes into "network segmentation mode". In this mode it cancels all evictions and don't evict any Pods. It leaves network segmentation mode when it sees at least one Ready Node. When leaving it resets all timers, so each Node has full grace period to reconnect to the cluster. cc @lavalamp @davidopp @mml @wojtek-t @fgrzadkowski	2016-05-20 05:31:34 -07:00
k8s-merge-robot	bd8033e2b0	Merge pull request #25864 from jsafrane/devel/pv-fix-log Automatic merge from submit-queue volume controller: Fix method name in a log message It's deleteVolume, not deleteClaim. @kubernetes/sig-storage	2016-05-20 03:53:22 -07:00
Jan Safranek	c7da3abd5b	volume controller: Speed up binding by not sorting volumes The binder sorts all available volumes first, then it filters out volumes that cannot be bound by processing each volume in a loop and then finds the smallest matching volume by binary search. So, if we process every available volume in a loop, we can also remember the smallest matching one and save us potentially long sorting (and quick binary search).	2016-05-20 12:26:39 +02:00
Daniel Smith	5448400b1c	Merge pull request #25243 from smarterclayton/explore_quantity Provide an int64 version of Quantity that is much faster	2016-05-19 16:56:48 -07:00
Paul Weil	4d6fee74d0	daemonset handle DeletedFinalStateUnknown	2016-05-19 17:16:34 -04:00
Jan Safranek	0279232360	volume controller: Add cache with the latest version of PVs and PVCs When the controller binds a PV to PVC, it saves both objects to etcd. However, there is still an old version of these objects in the controller Informer cache. So, when a new PVC comes, the PV is still seen as available and may get bound to the new PVC. This will be blocked by etcd, still, it creates unnecessary traffic that slows everything down. Also, we save bound PV/PVC as two transactions - we save PV/PVC.Spec first and then .Status. The controller gets "PV/PVC.Spec updated" event from etcd and tries to fix the Status, as it seems to the controller it's outdated. This write again fails - there already is a correct version in etcd. We can't influence the Informer cache, it is read-only to the controller. To prevent these useless writes to etcd, this patch introduces second cache in the controller, which holds latest and greatest version on PVs and PVCs. It gets updated with events from etcd and after etcd confirms successful save of PV/PVC modified by the controller. The cache stores only pointers to PVs/PVCs, so in ideal case it shares the actual object data with the informer cache. They will diverge only when the controller modifies something and the informer cache did not get update events yet.	2016-05-19 16:09:06 +02:00
Clayton Coleman	5e4308f91d	Update use of Quantity in other classes	2016-05-19 08:41:43 -04:00
Jan Safranek	e9a6ec29a0	volume controller: use better operation names Using volume/claim.UID in the operation name is not really useful, as UIDs are not logged by rest of the controller. On the other hand, volume.Name and claim.Namespace/Name is logged pretty often and it would help to log these also in operation name. This has been already proven to be very useful in controller debugging.	2016-05-19 14:19:33 +02:00
Robert Rati	e388c137bb	Separate sync and list functionality in the reflector. #23394	2016-05-19 07:41:24 -04:00
Jan Safranek	0ee9160f88	volume recycler: Don't start a new recycler pod if one already exists. Recycling is a long duration process and when the recycler controller is restarted in the meantime, it should not start a new recycler pod if there is one already running. This means that the recycler pod must have deterministic name based on name of the recycled PV, we then get name conflicts when creating the pod. Two things need to be changed: - recycler controller and recycler plugins must pass the PV.Name to place, where the pod is created. - create recycler pod with deterministic name and check "already exists" error. When at it, remove useless 'resourceVersion' argument and make log messages starting with lowercase.	2016-05-19 12:58:25 +02:00
Jan Safranek	61d630ddf7	volume controller: Fix method name in a log message It's deleteVolume, not deleteClaim.	2016-05-19 12:54:17 +02:00
k8s-merge-robot	c63ac4e664	Merge pull request #24331 from jsafrane/devel/refactor-binder Automatic merge from submit-queue Refactor persistent volume controller Here is complete persistent controller as designed in https://github.com/pmorie/pv-haxxz/blob/master/controller.go It's feature complete and compatible with current binder/recycler/provisioner. No new features, it should be much more stable and predictable. Testing -- The unit test framework is quite complicated, still it was necessary to reach reasonable coverage (78% in `persistentvolume_controller.go`). The untested part are error cases, which are quite hard to test in reasonable way - sure, I can inject a VersionConflictError on any object update and check the error bubbles up to appropriate places, but the real test would be to run `syncClaim`/`syncVolume` again and check it recovers appropriately from the error in the next periodic sync. That's the hard part. Organization --- The PR starts with `rm -rf kubernetes/pkg/controller/persistentvolume`. I find it easier to read when I see only the new controller without old pieces scattered around. [`types.go` from the old controller is reused to speed up matching a bit, the code looks solid and has 95% unit test coverage]. I tried to split the PR into smaller patches, let me know what you think. ~~TODO~~ -- * ~~Missing: provisioning, recycling~~. * ~~Fix integration tests~~ * ~~Fix e2e tests~~ @kubernetes/sig-storage <!-- Reviewable:start --> --- This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/24331) <!-- Reviewable:end --> Fixes #15632	2016-05-19 03:06:46 -07:00
k8s-merge-robot	4f09f51486	Merge pull request #24800 from thockin/validation_pt8-3 Automatic merge from submit-queue Make name validators return string slices Part of the larger validation PR, broken out for easier review and merge. Builds on previous PRs in the series.	2016-05-19 02:15:27 -07:00
Daniel Smith	6dc1437015	Merge pull request #25671 from deads2k/fix-add-indexer make addIndexers safe for sharedInformer	2016-05-18 14:48:43 -07:00

1 2 3 4 5 ...

1095 Commits