Commit Graph

2022 Commits

Author SHA1 Message Date
Kubernetes Submit Queue
c40a668ae4 Merge pull request #40081 from kargakis/cleanup-policy-fix
Automatic merge from submit-queue

controller: decouple cleanup policy from deployment strategies

Deployments get cleaned up only when they are paused, they get scaled up/down,
or when the strategy that drives rollouts completes. This means that stuck
deployments that fall into none of the above categories will not get cleaned
up. Since cleanup is already safe by itself (we only delete old replica sets
that are synced by the replica set controller and have no replicas) we can
execute it for every deployment when there is no intention to rollback.

Fixes https://github.com/kubernetes/kubernetes/issues/40068
2017-01-19 04:35:39 -08:00
Michail Kargakis
d5227e364d controller: decouple cleanup policy from deployment strategies
Deployments get cleaned up only when they are paused, they get scaled up/down,
or when the strategy that drives rollouts completes. This means that stuck
deployments that fall into none of the above categories will not get cleaned
up. Since cleanup is already safe by itself (we only delete old replica sets
that are synced by the replica set controller and have no replicas) we can
execute it for every deployment when there is no intention to rollback.
2017-01-19 10:33:24 +01:00
Wojciech Tyczynski
d08abdb187 Allow for returning map[string]interface{} from patch. 2017-01-18 11:53:30 +01:00
Clayton Coleman
bcde05753b
Correct import statements 2017-01-17 16:18:18 -05:00
Clayton Coleman
660095776a
generated: staging 2017-01-17 16:17:20 -05:00
Clayton Coleman
9a2a50cda7
refactor: use metav1.ObjectMeta in other types 2017-01-17 16:17:19 -05:00
Clayton Coleman
36acd90aba
Move APIs and core code to use metav1.ObjectMeta 2017-01-17 16:17:18 -05:00
Kubernetes Submit Queue
c0a1fa73f5 Merge pull request #39939 from resouer/statefulset
Automatic merge from submit-queue (batch tested with PRs 34763, 38706, 39939, 40020)

Use Statefulset instead in e2e and controller

Quick fix ref: #35534

We should finish the issue to meet v1.6 milestone.
2017-01-17 09:14:51 -08:00
deads2k
f31ecdd0f7 generated changes 2017-01-17 08:32:05 -05:00
deads2k
26c46971f2 move PatchType to apimachinery 2017-01-17 08:32:05 -05:00
Kubernetes Submit Queue
f0b0cd0399 Merge pull request #39945 from sttts/sttts-cutoff-pkg-serviceaccount-dep
Automatic merge from submit-queue

genericapiserver: cut off pkg/serviceaccount dependency

**Blocked** by pkg/api/validation/genericvalidation to be split up and moved into apimachinery.
2017-01-17 03:09:21 -08:00
Harry Zhang
a88cbdc52d Update bazel 2017-01-17 16:55:06 +08:00
Kubernetes Submit Queue
9d2fce7c22 Merge pull request #39608 from peay/cronjob-too-many-times-to-list
Automatic merge from submit-queue

Do not list CronJob unmet starting times beyond deadline

**What this PR does / why we need it**:

See #36311. `getRecentUnmetScheduleTimes` gives up after 100 unmet times to avoid wasting too much CPU or memory generating all the times, as it generates them sequentially.

When concurrency is forbidden, this is conceptually un-necessary: we only need the last unmet start time. This suggests that when concurrency is forbidden, we could generate times by going backward in time from now. This is not very practical as CronJob currently relies on a package that only provides `Next` and no `Prev`. Hand-cooking a `Prev` does not seem like a good idea. I could submit a PR to the cron library to add a `Prev` method, and use that when concurrency is forbidden through something like `getLastUnmetScheduleTime`. This would be `O(1)` and there would be no limit involved.

(edit: actually, even for the other concurrency settings, we only start the last unmet start times -- there is a `TODO` in the controller to actually start all of them, but that is not implemented at the moment. This means the solution would apply, at least temporarily, to all concurrency settings).

cc @soltysh what do you think?

In the meantime, I would suggest to do something simple. Currently, the user has no way to configure anything to ensure that his CronJob will not get stuck if one job takes more that 100 unmet times.

 `getRecentUnmetScheduleTimes` starts with an initial time corresponding to the last start (or to the creation of the CronJob, if nothing has started yet). However, when `StartingDeadlineSeconds` is set, the controller will not start anything that is older than the deadline, so if the last start is way beyond the deadline, we are generating potentially lots of unmet start times that will not be considered by the scheduler for scheduling anyway.

Consider a job running every minute, where the last instance has taken 120 minutes. This means there are more than 100 unmet times when we start counting from the last start time.

**The PR makes `getRecentUnmetScheduleTimes` only consider times that do not fall beyond the deadline.** Here, the CronJob can be configured with a `StartingDeadlineSeconds` of, say, 10 minutes. After the 120min job has run, `getRecentUnmetScheduleTimes` will only consider the times in the last 10 minutes from now, and will not get stuck.

As a side note on the max. number of unmet times to use as limits in terms of CPU used by the controller: I have run a quick benchmark on my i7 mac. Schedules corresponding to "once a week" tend to be more expensive to generate unmet times for. Just FYI.

```
+--------------+---------------+--------------+
|   SCHEDULE   | MISSED STARTS |    TIMING    |
+--------------+---------------+--------------+
| */1 * * * ?  |           100 | 383.645µs    |
| */30 * * * ? |           100 | 354.765µs    |
| 30 1 * * ?   |           100 | 1.065124ms   |
| 30 1 * * 0   |           100 | 1.80034ms    |
| */1 * * * ?  |           500 | 1.341365ms   |
| */30 * * * ? |           500 | 1.814441ms   |
| 30 1 * * ?   |           500 | 8.475012ms   |
| 30 1 * * 0   |           500 | 10.020613ms  |
| */1 * * * ?  |          1000 | 2.551697ms   |
| */30 * * * ? |          1000 | 4.075813ms   |
| 30 1 * * ?   |          1000 | 17.674945ms  |
| 30 1 * * 0   |          1000 | 19.149324ms  |
| */1 * * * ?  |         10000 | 25.725531ms  |
| */30 * * * ? |         10000 | 87.520022ms  |
| 30 1 * * ?   |         10000 | 174.29216ms  |
| 30 1 * * 0   |         10000 | 196.565748ms |
+--------------+---------------+--------------+
```

using

```.go
package main

import (
    "fmt"
    "time"
    "os"
    "strconv"

    "github.com/robfig/cron"
    "github.com/olekukonko/tablewriter"
)

func timeSchedule(schedule string, iterations int) (time.Duration) {
    sched, err := cron.ParseStandard(schedule)

    if err != nil {
        panic(fmt.Sprintf("Unparseable schedule: %s", err))
    }

    start := time.Now()
    t := time.Now()

    for i := 1; i <= iterations; i++ {
        t = sched.Next(t)
    }

    return time.Since(start)
}

func main() {
    table := tablewriter.NewWriter(os.Stdout)
    table.SetHeader([]string{"Schedule", "Missed starts", "Timing"})

    schedules := []string{"*/1 * * * ?", "*/30 * * * ?", "30 1 * * ?", "30 1 * * 0"}
    iteration_nums := []int{100, 500, 1000, 10000}

    for _, iterations := range iteration_nums {
        for _, schedule := range schedules {
            table.Append([]string{schedule,
                                  strconv.Itoa(iterations),
                                  timeSchedule(schedule, iterations).String()})
        }
    }
    table.Render()
}
```

**Which issue this PR fixes**: fixes #36311

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-01-17 00:41:45 -08:00
Dr. Stefan Schimanski
bf307d9948 genericapiserver: cut off pkg/serviceaccount dependency 2017-01-17 09:36:10 +01:00
Harry Zhang
b8678ad130 Use statefulset instead in controller
Rename e2e folder to statefulset
2017-01-17 10:36:37 +08:00
Kubernetes Submit Queue
f74b4bbbad Merge pull request #38094 from yarntime/fix_update_typo
Automatic merge from submit-queue

fix typos

fix typos.
2017-01-16 18:22:33 -08:00
deads2k
8686d67c80 move pkg/util/rand 2017-01-16 16:04:03 -05:00
Kubernetes Submit Queue
6defc30337 Merge pull request #39882 from deads2k/api-59-errors
Automatic merge from submit-queue (batch tested with PRs 38592, 39949, 39946, 39882)

move api/errors to apimachinery

`pkg/api/errors` is a set of helpers around `meta/v1.Status` that help to create and interpret various apiserver errors.  Things like `.NewNotFound` and `IsNotFound` pairings.  This pull moves it into apimachinery for use by the clients and servers.

@smarterclayton @lavalamp First commit is the move plus minor fitting.  Second commit is straight replace and generation.
2017-01-16 10:37:42 -08:00
deads2k
77b4d55982 mechanical 2017-01-16 09:35:12 -05:00
Dr. Stefan Schimanski
918868b115 genericapiserver: cut off certificates api dependency 2017-01-16 14:10:59 +01:00
Kubernetes Submit Queue
eb9f953496 Merge pull request #39876 from deads2k/generic-20-deps-03
Automatic merge from submit-queue

move more things to apiserver

```
pkg/genericapiserver/api/handlers/negotiation/ -> apiserver/pkg/handlers/negotiation
pkg/genericapiserver/api/metrics -> apiserver/pkg/metrics
pkg/genericapiserver/api/request -> apiserver/pkg/request
pkg/util/wsstream -> apiserver/pkg/util/wsstream
plugin/pkg/auth/authenticator/request/headerrequest -> apiserver/pkg/authentication/request/headerrequest
plugin/pkg/webhook -> apiserver/pkg/webhook
```

and mechanicals.

`k8s.io/kubernetes/pkg/genericapiserver/routes/data/swagger` needs to be sorted out.
2017-01-16 04:14:37 -08:00
peay
d141a43d86 Do not list CronJob unmet starting times beyond deadline 2017-01-15 12:29:20 -05:00
Kubernetes Submit Queue
0ca72d110d Merge pull request #39655 from xychu/typo-in-quota-ctr
Automatic merge from submit-queue

Fix typo in resource quota controller comments
2017-01-15 01:14:38 -08:00
Kubernetes Submit Queue
a9f5065833 Merge pull request #39794 from kargakis/updated-unit-tests
Automatic merge from submit-queue

Updated unit tests

@janetkuo updated the flaky unit test to have the same structure with regard to uncasting as the rest of the tests. ptal
2017-01-13 18:39:55 -08:00
Kubernetes Submit Queue
5723979b60 Merge pull request #39525 from kargakis/update-equality-helper
Automatic merge from submit-queue (batch tested with PRs 39807, 37505, 39844, 39525, 39109)

Update deployment equality helper

@mfojtik @janetkuo this is split out of https://github.com/kubernetes/kubernetes/pull/38714 to reduce the size of that PR, ptal
2017-01-13 13:40:45 -08:00
Kubernetes Submit Queue
6b5d82b512 Merge pull request #37505 from k82cn/use_controller_inf
Automatic merge from submit-queue (batch tested with PRs 39807, 37505, 39844, 39525, 39109)

Made cache.Controller to be interface.

**What this PR does / why we need it**:

#37504
2017-01-13 13:40:41 -08:00
deads2k
31b6ba4e94 mechanicals 2017-01-13 16:33:09 -05:00
Kubernetes Submit Queue
a6fa5c2bfd Merge pull request #39814 from deads2k/api-58-multi-register
Automatic merge from submit-queue

replace global registry in apimachinery with global registry in k8s.io/kubernetes

We'd like to remove all globals, but our immediate problem is that a shared registry between k8s.io/kubernetes and k8s.io/client-go doesn't work.  Since client-go makes a copy, we can actually keep a global registry with other globals in pkg/api for now.

@kubernetes/sig-api-machinery-misc @lavalamp @smarterclayton @sttts
2017-01-13 12:37:02 -08:00
deads2k
f1176d9c5c mechanical repercussions 2017-01-13 08:27:14 -05:00
Michail Kargakis
9c4195c50b Fix and tests for SelectorUpdatedBefore 2017-01-13 10:23:08 +01:00
Michail Kargakis
e2695d9d05 controller: unit tests for overlapping and recreate deployments 2017-01-13 10:21:51 +01:00
Klaus Ma
25fe1e0d82 Made cache.Controller to be interface. 2017-01-13 13:33:23 +08:00
Kubernetes Submit Queue
27500e135b Merge pull request #39468 from NickrenREN/node-status-update
Automatic merge from submit-queue (batch tested with PRs 39661, 39740, 39801, 39468, 39743)

fix nodeStatusUpdateRetry count exceeding condition judgement

When tryUpdateNodeStatus() return err,err!=nil,  but nc.kubeClient.Core().Nodes().Get() return no err, err==nil,
And we run nodeStatusUpdateRetry times, when for loop ends, err == nil, we can not print error info and run continue, so maybe the condition judgement is not right
Maybe caused #38671
2017-01-12 13:58:29 -08:00
Timothy St. Clair
fbc5323dad Refactor registry to use store vs. etcd 2017-01-12 09:23:38 -06:00
NickrenREN
0b94834b17 fix nodeStatusUpdateRetry count exceeding condition judgement
When tryUpdateNodeStatus() return err,err!=nil,  but nc.kubeClient.Core().Nodes().Get() return no err, err==nil,
And we run nodeStatusUpdateRetry times, when for loop ends, err == nil, we can not print error info and run continue, so the condition judgement is wrong.
2017-01-12 22:00:30 +08:00
Kubernetes Submit Queue
e73d66ce44 Merge pull request #37557 from sttts/sttts-update-ugorji
Automatic merge from submit-queue

Update ugorji/go/codec godep

In order to pick-up https://github.com/ugorji/go/issues/119 and to get rid of the workaround at https://github.com/kubernetes/kubernetes/pull/36909/files#diff-a09eb061a0fb0ef3c9ef9d696f1ad0b4R426.
2017-01-12 02:36:16 -08:00
Dawn Chen
3648eaae04 Revert "controller: unit tests for overlapping and recreate deployments" 2017-01-11 17:33:46 -08:00
Kubernetes Submit Queue
1747db8c11 Merge pull request #38787 from mikedanese/ds-fix2
Automatic merge from submit-queue (batch tested with PRs 39483, 39088, 38787)

daemonset: differentiate between cases in nodeShouldRun

specifically we need to differentiate between wanting to run,
should run and should continue running. This is required to
support all taint effects and will improve reporting and end
user debuggability.

fixes https://github.com/kubernetes/kubernetes/issues/28839 among other things
2017-01-11 15:35:48 -08:00
Kubernetes Submit Queue
9eb7060892 Merge pull request #39088 from kargakis/unit-tests-for-the-d-controller
Automatic merge from submit-queue (batch tested with PRs 39483, 39088, 38787)

controller: unit tests for overlapping and recreate deployments

Belated unit tests for https://github.com/kubernetes/kubernetes/pull/38080 and https://github.com/kubernetes/kubernetes/pull/36748.

@kubernetes/sig-apps-misc
2017-01-11 15:35:46 -08:00
Mike Danese
df0f4bd41e add table test for should run predicates 2017-01-11 13:37:48 -08:00
Mike Danese
c518e89042 daemonset: differentiate between cases in nodeShouldRun
secifically we need to differentiate between wanting to run,
should run and should continue running. This is required to
support all taint effects and will improve reporting and end
user debuggability.
2017-01-11 13:37:47 -08:00
Dr. Stefan Schimanski
2741eb7fdb Update generated files 2017-01-11 21:54:07 +01:00
Michail Kargakis
6013186ac3 Update deployment equality helper 2017-01-11 18:34:12 +01:00
deads2k
6a4d5cd7cc start the apimachinery repo 2017-01-11 09:09:48 -05:00
yarntime@163.com
f7c737e8a9 fix typos 2017-01-11 16:08:20 +08:00
Kubernetes Submit Queue
1fbb22e115 Merge pull request #39702 from mikedanese/kubelet-csr
Automatic merge from submit-queue (batch tested with PRs 39684, 39577, 38989, 39534, 39702)

kubelet: request client auth certificates from certificate API.

This fixes kubeadm and --experiment-kubelet-bootstrap.

cc @liggitt
2017-01-10 22:24:17 -08:00
Kubernetes Submit Queue
bb7d07a33d Merge pull request #39694 from DirectXMan12/bug/hpa-panic
Automatic merge from submit-queue (batch tested with PRs 39694, 39383, 39651, 39691, 39497)

HPA Controller: Check for 0-sum request value

In certain conditions in which the set of metrics returned by Heapster
is completely disjoint from the set of pods returned by the API server,
we can have a request sum of zero, which can cause a panic (due to
division by zero).  This checks for that condition.

Fixes #39680

**Release note**:

```release-note
Fixes an HPA-related panic due to division-by-zero.
```
2017-01-10 21:25:10 -08:00
Mike Danese
d2032fd83c kubelet: request client auth certificates from certificate API. 2017-01-10 17:57:39 -08:00
Kubernetes Submit Queue
41689b15bd Merge pull request #34488 from mikedanese/signing-profile
Automatic merge from submit-queue

certificates: add a signing profile to the internal types

Here is a strawman of a CertificateSigningProfile type which would be used by the certificates controller when configuring cfssl. Side question: what magnitude of change warrants a design proposal?

@liggitt @gtank
2017-01-10 15:33:59 -08:00
Solly Ross
c830d94dc4 HPA Controller: Check for 0-sum request value
In certain conditions in which the set of metrics returned by Heapster
is completely disjoint from the set of pods returned by the API server,
we can have a request sum of zero, which can cause a panic (due to
division by zero).  This checks for that condition.

Fixes #39680
2017-01-10 17:26:13 -05:00