Automatic merge from submit-queue
Limit concurrent route creations
Ref #26119
This is supposed to improve 2 things:
- retry creating route in routecontroller in case of failure
- limit number of concurrent CreateRoute calls in flight.
We need something like that, because we have a limit of concurrent in-flight CreateRoute requests in GCE.
@gmarek @cjcullen
Automatic merge from submit-queue
Add release_1_3 clientset in update-codegen
Add release_1_3 clientset in update-codegen to keep it update-to-date; update the generated clientset.
Automatic merge from submit-queue
Increase goroutinemap unit test timeouts.
50ms is flaky in Jenkins. This makes the test to take at least 0.5s to check that things that should block and wait for something really do block and wait (was 100ms before).
Fixes#25825
Automatic merge from submit-queue
Attach Detach Controller Business Logic
This PR adds the meat of the attach/detach controller proposed in #20262.
The PR splits the in-memory cache into a desired and actual state of the world.
Automatic merge from submit-queue
Refactor storePodsNamespacer.List()
This fixes a bug in the previous version where, when we fell back on a
brute force approach, we were still returning an error.
It also clarifies the flow control into 3 distinct cases. The cases
don't share variables any more, which makes mistakes like the one
mentioned above harder.
Automatic merge from submit-queue
volume controller: use better operation names
Using volume/claim.UID in the operation name is not really useful, as UIDs are not logged by rest of the controller. On the other hand, volume.Name and claim.Namespace/Name is logged pretty often and it would help to log these also in operation name. Still, I'd prefer to have the operation name really unique to be protected from users deleting a volume and quickly creating another one with the same name, so UID is still part of the operation name.
This has been already proven to be very useful in controller debugging.
Automatic merge from submit-queue
Use read-only root filesystem capabilities of rkt
Propagates `api.Container.SecurityContext.ReadOnlyRootFileSystem` flag to rkt container runtime.
cc @yifan-gu
Fixes#23837
Automatic merge from submit-queue
Use docker containerInfo.LogPath and not manually constructed path
## Pull Request Guidelines
Since the containerInfo has the LogPath in it, let's use that and
not manually construct the path ourselves. This also makes the code
less prone to breaking if docker change this path.
Fixes#23695
Refactor storePodsNamespacer.List() and
storeReplicationContollersNamespacer.List(). They are the same
function, just with different signatures.
This fixes a bug where, when we fell back on a brute force approach, we
were still returning an error.
Also change to explicit return without named return values.
Automatic merge from submit-queue
Ensure that init containers are preserved during pruning
Pods with multiple init containers were getting the wrong containers
pruned. Fix an error message and add a test.
Fixes#26131
Automatic merge from submit-queue
Add some extra checking in the tests to prevent flakes.
Attempts to fix https://github.com/kubernetes/kubernetes/issues/25967
The hypothesis is that somehow waitTest() catches an idle that occurs before all changes have been applied. This will block until the expected number of changes have arrived.
Automatic merge from submit-queue
use monotonic now in TestDelNode
Fixes https://github.com/kubernetes/kubernetes/issues/24971.
Briefly, the rate_limited_queue uses a `container/heap` to store values, and use this data structure to ensure we can always fetch the value with the minimum `processAt`. However, in some extreme condition, the continuous call to `time.Now()` would get the same value, which causes some unpredictable order in the queue, this fix uses a monotonic `now()` to avoid that.
@smarterclayton please take a look.
Automatic merge from submit-queue
rkt: Support alternate stage1's via annotation
This provides a basic implementation for setting a stage1 on a per-pod
basis via an annotation.
This provides a basic implementation for setting a stage1 on a per-pod
basis via an annotation. See discussion here for how this approach was arrived at: https://github.com/kubernetes/kubernetes/issues/23944#issuecomment-212653776
It's possible this feature should be gated behind additional knobs, such
as a kubelet flag to filter allowed stage1s, or a check akin to what
priviliged gets in the apiserver.
Currently, it checks `AllowPrivileged`, as a means to let people disable
this feature, though overloading it as stage1 and privileged isn't
ideal.
Fixes#23944
Testing done (note, unfortunately done with some additional ./cluster changes merged in):
```
$ cat examples/stage1-fly/fly-me-to-the-moon.yaml
apiVersion: v1
kind: Pod
metadata:
labels:
name: exit
name: exit-fast
annotations: {"rkt.alpha.kubernetes.io/stage1-name-override": "coreos.com/rkt/stage1-fly:1.3.0"}
spec:
restartPolicy: Never
containers:
- name: exit
image: busybox
command: ["sh", "-c", "ps aux"]
$ kubectl create -f examples/stage1-fly
$ ssh core@minion systemctl status -l --no-pager k8s_2f169b2e-c32a-49e9-a5fb-29ae1f6b4783.service
...
failed
...
May 04 23:33:03 minion rkt[2525]: stage0: error writing /etc/rkt-resolv.conf: open /var/lib/rkt/pods/run/2f169b2e-c32a-49e9-a5fb-29ae1f6b4783/stage1/rootfs/etc/rkt-resolv.conf: no such file or directory
...
# Restart kubelet with allow-privileged=false
$ kubectl create -f examples/stage1-fly
$ kubectl describe exit-fast
...
1m 19s 5 {kubelet euank-e2e-test-minion-dv3u} spec.containers{exit} Warning Failed Failed to create rkt container with error: cannot make "exit-fast_default(17050ce9-1252-11e6-a52a-42010af00002)": running a custom stage1 requires a privileged security context
....
```
Note as well that the "success" here is rkt spitting out an [error message](https://github.com/coreos/rkt/issues/2141) which indicates that the right stage1 was being used at least.
cc @yifan-gu @aaronlevy
Automatic merge from submit-queue
reduce conflict retries
Eliminates quota admission conflicts due to latent caches on the same API server.
@derekwaynecarr
Automatic merge from submit-queue
Downward API implementation for resources limits and requests
This is an implementation of Downward API for resources limits and requests, and it works with environment variables and volume plugin.
This is based on proposal https://github.com/kubernetes/kubernetes/pull/24051. This implementation follows API with magic keys approach as discussed in the proposal.
@kubernetes/rh-cluster-infra
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/24179)
<!-- Reviewable:end -->
Split controller cache into actual and desired state of world.
Controller will only operate on volumes scheduled to nodes that
have the "volumes.kubernetes.io/controller-managed-attach" annotation.
Automatic merge from submit-queue
Add a 'kubectl clusterinfo dump' option
Ref: #3500
@bgrant0607 @smarterclayton @jszczepkowski
Usage:
```
# Dump current cluster state to stdout
kubectl clusterinfo dump
# Dump current cluster state to /tmp
kubectl clusterinfo dump --output-directory=/tmp
# Dump all namespaces to stdout
kubectl clusterinfo dump --all-namespaces
# Dump a set of namespaces to /tmp
kubectl clusterinfo dump --namespaces default,kube-system --output-directory=/tmp
```
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/20672)
<!-- Reviewable:end -->
Automatic merge from submit-queue
plumb Update resthandler to allow old/new comparisons in admission
Rework how updated objects are passed to rest storage Update methods (first pass at https://github.com/kubernetes/kubernetes/pull/23928#discussion_r61444342)
* allows centralizing precondition checks (uid and resourceVersion)
* allows admission to have the old and new objects on patch/update operations (sets us up for field level authorization, differential quota updates, etc)
* allows patch operations to avoid double-GETting the object to apply the patch
Overview of important changes:
* pkg/api/rest/rest.go
* changes `rest.Update` interface to give rest storage an `UpdatedObjectInfo` interface instead of the object directly. To get the updated object, the storage must call `UpdatedObject()`, passing in the current object
* pkg/api/rest/update.go
* provides a default `UpdatedObjectInfo` impl
* passes a copy of the updated object through any provided transforming functions and returns it when asked
* builds UID preconditions from the updated object if they can be extracted
* pkg/apiserver/resthandler.go
* Reworks update and patch operations to give old objects to admission
* pkg/registry/generic/registry/store.go
* Calls `UpdatedObject()` inside `GuaranteedUpdate` so it can provide the old object
Todo:
- [x] Update rest.Update interface:
* Given the name of the object being updated
* To get the updated object data, the rest storage must pass the current object (fetched using the name) to an `UpdatedObject(ctx, oldObject) (newObject, error)` func. This is typically done inside a `GuaranteedUpdate` call.
- [x] Add old object to admission attributes interface
- [x] Update resthandler Update to move admission into the UpdatedObject() call
- [x] Update resthandler Patch to move the patch application and admission into the UpdatedObject() call
- [x] Add resttest tests to make sure oldObj is correctly passed to UpdatedObject(), and errors propagate back up
Follow-up:
* populate oldObject in admission for delete operations?
* update quota plugin to use `GetOldObject()` in admission attributes
* admission plugin to gate ownerReference modification on delete permission
* Decide how to handle preconditions (does that belong in the storage layer or in the resthander layer?)
Instead of just rate limits to operation polling, send all API calls
through a rate limited RoundTripper.
This isn't a perfect solution, since the QPS is obviously getting
split between different controllers, etc., but it's also spread across
different APIs, which, in practice, rate limit differently.
Fixes#26119 (hopefully)
This provides a basic implementation for setting a stage1 on a per-pod
basis via an annotation.
It's possible this feature should be gated behind additional knobs, such
as a kubelet flag to filter allowed stage1s, or a check akin to what
priviliged gets in the apiserver.
Currently, it checks `AllowPrivileged`, as a means to let people disable
this feature, though overloading it as stage1 and privileged isn't
ideal.
Automatic merge from submit-queue
Handle federated service name lookups in kube-dns.
For the domain name queries that fail to match any records in the local
kube-dns cache, we check if the queried name matches the federation
pattern. If it does, we send a CNAME response to the federated name.
For more details look at the comments in the code.
Tests are coming ...
Also, this PR is based on @ArtfulCoder's PR #23930. So please review only the last commit here.
PTAL @ArtfulCoder @thockin @quinton-hoole @nikhiljindal
[]()