Automatic merge from submit-queue
Support terminal resizing for exec/attach/run
```release-note
Add support for terminal resizing for exec, attach, and run. Note that for Docker, exec sessions
inherit the environment from the primary process, so if the container was created with tty=false,
that means the exec session's TERM variable will default to "dumb". Users can override this by
setting TERM=xterm (or whatever is appropriate) to get the correct "smart" terminal behavior.
```
Fixes#13585
Automatic merge from submit-queue
genconversion=false should skip fields during conversion generation
Currently it only skips if the fields don't match, but that leaves no
way for callers to say "no really, ignore this field".
@wojtek-t @thockin *must* be able to ignore a non-convertible field (if the types are different but we still want the autogeneration of everything else)
Automatic merge from submit-queue
Don't panic if we hit a dangling symlink in mungedocs
I hit this because I have a dangling symlink, which would cause a panic.
Add support for terminal resizing for exec, attach, and run. Note that for Docker, exec sessions
inherit the environment from the primary process, so if the container was created with tty=false,
that means the exec session's TERM variable will default to "dumb". Users can override this by
setting TERM=xterm (or whatever is appropriate) to get the correct "smart" terminal behavior.
Automatic merge from submit-queue
controller-manager support number of garbage collector workers to be configurable
The number of garbage collector workers of controller-manager is a fixed value 5 now, make it configurable should more properly
Automatic merge from submit-queue
Unify logging in generators and avoid annoying logs.
@thockin regarding our discussing in the morning
@lavalamp - FYI
This mostly takes the previously checked in files and removes them, and moves
the generation to be on-demand instead of manual. Manually verified no change
in generated output.
Automatic merge from submit-queue
Deepcopy: avoid struct copies and reflection Call
- make signature of generated deepcopy methods symmetric with `in *type, out *type`, avoiding copies of big structs on the stack
- switch to `in interface{}, out interface{}` which allows us to call them with without `reflect.Call`
The first change reduces runtime of BenchmarkPodCopy-4 from `> 3500ns` to around `2300ns`.
The second change reduces runtime to around `1900ns`.
When we introduce a new field for backwards compatibility, we may want
to specify a different protobuf field name (one that matches JSON) than
the automatic transformation applied to the struct field. This allows an
API field to define the name of its protobuf tag.
Automatic merge from submit-queue
Prevent kube-proxy from panicing when sysfs is mounted as read-only.
Fixes https://github.com/kubernetes/kubernetes/issues/25543.
This PR:
* Checks the permission of sysfs before setting conntrack hashsize, and returns an error "readOnlySysFSError" if sysfs is readonly. As I know, this is the only place we need write permission to sysfs, CMIIW.
* Update a new node condition 'RuntimeUnhealthy' with specific reason, message and hit to the administrator about the remediation.
I think this should be an acceptable fix for now.
Node problem detector is designed to integrate with different problem daemons, but **the main logic is in the problem detection phase**. After the problem is detected, what node problem detector does is also simply updating a node condition.
If we let kube-proxy pass the problem to node problem detector and let node problem detector update the node condition. It looks like an unnecessary hop. The logic in kube-proxy won't be different from this PR, but node problem detector will have to open an unsafe door to other pods because the lack of authentication mechanism.
It is a bit hard to test this PR, because we don't really have a bad docker in hand. I can only manually test it:
* If I manually change the code to let it return `"readOnlySysFSError`, the node condition will be updated:
```
NetworkUnavailable False Mon, 01 Jan 0001 00:00:00 +0000 Fri, 08 Jul 2016 01:36:41 -0700 RouteCreated RouteController created a route
OutOfDisk False Fri, 08 Jul 2016 01:37:36 -0700 Fri, 08 Jul 2016 01:34:49 -0700 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Fri, 08 Jul 2016 01:37:36 -0700 Fri, 08 Jul 2016 01:34:49 -0700 KubeletHasSufficientMemory kubelet has sufficient memory available
Ready True Fri, 08 Jul 2016 01:37:36 -0700 Fri, 08 Jul 2016 01:35:26 -0700 KubeletReady kubelet is posting ready status. WARNING: CPU hardcapping unsupported
RuntimeUnhealthy True Fri, 08 Jul 2016 01:35:31 -0700 Fri, 08 Jul 2016 01:35:31 -0700 ReadOnlySysFS Docker unexpectedly mounts sysfs as read-only for privileged container (docker issue #24000). This causes the critical system components of Kubernetes not properly working. To remedy this please restart the docker daemon.
KernelDeadlock False Fri, 08 Jul 2016 01:37:39 -0700 Fri, 08 Jul 2016 01:35:34 -0700 KernelHasNoDeadlock kernel has no deadlock
Addresses: 10.240.0.3,104.155.176.101
```
* If not, the node condition `RuntimeUnhealthy` won't appear.
* If I run the permission checking code in a unprivileged container, it did return `readOnlySysFSError`.
I'm not sure whether we want to mark the node as `Unscheduable` when this happened, which only needs few lines change. I can do that if we think we should.
I'll add some unit test if we think this fix is acceptable.
/cc @bprashanth @dchen1107 @matchstick @thockin @alex-mohr
Mark P1 to match the original issue.
[]()
Automatic merge from submit-queue
Enable memory eviction by default
```release-note
Enable memory based pod evictions by default on the kubelet.
Trigger pod eviction when available memory falls below 100Mi.
```
See: https://github.com/kubernetes/kubernetes/issues/28552
/cc @kubernetes/rh-cluster-infra @kubernetes/sig-node
This fixes PodSpec to generate cleanly. No other types only half-generate (so
now we Fatalf), though several fail to generate at all (only Errorf for now).
There are ample opportunities to optimize and streamline here. For example,
there's no reason to have a function to convert IntStr to IntStr. Removing the
function does generate the right assignment, but it is unclear whether the
registered function is needed or not. I opted to leave it alone for now.
Another example is Convert_Slice_byte_To_Slice_byte, which just seems silly.
This drives conversion generation from file tags like:
// +conversion-gen=k8s.io/my/internal/version
.. rather than hardcoded lists of packages.
The only net change in generated code can be explained as correct. Previously
it didn't know that conversion was available.
This is used subsequently to simplify the conversion generation, so each
package can declare what peer-packages it uses, and have those imported
dynamically, rather than having one mega list of packages to import and not
really being clear why, for any given list item.
Automatic merge from submit-queue
Prep for not checking in generated, part 1/2
This PR is extracted from #25978 - it is just the deep-copy related parts. All the Makefile and conversion stuff is excluded.
@wojtek-t this is literally branched, a bunch of commits deleted, and a very small number of manual fixups applied. If you think this is easier to review (and if it passes CI) you can feel free to go over it again. I will follow this with a conversion-related PR to build on this.
Or if you prefer, just close this and let the mega-PR ride.
@lavalamp
This is the last piece of Clayton's #26179 to be implemented with file tags.
All diffs are accounted for. Followup will use this to streamline some
packages.
Also add some V(5) debugging - it was helpful in diagnosing various issues, it
may be helpful again.
This drives most of the logic of deep-copy generation from tags like:
// +deepcopy-gen=package
..rather than hardcoded lists of packages. This will make it possible to
subsequently generate code ONLY for packages that need it *right now*, rather
than all of them always.
Also remove pkgs that really do not need deep-copies (no symbols used
anywhere).
Previously we just tracked comments on the 'package' declaration. Treat all
file comments as one comment-block, for simplicity. Can be revisited if
needed.
This means that tags like:
// +foo=bar
// +foo=bat
..will produce []string{"bar", "bat"}. This is needed for later commits which
will want to use this to make code generation more self contained.
In bringing back Clayton's PR piece-by-piece this was almost as easy to
implement as his version, and is much more like what I think we should be
doing.
Specifically, any time which defines a .DeepCopy() method will have that method
called preferentially. Otherwise we generate our own functions for
deep-copying. This affected exactly one type - resource.Quantity. In applying
this heuristic, several places in the generated code were simplified.
To achieve this I had to convert types.Type.Methods from a slice to a map,
which seems correct anyway (to do by-name lookups).
This re-institutes some of the rolled-back logic from previous commits. It
bounds the scope of what the deepcopy generator is willing to do with regards
to generating and calling generated functions.
His PR cam during the middle of this development cycle, and it was easier to
burn it down and recreate it than try to patch it into an existing series and
re-test every assumption. This behavior will be re-introduced in subsequent
commits.
Automatic merge from submit-queue
Cleanup third party (pt 2)
Move forked-and-hacked golang code to the forked/ directory. Remove ast/build/parse code that is now in stdlib. Remove unused shell2junit
Automatic merge from submit-queue
WIP - Handle map[]struct{} in DeepCopy
Deep copy was not properly handling the empty struct case we use for Sets.
@lavalamp I need your expertise when you have some time - the go2idl parser is turning sets.String into the following tree:
type: sets.String kind: Alias
underlying: map[string]sets.Empty kind: Map
key: string kind: Builtin
elem: set.Empty kind: Struct
^
should be Alias
Looking at tc.Named, I'm not sure what the expected outcome would be and why you flatten there.
While testing this fix in OpenShift it was discovered that the
PackageConstraint was overly aggressive - types that declare a public
copy function should always return true. PackageConstraint is intended
to limit packages where we might generate deep copy function, rather
than to prevent external packages from being consumed.
Downstream generators that want to reuse the upstream generated types
need to be able to define a different ignore tag (so that they can see
the already generated types).
Automatic merge from submit-queue
[kubelet] Allow opting out of automatic cloud provider detection in kubelet. By default kubelet will auto-detect cloud providers
fixes#28231
Automatic merge from submit-queue
scheduler: change phantom pod test from integration into unit test
This is an effort for #24440.
Why this PR?
- Integration test is hard to debug. We could model the test as a unit test similar to [TestSchedulerForgetAssumedPodAfterDelete()](132ebb091a/plugin/pkg/scheduler/scheduler_test.go (L173)). Currently the test is testing expiring case, we can change that to delete.
- Add a test similar to TestSchedulerForgetAssumedPodAfterDelete() to test phantom pod.
- refactor scheduler tests to share the code between TestSchedulerNoPhantomPodAfterExpire() and TestSchedulerNoPhantomPodAfterDelete()
- Decouple scheduler tests from scheduler events: not to use events
Automatic merge from submit-queue
Track object modifications in fake clientset
Fake clientset is used by unit tests extensively but it has some
shortcomings:
- no filtering on namespace and name: tests that want to test objects in
multiple namespaces end up getting all objects from this clientset,
as it doesn't perform any filtering based on name and namespace;
- updates and deletes don't modify the clientset state, so some tests
can get unexpected results if they modify/delete objects using the
clientset;
- it's possible to insert multiple objects with the same
kind/name/namespace, this leads to confusing behavior, as retrieval is
based on the insertion order, but anchors on the last added object as
long as no more objects are added.
This change changes core.ObjectRetriever implementation to track object
adds, updates and deletes.
Some unit tests were depending on the previous (and somewhat incorrect)
behavior. These are fixed in the following few commits.
Automatic merge from submit-queue
TLS bootstrap API group (alpha)
This PR only covers the new types and related client/storage code- the vast majority of the line count is codegen. The implementation differs slightly from the current proposal document based on discussions in design thread (#20439). The controller logic and kubelet support mentioned in the proposal are forthcoming in separate requests.
I submit that #18762 ("Creating a new API group is really hard") is, if anything, understating it. I've tried to structure the commits to illustrate the process.
@mikedanese @erictune @smarterclayton @deads2k
```release-note-experimental
An alpha implementation of the the TLS bootstrap API described in docs/proposals/kubelet-tls-bootstrap.md.
```
[]()
Fake clientset is used by unit tests extensively but it has some
shortcomings:
- no filtering on namespace and name: tests that want to test objects in
multiple namespaces end up getting all objects from this clientset,
as it doesn't perform any filtering based on name and namespace;
- updates and deletes don't modify the clientset state, so some tests
can get unexpected results if they modify/delete objects using the
clientset;
- it's possible to insert multiple objects with the same
kind/name/namespace, this leads to confusing behavior, as retrieval is
based on the insertion order, but anchors on the last added object as
long as no more objects are added.
This change changes core.ObjectRetriever implementation to track object
adds, updates and deletes.
Some unit tests were depending on the previous (and somewhat incorrect)
behavior. These are fixed in the following few commits.
Specifying // +protobuf.nullable=true on a Go type that is an alias of a
map or slice will generate a synthetic protobuf message with the type
name that will serialize to the wire in a way that allows the difference
between empty and nil to be recorded.
For instance:
// +protobuf.nullable=true
types OptionalMap map[string]string
will create the following message:
message OptionalMap {
map<string, string> Items = 1
}
and generate marshallers that use the presence of OptionalMap to
determine whether the map is nil (rather than Items, which protobuf
provides no way to delineate between empty and nil).
Automatic merge from submit-queue
[client-gen]Add Patch to clientset
* add the Patch() method to the clientset.
* I have to rename the existing Patch() method of `Event` to PatchWithEventNamespace() to avoid overriding.
* some minor changes to the fake Patch action.
cc @Random-Liu since he asked for the method
@kubernetes/sig-api-machinery
ref #26580
```release-note
Add the Patch method to the generated clientset.
```
Automatic merge from submit-queue
return nil from NewClientConfig instead of empty struct
This is a go convention and fixes an nil pointer in kubelet when passing in bad command line options:
```
I0624 04:12:33.333246 25404 plugins.go:141] Loaded network plugin "kubenet"
E0624 04:12:33.333390 25404 runtime.go:58] Recovered from panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/runtime/runtime.go:52
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/runtime/runtime.go:40
/usr/local/go/src/runtime/asm_amd64.s:472
/usr/local/go/src/runtime/panic.go:443
/usr/local/go/src/runtime/panic.go:62
/usr/local/go/src/runtime/sigpanic_unix.go:24
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/client/clientset_generated/internalclientset/typed/core/unversioned/service.go:132
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet.go:254
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/client/cache/listwatch.go:80
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/client/cache/reflector.go:262
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/client/cache/reflector.go:204
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/wait/wait.go:86
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/wait/wait.go:87
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/wait/wait.go:49
```
cc @caesarxuchao @lavalamp
Before this, I was stumped with
invalid argument "federation=kubernetes-federation.test." for --federations=federation=kubernetes-federation.test.: federation not a valid federation name
but now
invalid argument "federation=kubernetes-federation.test." for --federations=federation=kubernetes-federation.test.: "kubernetes-federation.test." not a valid domain name: ["must match the regex [a-z0-9]([-a-z0-9]*[a-z0-9])?(\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)* (e.g. 'example.com')"]
Automatic merge from submit-queue
Migrate most of remaining tests from cmd/integration to test/integration to use framework
Ref #25940
Built on top of https://github.com/kubernetes/kubernetes/pull/27182 - only the last commit is unique
This PR contains Kubelet changes to enable attach/detach controller control.
* It introduces a new "enable-controller-attach-detach" kubelet flag to
enable control by controller. Default enabled.
* It removes all references "SafeToDetach" annoation from controller.
* It adds the new VolumesInUse field to the Node Status API object.
* It modifies the controller to use VolumesInUse instead of SafeToDetach
annotation to gate detachment.
* There is a bug in node-problem-detector that causes VolumesInUse to
get reset every 30 seconds. Issue https://github.com/kubernetes/node-problem-detector/issues/9
opened to fix that.
Automatic merge from submit-queue
Reduce volume controller sync period
fixes#24236 and most probably also fixes#25294.
Needs #25881! With the cache, binder is not affected by sync period. Without the cache, binding of 1000 PVCs takes more than 5 minutes (instead of ~70 seconds).
15 seconds were chosen by fair 2d10 roll :-)
Automatic merge from submit-queue
Setting TLS1.2 minimum because TLS1.0 and TLS1.1 are vulnerable
TLS1.0 is known as vulnerable since it can be downgraded to SSL
https://blog.varonis.com/ssl-and-tls-1-0-no-longer-acceptable-for-pci-compliance/
TLS1.1 can be vulnerable if cipher RC4-SHA is used, and in Kubernetes it is, you can check it with
`
openssl s_client -cipher RC4-SHA -connect apiserver.k8s.example.com:443
`
https://www.globalsign.com/en/blog/poodle-vulnerability-expands-beyond-sslv3-to-tls/
Test suites like Qualys are reporting this Kubernetes issue as a level 3 vulnerability, they recommend to upgrade to TLS1.2 that is not affected, quoting Qualys:
`
RC4 should not be used where possible. One reason that RC4 was still being used was BEAST and Lucky13 attacks against CBC mode ciphers in
SSL and
TLS. However, TLSv 1.2 or later address these issues.
`
Automatic merge from submit-queue
Use pause image depending on the server's platform when testing
Removed all pause image constant strings, now the pause image is chosen by arch. Part of the effort of making e2e arch-agnostic.
The pause image name and version is also now only in two places, and it's documented to bump both
Also removed "amd64" constants in the code. Such constants should be replaced by `runtime.GOARCH` or by looking up the server platform
Fixes: #22876 and #15140
Makes it easier for: #25730
Related: #17981
This is for `v1.3`
@ixdy @thockin @vishh @kubernetes/sig-testing @andyzheng0831 @pensu
Automatic merge from submit-queue
kube-controller-manager: Add configure-cloud-routes option
This allows kube-controller-manager to allocate CIDRs to nodes (with
allocate-node-cidrs=true), but will not try to configure them on the
cloud provider, even if the cloud provider supports Routes.
The default is configure-cloud-routes=true, and it will only try to
configure routes if allocate-node-cidrs is also configured, so the
default behaviour is unchanged.
This is useful because on AWS the cloud provider configures routes by
setting up VPC routing table entries, but there is a limit of 50
entries. So setting configure-cloud-routes on AWS would allow us to
continue to allocate node CIDRs as today, but replace the VPC
route-table mechanism with something not limited to 50 nodes.
We can't just turn off the cloud-provider entirely because it also
controls other things - node discovery, load balancer creation etc.
Fix#25602
This allows kube-controller-manager to allocate CIDRs to nodes (with
allocate-node-cidrs=true), but will not try to configure them on the
cloud provider, even if the cloud provider supports Routes.
The default is configure-cloud-routes=true, and it will only try to
configure routes if allocate-node-cidrs is also configured, so the
default behaviour is unchanged.
This is useful because on AWS the cloud provider configures routes by
setting up VPC routing table entries, but there is a limit of 50
entries. So setting configure-cloud-routes on AWS would allow us to
continue to allocate node CIDRs as today, but replace the VPC
route-table mechanism with something not limited to 50 nodes.
We can't just turn off the cloud-provider entirely because it also
controls other things - node discovery, load balancer creation etc.
Fix#25602
Automatic merge from submit-queue
Attempt 2: Bump GCE containerVM to container-v1-3-v20160517 (Docker 1.11.1) again.
Workaround the issue of small root_maxkeys on the debian based container-vm image, and bump our image to the new alpha version for docker 1.11.1 validation.
ref: #23397#25893
cc/ @vishh @timstclair
Automatic merge from submit-queue
Attach Detach Controller Business Logic
This PR adds the meat of the attach/detach controller proposed in #20262.
The PR splits the in-memory cache into a desired and actual state of the world.
Split controller cache into actual and desired state of world.
Controller will only operate on volumes scheduled to nodes that
have the "volumes.kubernetes.io/controller-managed-attach" annotation.
For the domain name queries that fail to match any records in the local
kube-dns cache, we check if the queried name matches the federation
pattern. If it does, we send a CNAME response to the federated name.
For more details look at the comments in the code.
Automatic merge from submit-queue
vSphere Volume Plugin Implementation
This PR implements vSphere Volume plugin support in Kubernetes (ref. issue #23932).
Automatic merge from submit-queue
Fix hyperkube flag parsing
Hyperkube flag parsing was not playing nicely with kubectl command and sub-commands. This PR addresses that problem, and adds some tests which exercise hyperkube dispatching to nested cobra commands.
\cc @aaronlevy @kbrwn @mumoshu
fixes#24088
[]()
Automatic merge from submit-queue
kubelet/cadvisor: Refactor cadvisor disk stat/usage interfaces.
basically
1) cadvisor struct will know what runtime the kubelet is, passed in via additional argument to New()
2) rename cadvisor wrapper function to DockerImagesFsInfo() to ImagesFsInfo() and have linux implementation choose a label based on the runtime inside the cadvisor struct
2a) mock/fake/unsupported modified to take the same additional argument in New()
3) kubelet's wrapper for the cadvisor wrapper is renamed in parallel
4) make all tests use new interface
Automatic merge from submit-queue
Add support for limiting grace period during soft eviction
Adds eviction manager support in kubelet for max pod graceful termination period when a soft eviction is met.
```release-note
Kubelet evicts pods when available memory falls below configured eviction thresholds
```
/cc @vishh
Automatic merge from submit-queue
Use protobufs by default to communicate with apiserver (still store JSONs in etcd)
@lavalamp @kubernetes/sig-api-machinery
Automatic merge from submit-queue
Cache Webhook Authentication responses
Add a simple LRU cache w/ 2 minute TTL to the webhook authenticator.
Kubectl is a little spammy, w/ >= 4 API requests per command. This also prevents a single unauthenticated user from being able to DOS the remote authenticator.
Automatic merge from submit-queue
Add support for PersistentVolumeClaim in Attacher/Detacher interface
The attach detach interface does not support volumes which are referenced through PVCs. This PR adds that support
Automatic merge from submit-queue
kube-apiserver options should be decoupled from impls
A few months ago we refactored options to keep it independent of the
implementations, so that it could be used in CLI tools to validate
config or to generate config, without pulling in the full dependency
tree of the master. This change restores that by separating
server_run_options.go back to its own package.
Also, options structs should never contain non-serializable types, which
storagebackend.Config was doing with runtime.Codec. Split the codec out.
Fix a typo on the name of the etcd2.go storage backend.
Finally, move DefaultStorageMediaType to server_run_options.
@nikhiljindal as per my comment in #24454, @liggitt because you and I
discussed this last time
Automatic merge from submit-queue
Refactor persistent volume controller
Here is complete persistent controller as designed in https://github.com/pmorie/pv-haxxz/blob/master/controller.go
It's feature complete and compatible with current binder/recycler/provisioner. No new features, it *should* be much more stable and predictable.
Testing
--
The unit test framework is quite complicated, still it was necessary to reach reasonable coverage (78% in `persistentvolume_controller.go`). The untested part are error cases, which are quite hard to test in reasonable way - sure, I can inject a VersionConflictError on any object update and check the error bubbles up to appropriate places, but the real test would be to run `syncClaim`/`syncVolume` again and check it recovers appropriately from the error in the next periodic sync. That's the hard part.
Organization
---
The PR starts with `rm -rf kubernetes/pkg/controller/persistentvolume`. I find it easier to read when I see only the new controller without old pieces scattered around.
[`types.go` from the old controller is reused to speed up matching a bit, the code looks solid and has 95% unit test coverage].
I tried to split the PR into smaller patches, let me know what you think.
~~TODO~~
--
* ~~Missing: provisioning, recycling~~.
* ~~Fix integration tests~~
* ~~Fix e2e tests~~
@kubernetes/sig-storage
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/24331)
<!-- Reviewable:end -->
Fixes#15632
This patch adds the --exit-on-lock-contention flag, which must be used
in conjunction with the --lock-file flag. When provided, it causes the
kubelet to wait for inotify events for that lock file. When an 'open'
event is received, the kubelet will exit.
A few months ago we refactored options to keep it independent of the
implementations, so that it could be used in CLI tools to validate
config or to generate config, without pulling in the full dependency
tree of the master. This change restores that by separating
server_run_options.go back to its own package.
Also, options structs should never contain non-serializable types, which
storagebackend.Config was doing with runtime.Codec. Split the codec out.
Fix a typo on the name of the etcd2.go storage backend.
Finally, move DefaultStorageMediaType to server_run_options.
Automatic merge from submit-queue
Conversions have kube-isms and are not portable for downstream
Some minor fixes to enable generators for OpenShift and others who need
to generate conversions on Kube API groups outside the core.
@deads2k
Automatic merge from submit-queue
rkt: Refactor GarbageCollect to enforce GCPolicy.
Previously, we uses `rkt gc` to garbage collect dead pods, which is very coarse, and can cause the dead pods to be removed too aggressively.
This PR improves the garbage collection, now after one GC iteration:
- The deleted pods will be removed.
- If the number of containers exceeds gcPolicy.MaxContainers,
then containers whose ages are older than gcPolicy.minAge will be removed.
cc @kubernetes/sig-node @euank @sjpotter
Pending on #23887 for the Godep updates.
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/24647)
<!-- Reviewable:end -->
Automatic merge from submit-queue
Moving federation-apiserver to use genericapiserver.ServerRunOptions and deleting federation-apiserver options
The remaining params were related to authz and authn and one parameter for WatchCacheSize.
Have moved them to genericapiserver.ServerRunOptions and now federation-apiserver can just use genericapiserver.ServerRunOptions()
cc @jianhuiz @kubernetes/sig-cluster-federation
Automatic merge from submit-queue
Adding Services to federation clientset
Commits:
1. Regenerate the client without any changes to client-gen
2. Update clientgen to add a parameter to specify generating client only for Services v1 object.
3. Regenerate federation_internalclientset
4. Regenerate federation_release_1_3
Second commit is the most important one. Other 3 commits are auto generated by running client-gen.
I have added a command line argument to client-gen that takes in a list of group/version/resource. If a group version is part of this list, then only the resources in this list are included in the client. For other group versions, the existing check of genclient=true in types.go is used.
Other alternatives considered were:
* Update genclient in types.go to mention the clientset name in which it should be included instead of just saying genclient=true (so Services will say genclient=core,federation while all other v1 resources will say genclient=core). This requires a code change in types.go to change a client set.
* Create another types.go which will only include Services and use that to generate federation clientset. This will lead to duplicate Service definition.
cc @caesarxuchao @lavalamp @jianhuiz @mfanjie @kubernetes/sig-cluster-federation
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/25443)
<!-- Reviewable:end -->
Automatic merge from submit-queue
WIP v0 NVIDIA GPU support
```release-note
* Alpha support for scheduling pods on machines with NVIDIA GPUs whose kubelets use the `--experimental-nvidia-gpus` flag, using the alpha.kubernetes.io/nvidia-gpu resource
```
Implements part of #24071 for #23587
I am not familiar with the scheduler enough to know what to do with the scores. Mostly punting for now.
Missing items from the implementation plan: limitranger, rkt support, kubectl
support and docs
cc @erictune @davidopp @dchen1107 @vishh @Hui-Zhi @gopinatht
Automatic merge from submit-queue
pkg/apis/rbac: Add Openshift authorization API types
This PR updates #23396 by adding the Openshift RBAC types to a new API group.
Changes from Openshift:
* Omission of [ResourceGroups](4589987883/pkg/authorization/api/types.go (L32-L104)) as most of these were Openshift specific. Would like to add the concept back in for a later release of the API.
* Omission of IsPersonalSubjectAccessReview as its implementation relied on Openshift capability.
* Omission of SubjectAccessReview and ResourceAccessReview types. These are defined in `authorization.k8s.io`
~~API group is named `rbac.authorization.openshift.com` as we omitted the AccessReview stuff and that seemed to be the lest controversial based on conversations in #23396. Would be happy to change it if there's a dislike for the name.~~ Edit: API groups is named `rbac`, sorry misread the original thread.
As discussed in #18762, creating a new API group is kind difficult right now and the documentation is very out of date. Got a little help from @soltysh but I'm sure I'm missing some things. Also still need to add validation and a RESTStorage registry interface. Hence "WIP".
Any initial comments welcome.
cc @erictune @deads2k @sym3tri @philips
Automatic merge from submit-queue
Webhook Token Authenticator
Add a webhook token authenticator plugin to allow a remote service to make authentication decisions.
Automatic merge from submit-queue
PSP admission
```release-note
Update PodSecurityPolicy types and add admission controller that could enforce them
```
Still working on removing the non-relevant parts of the tests but I wanted to get this open to start soliciting feedback.
- [x] bring PSP up to date with any new features we've added to SCC for discussion
- [x] create admission controller that is a pared down version of SCC (no ns based strategies, no user/groups/service account permissioning)
- [x] fix tests
@liggitt @pmorie - this is the simple implementation requested that assumes all PSPs should be checked for each requests. It is a slimmed down version of our SCC admission controller
@erictune @smarterclayton
Automatic merge from submit-queue
Move internal types of hpa from pkg/apis/extensions to pkg/apis/autoscaling
ref #21577
@lavalamp could you please review or delegate to someone from CSI team?
@janetkuo could you please take a look into the kubelet changes?
cc @fgrzadkowski @jszczepkowski @mwielgus @kubernetes/autoscaling
Automatic merge from submit-queue
Introduce skeleton of new attach/detach controller
This PR introduces the skeleton of the new attach/detach controller for #20262
Implements part of #24071
I am not familiar with the scheduler enough to know what to do with the scores. Punting for now.
Missing items from the implementation plan: limitranger, rkt support, kubectl
support and user docs
Automatic merge from submit-queue
Kubelet eviction flag parsers and tests
The first two commits are from https://github.com/kubernetes/kubernetes/pull/24559 that have achieved LGTM.
The last commit is only part that is interesting, it adds the parsing logic to handle the flags, and reserves `pkg/kubelet/eviction` for eviction manager logic.
Automatic merge from submit-queue
Move godeps to vendor/
This is a first-step towards glide support, maybe we don't want or need to take this, but it was easy to try.
This fails to compile, not sure why:
```
# k8s.io/kubernetes/pkg/apis/extensions/v1beta1
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2703: undefined: extensions.ClusterAutoscaler
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2703: undefined: ClusterAutoscaler
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2719: undefined: extensions.ClusterAutoscaler
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2719: undefined: ClusterAutoscaler
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2723: undefined: extensions.ClusterAutoscalerList
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2723: undefined: ClusterAutoscalerList
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:3468: Convert_extensions_JobSpec_To_v1beta1_JobSpec redeclared in this block
previous declaration at _output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion.go:328
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:3845: Convert_extensions_ScaleStatus_To_v1beta1_ScaleStatus redeclared in this block
previous declaration at _output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion.go:98
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:4737: Convert_v1beta1_JobSpec_To_extensions_JobSpec redeclared in this block
previous declaration at _output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion.go:380
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:5186: Convert_v1beta1_ScaleStatus_To_extensions_ScaleStatus redeclared in this block
previous declaration at _output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion.go:120
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2723: too many errors
!!! Error in /home/thockin/tmp/godep-vendor/src/k8s.io/kubernetes/hack/lib/golang.sh:417
```
Automatic merge from submit-queue
cluster/images/hyperkube: create symlink for each server
Add a kubelet symlink so that the hyperkube image can appear as a kubelet image. https://github.com/kubernetes/kubernetes/issues/24510
Automatic merge from submit-queue
add namespace index for cache
@wojtek-t
Implement in this approach make the change of lister.go small, but we should replace all `NewInformer()` to `NewIndexInformer()`, even when someone not want to filter by namespace(eg. gc_controller and scheduler). Any suggestion?
Automatic merge from submit-queue
Reimplement 'pause' in C - smaller footprint all around
Statically links against musl. Size of amd64 binary is 3560 bytes.
I couldn't test the arm binary since I have no hardware to test it on, though I assume we want it to work on a raspberry pi.
This PR also adds the gcc5/musl cross compiling image used to build the binaries.
@thockin
Automatic merge from submit-queue
API changes for Cascading deletion
This PR includes the necessary API changes to implement cascading deletion with finalizers as proposed is in #23656. Comments are welcome.
@lavalamp @derekwaynecarr @bgrant0607 @rata @hongchaodeng
The codec factory should support two distinct interfaces - negotiating
for a serializer with a client, vs reading or writing data to a storage
form (etcd, disk, etc). Make the EncodeForVersion and DecodeToVersion
methods only take Encoder and Decoder, and slight refactoring elsewhere.
In the storage factory, use a content type to control what serializer to
pick, and use the universal deserializer. This ensures that storage can
read JSON (which might be from older objects) while only writing
protobuf. Add exceptions for those resources that may not be able to
write to protobuf (specifically third party resources, but potentially
others in the future).
Automatic merge from submit-queue
Petset controller
Took longer than I expected. Main parts of this pr are:
1. Identity generation based on petset spec (volumes are mapped per discussion in #18016)
2. Ensure that we create/delete pets in sequence
3. Ensuring that we create, wait for healthy, create; or delete, wait for terminationGrace, delete
4. Controller that watches apiserver and drives actual -> desired
PVCs are not deleted, yet.
Automatic merge from submit-queue
Deleting duplicate code from federated-apiserver.Run()
This removes most of duplicate code from federated-apiserver.Run().
The code remaining is related to storage or authz and authn.
https://github.com/kubernetes/kubernetes/pull/24787 refactors the storage related code.
I am still figuring out authz and authn.
cc @jianhuiz
Automatic merge from submit-queue
Provide flags to use etcd3 backed storage
ref: #24405
What's in this PR?
- Add a new flag "storage-backend" to choose "etcd2" or "etcd3". By default (i.e. empty), it's "etcd2".
- Take out etcd config code into a standalone package and let it create etcd2 or etcd3 storage backend given user input.
Automatic merge from submit-queue
Generated clients can return their RESTClients, RESTClient can return its RateLimiter
cc @lavalamp @krousey @wojtek-t @smarterclayton @timothysc
Ref. #22421
Automatic merge from submit-queue
Add kubelet flags for eviction threshold configuration
This PR just adds the flags for kubelet eviction and the associated generated code.
I am happy to tweak text, but we can also do that later at this point in the release.
Since this causes codegen, I wanted to stage this first.
/cc @vishh @kubernetes/sig-node
Automatic merge from submit-queue
Move internal types of job from pkg/apis/extensions to pkg/apis/batch
This addressed the job part of #23216, this is still WIP. Will notify once finished. I'd like to have it in before starting working on ScheduledJob.
@lavalamp @erictune fyi
Automatic merge from submit-queue
update controllers watching all pods to share an informer
This plumbs the shared pod informer through the various controllers to avoid duplicated watches.
Automatic merge from submit-queue
All clients under ClientSet share one RateLimiter.
Currently we create a rate limiter for each client in client set. It makes the reasoning about rate limiting behavior much harder. This PR changes this behavior and now all clients in the set share single rate limiter. Ref. #24157
cc @lavalamp @wojtek-t
Automatic merge from submit-queue
Make fake client actions use fully qualified resource
The output of a versioned clientset is version object. The fake client used to assume only internal objects will be returned. This PR removes this assumption by making fake actions initialized with a fully qualified resource instead of a resource string.
We have to regenerate fake clients in release_1_2 clientset to let it compile. For the test fakes, we are breaking the backwards compatibility promise.
Part of #24155.
Automatic merge from submit-queue
client-gen: use serializer instead of codec for versioned client
For a versioned client, because the output of every client method is a versioned object, so it should use a serializer instead of a codec that does conversion.
@lavalamp @krousey
Automatic merge from submit-queue
genericapiserver: Moving more flags to ServerRunOptions
Moving more apiserver flags to generic api server.
With this change, most of the Config is created in genericapiserver.NewConfig(). The plan is to move everything there.
I didnt touch Storage related params as they are being changed in https://github.com/kubernetes/kubernetes/pull/23208.
And I will handle authz and authn in another PR (need to figure out a few things).
cc @lavalamp @kubernetes/sig-api-machinery
Automatic merge from submit-queue
Remove disabled mirror pod test in cmd/integration
The new tests for mirror pods are being added in the node e2e suite.
This partially addresses #24440
Automatic merge from submit-queue
Client-gen: Add fake clients to testoutput; fix the imports in fake clients
Start generating fake client in the testoutput, so that we can check the fake client is generated correctly under extreme conditions.
Fix the problem reported https://github.com/kubernetes/kubernetes/pull/20573#issuecomment-210556618, where the import names contain dots when the group name contains dots.
cc @deads2k
Automatic merge from submit-queue
Use correct defaults when binding apiserver flags
defaults should be set in the struct-creating function, then the current struct field value used as the default when binding the flag
Automatic merge from submit-queue
Make etcd cache size configurable
Instead of the prior 50K limit, allow users to specify a more sensible size for their cluster.
I'm not sure what a sensible default is here. I'm still experimenting on my own clusters. 50 gives me a 270MB max footprint. 50K caused my apiserver to run out of memory as it exceeded >2GB. I believe that number is far too large for most people's use cases.
There are some other fundamental issues that I'm not addressing here:
- Old etcd items are cached and potentially never removed (it stores using modifiedIndex, and doesn't remove the old object when it gets updated)
- Cache isn't LRU, so there's no guarantee the cache remains hot. This makes its performance difficult to predict. More of an issue with a smaller cache size.
- 1.2 etcd entries seem to have a larger memory footprint (I never had an issue in 1.1, even though this cache existed there). I suspect that's due to image lists on the node status.
This is provided as a fix for #23323
Automatic merge from submit-queue
Client-gen: handle dotted group name, e.g., "authentication.k8s.io"
The client-gen used to assume the group name doesn't include dot, but it's not true, e.g., we have group `authentication.k8s.io`.
With this PR, Client-gen will use the full group name when creating directory (e.g., the client for the authentication group will be generated at pkg/client/clientset_generated/release_1_3/typed/`authentication.k8s.io`/v1/). However, because golang doesn't allow dot in variable/function names, so when the group name is used as part of variable/function name, client-gen extracts the part before the first dot (e.g., authentication).
This PR also changes the group name of the test group from `testgroup` to `testgroup.k8s.io` to verify if client-gen generates sane code.
cc @deads2k for #20573
Automatic merge from submit-queue
Make kubectl bash-completion namespace and resource alias aware

- filter resource listing by `--namespace` flag given before in the command line
```bash
$ kubectl get pod --namespace=kube-system <tab><tab>
kube-dns-v9-2wuzj kube-dns-v9-llqxa
```
- add completion of `--namespace`
```bash
$ kubectl get pod --namespace=<tab><tab>
[*] default ingress kube-system
```
- add support for plural nouns and aliases like `rc`
Automatic merge from submit-queue
Unit test for negative value of allocatable resources
Introduce unit test for checking resource quantities for kubelet's allocatable resources.
Covered values:
* negative quantity value: error expected
* invalid quantity unit: error expected
* valid quantity: error not expected
Running go test with -v, returned error are logged as well for more information:
```shell
=== RUN TestValueOfAllocatableResources
--- PASS: TestValueOfAllocatableResources (0.00s)
server_test.go:47: Returned err: "resource quantity for \"memory\" cannot be negative: -150G"
server_test.go:47: Returned err: "unable to parse quantity's suffix"
PASS
ok k8s.io/kubernetes/cmd/kubelet/app 0.020s
```
Automatic merge from submit-queue
Implement a streaming serializer for watch
Changeover watch to use streaming serialization. Properly version the
watch objects. Implement simple framing for JSON and Protobuf (but not
YAML).
@wojtek-t @lavalamp
Automatic merge from submit-queue
Additional go vet fixes
Mostly:
- pass lock by value
- bad syntax for struct tag value
- example functions not formatted properly
In order to allow a Go type to have a different internal representation,
provide the comment `// +protobuf.as=MESSAGENAME` which substitutes the
current message's members with the named message's members. Used by the
unversioned.Time type to have the same fields as unversioned.Timestamp
and to have an alternate serialization.
Automatic merge from submit-queue
Move typed clients into clientset folder
Move typed clients from `pkg/client/typed/` to `pkg/client/clientset_generated/${clientset_name}/typed`.
The first commit changes the client-gen, the last commit updates the doc, other commits are just moving things around.
@lavalamp @krousey
Automatic merge from submit-queue
Migrate to the new conversion generator - part1
This PR contains two commits:
- few more fixes to the generator
- migration of the pkg/api/v1 to use the new generator
The second commit is big, but I reviewed the changes and they contain:
- conversions between types that we didn't even generating conversion between
- changes in how we handle maps/pointers/slices - previously we were explicitly referencing fields, now we are using "shadowing in, out" to make the code more generic
- lack of auto-generated method for ReplicationControllerSpec (because these types are different (*int vs int for Replicas) and a preexisting conversion already exists
Most of issues in the first commit (e.g. adding references to "in" and "out" for slices/maps/points) were discovered by our tests. So I'm pretty confident that this change is correct now.
Covered values:
- negative quantity value: error expected
- invalid quantity unit: error expected
- valid quantity: error not expected
Running go test with -v, returned error are logged as well for more information:
=== RUN TestValueOfAllocatableResources
--- PASS: TestValueOfAllocatableResources (0.00s)
server_test.go:47: negative quantity value: resource quantity for \"memory\" cannot be negative: -150G
server_test.go:47: invalid quantity unit: unable to parse quantity's suffix
PASS
ok k8s.io/kubernetes/cmd/kubelet/app 0.020s
Automatic merge from submit-queue
rkt: bump rkt version to 1.2.1
Upon bumping the rkt version, `--hostname` is supported. Also we now gets the configs from the rkt api service, so `stage1-image` is deprecated.
cc @yujuhong @Random-Liu
Automatic merge from submit-queue
Make kubelet use an arch-specific pause image depending on GOARCH
Related to: #22876, #22683 and #15140
@ixdy @pwittrock @brendandburns @mikedanese @yujuhong @thockin @zmerlynn
When setting kube/system-resources for a node, negative quantities can result in
node's allocatable being higher then node's capacity.
Let's check the quantity and return error if it is negative.
Volume names have now format <cluster-name>-dynamic-<pv-name>.
pv-name is guaranteed to be unique in Kubernetes cluster, adding
<cluster-name> ensures we don't conflict with any running cluster
in the cloud project (kube-controller-manager --cluster-name=XXX).
'kubernetes' is the default cluster name.
`--kubelet-cgroups` and `--system-cgroups` respectively.
Updated `--runtime-container` to `--runtime-cgroups`.
Cleaned up most of the kubelet code that consumes these flags to match
the flag name changes.
Signed-off-by: Vishnu kannan <vishnuk@google.com>
I can't revert with github which says "Sorry, this pull request couldn’t be
reverted automatically. It may have already been reverted, or the content may
have changed since it was merged."
Reverts commit: 0c191e787b
* Metrics will not be expose until they are hooked up to a handler
* Metrics are not cached and expose a dos vector, this must be fixed before release or the stats should not be exposed through an api endpoint
Recycle controller tries to recycle or delete a PV several times.
It stores count of failed attempts and timestamp of the last attempt in
annotations of the PV.
By default, the controller tries to recycle/delete a PV 3 times in
10 minutes interval. These values are configurable by
kube-controller-manager --pv-recycler-maximum-retry=X --pvclaimbinder-sync-period=Y
arguments.
Combine the fields that will be used for content transformation
(content-type, codec, and group version) into a single struct in client,
and then pass that struct into the rest client and request. Set the
content-type when sending requests to the server, and accept the content
type as primary.
Will form the foundation for content-negotiation via the client.
The protobuf tags contain the assigned tag id, which then ensures
subsequent generation is consistently tagging (tags don't change across
generations unless someone deletes the protobuf tag).
In addition, generate final proto IDL that is free of gogoproto
extensions for ease of generation into other languages.
Add a flag --keep-gogoproto which preserves the gogoproto extensions in
the final IDL.
Pass down into the server initialization the necessary interface for
handling client/server content type negotiation. Add integration tests
for the negotiation.
Remove Codec from versionInterfaces in meta (RESTMapper is now agnostic
to codec and serialization). Register api/latest.Codecs as the codec
factory and use latest.Codecs.LegacyCodec(version) as an equvialent to
the previous codec.
A NegotiatedSerializer is passed into the API installer (and
ParameterCodec, which abstracts conversion of query params) that can be
used to negotiate client/server request/response serialization. All
error paths are now negotiation aware, and are at least minimally
version aware.
Watch is specially coded to only allow application/json - a follow up
change will convert it to use negotiation.
Ensure the swagger scheme will include supported serializations - this
now includes application/yaml as a negotiated option.
Break Codec into two general purpose interfaces, Encoder and Decoder,
and move parameter codec responsibilities to ParameterCodec.
Make unversioned types explicit when registering - these types go
through conversion without modification.
Switch to use "__internal" instead of "" to represent the internal
version. Future commits will also add group defaulting (so that "" is
expanded internally into a known group version, and only cleared during
set).
For embedded types like runtime.Object -> runtime.RawExtension, put the
responsibility on the caller of Decode/Encode to handle transformation
into destination serialization. Future commits will expand RawExtension
and Unknown to accept a content encoding as well as bytes.
Make Unknown a bit more powerful and use it to carry unrecognized types.
This is part of migrating kubelet configuration to the componentconfig api
group and is preliminary to retrofitting client configuration and
implementing full fledged API group mechinary.
Signed-off-by: Mike Danese <mikedanese@google.com>
Add `kube-reserved` and `system-reserved` flags for configuration
reserved resources for usage outside of kubernetes pods. Allocatable is
provided by the Kubelet according to the formula:
```
Allocatable = Capacity - KubeReserved - SystemReserved
```
Also provides a method for estimating a reasonable default for
`KubeReserved`, but the current implementation probably is low and needs
more tuning.
If preformat lines are unbalanced, do not delete all the lines from there to
end of file. That might be the only copy of edits made by the user.
Do not print entire file contents for munge errors affecting entire file.
Before this change the link checker would ignore errors in a file if
the last link in a file was correct. The last link would wipe out the
error variable and set it to nil. Furthermore, it replaced errored
links with the empty string.
If we find an error that we can't correct, append the error message to
an an errs slice and leave the string as is.
Signed-off-by: Mike Danese <mikedanese@google.com>
Implement a flag that defines the frequency at which a node's out of
disk condition can change its status. Use this flag to suspend out of
disk status changes in the time period specified by the flag, after
the status is changed once.
Set the flag to 0 in e2e tests so that we can predictably test out of
disk node condition.
Also, use util.Clock interface for all time related functionality in
the kubelet. Calling time functions in unversioned package or time
package such as unversioned.Now() or time.Now() makes it really hard
to test such code. It also makes the tests flaky and sometimes
unnecessarily slow due to time.Sleep() calls used to simulate the
time elapsed. So use util.Clock interface instead which can be faked
in the tests.
For AWS EBS, a volume can only be attached to a node in the same AZ.
The scheduler must therefore detect if a volume is being attached to a
pod, and ensure that the pod is scheduled on a node in the same AZ as
the volume.
So that the scheduler need not query the cloud provider every time, and
to support decoupled operation (e.g. bare metal) we tag the volume with
our placement labels. This is done automatically by means of an
admission controller on AWS when a PersistentVolume is created backed by
an EBS volume.
Support for tagging GCE PVs will follow.
Pods that specify a volume directly (i.e. without using a
PersistentVolumeClaim) will not currently be scheduled correctly (i.e.
they will be scheduled without zone-awareness).
Add flags to control max connections (set to 256k vs 64k default) and TCP
established timeout (set to 1 day vs 5 day default). Flags can be set to 0 to
mean "don't change it".
This is only set at startup, and not wrapped in a rectifier loop.
Tested manually.
Replace many of the remaining s.Convert() invocations with direct
execution, and make generated methods public. Removes 10% of the
allocations during decode of a pod and ~20-40% of the total CPU time.
Public utility methods and JWT parsing, and controller specific logic.
Also remove the coupling between ServiceAccountTokenGetter and the
authenticator class.
1. Name default scheduler with name `kube-scheduler`
2. The default scheduler only schedules the pods meeting the following condition:
- the pod has no annotation "scheduler.alpha.kubernetes.io/name: <scheduler-name>"
- the pod has annotation "scheduler.alpha.kubernetes.io/name: kube-scheduler"
update gofmt
update according to @david's review
run hack/test-integration.sh, hack/test-go.sh and local e2e.test
Refactor Kubelet's server functionality into a server package. Most
notably, move pkg/kubelet/server.go into
pkg/kubelet/server/server.go. This will lead to better separation of
concerns and a more readable code hierarchy.
The HPA controller had previously used a single Client
object to act as three different Namespacers. To improve
ease of extensibility and to make it clearer what the HPA
controller actually needs to use from the client, it should
use separate Namespacers for each of its needs (Scales, HPAs,
and Events).
Remove some indentation from file generation for specific languages,
allow SnippetWriter's io.Writer to be accessed, and add Path to
types.Name for languages where Package and Path can be disjoint.
Set resyncInterval to one minute now that we rely on the generic pleg to trigger
pod syncs on container events. When there is an error during syncing, pod
workers need to wake up sooner to retry. Set the sync error backoff period to
10 second in this case.
This commit makes the HPA metrics client configurable in where
it looks for heapster instead of hard coding it to
"kube-system/heapster". The values of "kube-system/heapster"
are still recorded as constants in the metrics client package
for use as default values.
* Add support for pointers, map keys, more builtins, Alias types
* Support for packages & imports
* Add helper functions to types
* change generation interface
* fix naming/ordering
* SnippetWriter for templates
* Naming systems
* Use the standard packages for import resolution
* Documentation
Default to hardcodes for components that had them, and 5.0 qps, 10 burst
for those that relied on client defaults
Unclear if maybe it'd be better to just assume these are set as part of
the incoming kubeconfig. For now just exposing them as flags since it's
easier for me to manually tweak.
Flocker [1] is an open-source container data volume manager for
Dockerized applications.
This PR adds a volume plugin for Flocker.
The plugin interfaces the Flocker Control Service REST API [2] to
attachment attach the volume to the pod.
Each kubelet host should run Flocker agents (Container Agent and Dataset
Agent).
The kubelet will also require environment variables that contain the
host and port of the Flocker Control Service. (see Flocker architecture
[3] for more).
- `FLOCKER_CONTROL_SERVICE_HOST`
- `FLOCKER_CONTROL_SERVICE_PORT`
The contribution introduces a new 'flocker' volume type to the API with
fields:
- `datasetName`: which indicates the name of the dataset in Flocker
added to metadata;
- `size`: a human-readable number that indicates the maximum size of the
requested dataset.
Full documentation can be found docs/user-guide/volumes.md and examples
can be found at the examples/ folder
[1] https://clusterhq.com/flocker/introduction/
[2] https://docs.clusterhq.com/en/1.3.1/reference/api.html
[3] https://docs.clusterhq.com/en/1.3.1/concepts/architecture.html
Add an experimental network plugin implementation named "cni" that
uses the Container Networking Interface (CNI) specification for
configuring networking for pods.
https://github.com/appc/cni/blob/master/SPEC.md
This changes the --legacy-userspace-proxy flag to be a string flag
--proxy-mode. If specified, the flag will be respected ('userspace' and
'iptables' being valid values). If left blank (default) we will choose the
"best". best means userspace for now UNLESS the user adds an annotation
(net.experimental.kubernetes.io/proxy-mode) to their node, in which case we
will try to use that.
This allows people to try it on a single machine without fear of global failure
and without it getting rolled back on reboots. It is a poor-man's config blob.
Add a HostIPC field to the Pod Spec to create containers sharing
the same ipc of the host.
This feature must be explicitly enabled in apiserver using the
option host-ipc-sources.
Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>
A lot of packages use StringSet, but they don't use anything else from
the util package. Moving StringSet into another package will shrink
their dependency trees significantly.
1. Add EvnetRecordQps and EventBurst parameter in kubelet.
2. If EvnetRecordQps and EventBurst was set, rate limit events in kubelet
with a independent ratelimiter as setted.
With some regularity, if the root certificate file needs to be generated
the apiserver could come up on the non-secure port before the cert
was generated.
`hack/local-up-cluster.sh` requires that apiserver.crt exists
before the replication controller starts. Otherwise service accounts
and secrets don't work.
This change just takes the certificate handling code out of the `go`.
Allow the user to specify the resolver configuration file that is used
to determine the default DNS parameters. This defaults to the system's
/etc/resolv.conf.
Currently, whenever there is any update, kubelet would force all pod workers to
sync again, causing resource contention and hence performance degradation.
This commit flips kubelet to use incremental updates (as opposed to snapshots).
This allows us to know what pods have changed and send updates to those pod
workers only. The `SyncPods` function has been replaced with individual
handlers, each handling an operation (ADD, REMOVE, UPDATE). Pod workers are
still triggered periodically, and kubelet performs periodic cleanup as well.
This commit also spawns a new goroutine solely responsible for killing pods.
This is necessary because pod killing could hold up the sync loop for
indefinitely long amount of time now user can define the graceful termination
period in the container spec.
Also add related new flags to apiserver:
"--oidc-issuer-url", "--oidc-client-id", "--oidc-ca-file", "--oidc-username-claim",
to enable OpenID Connect authentication.
Avoid TTL by deleting pods immediately when they aren't
scheduled, and letting the Kubelet delete them otherwise.
Ensure the Kubelet uses pod.Spec.TerminationGracePeriodSeconds
when no pod.DeletionGracePeriodSeconds is available.
Check to make sure there is not an alphanumeric character immeditely
before or after the 'flag'. It there is an alphanumeric character then
this is obviously not actually the flag we care about. For example if
the project declares a flag "valid-name" but the regex finds something
like "invalid_name" we should not match. Clearly this "invalid_name" is
not actually a wrong usage of the "valid-name" flag.
1. Add HostnameOverride parameter for kube-proxy as kubelet did.
2. Add Birthcry event for kube-proxy.
3. Because record event need apiserver client, adjust order of code partly.
Since pflag can handle net.IPNet arguements use that code. This means
that our code no longer has casts back and forth and just natively uses
net.IPNet.
pflag can handle IP addresses so use the pflag code instead of doing it
ourselves. This means our code just uses net.IP and we don't have all of
the useless casting back and forth!
Moves the userspace code in proxy to a sub-package and adds the
ProxyProvider interface.
This is in preparation for landing an implementation of
https://github.com/GoogleCloudPlatform/kubernetes/issues/3760, which
will mostly be in another sub package for iptables.
First is initializing a KubeletConfig that starts no background
processes. Second is running the config. Provide a legacy path
that won't impact older callers while making it easier to customize
the interfaces passed to the Kubelet.
Used by OpenShift to inject some custom interfaces to the Kubelet
for config management.
The minion server will
- launch the proxy and executor
- relaunch them when they terminate uncleanly
- logrotate their logs.
It is a replacement for a full-blown init process like s6 which is not necessary
in this case.
The basic idea is that in the main mungedocs we run the entirefile and
create an annotated set of lines about that file. All mungers then act
on a struct mungeLines instead of on a bytes array. Making use of the
metadata where appropriete. Helper functions exist to make updating a
'macro block' extremely easy.
OpenShift uses multiple API packages (types are split) which
Kube will also eventually have as we introduce more plugins.
These changes make the generators able to handle importing different
API object packages into a single generator function.
Currently setting the `--allocate-node-cidrs` flag to true with an
empty cloud provider causes the kube-controller-manager to crash during
startup.
Fix the issue by checking for an empty cloud provider before setting
up route management on the cloud provider. This change introduces a
change in behavior. The kube-controller-manager now supports allocating
pod CIDRs without a cloud provider. This means users must manage routes
through some other mechanism.
The controller manager logs a warning if `--allocate-node-cidrs` is set,
but not a cloud provider:
```
I0725 17:10:41.587888 43185 plugins.go:70] No cloud provider specified.
I0725 17:10:41.588036 43185 nodecontroller.go:114] Sending events to api server.
E0725 17:10:41.588122 43185 controllermanager.go:201] Failed to start service controller: ServiceController should not be run without a cloudprovider.
W0725 17:10:41.588136 43185 controllermanager.go:213] allocate-node-cidrs is set, but no cloud provider specified. Will not manage routes.
E0725 17:10:41.589703 43185 nodecontroller.go:187] Error monitoring node status: Get http://127.0.0.1:8080/api/v1/nodes: dial tcp 127.0.0.1
```
Fixes#11866
This should ensure all load balancers get deleted even if a reordering of
watch events causes us to strand one after its service has been deleted,
because the sync will notice that the service controller's cache has a
service in it that no longer exists in the apiserver.
It could still leak in the case that the controller manager is killed
between when it leaks something and the sync runs, but this should
improve things.
* Add analytics munger w/ munge heading
* More link autofixes
* Allow running a subset of munges
* Fix repo root detection
* Only process non-preformatted blocks
* Gendocs no longer adds the analytics link; mungedocs does that in a
second pass.
All munges now start with `<!-- BEGIN MUNGE:` and end with `<!-- END MUNGE:`.
This lets me (in a followup) filter them better to normalize contents during
verification of generated docs.
Adds cmd/mungedocs which is framework for processing
all files under docs/ and either verifying that no changes needed or
making in-place changes.
Did not reuse kube::util::gen-docs because that seemed to be
centered around handling added files, and this pass does not
add files.
Planned uses:
- table of contents automatic updating
- linkification
- internal link checker
- link-path-relativizer or absolutizer
- example file syncer
- header inserter.
Just table-of-contents updating in this PR.
Added Table of Contents to docs/networking.md.
Demonstrates use of new TOC generator presubmit.
Other docs will be added in future PRs.
Additional development will be needed to handle some
of the more complex cases.
PR #10643 Started adding the dns names for the kubernetes master to self
sign certs which were created. The kubelet uses this same code, and thus
the kubelet cert started saying it was valid for these name as well.
While hardless, the kubelet cert shouldn't claim to be these things. So
make the caller explicitly list both their ip and dns subject alt names.
A cert from GCE shows:
- IP Address:23.236.49.122
- IP Address:10.0.0.1
- DNS:kubernetes,
- DNS:kubernetes.default
- DNS:kubernetes.default.svc
- DNS:kubernetes.default.svc.cluster.local
- DNS:e2e-test-zml-master
A similarly configured self signed cert shows:
- IP Address:23.236.49.122
- IP Address:10.0.0.1
- DNS:kubernetes
- DNS:kubernetes.default
- DNS:kubernetes.default.svc
So we are missing the fqdn kubernetes.default.svc.cluster.local. The
apiserver does not even know the fqdn! it's defined entirely by the
kubelet! We also do not have the cluster name certificate. This may be
--cluster-name= argument to the apiserver but will take a bit more
research.
Refactor GetNodeHostIP into pkg/util/node (instead of pkg/util to break import cycle).
Include internalIP in gce NodeAddresses. Remove NodeLegacyHostIP
Add support for pluggable Docker exec handlers. The default handler is
now Docker's native exec API call. The previous default, nsenter, can be
selected by passing --docker-exec-handler=nsenter when starting the
kubelet.
pkg/service:
There were a couple of references here just as a reminder to change the
behavior of findPort. As of v1beta3, TargetPort was always defaulted, so
we could remove findDefaultPort and related tests.
pkg/apiserver:
The tests were using versioned API codecs for some of their encoding
tests. Necessary API types had to be written and registered with the
fake versioned codecs.
pkg/kubectl:
Some tests were converted to current versions where it made sense.
Use the systemd $NOTIFY_SOCKET convention for kube-apiserver
startup. This allows it to be part of dependency trees and for
consumers to wait until it is listening on its ports.
The $NOTIFY_SOCKET protocol is described here:
http://www.freedesktop.org/software/systemd/man/sd_notify.html
Currently this is limited to the kube-apiserver process. Other
kube processes are internal kubernetes moving points. The API
server is the entry point relied on by callers.
100% stolen from Stef Walter from:
https://github.com/GoogleCloudPlatform/kubernetes/pull/8316
--node-milli-cpu
--node-memory
--machines
--minion-regexp
--sync-nodes
Remove the following flags from the standalon kubernetes binary:
--node-milli-cpu
--node-memory
- Delete nodes when they are no longer ready and don't exist in the
cloud provider.
- Label each node with it's hostname.
- Add flag to skip node registration.
- Add a test for registering an existing node.
This commit deletes cmd/e2e and updates hack/ginkgo-e2e.sh to use the
'ginkgo' command instead. All logic from cmd/e2e/e2e.go and
test/e2e/driver.go have been combined into the new file
test/e2e/e2e_test.go.
The test tarball now includes a built version of the test/e2e test
binary, which includes all tests under test/e2e. This was accomplished
by updating the build scripts to use 'go test -c' when a target name
ended with '.test', and adding a dependency on test/e2e/e2e.test.
This prebuilt test binary is passed to the Ginkgo runner in
hack/ginkgo-e2e.sh. In a future change, we can add support to run
Ginkgo against the source tree if it is available.
This change is generally intended to have no externally visible changes,
aside from the following caveats:
- The -t/--tests flag has been removed
- Calling cmd/e2e/e2e directly obviously won't work, but that was never
intended to be supported anyway
- If the GINKGO_PARALLEL environment variable is set to y, then ginkgo
will run test specs in parallel. (Currently defaults to n, since some
tests are broken in this mode.)
Additionally, several tests which made poor assumptions about cwd or
used testContext before it had been set have been fixed.
Kubelet will stop accepting new pods if it detects low disk space on root fs or fs holding docker images.
Running pods are not affected. low-diskspace-threshold-mb is used to configure the low diskspace threshold.
The test has been flaky for a while due to the potential watch performance
problem. Temporarily disable this test until we resolve#6651.
Note that there is extensive coverage of mirror pod creation/deletion via unit
tests in kubelet_test.go.
* Add an allocator which saves state in etcd
* Perform PortalIP allocation check on startup and periodically afterwards
Also expose methods in master for downstream components to handle IP allocation
/ master registration themselves.
This commit deletes cmd/e2e and updates hack/ginkgo-e2e.sh to use the
'ginkgo' command instead. All logic from cmd/e2e/e2e.go and
test/e2e/driver.go have been combined into the new file
test/e2e/e2e_test.go.
Additionally, several tests which made poor assumptions about cwd or
used testContext before it was set have been fixed.
This change is generally intended to have no externally visible changes,
aside from the following caveats:
- The -t/--tests flag has been removed
- Calling cmd/e2e/e2e directly obviously won't work, but that was never
supported anyway
- If the GINKGO_PARALLEL environment variable is set to y, then ginkgo
will run test specs in parallel. (Currently defaults to n, since some
tests are broken in this mode.)
--master option still supported.
--kubeconfig option added to kube-proxy,
kube-scheduler, and kube-controller-manager
binaries.
Kube-proxy now always makes some kind of API
source, since that is its only kind of config.
Warn if it is using a default client, which probably won't work.
Uses the clientcmd builder.
--master flag is still supported for distros that need it.
But now, --kubeconfig flag can be used instead, or in addition,
to specify the auth info, and/or the location of the master.
A subsequent PR will change salt to generate a kubeconfig,
and to make kube-proxy use it, for salt-based clouds.
external load balancers up-to-date based on the service's specs, using
the new DeltaFIFO watch queue class. Remove the old registry REST
handler code for creating/updating/deleting load balancers.
Also clean up a bunch of the GCE cloudprovider code related to load balancers.
I had the kublet die on startup and the only error was "0x401da0" Which
I assume is an address of the err.Error function. The other way to fix
this, I think, would be to use err.Error(), however that could cause
fmt.Fprintf() problems, debuging on the error message people used.
Now I get a nice clean error I can understand:
"cAdvisor.New() err = mountpoint for cpu not found"
We were specifying a region, but naming it as a zone in util.sh
The zone matters just as much as the region, e.g. for EBS volumes.
We also change the config to require a Zone, not a Region.
But we fallback to get the information from the metadata service.
Integration test often time out because the machine is loaded. Instead of
increasing timeout, this change hopes to address the issue by limiting the
number of tests running simultaneously.
Add a new flag in integration.go to specify the maximum number of concurrent
tests. Set the default in travis and shippable configurations to be 4.
port for monitoring by monit. This is in preparation for the standard
kubelet port to switch to SSL only (and eventually to only accepting
connections on the SSL port that present a proper client SSL cert).
Also standardize the formatting of the monit config files a bit.
Instead of endpoints being a flat list, it is now a list of "subsets"
where each is a struct of {Addresses, Ports}. To generate the list of
endpoints you need to take union of the Cartesian products of the
subsets. This is compact in the vast majority of cases, yet still
represents named ports and corner cases (e.g. each pod has a different
port number).
This also stores subsets in a deterministic order (sorted by hash) to
avoid spurious updates and comparison problems.
This is a fully compatible change - old objects and clients will
keepworking as long as they don't need the new functionality.
This is the prep for multi-port Services, which will add API to produce
endpoints in this new structure.
It currently is impossible to use two healthz handlers on different
ports in the same process. This removes the global variables in favor
of requiring the consumer to specify all health checks up front.
* If you want to test this out when an actual NFS export a good place
to start is by running the NFS server in a container:
docker run -d --name nfs --privileged cpuguy83/nfs-server /tmp
More detail can be found here:
https://github.com/cpuguy83/docker-nfs-server
Integrated the imageManager into the Kubelet and applies the garbage
collection policy every 5 minutes. The default policy allows up to 90%
disk usage, after which images are garbage collected to bring limit back
down to 80%.
Fixes#157.
Currently, API server is not aware of the static pods (manifests from
sources other than the API server, e.g. file and http) at all. This is
inconvenient since users cannot check the static pods through kubectl.
It is also sub-optimal because scheduler is unaware of the resource
consumption by these static pods on the node.
This change syncs the information back to the API server by creating a
mirror pod via API server for each static pod.
- Kubelet creates containers for the static pod, as it would do
normally.
- If a mirror pod gets deleted, Kubelet will re-create one. The
containers are sync'd to the static pods, so they will not be
affected.
- If a static pod gets removed from the source (e.g. manifest file
removed from the directory), the orphaned mirror pod will be deleted.
Note that because events are associated with UID, and the mirror pod has
a different UID than the original static pod, the events will not be
shown for the mirror pod when running `kubectl describe pod
<mirror_pod>`.
A conformance test is a test you run against a cluster that is already
set up. We would use it to test a hosted kubernetes service to make
sure that it meets a bar for quality. Also, a getting-started-guide
author, who has not implemented a complete set of cluster/...
scripts (that is, the getting-started-guide has some non-automated steps)
can use this to see which e2e tests pass on a cluster.
To be done in future PRs:
- disable tests which can't possibly run in a conformance test
because they require things like cluster ssh.
- document that when we accept a getting-started-guide, that
people should run the conformance test against their cluster
(unless they already have cluster/... scripts.
I ran this against a GCE cluster and 22 tests passed.
The integration test will fail if I check in my pending PR
to remove boundPods. Kubernetes eventually does the right
thing, but getting the integration test to check expectations
is hard because the scheduling behavior is unpredictable.
The boundPods removal is needed to fix P1 bug and speeds up
the scheduler considerably.
All the distros that use this have been updated,
or have PRs out to update them, or owners
have been asked to fix RPMs.
Removing this prevents further use of this model.
Remove now dead code: EtcdClientOrDie
Remove now dead pkg/proxy/config/etcd.go.
Remove unused imports.
This will make it easier to start running the real cAdvisor alongside
Kubelet. This change is primarily no-op refactoring. The main behavioral
change is that we always create a cAdvisor interface and expect it to
always be available. When we make a request, if cAdvisor is not
connected the request fails with a connection error. This failure is
handled today as well.
e2e auth breakage it caused. The fix is to not set project/zone/kube_master
to the empty string partway through the script, which I should have
realized was a bad idea in the first place.
Revise our code to only call Request.Namespace() if a namespace
*should* be present. For root scoped resources, namespace should
be ignored. For namespaced resources, it is an error to have
Namespace=="".
This does away with the giant dump from cobra for kubectl and instead
generates md files which contain similar information, but one per verb.
This might work well as part of the cobra project, instead of doing it
in kube, but this gets us nice, linked, documentation right now. If
people like it, I will try to get something similar into cobra.
generate man pages for kubectl using the cobra.Command information.
This will directly create files in (by default) docs/man/man1/ called
kubectl*.1. Each child verb/cobra command will gets its own man page.
--sync_nodes=false gives user flexibility to add/remove nodes in the
cluster using REST api/kubectl cli and at the same time can use
cloud provider for other resources like persistent disks, etc.
Currently, the validation logic validates fields in an object and supply default
values wherever applies. This change factors out defaulting to a set of
defaulting callback functions for decoding (see #1502 for more discussion).
* This change is based on pull request 2587.
* Most defaulting has been migrated to defaults.go where the defaulting
functions are added.
* validation_test.go and converter_test.go have been adapted to not testing the
default values.
* Fixed all tests with that create invalid objects with the absence of
defaulting logic.
This leaves `pkg/kubelet/server/server.go` looking a little ugly as there is an extra layer of "config" structs that isn't needed. This is left as a TODO for now.
Allow the master to have pod/node cache timeouts controlled via a config
flag for integration tests.
Move integration test to '127.0.0.1' so that it correctly returns a health
check, and enable health check testing on the integration test.
Part of #108.
Also:
* Added hyperkube cmd (not built by default yet).
* Added version support to hyperkube
* Remove health_check_minions flag from apiserver as it is no longer used with #3733
Use ginkgo's native support for JUnit in order to generate the XML file.
This is a first step in better integration of our e2e tests with
Jenkins. In order to improve the logged information, we will probably
need to have more native ginkgo tests but this step allows us to see
what Jenkins can already do with this information and what we need to
tweak to improve it.
Tested by running the full e2e tests and inspecting the contents of
junit.xml on the top of the tree.
Textual output is still generated on the console to keep the current
goe2e.sh logs available until the full conversion of our Jenkins
instance to use the JUnit XML is completed.
Break up the monolithic volumes code in kubelet into very small individual
modules with a well-defined interface. Move them all into their own packages
and beef up testing along the way.
* Add --orderseed, shuffle order every time, report order for repeatability
* Add --times, acts like a multi-deck shoe
* Remove fixed numbering in TAP output (this is actually not needed;
TAP output is just done by outputting what assertion count you're on.)
This is essentially just a port of f3a992aa and 369064c6 (minus
reporting, which can be handled later when we make TAP, etc, better).
This syntax is akin to what Python unittest uses for running a subset of the tests.
If a test gets skipped, log it. If an invalid test test is passed to --test, warn about it.
Added a kubelet config source for watching pods on apiserver.
The pods are converted to boundpods for merging with other
config sources.
The preferred way to create a kubelet is now to pass an apiserver
client but not an etcd client. Changed cmd/integration to use
apiserver to talk to kubelets. And cmd/kubernetes.
Unit, integration, and e2e tests pass, except for a failure of the pd
e2e test which was unrelated.
Make all kubelet config sources ensure that UID and Namespace are defaulted, if
need be.
We can *almost* disable the "if blank" logic for UID, except for tests that
call APIs that do not run through SyncPods. We really ought to be enforcing
invariants better.
This exposes the proper v1beta3 API endpoint when the user specifies
the --runtime_config=api/v1beta3 argument to the apiserver. v1beta3
is still considered experimental and subject to change.
--runtime_config is a map of string keys and values, that can be
specified by providing
--runtime_config=a=b,b=c,d,e
Only the key must be specified, the value can be omitted.
Enables v1beta3 in hack/local-up-cluster.sh and hack/test-cmd.sh
Fixes issue #3202.
* Validates SyncFrequency and minimum GC age and propagates error to RunKubelet
* Defaults for Kubelet config and minor cleanup
* cmd Kubelet MinimumGCAge to 1m instead of 0
make etcd registry pass test
fix kubelet config for quantity
fix openstack for quantity
fix controller for quantity
fix last tests for quantity
wire into binaries
fix controller manager
fix build for 32 bit systems
This adds --cluster_dns and --cluster_domain flags to kubelet. If
non-empty, kubelet will set docker --dns and --dns-search flags based on
these. It uses the cluster DNS and appends the hosts's DNS servers.
Likewise for DNS search domains.
This also adds API support to bypass cluster DNS entirely, needed to
bootstrap DNS.
Add test artifacts to the build. This lets you do:
tar -xzf kubernetes.tar.gz
tar -xzf kubernetes-test.tar.gz
cd kubernetes
go run ./hack/e2e.go -up -test -down
without having a git checkout.
OpenShift would like to also enable swagger, but we need to register our
services as swagger services prior to the SwaggerAPI being started. I've
added a bool (default false) to master.Config to enable swagger, and split
the method in master out so that a downstream consumer can call it.
RegisterHandlers was called after the listening for events had already begun.
So, there was a race where sometimes the first update would, with the
initial state, would notify an empty list of listeners.
This showed up in services.sh e2e test as empty service and endpoint maps
after the test step which restarts the kube-proxy.
Perhaps due to timing, this doesn't show up with etcd source, but does
show up with apiserver as a source. A separate PR makes APIserver
the source as a default, and depends on this.
This took me several days to debug.
Configure apiserver to serve Securely on port 6443.
Generate token for kubelets during master VM startup.
Put token into file apiserver can get and another file the kubelets can get.
Added e2e test.
This change refactors the way Kubelet's DockerPuller handles the docker config credentials to utilize a new credentialprovider library.
The credentialprovider library is based on several of the files from the Kubelet's dockertools directory, but supports a new pluggable model for retrieving a .dockercfg-compatible JSON blob with credentials.
With this change, the Kubelet will lazily ask for the docker config from a set of DockerConfigProvider extensions each time it needs a credential.
This change provides common implementations of DockerConfigProvider for:
- "Default": load .dockercfg from disk
- "Caching": wraps another provider in a cache that expires after a pre-specified lifetime.
GCP-only:
- "google-dockercfg": reads a .dockercfg from a GCE instance's metadata
- "google-dockercfg-url": reads a .dockercfg from a URL specified in a GCE instance's metadata.
- "google-container-registry": reads an access token from GCE metadata into a password field.
apiserver becomes kube-apiserver
controller-manager -> kube-controller-manager
scheduler and proxy similarly.
Only thing I promise is that right now hack/build-go.sh and
build/release.sh exit with 0. That's it. Who knows if any of this
actually works....
Added function to read basic ACL from a CSV file.
Added implementation of Authorize based on that file's policies.
Added docs on authentication and authorization.
Added example file and tested it.
Added basic interface for authorizer implementations.
Added default "authorize everything" and "authorize nothing
implementations.
Added authorization check immediately after authentication check.
Added an integration test of authorization at the HTTP level of
abstraction.
Callsites no longer allocate a mux.
Master now exposes method to install handlers
which use the master's auth code. Not used
but forks (openshift) are expected to use these
methods. These methods will later be a point
for additional plug-in functionality.
Integration tests now use the master-provided
handler which has auth, rather than using the mux,
which didn't. Fix TestWhoAmI now that /_whoami
sits behind auth.
Moved code from cmd/apiserver to pkg/master.
test/integration/client_test made to use a master object,
instead of an apiserver.Handle.
Subsequent PRs will move more handler-installation into
pkg/master, with the goal that every http.Handler of a
standalone apiserver process can also be tested
in a "testing"-style go test.
In particular, a subsequent PR will test
authorization.
Prepares for the meta object to front multiple underlying types
when TypeMeta and ObjectMeta is split in internal and v1beta3, but
combined in v1beta1 and v1beta2
Allows us to define different watch versioning regimes in the future
as well as to encode information with the resource version.
This changes /watch/resources?resourceVersion=3 to start the watch at
4 instead of 3, which means clients can read a resource version and
then send it back to the server. Clients should no longer do math on
resource versions.
* Default file based implementation
* Define some simple interfaces
* Add -token_auth_file to apiserver that will start the apiserver
with a request filter for tokens
Public access to the DockerHub is not guaranteed in all environments,
add a flag to the kubelet that allows it to use a different image (like
one on a private registry) as well as only pull the first time the
image is needed.
Fixes#1545
* Allows consumers to provide their own transports for common cases.
* Supports KUBE_API_VERSION on test cases for controlling which
api version they test against
* Provides a common flag registration method for CLIs that need
to connect to an API server (to avoid duplicating flags)
* Ensures errors are properly returned by the server
* Add a Context field to client.Config
* Defaults to v1beta1
* apiserver takes -storage_version which controls etcd storage version
and the version of the client used to connect to other apiservers
* Changed signature of client.New to add version parameter
* All controller code and component code prefers the oldest (most common)
server version
* Make Codec separate from Scheme
* Move EncodeOrDie off Scheme to take a Codec
* Make Copy work without a Codec
* Create a "latest" package that imports all versions and
sets global defaults for "most recent encoding"
* v1beta1 is the current "latest", v1beta2 exists
* Kill DefaultCodec, replace it with "latest.Codec"
* This updates the client and etcd to store the latest known version
* EmbeddedObject is per schema and per package now
* Move runtime.DefaultScheme to api.Scheme
* Split out WatchEvent since it's not an API object today, treat it
like a special object in api
* Kill DefaultResourceVersioner, instead place it on "latest" (as the
package that understands all packages)
* Move objDiff to runtime.ObjectDiff
Currently the apiserver will not start unless a machine list or a
valid cloud provider is specified. This prevents a workflow that
manages machines solely through the minions API.
Fix the issue by changing the apiserver to only log a message that
the apiserver is being started with an empty machine list.
This patch results in a change in behavior. The apiserver will no
longer exit non-zero if a cloud provider or machine list is not
configured.
In particular, add support for -server_version=raw and use matching
format for the output of -version and -server_version.
The "normal" format is essentially defined by (version.Info) String()
method, so future updates to that method will be reflected on both.
Full version information is still available by using the "raw" flag.
Tested:
- Used cluster/kubecfg.sh to query local build and the server.
$ cluster/kubecfg.sh -version
Kubernetes version 0.2+, build 9316edfc0d2b28923fbb6eafa38458350859f926
$ cluster/kubecfg.sh -server_version
Server: Kubernetes version 0.2, build a0abb38157
$ cluster/kubecfg.sh -version=raw
version.Info{Major:"0", Minor:"2+", GitVersion:"v0.2-25-g9316edfc0d2b28", GitCommit:"9316edfc0d2b28923fbb6eafa38458350859f926", GitTreeState:"clean"}
$ cluster/kubecfg.sh -server_version=raw
version.Info{Major:"0", Minor:"2", GitVersion:"v0.2", GitCommit:"a0abb3815755d6a77eed2d07bb0aa7d255e4e769", GitTreeState:"clean"}
Fixes: #1092
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Cloud providers may need specific configurations to run properly (e.g.
authentication parameters, uri, etc.).
This patch adds the simplest implementation for passing configurations
to cloudproviders: a new apiserver -cloud_config flag to specify the
path to an arbitrary configuration file.
Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>
Use images and some better formatting. Also add scripts to help prevent typos.
This based on an improved version done by Julia Ferraioli. She came up with the cool images.
This avoids some conflict with the built-in `flag` module in Go. The
module was already being renamed to `verflag` on import anyways, so we
might as well just call it that.
Tested:
- hack/build-go.sh and ran the resulting binaries with -version args.
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Convert host:port and URLs passed to client.New() into the proper
values, and return an error if the value is invalid. Change CLI
to return an error if -master is invalid. Remove Client.rawRequest
which was not in use, and fix the involved tests. Add NewOrDie
Preserves the behavior of the client to not auth when a non-https
URL is passed (although in the future this should be corrected).
Also rename some to other names that make better reading. There are still a
bunch of "make" functions but they do things like assemble a string from parts
or build an array of things. It seemed that "make" there seemed fine. "New"
is for "constructors".
We have a few long running operations today (sync=true, watch) that
exceed the original default http.Server timeout. We should set the
timeout to a high enough limit that more granular controls can be
implemented.
Prepare for running multiple API versions on the same HTTP server
by decoupling some of the mechanics of apiserver. Define a new
APIGroup object which represents a version of the API.
The sync frequency should be part of the syncLoop and resync no
less often than every X seconds. The current implementation runs
even if a config update was delivered less than X seconds ago.
because it causes a runtime panic if a binary which has its own implementation
of "-version" flag tries to reuse a package library which indirectly depend on
"pkg/version".
e.g. If such an user-defined binary tires to link "pkg/api" or "pkg/client",
the binary fails with a runtime panic "flag redefined: version".
To make sure the etcd watcher works, I changed the replication
controller to use watch.Interface. I made apiserver support watches on
controllers, so replicationController can be run only off of the
apiserver. I made sure all the etcd watch testing that used to be in
replicationController is now tested on the new etcd watcher in
pkg/tools/.
Tested: Passed -version argument to kubelet (and all other binaries):
$ output/go/bin/kubecfg -version
Kubernetes version 0.1, build 6454a541fd56
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
This change adds additional test coverage for the kubecfg
command. There is now a test for the case when the auth info
file does not exist. LoadAuthInfo tests have been refactored
to use table testing.
Setting up a new master.Master instance requires passing
around too many arguments.
Add a master.Config type and group related master configs.
Refactor all commands to instantiate new masters using a
master.Config struct.
The current integration tests do not return after delegating
HTTP requests, as a result an extra call to response.WriteHeader
is made for every request.
Fix the issue by returning after delegating HTTP requests.
Also make the output and validation of input better for kubecfg api calls.
Kubecfg will now display a usage argument if the URL is incorrect or an
unrecognized storage type is passed.
Check for 1 path segment on create/list, 2 on update/delete, and
allow any number of path segments on get (for now).
Also pretty prints the list of actual types that are supported for
create/update, which today corresponds to the list of types that
are supported period.
Also transfer the Kubelet from using ContainerManifest.ID to source specific
identifiers with namespacing. Move goroutine behavior out of kubelet/ and
into integration.go and cmd/kubelet/kubelet.go for better isolation.
The -etcd_servers flag is used inconsistently by the Kubernetes commands,
both externally and internally.
This patch fixes the issue by using the same type to represent a list of
etcd servers internally, and declares the -etcd_servers flag consistently
across all commands.
This patch should be 100% backwards compatible with no changes in behavior.
Splits endpoint and service configuration into their own objects. Also makes
the endpoint and service configuration tests correct - there was a race condition
previously that meant tests were passing but not checking correct code.
Improve apiserver/logger.go's interface (it's pretty cool now).
Improve apiserver's error reporting to clients.
Improve client's handling of errors from apiserver.
Make failed PUTs return 409 (conflict)-- http status codes are amazingly
well defined for what we're doing!
1) imported glog to third_party (previous commit)
2) add support for third_party/update.sh to update just one pkg
3) search-and-replace:
s/log.Printf/glog.Infof/
s/log.Print/glog.Info/
s/log.Fatalf/glog.Fatalf/
s/log.Fatal/glog.Fatal/
4) convert glog.Info.*, err into glog.Error*
Adds some util interfaces to logging and calls them from each cmd, which
will set the default log output to write to glog. Pass glog-wrapped
Loggers to etcd for logging.
Log files will go to /tmp - we should probably follow this up with a
default log dir for each cmd.
The glog lib is sort of weak in that it only flushes every 30 seconds, so
we spin up our own flushing goroutine.