
Automatic merge from submit-queue
Add e2e tests that check for wrapped volume race
This PR adds two new e2e tests that reproduce the race condition fixed in #29641 (see e.g. #29297)
In order to observe the race, you need to revert the PR that fixes it, via e.g.
```
git revert -n df1e925143
```
or
```
curl -sL https://github.com/kubernetes/kubernetes/pull/29641.patch | patch -p1 -R
```
The tests are `[Slow]` because they need to run several passes that involve creating pods with many volumes. They also are `[Serial]` because the load on the cluster may affect reproducibility of the race. They take about ~450s each when they fail on standard GCE cluster created by `go run hack/e2e.go -v --up`. `git_repo` test takes about 66s to run when it succeeds (fix PR not reverted) and `configmap` test takes about 546s in this case because configmap mounting is slower and still requires 3 passes x 5 pods x 50 configmap volumes to fail constantly with fix PR reverted. Probably these times can be reduced but frankly I've already spent quite a bit of time on tuning the numbers to find a balance between reproducibility and speed.
Managed to reproduce the problem in more or less reliable way for `configMap` and `gitRepo` volumes. Tried to reproduce it for `secret` volumes too but without success so far because they use tmpfs-based `emptyDir` variety. For `downwardAPI` volumes I expect the same problems with race reproducibility as with `secret` volumes, although I think some e2e races were caused by the bug, e.g. #29633.
The tests operate by creating several pods (via an RC) with many volumes and waiting for them to become Running. It sets node affinity for pods so that they all get created on a single node (the first one in the node list). The race condition leads to volume mount failures with slow retries, thus causing the test to time out.
The test failures look like this:
configmap:
```
• Failure [435.547 seconds]
[k8s.io] Wrapped EmptyDir volumes
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:709
should not cause race condition when used for configmaps [Serial] [Slow] [It]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/wrapped_empty_dir.go:170
Failed waiting for pod wrapped-volume-race-8c097734-6376-11e6-9ffa-5254003793ad-acbtt to enter running state
Expected error:
<*errors.errorString | 0xc8201758d0>: {
s: "timed out waiting for the condition",
}
timed out waiting for the condition
not to have occurred
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/wrapped_empty_dir.go:395
```
You'll see errors like this in kubelet log on the first node in the cluster:
```
E0816 00:27:23.319431 3510 configmap.go:174] Error creating atomic writer: stat /var/lib/kubelet/pods/e5986355-6347-11e6-a5d7-42010af00002/volumes/kubernetes.io~configmap/racey-configmap-14: no such file or directory
E0816 00:27:23.319478 3510 nestedpendingoperations.go:232] Operation for "\"kubernetes.io/configmap/e5986355-6347-11e6-a5d7-42010af00002-racey-configmap-14\" (\"e5986355-6347-11e6-a5d7-42010af00002\")" failed. No retries permitted until 2016-08-16 00:28:27.319450118 +0000 UTC (durationBeforeRetry 1m4s). Error: MountVolume.SetUp failed for volume "kubernetes.io/configmap/e5986355-6347-11e6-a5d7-42010af00002-racey-configmap-14" (spec.Name: "racey-configmap-14") pod "e5986355-6347-11e6-a5d7-42010af00002" (UID: "e5986355-6347-11e6-a5d7-42010af00002") with: stat /var/lib/kubelet/pods/e5986355-6347-11e6-a5d7-42010af00002/volumes/kubernetes.io~configmap/racey-configmap-14: no such file or directory
```
git_repo:
```
• Failure [455.035 seconds] [0/1882]
[k8s.io] Wrapped EmptyDir volumes
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:709
should not cause race condition when used for git_repo [Serial] [Slow] [It]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/wrapped_empty_dir.go:179
Failed waiting for pod wrapped-volume-race-71b12b3d-6375-11e6-9ffa-5254003793ad-b0slz to enter running state
Expected error:
<*errors.errorString | 0xc8201758d0>: {
s: "timed out waiting for the condition",
}
timed out waiting for the condition
not to have occurred
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/wrapped_empty_dir.go:395
```
Errors in kubelet log:
```
E0815 23:41:08.670203 3510 nestedpendingoperations.go:232] Operation for "\"kubernetes.io/git-repo/97636bd8-6341-11e6-a5d7-42010af00002-racey-git-repo-8\" (\"97636bd8-6341-11e6-a5d7-42010af00002\")" failed. No retries permitted until 2016-08-15 23:42:12.670181604 +0000 UTC (durationBeforeRetry 1m4s). Error: MountVolume.SetUp failed for volume "kubernetes.io/git-repo/97636bd8-6341-11e6-a5d7-42010af00002-racey-git-repo-8" (spec.Name: "racey-git-repo-8") pod "97636bd8-6341-11e6-a5d7-42010af00002" (UID: "97636bd8-6341-11e6-a5d7-42010af00002") with: failed to exec 'git clone http://10.0.68.35:2345 test': : chdir /var/lib/kubelet/pods/97636bd8-6341-11e6-a5d7-42010af00002/volumes/kubernetes.io~git-repo/racey-git-repo-8: no such file or directory
```
Generally, the races cause unexpected "no such directory" errors in kubelet logs with subsequent volume mount failures.
I've added race tests to e2e test `empty_dir_wrapper.go` ("EmptyDir wrapper volumes"). This test was added in #18445, the same PR that introduced the race bug. The original purpose of the test was making sure that no conflicts occur between different wrapped emptyDir volumes, so I've replaced "should becomes" with "should not conflict" in the first `It(...)`.
Kubernetes
Are you ...
- Interested in learning more about using Kubernetes? Please see our user-facing documentation on kubernetes.io
- Interested in hacking on the core Kubernetes code base? Keep reading!
Kubernetes is an open source system for managing containerized applications across multiple hosts, providing basic mechanisms for deployment, maintenance, and scaling of applications.
Kubernetes is:
- lean: lightweight, simple, accessible
- portable: public, private, hybrid, multi cloud
- extensible: modular, pluggable, hookable, composable
- self-healing: auto-placement, auto-restart, auto-replication
Kubernetes builds upon a decade and a half of experience at Google running production workloads at scale, combined with best-of-breed ideas and practices from the community.
Kubernetes is ready for Production!
With the 1.0.1 release Kubernetes is ready to serve your production workloads.
Kubernetes can run anywhere!
You can run Kubernetes on your local workstation under Vagrant, cloud providers (e.g. GCE, AWS, Azure), and physical hardware. Essentially, anywhere Linux runs you can run Kubernetes. Checkout the Getting Started Guides for details.
Concepts
Kubernetes works with the following concepts:
- Cluster
- A cluster is a set of physical or virtual machines and other infrastructure resources used by Kubernetes to run your applications. Kubernetes can run anywhere! See the Getting Started Guides for instructions for a variety of services.
- Node
- A node is a physical or virtual machine running Kubernetes, onto which pods can be scheduled.
- Pod
- Pods are a colocated group of application containers with shared volumes. They're the smallest deployable units that can be created, scheduled, and managed with Kubernetes. Pods can be created individually, but it's recommended that you use a replication controller even if creating a single pod.
- Replication controller
- Replication controllers manage the lifecycle of pods. They ensure that a specified number of pods are running at any given time, by creating or killing pods as required.
- Service
- Services provide a single, stable name and address for a set of pods. They act as basic load balancers.
- Label
- Labels are used to organize and select groups of objects based on key:value pairs.
Documentation
Kubernetes documentation is organized into several categories.
- Getting started guides
- for people who want to create a Kubernetes cluster
- for people who want to port Kubernetes to a new environment
- User documentation
- for people who want to run programs on an existing Kubernetes cluster
- in the Kubernetes User Guide: Managing Applications Tip: You can also view help documentation out on http://kubernetes.io/docs/.
- the Kubectl Command Line Interface is a detailed reference on
the
kubectl
CLI - User FAQ
- Cluster administrator documentation
- for people who want to create a Kubernetes cluster and administer it
- in the Kubernetes Cluster Admin Guide
- Developer and API documentation
- for people who want to write programs that access the Kubernetes API, write plugins or extensions, or modify the core Kubernetes code
- in the Kubernetes Developer Guide
- see also notes on the API
- see also the API object documentation, a detailed description of all fields found in the core API objects
- Walkthroughs and examples
- hands-on introduction and example config files
- in the user guide
- in the docs/examples directory
- Contributions from the Kubernetes community
- in the docs/contrib directory
- Design documentation and design proposals
- for people who want to understand the design of Kubernetes, and feature proposals
- design docs in the Kubernetes Design Overview and the docs/design directory
- proposals in the docs/proposals directory
- Wiki/FAQ
- in the wiki
- troubleshooting information in the troubleshooting guide
Community, discussion, contribution, and support
See which companies are committed to driving quality in Kubernetes on our community page.
Do you want to help "shape the evolution of technologies that are container packaged, dynamically scheduled and microservices oriented?"
You should consider joining the Cloud Native Computing Foundation. For details about who's involved and how Kubernetes plays a role, read their announcement.
Code of conduct
Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.
Are you ready to add to the discussion?
We have presence on:
You can also view recordings of past events and presentations on our Media page.
For Q&A, our threads are at:
Want to contribute to Kubernetes?
If you're interested in being a contributor and want to get involved in developing Kubernetes, start in the Kubernetes Developer Guide and also review the contributor guidelines.
Or, if you just have an idea for a new feature, see the Kubernetes Features repository for details on how to propose it.
Support
While there are many different channels that you can use to get ahold of us, you can help make sure that we are efficient in getting you the help that you need.
If you need support, start with the troubleshooting guide and work your way through the process that we've outlined.
That said, if you have questions, reach out to us one way or another. We don't bite!
Community resources
- Awesome-kubernetes - http://ramitsurana.github.io/awesome-kubernetes
You can find more projects, tools and articles related to Kubernetes on the awesome-kubernetes list. Add your project there and help us make it better.
- CoreKube - https://corekube.com:
Instructive & educational resources for the Kubernetes community. By the community.
- Community Documentation
Here you can learn more about the current happenings in the kubernetes community.