Tests "Multi-AZ Cluster Volumes" should consider only nodes that are
schedulable and *untainted* when computing AZ where to run the tests.
GetReadySchedulableNodes() already filters schedulable + untainted nodes,
no need to do it again in GetSchedulableClusterZones().
For some reason when we send them to journald, many log lines are
consistently dropped as soon as the PLEG is started.
If we log directly to file, we don't have this problem. As a bonus, if
the tests crash, the kubelet logs will always be available since they
were already written; otherwise we normally wait until the end of the
test run to collect them from journald, meaning that we often end up
with empty logs.
Removes any reference from the registry gcr.io/kubernetes-e2e-test-images in
kubernetes/kubernetes, replacing it with k8s.gcr.io/kubernetes-e2e-test-images.
In some cases, the images had to be updated since a few things have changed since
their original implementation, most notably being the fact that some of the images
have been centralized into the agnhost image.
Co-Authored-By: Claudiu Belu <cbelu@cloudbasesolutions.com>
- recover to last-known-good ConfigMap.KubeletConfigKey
~12m to run in CI, 13m locally
- non-nil last-known-good to a new non-nil last-known-good
~24m to run in CI
- recover to last-known-good ConfigMap
~12m to run in CI
- state transitions
~8m to run in CI
Including a skip method as the first line of a test does not prevent the test to fail in the BeforeEach function.
If the test is skipped because of a tag in the name, then we can prevent such odd behavior.
We now use a host local exec instead of SSH commands to simplify the
test and make the result more robust.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
This assumes that SSH via bastion works if the `KUBE_SSH_BASTION`
environment variable is set, which is the case for
`pull-kubernetes-e2e-gce-correctness`.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
1. fix command empty issue for some Windows storage tests
2. enable more windows storage tests by adding ntfs test patten
Change-Id: Ic33be282d669a23107474a14d4368bbf95c9b459
This add e2e test for HPA ContainerResource metrics. This add test to cover two scenarios
1. Scale up on a busy application with an idle sidecar container
2. Do not scale up on a busy sidecar with an idle application.
Signed-off-by: Vivek Singh <svivekkumar@vmware.com>
* Squashed commit of the following:
commit 7f774dcb54b511a3956aed0fac5c803f145e383a
Author: Jay Vyas (jayunit100) <jvyas@vmware.com>
Date: Fri Jun 18 10:58:16 2021 +0000
fix commit message
commit 0ac09650742f02004dbb227310057ea3760c4da9
Author: jay vyas <jvyas@vmware.com>
Date: Thu Jun 17 07:50:33 2021 -0400
Update test/e2e/network/netpol/kubemanager.go
Co-authored-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
commit 6a8bf0a6a2690dac56fec2bdcdce929311c513ca
Author: jay vyas <jvyas@vmware.com>
Date: Sun Jun 13 08:17:25 2021 -0400
Implement Service polling for network policy suite to remove reliance on CoreDNS when verifying network policys
Update test/e2e/network/netpol/probe.go
Co-authored-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
Add deafultNS to use service probe
commit b9c17a48327aab35a855540c2294a51137aa4a48
Author: Matthew Fenwick <mfenwick100@gmail.com>
Date: Thu May 27 07:30:59 2021 -0400
address code review comments for networkpolicy decoupling from dns
commit e23ef6ff0d189cf2ed80dbafed9881d68402cb56
Author: jay vyas <jvyas@vmware.com>
Date: Wed May 26 13:30:21 2021 -0400
NetworkPolicy decoupling from DNS
gofmt
remove old function
* model refactor
* minor
* dropped getK8sModel func
* dropped modelMap, added global model in BeforeEach and subsequent changes
Co-authored-by: Rajas Kakodkar <rajaskakodkar16@gmail.com>
Prevent Kubelet from incorrectly interpreting "not yet started" pods as "ready to terminate pods" by unifying responsibility for pod lifecycle into pod worker
Add e2e tests to cover the basic flows for the `full-pcpus-only` option:
negative flow to ensure rejection with proper error message, and
positive flow to verify the actual cpu allocation.
Co-authored-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
Through Job.status.uncountedPodUIDs and a Pod finalizer
An annotation marks if a job should be tracked with new behavior
A separate work queue is used to remove finalizers from orphan pods.
Change-Id: I1862e930257a9d1f7f1b2b0a526ed15bc8c248ad
As of now, we allow PDBs to be applied to pods via
selectors, so there can be unmanaged pods(pods that
don't have backing controllers) but still have PDBs associated.
Such pods are to be logged instead of immediately throwing
a sync error. This ensures disruption controller is
not frequently updating the status subresource and thus
preventing excessive and expensive writes to etcd.
A number of race conditions exist when pods are terminated early in
their lifecycle because components in the kubelet need to know "no
running containers" or "containers can't be started from now on" but
were relying on outdated state.
Only the pod worker knows whether containers are being started for
a given pod, which is required to know when a pod is "terminated"
(no running containers, none coming). Move that responsibility and
podKiller function into the pod workers, and have everything that
was killing the pod go into the UpdatePod loop. Split syncPod into
three phases - setup, terminate containers, and cleanup pod - and
have transitions between those methods be visible to other
components. After this change, to kill a pod you tell the pod worker
to UpdatePod({UpdateType: SyncPodKill, Pod: pod}).
Several places in the kubelet were incorrect about whether they
were handling terminating (should stop running, might have
containers) or terminated (no running containers) pods. The pod worker
exposes methods that allow other loops to know when to set up or tear
down resources based on the state of the pod - these methods remove
the possibility of race conditions by ensuring a single component is
responsible for knowing each pod's allowed state and other components
simply delegate to checking whether they are in the window by UID.
Removing containers now no longer blocks final pod deletion in the
API server and are handled as background cleanup. Node shutdown
no longer marks pods as failed as they can be restarted in the
next step.
See https://docs.google.com/document/d/1Pic5TPntdJnYfIpBeZndDelM-AbS4FN9H2GTLFhoJ04/edit# for details
UserInfo contains a uid field alongside groups, username and extra.
This change makes it possible to pass a UID through as an impersonation header like you
can with Impersonate-Group, Impersonate-User and Impersonate-Extra.
This PR contains:
* Changes to impersonation.go to parse the Impersonate-Uid header and authorize uid impersonation
* Unit tests for allowed and disallowed impersonation cases
* An integration test that creates a CertificateSigningRequest using impersonation,
and ensures that the API server populates the correct impersonated spec.uid upon creation.
1. add AllocateLoadBalancerNodePorts fields in specs for validation test cases
2. update fuzzer
3. in resource quota e2e, allocate node port for loadbalancer type service and
exceed the node port quota
Signed-off-by: Hanlin Shi <shihanlin9@gmail.com>
This change updates the CSR API to add a new, optional field called
expirationSeconds. This field is a request to the signer for the
maximum duration the client wishes the cert to have. The signer is
free to ignore this request based on its own internal policy. The
signers built-in to KCM will honor this field if it is not set to a
value greater than --cluster-signing-duration. The minimum allowed
value for this field is 600 seconds (ten minutes).
This change will help enforce safer durations for certificates in
the Kube ecosystem and will help related projects such as
cert-manager with their migration to the Kube CSR API.
Future enhancements may update the Kubelet to take advantage of this
field when it is configured in a way that can tolerate shorter
certificate lifespans with regular rotation.
Signed-off-by: Monis Khan <mok@vmware.com>