Commit Graph

61 Commits

Author SHA1 Message Date
Jefftree
97eed70d74 Add konnectivity log files 2020-02-03 13:21:42 -08:00
Jacek Kaniuk
2dc3684cf7 Fix waiting for logexporter log fetching processes
Fix bug found by shellcheck in logexporter log fetching
where last wait was not working properly.
Fix DumpClusterLogs hanging in 5k nodes clusters:
https://github.com/kubernetes/kubernetes/issues/85753

Change-Id: Id02bf9048b19e790940c7eac6d45d7fa7a3dfb2b
2019-12-04 18:13:09 +01:00
Jacek Kaniuk
7eb6182a63 Revert "Fix shellcheck failure in log-dump/log-dump.sh"
This reverts commit e99a325d4e.
2019-12-03 18:17:54 +01:00
toyoda
e99a325d4e Fix shellcheck failure in log-dump/log-dump.sh 2019-11-12 16:51:09 +09:00
Yang Lu
c4aed0d485 Dump GKE windows test logs via diagnostics tool 2019-10-30 16:53:21 -07:00
Janek Łukaszewicz
72683a0252 log-dump: make logging clearer 2019-09-09 13:08:24 +02:00
Angela Li
fa90cb9e3d Avoid truncating long log messages 2019-07-11 10:50:11 -07:00
Angela Li
c0c29586a9 Add EntryType 2019-07-10 14:09:44 -07:00
Angela Li
a97d544475 Changed to use select-object to filter the log properties 2019-07-10 10:25:38 -07:00
Angela Li
ed43a6c039 Add timestamp to the docker test logs 2019-07-09 17:31:24 -07:00
Haosdent Huang
7ce6e71891 Fix typos. 2019-06-11 01:52:14 +08:00
SataQiu
bc279da872 fix some shellcheck failures of cluster/*.sh 2019-04-04 23:20:52 +08:00
Peter Hornyack
0bb25290c8 Update log-dump.sh for Windows nodes.
Tested:
```
$ PROJECT=${CLOUDSDK_CORE_PROJECT} KUBERNETES_SKIP_CONFIRM=y NUM_NODES=2 \
  NUM_WINDOWS_NODES=2 KUBE_GCE_ENABLE_IP_ALIASES=true go run \
  ./hack/e2e.go -- --up
$ cluster/log-dump/log-dump.sh
$ ls _artifacts
```

And with: NUM_NODES=2 NUM_WINDOWS_NODES=0; NUM_NODES=0 NUM_WINDOWS_NODES=2
2019-02-26 12:10:19 -08:00
Krzysztof Siedlecki
bc42602024 adding handling for use_custom_instance_list in dump_nodes_with_logexporter 2019-02-08 14:02:06 +01:00
Matt Matejczyk
5e6171790b Propagate dump_systemd_journal to logexporter job.
Log exporter changes have been made in
https://github.com/kubernetes/test-infra/pull/11121 and new version has
been pushed in https://github.com/kubernetes/test-infra/pull/11149
2019-02-06 15:49:29 +01:00
Matt Matejczyk
35543f8989 Allow dumping full systemd journal in log-dump.sh.
The feature is gated behind a newly introduced 'dump-systemd-journal' flag.
We want to dump the full systemd journal in our scalability performance tests.
2019-02-03 21:28:37 +01:00
Maciej Borsz
2aee491bf8 Fix detect_node_failures for gke 2018-12-19 08:14:22 +01:00
Maciej Borsz
325511d0ab Check if INSTANCE_GROUPS is empty in detect_node_failures. 2018-12-18 11:59:11 +01:00
Maciej Borsz
8e879db938 Revert "Revert "Check for hostError and automaticRestart when test finishes.""
This reverts commit 047aa25484.
2018-12-18 11:57:03 +01:00
Maciej Borsz
047aa25484
Revert "Check for hostError and automaticRestart when test finishes." 2018-11-30 17:55:27 +01:00
Maciej Borsz
0514aa17a6 Check for hostError and automaticRestart when test finishes. 2018-11-27 15:13:56 +01:00
Katharine Berry
3578696846 DRY 2018-09-06 16:54:13 -07:00
Katharine Berry
ed0f3f5d3c Don't bother dumping coverage info if it won't exist. 2018-09-06 16:24:32 -07:00
Katharine Berry
e17499c8e6 Include coverage information when dumping logs. 2018-09-06 16:24:32 -07:00
Kubernetes Submit Queue
d67a03183a
Merge pull request #67687 from Lion-Wei/remote-reschrduler
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

remove rescheduler since scheduling DS pods by default scheduler is moving to beta

**What this PR does / why we need it**:

remove rescheduler since scheduling DS pods by default scheduler is moving to beta

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #64725

**Special notes for your reviewer**:

**Release note**:
```release-note
Remove rescheduler since scheduling DS pods by default scheduler is moving to beta.
```
2018-08-23 12:32:17 -07:00
liangwei
5ea138f4e9 remove rescheduler 2018-08-22 11:49:14 +08:00
Maciej Borsz
598be75757 Store logs from 'logexporter' to allow debugging it. 2018-08-14 15:43:32 +02:00
wojtekt
0316faba9d Fix dumping logs with logexporter 2018-07-02 15:24:25 +02:00
Kubernetes Submit Queue
624dec20c0
Merge pull request #65139 from wojtek-t/fix_logexporter
Automatic merge from submit-queue (batch tested with PRs 65123, 65176, 65139, 65084, 65056). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Periodically fetch logexported nodes instead of sleeping
2018-06-21 16:56:13 -07:00
wojtekt
43d217f904 Periodically fetch logexported nodes instead of sleeping 2018-06-18 14:29:14 +02:00
Shyam Jeedigunta
87225c0b9a Increase logexporter timeout and add debug logs 2018-06-12 16:30:04 +02:00
RaviSantosh Gudimetla
872addf9e3
Revert "Remove rescheduler and corresponding tests from master" 2018-05-31 22:18:49 -04:00
ravisantoshgudimetla
aeccffc339 Phase out rescheduler in favor of priority and preemption 2018-05-29 19:52:06 -04:00
Kubernetes Submit Queue
7b8bb6e7d3
Merge pull request #63357 from Random-Liu/install-and-use-crictl
Automatic merge from submit-queue (batch tested with PRs 63167, 63357). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Install and use crictl in gce kube-up.sh

Download and use crictl in gce kube-up.sh.

This PR:
1. Downloads crictl `v1.0.0-beta.0` onto the node, which supports CRI v1alpha2. We'll upgrade it to `v1.0.0-beta.1` soon after the release is cut.
2. Change `kube-docker-monitor` to `kube-container-runtime-monitor`, and let it use `crictl` to do health monitoring.
3. Change `e2e-image-puller` to use `crictl`. Because of https://github.com/kubernetes/kubernetes/issues/63355, it doesn't work now. But in `crictl v1.0.0-beta.1`, we are going to statically link it, and the `e2e-image-puller` should work again.
4. Use `systemctl kill --kill-who=main` instead of `pkill`, the reason is that:
  a. `pkill docker` will send `SIGTERM` to all processes including `dockerd`, `docker-containerd`, `docker-containerd-shim`. This is not a problem for Docker 17.03 CE, because `containerd-shim` in containerd 0.2.x doesn't exit with SIGERM (see [code](https://github.com/containerd/containerd/blob/v0.2.x/containerd-shim/main.go#L123)). However, `containerd-shim` in containerd 1.0+ does exit with SIGTERM (see [code](https://github.com/containerd/containerd/blob/master/cmd/containerd-shim/main_unix.go#L200)). This means that `pkill docker` and `pkill containerd` will kill all shim processes for Docker 17.11+ and containerd 1.0+.
  b. We can use `pkill -x` instead. However, docker systemd service name is `docker`, but daemon process name is `dockerd`. We have to introduce another environment variable to specify "daemon process name". Given so, it seems easier to just use `systemctl kill` which only requires systemd service name. `systemctl kill --kill-who=main` will make sure only main process receives SIGTERM.

Signed-off-by: Lantao Liu <lantaol@google.com>

/cc @filbranden @yujuhong @feiskyer @mrunalp @kubernetes/sig-node-pr-reviews @kubernetes/sig-cluster-lifecycle-pr-reviews 

**Release note**:

```release-note
Kubernetes cluster on GCE have crictl installed now. Users can use it to help debug their node. The documentation of crictl can be found https://github.com/kubernetes-incubator/cri-tools/blob/master/docs/crictl.md.
```
2018-05-15 21:18:12 -07:00
Lantao Liu
884e08e33c Collect logs for health monitor services.
Signed-off-by: Lantao Liu <lantaol@google.com>
2018-05-03 17:18:00 -07:00
Matthias Bertschy
9b15af19b2 Update all script to use /usr/bin/env bash in shebang 2018-04-19 13:20:13 +02:00
Kubernetes Submit Queue
ebd3d68039
Merge pull request #55831 from Random-Liu/rename-log-dump-env
Automatic merge from submit-queue (batch tested with PRs 55392, 55491, 51914, 55831, 55836). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Rename log-dump env to `LOG_DUMP_SYSTEMD_SERVICES`.

For https://github.com/kubernetes/features/issues/286.

Rename `SYSTEMD_SERVICES` to `LOG_DUMP_SYSTEMD_SERVICES`. test-infra disables log dump in our e2e framework, and uses a different log dump logic https://github.com/kubernetes/test-infra/blob/master/kubetest/e2e.go#L480-L497. So the flags we added in https://github.com/kubernetes/kubernetes/pull/55288 will not work in test-infra.

Fortrunately, test-infra is using the same script `cluster/log-dump/log-dump.sh`, so we could still configure systemd services by setting the environment variable globally.

The original environment variable name is too general for setting globally, change it to a more specific name.

**Release note**:

```release-note
none
```
2017-11-17 00:18:25 -08:00
Lantao Liu
0085e2208d Rename log-dump env to LOG_DUMP_SYSTEMD_SERVICES. 2017-11-16 00:41:27 +00:00
Marcin Owsiany
310ab8c3c4 Do not crash on empty NODE_NAMES array. 2017-11-14 14:43:30 +01:00
Lantao Liu
32c4295bcf Support collecting log for alternative container runtime in e2e test. 2017-11-10 18:46:48 +00:00
Davanum Srinivas
9a217217c1 Fix log collection for kubeadm-gce tests
Separate out kuberenetes-anywhere provider under cluster/ but
delegate all the functionality to the "gce" one since the code
would be the same. Except for the name of the node, the
NODE_INSTANCE_PREFIX will be different, so account for that.
2017-10-26 07:57:42 -04:00
Jordan Liggitt
d7699028f6
Include audit log in master log capture 2017-09-24 19:59:53 -04:00
Shyam Jeedigunta
6ae0eb8806 Fix bug with gke in logdump 2017-09-13 14:03:03 +02:00
Shyam Jeedigunta
05fcefc0df Make log-dump use 'gcloud ssh' for GKE also 2017-09-13 00:14:57 +02:00
Shyam Jeedigunta
c483c13aee Correct logdump logic for kubemark master 2017-09-04 12:59:36 +02:00
Shyam Jeedigunta
a31703631f Make logdump work for GKE with 'use_custom_instance_list' defined 2017-09-02 00:29:16 +02:00
Shyam Jeedigunta
aac1837218 Make logdump for kubemark logs independent of KUBERNETES_PROVIDER 2017-09-01 23:56:00 +02:00
Kubernetes Submit Queue
f2335d33d6 Merge pull request #50713 from MrHohn/dump-master-log-fix
Automatic merge from submit-queue (batch tested with PRs 50713, 47660, 51198, 51159, 51195)

Dump installation and configuration logs for master

**What this PR does / why we need it**:
We are dumping out empty configuration and installation logs on master, see `kube-node-configuration.log` and `kube-node-installation.log` on http://gcsweb.k8s.io/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gce/12818/artifacts/bootstrap-e2e-master/.

I guess it is just because [we name the services on master differently](https://github.com/kubernetes/kubernetes/blob/v1.7.3/cluster/gce/gci/master.yaml#L4-L40)?

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #NONE

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-08-24 11:17:01 -07:00
Shyam Jeedigunta
d2b6705dc8 Add some debug statements to logdump script 2017-08-23 11:51:58 +02:00
Zihong Zheng
7654e6a9d6 Dump installation and configuration logs for master 2017-08-15 13:50:02 -07:00