Automatic merge from submit-queue (batch tested with PRs 41364, 40317, 41326, 41783, 41782)
Add ability to enable cache mutation detector in GCE
Add the ability to enable the cache mutation detector in GCE. The current default behavior (disabled) is retained.
When paired with https://github.com/kubernetes/test-infra/pull/1901, we'll be able to detect shared informer cache mutations in gce e2e PR jobs.
Automatic merge from submit-queue
Add standalone npd on GCI.
This PR added standalone NPD in GCE GCI cluster. I already verified the PR, and it should work.
/cc @dchen1107 @fabioy @andyxning @kubernetes/sig-node-misc
Automatic merge from submit-queue
Fix the output of health-mointor.sh
The script show prints the errors/response of the health check, but not
show the progress of `curl`.
Allow cache mutation detector enablement by PRs in an attempt to find
mutations before they're merged in to the code base. It's just for the
apiserver and controller-manager for now. If/when the other components
start using a SharedInformerFactory, we should set them up just like
this as well.
Automatic merge from submit-queue
Added configurable etcd initial-cluster-state to kube-up script.
Added configurable etcd initial-cluster-state to kube-up script. This
allows creation of multi-master cluster from scratch. This is a
cherry-pick of #41320 from 1.5 branch.
```release-note
Added configurable etcd initial-cluster-state to kube-up script.
```
Automatic merge from submit-queue (batch tested with PRs 40297, 41285, 41211, 41243, 39735)
cluster/gce: Add env var to enable apiserver basic audit log.
For now, this is focused on a fixed set of flags that makes the audit
log show up under /var/log/kube-apiserver-audit.log and behave similarly
to /var/log/kube-apiserver.log. Allowing other customization would
require significantly more complex changes.
Audit log rotation is handled the same as for `kube-apiserver.log`.
**What this PR does / why we need it**:
Add a knob to enable [basic audit logging](https://kubernetes.io/docs/admin/audit/) in GCE.
**Which issue this PR fixes**:
**Special notes for your reviewer**:
We would like to cherrypick/port this to release-1.5 also.
**Release note**:
```release-note
The kube-apiserver [basic audit log](https://kubernetes.io/docs/admin/audit/) can be enabled in GCE by exporting the environment variable `ENABLE_APISERVER_BASIC_AUDIT=true` before running `cluster/kube-up.sh`. This will log to `/var/log/kube-apiserver-audit.log` and use the same `logrotate` settings as `/var/log/kube-apiserver.log`.
```
For now, this is focused on a fixed set of flags that makes the audit
log show up under /var/log/kube-apiserver-audit.log and behave similarly
to /var/log/kube-apiserver.log. Allowing other customization would
require significantly more complex changes.
Audit log rotation is handled externally by the wildcard /var/log/*.log
already configured in configure-helper.sh.
Added configurable etcd initial-cluster-state to kube-up script. This
allows creation of multi-master cluster from scratch. This is a
cherry-pick of #41320 from 1.5 branch.
Automatic merge from submit-queue (batch tested with PRs 41121, 40048, 40502, 41136, 40759)
Remove deprecated kubelet flags that look safe to remove
Removes:
```
--config
--auth-path
--resource-container
--system-container
```
which have all been marked deprecated since at least 1.4 and look safe to remove.
```release-note
The deprecated flags --config, --auth-path, --resource-container, and --system-container were removed.
```
Automatic merge from submit-queue (batch tested with PRs 38739, 40480, 40495, 40172, 40393)
Use existing ABAC policy file when upgrading GCE cluster
When upgrading, continue loading an existing ABAC policy file so that existing system components continue working as-is
```
When upgrading an existing 1.5 GCE cluster using `cluster/gce/upgrade.sh`, an existing ABAC policy file located at /etc/srv/kubernetes/abac-authz-policy.jsonl (the default location in 1.5) will enable the ABAC authorizer in addition to the RBAC authorizer. To switch an upgraded 1.5 cluster completely to RBAC, ensure the control plane components and your superuser have been granted sufficient RBAC permissions, move the legacy ABAC policy file to a backup location, and restart the apiserver.
```
Automatic merge from submit-queue (batch tested with PRs 39260, 40216, 40213, 40325, 40333)
Fixed propagation of kube master certs during master replication.
Fixed propagation of kube-master-certs during master replication.
Automatic merge from submit-queue
Enable lazy initialization of ext3/ext4 filesystems
**What this PR does / why we need it**: It enables lazy inode table and journal initialization in ext3 and ext4.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#30752, fixes#30240
**Release note**:
```release-note
Enable lazy inode table and journal initialization for ext3 and ext4
```
**Special notes for your reviewer**:
This PR removes the extended options to mkfs.ext3/mkfs.ext4, so that the defaults (enabled) for lazy initialization are used.
These extended options come from a script that was historically located at */usr/share/google/safe_format_and_mount* and later ported to GO so this dependency to the script could be removed. After some search, I found the original script here: https://github.com/GoogleCloudPlatform/compute-image-packages/blob/legacy/google-startup-scripts/usr/share/google/safe_format_and_mount
Checking the history of this script, I found the commit [Disable lazy init of inode table and journal.](4d7346f7f5). This one introduces the extended flags with this description:
```
Now that discard with guaranteed zeroing is supported by PD,
initializing them is really fast and prevents perf from being affected
when the filesystem is first mounted.
```
The problem is, that this is not true for all cloud providers and all disk types, e.g. Azure and AWS. I only tested with magnetic disks on Azure and AWS, so maybe it's different for SSDs on these cloud providers. The result is that this performance optimization dramatically increases the time needed to format a disk in such cases.
When mkfs.ext4 is told to not lazily initialize the inode tables and the check for guaranteed zeroing on discard fails, it falls back to a very naive implementation that simply loops and writes zeroed buffers to the disk. Performance on this highly depends on free memory and also uses up all this free memory for write caching, reducing performance of everything else in the system.
As of https://github.com/kubernetes/kubernetes/issues/30752, there is also something inside kubelet that somehow degrades performance of all this. It's however not exactly known what it is but I'd assume it has something to do with cgroups throttling IO or memory.
I checked the kernel code for lazy inode table initialization. The nice thing is, that the kernel also does the guaranteed zeroing on discard check. If it is guaranteed, the kernel uses discard for the lazy initialization, which should finish in a just few seconds. If it is not guaranteed, it falls back to using *bio*s, which does not require the use of the write cache. The result is, that free memory is not required and not touched, thus performance is maxed and the system does not suffer.
As the original reason for disabling lazy init was a performance optimization and the kernel already does this optimization by default (and in a much better way), I'd suggest to completely remove these flags and rely on the kernel to do it in the best way.
Automatic merge from submit-queue (batch tested with PRs 39694, 39383, 39651, 39691, 39497)
Bump container-linux and gci timeout for docker health check
The command `docker ps` can take longer time to respond under heavy load or
when encountering some known issues. In these cases, the containers are running
fine, so aggressive health check could cause serious disruption. Bump the
timeout to 60s to be consistent with the debian-based containerVM.
This addresses #38588
The command `docker ps` can take longer time to respond under heavy load or
when encountering some known issues. In these cases, the containers are running
fine, so aggressive health check could cause serious disruption. Bump the
timeout to 60s to be consistent with the debian-based containerVM.
Automatic merge from submit-queue
Remove all MAINTAINER statements in the codebase as they are deprecated
**What this PR does / why we need it**:
ref: https://github.com/docker/docker/pull/25466
**Release note**:
```release-note
Remove all MAINTAINER statements in Dockerfiles in the codebase as they are deprecated by docker
```
@ixdy @thockin (who else should be notified?)
Automatic merge from submit-queue (batch tested with PRs 38294, 37009, 36778, 38130, 37835)
Only configure basic auth on gci if KUBE_USER and KUBE_PASSWORD are specified.
This should not change the existing flow when KUBE_USER/KUBE_PASSWORD are specified.
It makes not specifying those a valid option that means "don't turn on basic auth".
I only did it for cluster/gce/gci for now, but others should be somewhat similar.
Automatic merge from submit-queue
add a configuration for kubelet to register as a node with taints
and deprecate --register-schedulable
ref #28687#29178
cc @dchen1107 @davidopp @roberthbailey
Automatic merge from submit-queue (batch tested with PRs 37692, 37785, 37647, 37941, 37856)
Use unified gcp fluentd image for gci and cvm
Follow-up of https://github.com/kubernetes/kubernetes/pull/37681
Actually unify the pod specs for CVM and GCI, to simplify the configuration
CC @piosz