Register the kubelet on the master node with an apiserver. This option is
separated from the apiserver running locally on the master node so that it can be optionally enabled or disabled as needed. Also, fix the healthchecking configuration for the master components, which was previously only working by coincidence: If a kubelet doesn't register with a master, it never bothers to figure out what its local address is. In which case it ends up constructing a URL like http://:8080/healthz for the http probe. This happens to work on the master because all of the pods are using host networking and explicitly binding to 127.0.0.1. Once the kubelet is registered with the master and it determines the local node address, it tries to healthcheck on an address where the pod isn't listening and the kubelet periodically restarts each master component when the liveness probe fails.
This commit is contained in:
@@ -27,6 +27,10 @@ source "${KUBE_ROOT}/cluster/kube-util.sh"
|
||||
MINIONS_FILE=/tmp/minions-$$
|
||||
trap 'rm -rf "${MINIONS_FILE}"' EXIT
|
||||
|
||||
EXPECTED_NUM_NODES="${NUM_MINIONS}"
|
||||
if [[ "${REGISTER_MASTER_KUBELET:-}" == "true" ]]; then
|
||||
EXPECTED_NUM_NODES=$((EXPECTED_NUM_NODES+1))
|
||||
fi
|
||||
# Make several attempts to deal with slow cluster birth.
|
||||
attempt=0
|
||||
while true; do
|
||||
@@ -38,20 +42,24 @@ while true; do
|
||||
# Echo the output, strip the first line, then gather 2 counts:
|
||||
# - Total number of nodes.
|
||||
# - Number of "ready" nodes.
|
||||
"${KUBE_ROOT}/cluster/kubectl.sh" get nodes > "${MINIONS_FILE}" || true
|
||||
#
|
||||
# Suppress errors from kubectl output because during cluster bootstrapping
|
||||
# for clusters where the master node is registered, the apiserver will become
|
||||
# available and then get restarted as the kubelet configures the docker bridge.
|
||||
"${KUBE_ROOT}/cluster/kubectl.sh" get nodes > "${MINIONS_FILE}" 2> /dev/null || true
|
||||
found=$(cat "${MINIONS_FILE}" | sed '1d' | grep -c .) || true
|
||||
ready=$(cat "${MINIONS_FILE}" | sed '1d' | awk '{print $NF}' | grep -c '^Ready') || true
|
||||
|
||||
if (( ${found} == "${NUM_MINIONS}" )) && (( ${ready} == "${NUM_MINIONS}")); then
|
||||
if (( "${found}" == "${EXPECTED_NUM_NODES}" )) && (( "${ready}" == "${EXPECTED_NUM_NODES}")); then
|
||||
break
|
||||
else
|
||||
# Set the timeout to ~10minutes (40 x 15 second) to avoid timeouts for 100-node clusters.
|
||||
if (( attempt > 40 )); then
|
||||
echo -e "${color_red}Detected ${ready} ready nodes, found ${found} nodes out of expected ${NUM_MINIONS}. Your cluster may not be working.${color_norm}"
|
||||
echo -e "${color_red}Detected ${ready} ready nodes, found ${found} nodes out of expected ${EXPECTED_NUM_NODES}. Your cluster may not be working.${color_norm}"
|
||||
cat -n "${MINIONS_FILE}"
|
||||
exit 2
|
||||
else
|
||||
echo -e "${color_yellow}Waiting for ${NUM_MINIONS} ready nodes. ${ready} ready nodes, ${found} registered. Retrying.${color_norm}"
|
||||
echo -e "${color_yellow}Waiting for ${EXPECTED_NUM_NODES} ready nodes. ${ready} ready nodes, ${found} registered. Retrying.${color_norm}"
|
||||
fi
|
||||
attempt=$((attempt+1))
|
||||
sleep 15
|
||||
|
Reference in New Issue
Block a user