To enable rate limiting, needed for GA graduation,
we need to pass more parameters to the already crowded
`ListenAndServePodresources` function.
To tidy up a bit, pack the parameters in a helper struct,
with no intended changes in behavior.
Signed-off-by: Francesco Romani <fromani@redhat.com>
* Update pod_container_manager_linux.go
This is a simple optimization to reduce repeated invoking of the GetPodContainerName function.
* Update pod_container_manager_linux.go
将podContainerName, _ := m.GetPodContainerName(pod)更靠近使用podcontainerName变量的位置
Now that the endpoint update fields have names that make it clear that
they only contain UDP objects, it's obvious that the "protocol == UDP"
checks in the iptables and ipvs proxiers were no-ops, so remove them.
Rather than calling fp.deleteEndpointConnection() directly, set up the
proxy to have syncProxyRules() call it, so that we are testing it in
the way that it actually gets called.
Squash the IPv4 and IPv6 unit tests together so we don't need to
duplicate all that code. Fix a tiny bug in NewFakeProxier() found
while doing this...
The APIs talked about "stale services" and "stale endpoints", but the
thing that is actually "stale" is the conntrack entries, not the
services/endpoints. Fix the names to indicate what they actual keep
track of.
Also, all three fields (2 in the endpoints update object and 1 in the
service update object) are currently UDP-specific, but only the
service one made that clear. Fix that too.
This commit did not actually work; in between when it was first
written and tested, and when it merged, the code in
pkg/proxy/endpoints.go was changed to only add UDP endpoints to the
"stale endpoints"/"stale services" lists, and so checking for "either
UDP or SCTP" rather than just UDP when processing those lists had no
effect.
This reverts most of commit aa8521df66
(but leaves the changes related to
ipvs.IsRsGracefulTerminationNeeded() since that actually did have the
effect it meant to have).
Provide an administrator a streaming view of journal logs on Linux
systems using journalctl, and event logs on Windows systems using the
Get-WinEvent PowerShell cmdlet without them having to implement a client
side reader.
Only available to cluster admins.
The implementation for journald on Linux was originally done by Clayton
Coleman.
Introduce a heuristics approach to query logs
The logs query for node objects will follow a heuristics approach
when asked to query for logs from a service. If asked to get the
logs from a service foobar, it will first check if foobar logs to the
native OS service log provider. If unable to get logs from these, it
will attempt to get logs from /var/foobar, /var/log/foobar.log or
/var/log/foobar/foobar.log in that order.
The logs sub-command can also directly serve a file if the query looks
like a file.
Co-authored-by: Clayton Coleman <ccoleman@redhat.com>
Co-authored-by: Christian Glombek <cglombek@redhat.com>
Added EnableNodeLogQuery field to kubelet/apis/config/types.go and
staging/src/k8s.io/kubelet/config/v1beta1/types.go, then executed.
`hack/update-codegen.sh`.
This new field will default to off and will need to be explicitly
enabled in addition to the NodeLogQuery gate to use the feature.
Kubelet in standalone mode won't have kubeclient, it cannot get node.status
and get devices from it. Such a kubelet cannot mount attachable volumes
anyway.
The generated ResourceClaim name and the names of the ResourceClaimTemplate and
ResourceClaim referenced by a pod must be valid according to the resource API,
otherwise the pod cannot start.
Checking this was removed from the original implementation out of concerns
about validating fields in core against limitations imposed by a separate,
alpha API. But as this was pointed out again in
https://github.com/kubernetes/kubernetes/pull/116254#discussion_r1134010324
it gets added back.
The same strings that worked before still work now. In particular, the
constraints for a spec.resourceClaim.name are still the same (DNS label).
The name "PodScheduling" was unusual because in contrast to most other names,
it was impossible to put an article in front of it. Now PodSchedulingContext is
used instead.
Annotation
As part of this change, kube-proxy accepts any value for either
annotation that is not "disabled".
Change-Id: Idfc26eb4cc97ff062649dc52ed29823a64fc59a4
SyncKnownPods began triggering UpdatePod() for pods that have been
orphaned by desired config to ensure pods run to termination. This
test reads a mutex protected value while pod workers are running
in the background and as a consequence triggers a data race.
Wait for the workers to stabilize before reading the value. Other
tests validate that the correct sync events are triggered (see
kubelet_pods_test.go#TestKubelet_HandlePodCleanups for full
verification of this behavior).
It is slightly concerning that I was unable to recreate the race
locally even under stress testing, but I cannot identify why.
To that end, we need to add one kubelet getter listPodsFromDisk(). Other
than that, it is a pretty trivial move.
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
Latest changes to KEP-127 removed that phase, so let's stop reserving
those IDs for that.
While we are there, we replace 0 for 0*65536 as before we had a bug that
we were not multiplying the index, to avoid bugs in the future.
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
Now KEP-127 relies on idmap mounts to do the ID translation and we won't
do any chowns in the kubelet.
This patch just removes the usage of GetHostIDsForPod() in
operationexecutor to do the chown, and also removes the
GetHostIDsForPod() method from the kubelet volume interface.
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
Right now, the v1alpha1 API only passes enough information for one plugin to
process a claim, but the v1alpha2 API will allow for multiple plugins to
process a claim. This commit prepares the code for this upcoming change.
Signed-off-by: Kevin Klues <kklues@nvidia.com>
* add timeouts for communication with dra plugin
* move timeout constant to k8s.io/kubernetes/pkg/kubelet/cm/util
* move settings of timeout to pkg/kubelet/plugin/dra/plugin/client.go
* remove timeout constant
Implement DOS prevention wiring a global rate limit for podresources
API. The goal here is not to introduce a general ratelimiting solution
for the kubelet (we need more research and discussion to get there),
but rather to prevent misuse of the API.
Known limitations:
- the rate limits value (QPS, BurstTokens) are hardcoded to
"high enough" values.
Enabling user-configuration would require more discussion
and sweeping changes to the other kubelet endpoints, so it
is postponed for now.
- the rate limiting is global. Malicious clients can starve other
clients consuming the QPS quota.
Add e2e test to exercise the flow, because the wiring itself
is mostly boilerplate and API adaptation.
The pod_logs subsystem was inadvertently made redundant in the following
kube-apiserver metrics:
- kube_apiserver_pod_logs_pods_logs_backend_tls_failure_total
- kube_apiserver_pod_logs_pods_logs_insecure_backend_total
To safely rename them, it is required to deprecate them in 1.27 whilst
introducing the new metrics replacing them.
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
* scheduler(NodeResourcesFit): calculatePodResourceRequest in PreScore phase
* scheduler(NodeResourcesFit and NodeResourcesBalancedAllocation): calculatePodResourceRequest in PreScore phase
* modify the comments and tests.
* revert the tests.
* don't need consider nodes.
* use list instead of map.
* add comment for podRequests.
* avoid using negative wording in variable names.
DesiredStateOfWorld must remember both
- the effective SELinux label to apply as a mount option (non-empty for
RWOP volumes, empty otherwise)
- and the label that _would_ be used if the mount option would be used by
all access modes.
Mismatch warning metrics must be generated from the second label.
The checkpointing mechanism will repopulate DRA Manager in-memory cache on kubelet restart.
This will ensure that the information needed by the PodResources API is available across
a kubelet restart.
The ClaimInfoState struct represent the DRA Manager in-memory cache state in checkpoint.
It is embedd in the ClaimInfo which also include the annotation field. The separation between
the in-memory cache and the cache state in the checkpoint is so we won't be tied to the in-memory
cache struct which may change in the future. In the ClaimInfoState we save the minimal required fields
to restore the in-memory cache.
Signed-off-by: Moshe Levi <moshele@nvidia.com>
To enable rate limiting, needed for GA graduation,
we need to pass more parameters to the already crowded
`ListenAndServePodresources` function.
To tidy up a bit, pack the parameters in a helper struct,
with no intended changes in behavior.
Signed-off-by: Francesco Romani <fromani@redhat.com>