CSI volume plugin creates number of files/directories when processing block
volumes. These files must be cleaned when the plugin is done with the
volume, i.e. at the end on TearDownDevice().
This combines container names into a single list because separating them
into a long, variable length string isn't particularly useful in the
context of an streaming error message.
- Add handlers for service account issuer metadata.
- Add option to manually override JWKS URI.
- Add unit and integration tests.
- Add a separate ServiceAccountIssuerDiscovery feature gate.
Additional notes:
- If not explicitly overridden, the JWKS URI will be based on
the API server's external address and port.
- The metadata server is configured with the validating key set rather
than the signing key set. This allows for key rotation because tokens
can still be validated by the keys exposed in the JWKs URL, even if the
signing key has been rotated (note this may still be a short window if
tokens have short lifetimes).
- The trust model of OIDC discovery requires that the relying party
fetch the issuer metadata via HTTPS; the trust of the issuer metadata
comes from the server presenting a TLS certificate with a trust chain
back to the from the relying party's root(s) of trust. For tests, we use
a local issuer (https://kubernetes.default.svc) for the certificate
so that workloads within the cluster can authenticate it when fetching
OIDC metadata. An API server cannot validly claim https://kubernetes.io,
but within the cluster, it is the authority for kubernetes.default.svc,
according to the in-cluster config.
Co-authored-by: Michael Taufen <mtaufen@google.com>
This allows the proxier to cache local addresses instead of fetching all
local addresses every time in IsLocalIP.
Signed-off-by: Andrew Sy Kim <kiman@vmware.com>
This avoids fetching all local network interfaces everytime we sync an
external IP. For clusters with many external IPs this gets really
expensive. This change caches all local addresses once per sync.
Signed-off-by: Andrew Sy Kim <kiman@vmware.com>
This avoids fetching all local network interfaces everytime we sync an
external IP. For clusters with many external IPs this gets really
expensive. This change caches all local addresses once per sync.
Signed-off-by: Andrew Sy Kim <kiman@vmware.com>
kube-proxy, if is configured with an IP family, filters out the
incorrect IP version of the services.
This commit fix a bug caused by not filtering out the IPs in the
LoadBalancer Status Ingress field.
During EndpointSlice reconcilation, EndpointSliceTracker is supposed to
track expected EndpointSlice resource versions so that external changes
to them can be detected. But it actually tracked the stale resource
version and resulted in every Service was handled twice as it always
received an EndpointSlice update with a different resource version but
was actually created/updated by itself during the first processing.
This change will not work on its own. Higher level code needs to make
sure and call Allocate() before AddContainer is called. This is already
being done in cases when the TopologyManager feature gate is enabled (in
the PodAdmitHandler of the TopologyManager). However, we need to make
sure we add proper logic to call it in cases when the TopologyManager
feature gate is disabled.
This change will not work on its own. Higher level code needs to make
sure and call Allocate() before AddContainer is called. This is already
being done in cases when the TopologyManager feature gate is enabled (in
the PodAdmitHandler of the TopologyManager). However, we need to make
sure we add proper logic to call it in cases when the TopologyManager
feature gate is disabled.
Having this interface allows us to perform a tight loop of:
for each container {
containerHints = {}
for each provider {
containerHints[provider] = provider.GatherHints(container)
}
containerHints.MergeAndPublish()
for each provider {
provider.Allocate(container)
}
}
With this in place we can now be sure that the hints gathered in one
iteration of the loop always consider the allocations made in the
previous.
Instead of having a single call for Allocate(), we now split this into two
functions Allocate() and UpdatePluginResources().
The semantics split across them:
// Allocate configures and assigns devices to a pod. From the requested
// device resources, Allocate will communicate with the owning device
// plugin to allow setup procedures to take place, and for the device
// plugin to provide runtime settings to use the device (environment
// variables, mount points and device files).
Allocate(pod *v1.Pod) error
// UpdatePluginResources updates node resources based on devices already
// allocated to pods. The node object is provided for the device manager to
// update the node capacity to reflect the currently available devices.
UpdatePluginResources(
node *schedulernodeinfo.NodeInfo,
attrs *lifecycle.PodAdmitAttributes) error
As we move to a model in which the TopologyManager is able to ensure
aligned allocations from the CPUManager, devicemanger, and any
other TopologManager HintProviders in the same synchronous loop, we will
need to be able to call Allocate() independently from an
UpdatePluginResources(). This commit makes that possible.