Commit Graph

45875 Commits

Author SHA1 Message Date
Patrick Ohly
d2ff210c20 scheduler: add dynamic resource allocation plugin
The plugin handles the interaction with ResourceClaims that are referenced by a
Pod.
2022-11-11 21:58:03 +01:00
Patrick Ohly
0133df3929 kube-controller-manager: add ResourceClaim controller
The controller uses the exact same logic as the generic ephemeral inline volume
controller, just for inline ResourceClaimTemplate -> ResourceClaim.

In addition, it supports removal of pods from the ReservedFor field when those
pods are known to not need the claim anymore. At the moment, only this special
case is supported. Removal of arbitrary objects would imply granting full read
access to all types to determine whether a) an object is gone and b) if the
current incarnation is the one which is listed in ReservedFor. This may get
added later.
2022-11-10 20:23:50 +01:00
Patrick Ohly
b87530af4f kube-controller-manager: clone resource controller from volume/ephemeral 2022-11-10 20:23:50 +01:00
Patrick Ohly
8018ab7cd9 api: fully validate PotentialNodes and SuitableNodes
This is in response to review feedback. Checking for valid node names and the
set property catches programming mistakes in the components that have write
permission.
2022-11-10 20:23:50 +01:00
Patrick Ohly
5c5e060fb8 api: implement printers for dynamic resource allocation
This is needed for "kubectl get". It depends on the generated swagger docs.
2022-11-10 20:22:47 +01:00
Patrick Ohly
9683c60c05 api: generated files 2022-11-10 20:22:42 +01:00
Patrick Ohly
5cca60f0b8 api: dynamic resource allocation API
This adds a new resource.k8s.io API group with v1alpha1 as version. It contains
four new types: resource.ResourceClaim, resource.ResourceClass, resource.ResourceClaimTemplate, and
resource.PodScheduling.
2022-11-10 20:08:24 +01:00
Patrick Ohly
7d11b422e3 api: add resource claims to core API
The resource.k8s.io/ClaimTemplate only gets referenced by name, therefore the
changes to the core API are limited.
2022-11-10 20:08:24 +01:00
Patrick Ohly
155d49813f kube features: add DynamicResourceAllocation 2022-11-10 20:08:24 +01:00
Kubernetes Prow Robot
d94261e904 Merge pull request #113186 from ttakahashi21/KEP-3294
Introduce APIs to support CrossNamespaceSourceProvisioning
2022-11-10 08:06:54 -08:00
Cici Huang
2973712486 Rename FG to ValidatingAdmissionPolicy 2022-11-10 03:37:35 +00:00
Cici Huang
40c21dafcd Rename admission cel package to validatingadmissionpolicy 2022-11-10 03:37:30 +00:00
Kubernetes Prow Robot
2c1b7f5759 Merge pull request #112618 from jingyuanliang/fastStatusUpdateOnce
kubelet: Keep trying fast status update at startup until node is ready
2022-11-09 13:30:53 -08:00
Takafumi Takahashi
cb12a2bc51 Generate code 2022-11-09 21:21:52 +00:00
Takafumi Takahashi
87c1ca88d4 Add API and validation for CrossNamespaceVolumeDataSource 2022-11-09 20:58:25 +00:00
Kubernetes Prow Robot
623376bc82 Merge pull request #113788 from PiotrProkop/fix-discovering-numa-distance
Fix discovering numa distance when node ids are not starting from 0 or it's ids are not sequential
2022-11-09 12:22:53 -08:00
Kubernetes Prow Robot
ff19efdf9b Merge pull request #112744 from pwschuurman/statefulset-slice-impl
Add implementation of KEP-3335, StatefulSetSlice
2022-11-09 11:12:28 -08:00
Kubernetes Prow Robot
c84e920a48 Merge pull request #113786 from sanposhiho/revert-prefilter-skip
Revert "feature(scheduler): won't run Filter if PreFilter returned a Skip status"
2022-11-09 10:08:13 -08:00
PiotrProkop
540b5bd308 [topologymanager] rely on Cadvisor to calculate NUMA distance
Signed-off-by: PiotrProkop <pprokop@nvidia.com>
2022-11-09 17:52:14 +01:00
PiotrProkop
315f0dc6f1 Fix discovering numa distance when node ids are not starting from 0 or their ids are not sequential
Signed-off-by: PiotrProkop <pprokop@nvidia.com>
2022-11-09 17:52:08 +01:00
Kubernetes Prow Robot
e9ef6ee8b3 Merge pull request #113754 from logicalhan/kubelet-metrics
fix credential provider metric names
2022-11-09 08:31:00 -08:00
Kubernetes Prow Robot
7e0e0c8ec3 Merge pull request #113360 from mimowo/handling-pod-failures-beta-enable
Enable the "Retriable and non-retriable pod failures for jobs" feature into beta
2022-11-09 08:30:24 -08:00
Kubernetes Prow Robot
a7117b716b Merge pull request #112344 from zlabjp/fix-invalid-attach-limit
Fix incorrect "Invalid attach limit" error when maxAttachLimit is 0
2022-11-09 08:30:13 -08:00
Jingyuan Liang
9f5c5b82a9 kubelet: Keep trying fast status update at startup until node is ready 2022-11-09 15:55:20 +00:00
Jingyuan Liang
4a50fc4b8c kubelet: Refactor tryUpdateNodeStatus() into smaller functions 2022-11-09 15:52:04 +00:00
Kensei Nakada
f3868abfed Revert "feature(scheduler): won't run Filter if PreFilter returned a Skip status"
This reverts commit 786be73b4b.
2022-11-09 11:55:33 +00:00
Kubernetes Prow Robot
70263d55b2 Merge pull request #113501 from pacoxu/fix-startReflector
kubelet: fix nil pointer in startReflector for standalone mode
2022-11-09 03:50:12 -08:00
Kubernetes Prow Robot
1193a9abcb Merge pull request #113485 from MikeSpreitzer/apf-borrowing
Add borrowing between priority levels in APF
2022-11-09 01:40:12 -08:00
Michal Wozniak
b3e9d8ef4c Cleanup the default_preemption_test by indexing the potential victim pods 2022-11-09 10:26:08 +01:00
Michal Wozniak
c803892bd8 Enable the feature into beta 2022-11-09 09:02:40 +01:00
Mike Spreitzer
feb4227788 apiserver: finish implementation of borrowing in APF
Also make some design changes exposed in testing and review.

Do not remove the ambiguous old metric
`apiserver_flowcontrol_request_concurrency_limit` because reviewers
though it is too early.  This creates a problem, that metric can not
keep both of its old meanings.  I chose the configured concurrency
limit.

Testing has revealed a design flaw, which concerns the initialization
of the seat demand state tracking.  The current design in the KEP is
as follows.

> Adjustment is also done on configuration change … For a newly
> introduced priority level, we set HighSeatDemand, AvgSeatDemand, and
> SmoothSeatDemand to NominalCL-LendableSD/2 and StDevSeatDemand to
> zero.

But this does not work out well at server startup.  As part of its
construction, the APF controller does a configuration change with zero
objects read, to initialize its request-handling state.  As always,
the two mandatory priority levels are implicitly added whenever they
are not read.  So this initial reconfig has one non-exempt priority
level, the mandatory one called catch-all --- and it gets its
SmoothSeatDemand initialized to the whole server concurrency limit.
From there it decays slowly, as per the regular design.  So for a
fairly long time, it appears to have a high demand and competes
strongly with the other priority levels.  Its Target is higher than
all the others, once they start to show up.  It properly gets a low
NominalCL once other levels show up, which actually makes it compete
harder for borrowing: it has an exceptionally high Target and a rather
low NominalCL.

I have considered the following fix.  The idea is that the designed
initialization is not appropriate before all the default objects are
read.  So the fix is to have a mode bit in the controller.  In the
initial state, those seat demand tracking variables are set to zero.
Once the config-producing controller detects that all the default
objects are pre-existing, it flips the mode bit.  In the later mode,
the seat demand tracking variables are initialized as originally
designed.

However, that still gives preferential treatment to the default
PriorityLevelConfiguration objects, over any that may be added later.

So I have made a universal and simpler fix: always initialize those
seat demand tracking variables to zero.  Even if a lot of load shows
up quickly, remember that adjustments are frequent (every 10 sec) and
the very next one will fully respond to that load.

Also: revise logging logic, to log at numerically lower V level when
there is a change.

Also: bug fix in float64close.

Also, separate imports in some file

Co-authored-by: Han Kang <hankang@google.com>
2022-11-08 21:51:44 -08:00
Kubernetes Prow Robot
e62cfabf93 Merge pull request #112050 from nilekhc/kms-hot-reload
Implements hot reload of the KMS `EncryptionConfiguration`
2022-11-08 17:24:12 -08:00
Peter Schuurman
c492a6d69b Undo unintentional documentation comment change 2022-11-08 16:53:48 -08:00
Paco Xu
1b71dc77f2 linux: fix kubelet start unit test 2022-11-09 07:17:05 +08:00
Kubernetes Prow Robot
b4040b3b86 Merge pull request #113609 from haircommander/sandbox-metrics
kubelet: add support for broadcasting metrics from CRI
2022-11-08 15:08:26 -08:00
Kubernetes Prow Robot
d619f60e0f Merge pull request #113442 from Huang-Wei/kep-3521-C
[KEP-3521] Part 3: Bug fixes, integration & E2E Test
2022-11-08 15:08:15 -08:00
Kubernetes Prow Robot
d93ea2557f Merge pull request #111908 from dengyufeng2206/new-test0818
spelling fix
2022-11-08 13:51:20 -08:00
Kubernetes Prow Robot
694698ca38 Merge pull request #110485 from Octopusjust/k8s-pr
cidr_set.go :  fix several typo
2022-11-08 13:51:00 -08:00
Nilekh Chaudhari
761b7822fc feat: implements kms encryption config hot reload
This change enables hot reload of encryption config file when api server
flag --encryption-provider-config-automatic-reload is set to true. This
allows the user to change the encryption config file without restarting
kube-apiserver. The change is detected by polling the file and is done
by using fsnotify watcher. When file is updated it's process to generate
new set of transformers and close the old ones.

Signed-off-by: Nilekh Chaudhari <1626598+nilekhc@users.noreply.github.com>
2022-11-08 21:47:59 +00:00
Abu Kashem
424b23bb15 apiserver: fix defaulting for apf bootstrap configuration 2022-11-08 13:23:09 -08:00
Abu Kashem
c5520d6ba2 apiserver: validate borrowing for flowcontrol API 2022-11-08 13:23:07 -08:00
Abu Kashem
ca949d5188 apiserver: set borrowing defaults for flowcontrol API 2022-11-08 13:22:59 -08:00
Abu Kashem
a76223f8da apiserver: add generated files for borrowing in flowcontrol 2022-11-08 13:16:44 -08:00
Abu Kashem
a7e84a4537 apiserver: add fields for borrowing in apf flowcontrol 2022-11-08 13:16:44 -08:00
Han Kang
a09c6f6ca9 fix credential provider metric names
Change-Id: Idccdf419d53b04f1d8a1968f554a0b6ef32ab992
2022-11-08 12:59:53 -08:00
Kubernetes Prow Robot
3a99a5954d Merge pull request #113629 from andrewsykim/apiserver-identity-beta
Promote APIServerIdentity to Beta
2022-11-08 12:43:10 -08:00
Kubernetes Prow Robot
da735b5415 Merge pull request #113596 from jsafrane/selinux-reconstruction
Reconstruct SELinux  mount label
2022-11-08 12:43:03 -08:00
Kubernetes Prow Robot
b3082c5e5b Merge pull request #113582 from wzshiming/fix/grpc-probe-log
Fix grpc probe log
2022-11-08 12:42:56 -08:00
Peter Hunt
95489a26d6 kubelet: add cri metrics to server
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2022-11-08 14:47:08 -05:00
Peter Hunt
1a7388c2ef kubelet/metrics: add cri_metrics
that pulls metrics from the CRI

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2022-11-08 14:47:08 -05:00