Commit Graph

44631 Commits

Author SHA1 Message Date
Kevin Klues
155562dd2e Fix bug in TopologyManager with merging hints when NUM_NUMA > 2
Before this fix, hint permutations such as:

	permutation: [{11 true} {0101 true}]

Could result in merged hints of:

	mergedHint: {01 true}

This was possible because both hints in the permutation container a "preferred"
allocation (i.e. the full set of NUMA nodes set in the affinity bitmask are
*required* to satisfy the allocation). With this in place, the simplified logic
we had simply kept the merged hint as preferred as well.

However, what we really want is to ensure that the merged hint is only
preferred if *true* alignment of all resources is possible (i.e. if all hints
in the permutation are preferred AND their affinities are exactly equal).

The only exception to this is if *no* topology information is provided by a
given hint provider. In this case, we assume alignment doesn't matter and only
consider the resources that actually have hints provided for them.

This changes the semantics of permutations of the form:

	permutation: [{111 true} {011 true}]

To now result in the merged hint of:

	mergedHint: {011 false}

Instead of:

	mergedHint: {011 true}

This is arguably how it should always have been though (because a hint should
not be preferred if true alignment isn't possible), and two tests have had to
change to accomodate these new semantics.

This commit changes the merge function to implement the updated logic, adds a
test to verify it is functioning correctly, and updates the two tests mentioned
above to adjust to the new semantics.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2022-02-10 22:07:51 +00:00
Wojciech Tyczyński
7314286efd Fix validation of event updates 2022-02-10 20:01:45 +01:00
Kubernetes Prow Robot
542a979c03 Merge pull request #108029 from deads2k/just-runtimeconfig
update the --runtime-config handling to ensure that user preferences always take priority over hardcoded preferences
2022-02-10 10:15:57 -08:00
Kubernetes Prow Robot
973e77ceb1 Merge pull request #102330 from tnqn/replicaset-optimization
Add controllerUID index to improve ReplicaSetController performance
2022-02-10 10:15:46 -08:00
Sascha Grunert
effbcd3a0a Add support for CRI verbose fields
The remote runtime implementation now supports the `verbose` fields,
which are required for consumers like cri-tools to enable multi CRI
version support.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2022-02-10 17:12:26 +01:00
David Eads
41b2662bac update resourceconfig to have per-resource preferences take priority 2022-02-10 10:53:16 -05:00
Kubernetes Prow Robot
dd3a30fdbb Merge pull request #106398 from shawnhanx/controller_utils
should omit comparison to bool constant in pkg/controller/controller_utils.go
2022-02-10 07:17:46 -08:00
Ciprian Hacman
0819451ea6 Clean up logic for deprecated flag --container-runtime in kubelet
Signed-off-by: Ciprian Hacman <ciprian@hakman.dev>
2022-02-10 13:26:59 +02:00
Kubernetes Prow Robot
3b4a9cdfff Merge pull request #108007 from endocrimes/dani/cm-remove-docker
cm: Remove legacy docker references
2022-02-10 03:23:47 -08:00
Gunju Kim
eb4cd9ab4e Check taint/toleration before accepting pods, except for static pods 2022-02-10 19:39:26 +09:00
Kubernetes Prow Robot
518a3c2f70 Merge pull request #107108 from linxiulei/fix_pid
Read number of running processes from /proc/loadavg.
2022-02-10 01:15:47 -08:00
Mengjiao Liu
b3444cd7e8 Remove feature gate SetHostnameAsFQDN 2022-02-10 15:16:01 +08:00
Kubernetes Prow Robot
40c2d04946 Merge pull request #107112 from linxiulei/fix_pidmax
Consider threads-max when deciding MaxPID.
2022-02-09 20:49:45 -08:00
Kante
62eb70c1b3 reuse InformerFactory in scheduler tests (#107835)
* reuse informer in scheduler tests

Signed-off-by: kerthcet <kerthcet@gmail.com>

* reduce construct two informers

Signed-off-by: kerthcet <kerthcet@gmail.com>

* instantiate formerfacotry error

Signed-off-by: kerthcet <kerthcet@gmail.com>
2022-02-09 16:53:58 -08:00
Kubernetes Prow Robot
0dcd6eaa0d Merge pull request #103934 from boenn/tainttoleration
De-duplicate predicate (known as filter now) logic shared in kubelet and scheduler
2022-02-09 16:53:46 -08:00
Kubernetes Prow Robot
cfb2219ded Merge pull request #107175 from roycaihw/doc/webhook-rule-validation
Fix examples of admission registration rules that contain wildcards
2022-02-09 15:35:44 -08:00
Kubernetes Prow Robot
e74c42aaf2 Merge pull request #107880 from liggitt/kubectl-auth-token
Add command to request a bound service account token
2022-02-09 14:10:01 -08:00
Jordan Liggitt
19d71bb5d5 Validate and populate metadata fields in token request 2022-02-09 14:05:53 -05:00
haoyun
81b357295c feat: add container name when violate quota constraints
Signed-off-by: haoyun <hy352144278@gmail.com>
Co-authored-by: chenwen  <wen.chen@daocloud.io>
2022-02-09 19:17:44 +08:00
He Xiaoxi
ae9f117387 Remove tolerate-unready-endpoints annotation
Remove `tolerate-unready-endpoints` annotation in Service deprecated

from 1.11, use `Service.spec.publishNotReadyAddresses` instead.

Signed-off-by: He Xiaoxi <tossmilestone@gmail.com>
2022-02-09 17:17:29 +08:00
Kubernetes Prow Robot
97e20b4ecb Merge pull request #108001 from denkensk/check-activeq-len
check activeQ.Len() before Pop()
2022-02-08 16:06:28 -08:00
Kubernetes Prow Robot
0e31414f3e Merge pull request #107974 from sanposhiho/log-extender-error
Add log for the error extender returns
2022-02-08 16:06:16 -08:00
Kubernetes Prow Robot
8d01b02c60 Merge pull request #107096 from hakman/remove_non-masquerade-cidr
Remove deprecated flag --non-masquerade-cidr in kubelet
2022-02-08 12:42:50 -08:00
Kubernetes Prow Robot
24e5d1fdb7 Merge pull request #107432 from denkensk/graduate-nonpreemptingpriority-to-ga
Graduate NonPreemptingPriority to GA
2022-02-08 11:05:03 -08:00
Kubernetes Prow Robot
6a5b3da1d8 Merge pull request #107236 from cyclinder/fix_bug_WaitForAttach
GCEPD: fix incorrect return value in WaitForAttach
2022-02-08 09:59:02 -08:00
Danielle Lancashire
3630328fd9 eviction: Deflake TestStart
TestStart was previously flaky. In approx 100_000 local runs, it failed
about 70% of the time, and has been mentioned as a flaky unit test in
the past.

This flake was due to a race condition with the logic as written and the
go scheduler. UpdateThreshold calls `notifier.Start(events)` in a new Go
Routine, which is not guarunteed to be called immediately.

This meant that if `m.Start()` was called before `notifier.Start()`, the
test would fail, as the notifier would not have been started before the
4 events were processed and lock released.

Here, we update the test to more closely match the intended application
behaviour, and have events passed to the channel when `Start` is called
on the notifier.

This ensures that -Start gets called and additionally validates
that the correct channel is provided to the notifier.

Stop was never called previously, as it only gets called on a subsequent
call to UpdateThreshold. `AnyTimes()` hid that this did not occur.
2022-02-08 17:03:44 +01:00
Danielle Lancashire
c198062da4 cm: Remove legacy docker references
Dockershim and built-in Docker support are gone. Cleans up dead code
references to them.
2022-02-08 16:25:04 +01:00
Jorik Jonker
27b8f13763 kubelet: expose OOM metrics
cAdvisor has code to expose OOM metrics since 0.40.0, but this was not
included in Kubelet so far. This commit enables it.

Signed-off-by: Jorik Jonker <jorik.jonker@eu.equinix.com>
2022-02-08 12:24:25 +01:00
Alex Wang
541907334e graduate nonpreemptingpriority to ga 2022-02-08 18:11:23 +08:00
Alex Wang
ca50e459b0 check activeQ len before pop 2022-02-08 18:05:05 +08:00
Kensei Nakada
1ac9444c00 Change the wordings and set the log level
Co-authored-by: Alex Wang <453102040@qq.com>
2022-02-08 11:16:42 +09:00
Patrick Ohly
1f341ee7b5 kube-scheduler: downgrade namespace log message from "error" to "info"
GetNamespaceLabelsSnapshot has a fallback when it gets errors when looking up a
namespace, therefore reporting the error is more informational than a real
error. In particular, not finding the namespace is normal when running
test/integration/scheduler_perf and happens so frequently that there is a lot
of output on stderr:

E0120 12:19:09.204768   95305 plugin.go:138] "getting namespace, assuming empty set of namespace labels" err="namespace \"namespace-1\" not found" namespace="namespace-1"
2022-02-07 08:59:19 +01:00
sanposhiho
ed23e2162a Add log to see the extender's error 2022-02-07 02:50:08 +09:00
Jordan Liggitt
3a132bd206 Fix kubelet cri round trip test 2022-02-05 17:59:29 -05:00
Kubernetes Prow Robot
9b09612d1b Merge pull request #107656 from dims/add-labels-when-there-are-sig-aliases-used-in-approvers-reviewers
Add labels when there sig aliases used in approvers/reviewers
2022-02-05 02:20:50 -08:00
Kubernetes Prow Robot
6410ddaba9 Merge pull request #107623 from bbarnes52/podtopologyoptimization
Optimize pod topology spread performance
2022-02-04 15:58:58 -08:00
Kubernetes Prow Robot
469c4c4a30 Merge pull request #106715 from aojea/dual_hostnet_pods
set secondary address on host-network pods
2022-02-04 12:17:30 -08:00
atiratree
2fc401d0a2 add gc metrics and collect sync errors 2022-02-04 20:21:30 +01:00
Brian Barnes
4222d3a48e optimize pod topology spread 2022-02-04 10:27:58 -08:00
Kubernetes Prow Robot
c1190f5aa2 Merge pull request #107935 from ravisantoshgudimetla/wire-contexts-disruption
Wire contexts to Disruption controllers
2022-02-04 10:08:13 -08:00
Kubernetes Prow Robot
8f5a12d701 Merge pull request #107924 from gnufied/fix-reconciler-flake
fix flake in detach tests
2022-02-04 10:08:01 -08:00
Antonio Ojea
bc8e7ac1a0 ignore CRI PodSandboxNetworkStatus for host network pods 2022-02-04 18:41:57 +01:00
ravisantoshgudimetla
65ff81757d Wire contexts to Disruption controllers 2022-02-04 10:32:04 -05:00
Kubernetes Prow Robot
8e5089ad17 Merge pull request #107666 from aojea/dualga
dual-stack feature gate ga
2022-02-03 21:25:59 -08:00
Sebastian Sterk
efdfaab301 Update pkg/apis/coordination/validation/validation.go
Co-authored-by: Katrina Verey <kn.verey@gmail.com>
2022-02-03 18:25:43 +01:00
Kubernetes Prow Robot
baad1caee9 Merge pull request #107900 from smarterclayton/pr-107854
kubelet: Pods that have terminated before starting should not block startup
2022-02-02 15:51:45 -08:00
Hemant Kumar
4b589dfd7a fix flake in detach tests 2022-02-02 17:28:13 -05:00
Gunju Kim
3ce5c944a8 kubelet: Clean up a static pod that has been terminated before starting
- Allow a podWorker to start if it is blocked by a pod that has been
  terminated before starting
- When a pod can't start AND has already been terminated, exit cleanly
- Add a unit test that exercises race conditions in pod workers
2022-02-02 16:05:32 -05:00
jlsong01
d66b3edd65 allocate a unique scheme for each test to fix concurrent usage issue 2022-02-02 15:22:59 +08:00
Sebastian Sterk
8e9ea7b481 simple grammar fix 2022-02-02 00:04:35 +01:00