Commit Graph

73 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
184daed0db Merge pull request #107559 from liggitt/invalid-selectors
Handle invalid selectors properly
2022-01-19 14:49:31 -08:00
Jordan Liggitt
c0af728f43 Handle invalid selectors properly 2022-01-14 12:11:02 -05:00
Wojciech Tyczyński
551790729f Remove selflink references in different testing-related files 2022-01-14 12:58:05 +01:00
Davanum Srinivas
7fd97433f0 Next step in CSI migration for openstack
delete/modify tests that use intree cinder as well.

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2022-01-10 22:07:44 -05:00
Davanum Srinivas
497e9c1971 Cleanup OWNERS files (No Activity in the last year)
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2021-12-15 10:34:02 -05:00
Davanum Srinivas
9405e9b55e Check in OWNERS modified by update-yamlfmt.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2021-12-09 21:31:26 -05:00
Oksana Naumov
3af11fc12d Add support for Portworx to csi-translation lib
Signed-off-by: Oksana Naumov <trierra.dev@gmail.com>
2021-11-16 13:26:09 -08:00
Humble Chirammal
7c40eb9ae0 Add support for rbd plugin to csi-translation-lib
In support of csi-migration proposal here:
https://github.com/kubernetes/community/blob/master/contributors/design-proposals/storage/csi-migration.md

Will help with migration of in-tree RBD plugin ( kubernetes.io/rbd)
to RBD CSI driver ( rbd.csi.ceph.com ).

Fixes https://github.com/kubernetes/enhancements/issues/2923

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2021-11-15 23:46:29 +05:30
Kubernetes Prow Robot
a41d166a90 Merge pull request #105904 from mengjiao-liu/structured_logging_scheduler
Migrate assume_cache.go to structured logging
2021-11-09 11:28:00 -08:00
Kubernetes Prow Robot
53addf3ba3 Merge pull request #105858 from jyz0309/migrate-log
Migrated scheduler files binder.go binder_test.go to structured logging
2021-11-02 19:01:09 -07:00
Mengjiao Liu
e262e3ad2f Migrate assume_cache.go to structured logging 2021-11-01 17:43:39 +08:00
jyz0309
07bf08690c migrate log to structure log
Signed-off-by: jyz0309 <45495947@qq.com>

add klog.Kobj

Signed-off-by: jyz0309 <45495947@qq.com>

use KObj

Signed-off-by: jyz0309 <45495947@qq.com>

address comment

Signed-off-by: jyz0309 <45495947@qq.com>

remove useless var

Signed-off-by: jyz0309 <45495947@qq.com>

format code

Signed-off-by: jyz0309 <45495947@qq.com>

address comment

Signed-off-by: jyz0309 <45495947@qq.com>

use err key

Signed-off-by: jyz0309 <45495947@qq.com>

use PVC

Signed-off-by: jyz0309 <45495947@qq.com>

improve log message

Signed-off-by: jyz0309 <45495947@qq.com>

address comment

Signed-off-by: jyz0309 <45495947@qq.com>

use pod instead podName

Signed-off-by: jyz0309 <45495947@qq.com>
2021-10-31 21:11:26 +08:00
Patrick Ohly
a8c930ef46 generic ephemeral volume: graduation to GA
The feature gate gets locked to "true", with the goal to remove it in two
releases.

All code now can assume that the feature is enabled. Tests for "feature
disabled" are no longer needed and get removed.

Some code wasn't using the new helper functions yet. That gets changed while
touching those lines.
2021-10-11 20:54:20 +02:00
Patrick Ohly
bc263f3ba5 scheduler: use generic ephemeral volume helper functions
The name concatenation and ownership check were originally considered small
enough to not warrant dedicated functions, but the intent of the code is more
readable with them.
2021-10-11 17:33:57 +02:00
Wei Huang
b7d90ca991 sched: adjust events to register for VolumeBinding plugin 2021-10-07 08:51:04 -07:00
Kubernetes Prow Robot
6c292ce270 Merge pull request #105245 from yibozhuang/lost-pvc-prefilter-optimization
Scheduler volumebinding plugin - handle Lost PVC as UnschedulableAndUnresolvable
2021-10-05 08:23:09 -07:00
Yibo Zhuang
b8fe514232 Scheduler volumebinding plugin - handle Lost PVC as
UnschedulableAndUnresolvable

This change adds an additional check in the volumebinding scheduler
plugin to handle PVC with phase ClaimLost which will allow the
scheduler to return UnschedulableAndUnresolvable during the PreFilter
stage and skip the rest of the node evaluation since the PVC is
bound to a PV that does not exist.

Without this change, the FailedScheduling error message would look like:

0/10 nodes are available: 2 node(s) had taint {node/test: true},
that the pod didn't tolerate, 6 node(s) had taint {node/unhealthy: true},
that the pod didn't tolerate, 2 pvc(s) bound to non-existent pv(s)

Which is still evaluating every single node to determine that the pod
cannot be scheduled because the PVC is bound to a non-existent PV

With this change, the FailedScheduling error message would look like:

0/10 nodes are available: 1 persistentvolumeclaim "foo" bound
to non-existent persistentvolume "bar"

Signed-off By: Yibo Zhuang <yibzhuang@gmail.com>
2021-09-30 21:49:46 -07:00
Yibo Zhuang
603a4e1931 Enhance ErrReasonPVNotExist in volumebinding scheduler plugin
This change will make the message more clear when there
is a case of PVC(s) bound to PV(s) that no longer exists
and scheduler does not select the node due to this issue.

Previous error message would look like:
0/2 nodes are available: 2 pvc(s) bound to non-existent pv(s)

Updated message looks like:
0/2 nodes are available: 2 node(s) unavailable due to one or more
pvc(s) bound to non-existent pv(s)

For larger clusters with many different reasons of nodes that
are not available, the current message can be very misleading for
users to think that there are many PVCs lost due to PVs deleted but
in fact it could be just a single PVC case but many nodes not selected
by the scheduler due to this case.

Signed-off By: Yibo Zhuang <yibzhuang@gmail.com>
2021-09-27 15:02:35 -07:00
Yecheng Fu
82b50dcb7b scheduler/volumebinding: migrate to use pkg/scheduler/framework/plugins/feature 2021-09-11 10:17:28 +08:00
Patrick Ohly
89cb4d0ee9 scheduler: better reason for delay with generic ephemeral volumes
These events are currently emitted for a pod using a generic ephemeral volume:

  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  3s    default-scheduler  0/1 nodes are available: 1 persistentvolumeclaim "my-csi-app-inline-volume-my-csi-volume" not found.
  Warning  FailedScheduling  2s    default-scheduler  0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.

The one about "persistentvolumeclaim not found" is potentially confusing. It
occurs because the scheduler typically checks the pod before the ephemeral
volume controller had a chance to create the PVC.

This is a bit easier to understand:

  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  4s    default-scheduler  0/1 nodes are available: 1 waiting for ephemeral volume controller to create the persistentvolumeclaim "my-csi-app-inline-volume-my-csi-volume".
  Warning  FailedScheduling  2s    default-scheduler  0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
2021-08-30 10:06:59 +02:00
dntosas
cd795fa2eb [scheduler] Remove deprecated volumeSchedulingLatency metric
As part of https://github.com/kubernetes/kubernetes/pull/100720 we
backported fix on existing releases and in this commit we completely
remove the deprecated metric from master branch.

Signed-off-by: dntosas <ntosas@gmail.com>
2021-08-23 15:18:16 +03:00
dntosas
7cbac6bde0 [volumeScheduling/metrics] Fix buckets initialization
This metrics is measured in seconds so it makes no sense starting from
1000 as init value. This breaks also the scheduler e2e metric thus make
users unable to compute, for example, their SLO for the scheduler.
Even if this metric is deprecated, it should behave correctly until it is
completely removed to avoid user confusion.

For example, for each volume created, the minimum value exposed
as a metric is 16.6min (1000sec/60) which is obviously wrong as logic.

In this commit, we migrate bucket creation to start from reasonable
numbers, copying the incrementation from the conventions that the
scheduler follows itself.

Signed-off-by: dntosas <ntosas@gmail.com>
2021-08-17 12:49:40 +03:00
Konstantin Misyutin
29bd66d018 Remove "pkg/controller/volume/scheduling" dependency from "pkg/scheduler/framework/plugins"
All dependencies of VolumeBinding plugin from
"k8s.io/kubernetes/pkg/controller/volume/scheduling" package moved to
"k8s.io/kubernetes/pkg/scheduler/framework/plugins/volumebinding" package:

- whole file pkg/controller/volume/scheduling/scheduler_assume_cache.go
- whole file pkg/controller/volume/scheduling/scheduler_assume_cache_test.go
- whole file pkg/controller/volume/scheduling/scheduler_binder.go
- whole file pkg/controller/volume/scheduling/scheduler_binder_fake.go
- whole file pkg/controller/volume/scheduling/scheduler_binder_test.go

Package "k8s.io/kubernetes/pkg/controller/volume/scheduling/metrics" moved
to "k8s.io/kubernetes/pkg/scheduler/framework/plugins/volumebinding/metrics"
because it only used in VolumeBinding plugin and (e2e) tests.

More described in issue #89930 and PR #102953.

Signed-off-by: Konstantin Misyutin <konstantin.misyutin@huawei.com>
2021-08-13 19:08:45 +08:00
Yecheng Fu
83ee392ed4 implement EnqueueExtensions interface in volumebinding 2021-07-03 08:25:06 +08:00
Yecheng Fu
b522e95aae Prioritizing nodes based on volume capacity: API changes 2021-07-01 10:00:59 +08:00
Abdullah Gharaibeh
46f3e4dfdd Define in-tree scheduler plugin names in separate pkg to break a cyclic depednecy when moving plugin defaulting to CC 2021-06-09 15:36:09 -04:00
Dave Chen
c6e65079c7 Validate plugin config for KubeSchedulerConfiguration
Signed-off-by: Dave Chen <dave.chen@arm.com>
2021-04-14 09:30:20 +08:00
Patrick Ohly
5ca0814165 CSIStorageCapacity: use beta API 2021-03-08 20:52:50 +01:00
Kubernetes Prow Robot
027d9e6c25 Merge pull request #99835 from chendave/args
Move VolumeBinding plugin args validation to apis/config/validation
2021-03-08 09:31:42 -08:00
Kubernetes Prow Robot
699b38669f Merge pull request #99731 from AliceZhang2016/newFramework-accept-KubeSchedulerProfile
Make runtime.NewFramework accept KubeSchedulerProfile
2021-03-06 12:49:48 -08:00
Dave Chen
b8394c4700 Move VolumeBinding plugin args validation to apis/config/validation
This PR also looses the check to allow zero since the API doc has
explained that value zero indicates no waiting.

Signed-off-by: Dave Chen <dave.chen@arm.com>
2021-03-06 11:06:39 +08:00
Mengxue Zhang
b38caa91cc make runtime.NewFramework accept KubeSchedulerProfile 2021-03-05 18:30:21 +00:00
Yecheng Fu
d791f7feef Prioritizing nodes based on volume capacity: unit tests 2021-03-05 23:59:25 +08:00
Yecheng Fu
21a43586e7 Prioritizing nodes based on volume capacity 2021-03-05 23:59:25 +08:00
Benjamin Elder
56e092e382 hack/update-bazel.sh 2021-02-28 15:17:29 -08:00
chymy
57fc5f67e7 migrate pkg/scheduler/framework/plugins/volume to structured logs
Signed-off-by: chymy <chang.min1@zte.com.cn>
2021-02-20 08:42:31 +08:00
Jie Shen
f82e3c430c Wrap all errors in pkg/scheduler 2021-01-28 09:13:40 +08:00
rootlh
42c00bc523 fix bug: concurrent map writes error 2020-11-22 01:40:51 +08:00
Yecheng Fu
0961891a7a report UnschedulableAndUnresolvable status instead of an error when PVCs can't find bound
persistent volumes

This is an user error. We should't report an error.
2020-11-05 10:28:40 +08:00
Ali
09b2e8f638 Move scheduler interface to pkg/scheduler/framework 2020-10-13 13:13:27 +11:00
Kubernetes Prow Robot
965137a992 Merge pull request #94692 from alculquicondor/wrap_errors_min
Wrap errors from VolumeBinding and DefaultBinder plugins
2020-09-15 18:27:34 -07:00
Wei Huang
185ba08fcd Move podPassesBasicChecks() to VolumeBinding plugin 2020-09-11 13:54:02 -07:00
Aldo Culquicondor
7fb40fc03c Wrap errors on VolumeBinding plugin
Signed-off-by: Aldo Culquicondor <acondor@google.com>
Change-Id: I23053528ac6857124fddd7f9fa26e122202ff4bd
Signed-off-by: Aldo Culquicondor <acondor@google.com>
2020-09-10 16:22:16 -04:00
Yecheng Fu
96d0408a89 fix TestVolumeBinding unit test 2020-08-03 07:06:06 +08:00
Patrick Ohly
ff3e5e06a7 GenericEphemeralVolume: initial implementation
The implementation consists of
- identifying all places where VolumeSource.PersistentVolumeClaim has
  a special meaning and then ensuring that the same code path is taken
  for an ephemeral volume, with the ownership check
- adding a controller that produces the PVCs for each embedded
  VolumeSource.EphemeralVolume
- relaxing the PVC protection controller such that it removes
  the finalizer already before the pod is deleted (only
  if the GenericEphemeralVolume feature is enabled): this is
  needed to break a cycle where foreground deletion of the pod
  blocks on removing the PVC, which waits for deletion of the pod

The controller was derived from the endpointslices controller.
2020-07-09 23:29:24 +02:00
Kubernetes Prow Robot
55d77ade67 Merge pull request #92489 from alculquicondor/sig-storage-ownership
Add SIG storage owner aliases
2020-07-09 00:05:20 -07:00
Aldo Culquicondor
27ec356d76 Add SIG storage owner aliases
And give ownership to pkg/scheduler/framework/plugins/volumebinding

Signed-off-by: Aldo Culquicondor <acondor@google.com>
Change-Id: I4bd89b1745a2be0e458601056ab905bdd6692195
2020-07-07 10:26:16 -04:00
Patrick Ohly
0efbbe8555 CSIStorageCapacity: check for sufficient storage in volume binder
This uses the information provided by a CSI driver deployment for
checking whether a node has access to enough storage to create the
currently unbound volumes, if the CSI driver opts into that checking
with CSIDriver.Spec.VolumeCapacity != false.

This resolves a TODO from commit 95b530366a.
2020-07-06 19:20:10 +02:00
Adhityaa Chandrasekar
ec83143342 scheduler: merge Reserve and Unreserve plugins
Previously, separate interfaces were defined for Reserve and Unreserve
plugins. However, in nearly all cases, a plugin that allocates a
resource using Reserve will likely want to register itself for Unreserve
as well in order to free the allocated resource at the end of a failed
scheduling/binding cycle. Having separate plugins for Reserve and
Unreserve also adds unnecessary config toil. To that end, this patch
aims to merge the two plugins into a single interface called a
ReservePlugin that requires implementing both the Reserve and Unreserve
methods.
2020-06-24 21:10:35 +00:00
Yecheng Fu
f899976b41 fixup 2020-06-24 14:14:03 +08:00