kubernetes

Author	SHA1	Message	Date
Kubernetes Prow Robot	ac97b2d65e	Merge pull request #83507 from lyft/support-resetting-cpuacct Prevent returning invalid usageNanoCores value when cpuacct is reset in a live container	2020-02-09 08:45:53 -08:00
Kubernetes Prow Robot	abe6321296	Merge pull request #87952 from mikedanese/opts add *Options to Create, Update, and Patch in generated clientsets	2020-02-08 20:43:53 -08:00
Kubernetes Prow Robot	d09f8b9d54	Merge pull request #79409 from takmatsu/add-phase Modify Kubelet Pod Resources API to get only active pods	2020-02-08 16:09:52 -08:00
Mike Danese	bfc75d9a5c	manual fixes	2020-02-08 12:32:33 -05:00
Mike Danese	25651408ae	generated: run refactor	2020-02-08 12:30:21 -05:00
Kubernetes Prow Robot	dde6e8e746	Merge pull request #87858 from smarterclayton/different_type kubelet: Debug pod status output diff is wrong	2020-02-08 06:44:06 -08:00
Kubernetes Prow Robot	334d788f08	Merge pull request #87299 from mikedanese/ctx context in client-go	2020-02-08 06:43:52 -08:00
Kubernetes Prow Robot	b3ba969756	Merge pull request #87913 from cheftako/master Add code to fix kubelet/metrics memory issue.	2020-02-07 21:51:53 -08:00
Mike Danese	2637772298	some manual fixes	2020-02-07 18:17:40 -08:00
Mike Danese	3aa59f7f30	generated: run refactor	2020-02-07 18:16:47 -08:00
Kubernetes Prow Robot	d8b325b534	Merge pull request #85856 from adelina-t/cpu_requests_fix_ctrd Fix Cpu Requests priority Windows.	2020-02-07 15:19:58 -08:00
Walter Fender	9802bfcec0	Add code to fix kubelet/metrics memory issue. Bucketing url paths based on concept/handling. Bucketing code placed by handling code to encourage usage. Added unit tests. Fix format.	2020-02-07 15:12:24 -08:00
James DeFelice	6f8dfdfb0e	Fix docker/journald logging conformance See issue #86367	2020-02-07 16:04:27 -06:00
Kubernetes Prow Robot	8af70aeafc	Merge pull request #87390 from mattjmcnaughton/mattjmcnaughton/refactor-docker-specific-oom-out-of-qos-pkg Refactor docker specific oom const out of qos pkg	2020-02-07 12:31:48 -08:00
Takeaki Matsumoto	785fac6826	Make updateAllocatedDevices() as a public method and call it in podresources api	2020-02-07 13:26:56 +09:00
Kubernetes Prow Robot	3dc3c7c653	Merge pull request #87788 from ahg-g/ahg-filter Reduce overhead of error message formatting and allocation for NodeResource filter	2020-02-06 17:46:50 -08:00
Kubernetes Prow Robot	6ac8f2c70c	Merge pull request #87758 from klueska/upstream-cleanup-topology-manager Cleanup TopologyManager and update policy.Merge()	2020-02-06 17:46:27 -08:00
marosset	999fdfaddf	Calling hcsshim instead of docker api to get stats for windows to greatly reduce latency	2020-02-06 17:59:10 +00:00
Clayton Coleman	aed4d639a5	kubelet: Debug pod status output diff is wrong The types were different so the diff output is not useful, both should be pointers: ``` Feb 05 19:44:40 ci-ln-6k7l4-w-c-w9wbb.c.openshift-gce-devel-ci.internal hyperkube[2737]: I0205 19:44:40.222259 2737 status_manager.go:642] Pod status is inconsistent with cached status for pod "prometheus-k8s-1_openshift-monitoring(0e9137b8-3bd2-4353-b7f5-672749106dc1)", a reconciliation should be triggered: Feb 05 19:44:40 ci-ln-6k7l4-w-c-w9wbb.c.openshift-gce-devel-ci.internal hyperkube[2737]: interface{}( Feb 05 19:44:40 ci-ln-6k7l4-w-c-w9wbb.c.openshift-gce-devel-ci.internal hyperkube[2737]: - s"&PodStatus{Phase:Running,Conditions:[]PodCondition{PodCondition{Type:Initialized,Status:True,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2020-02-05 19:13:30 +0000 UTC,Reason:,Message:,},PodCondit> Feb 05 19:44:40 ci-ln-6k7l4-w-c-w9wbb.c.openshift-gce-devel-ci.internal hyperkube[2737]: + v1.PodStatus{ Feb 05 19:44:40 ci-ln-6k7l4-w-c-w9wbb.c.openshift-gce-devel-ci.internal hyperkube[2737]: + Phase: "Running", Feb 05 19:44:40 ci-ln-6k7l4-w-c-w9wbb.c.openshift-gce-devel-ci.internal hyperkube[2737]: + Conditions: []v1.PodCondition{ ```	2020-02-05 14:52:46 -05:00
Kubernetes Prow Robot	d90dd93855	Merge pull request #82111 from xieyanker/xieyanker-patch-2 remove stateCheckPeriod	2020-02-05 04:17:55 -08:00
Abdullah Gharaibeh	0a476eb7d4	reduce overhead of error message formatting and allocation for scheudler NodeResource filter	2020-02-04 11:02:29 -05:00
Kevin Klues	d5addb4090	Cleanup logging and creation logic of TopologyManager in prep for beta	2020-02-03 17:13:29 +00:00
Kevin Klues	bc686ea27b	Update TopologyManager.GetTopologyHints() to take pointers Previously, this function was taking full Pod and Container objects unnecessarily. This commit updates this so that they will take pointers instead.	2020-02-03 17:13:28 +00:00
Kevin Klues	adaa58b6cb	Update TopologyManager.Policy.Merge() to return a simple bool Previously, the verious Merge() policies of the TopologyManager all eturned their own lifecycle.PodAdmitResult result. However, for consistency in any failed admits, this is better handled in the top-level Topology manager, with each policy only returning a boolean about whether or not they would like to admit the pod or not. This commit changes the semantics to match this logic.	2020-02-03 17:13:28 +00:00
Kevin Klues	95a3ac447f	Fix bug in TopologManager RemoveContainer() Previously, we unconditionally removed all topology hints from a pod whenever just one container was being removed. This commit makes it so we only remove the hints for the single container being removed, and then conditionally remove the pod from the podTopologyHints[podUID] when no containers left in it.	2020-02-03 17:13:14 +00:00
Kubernetes Prow Robot	7e5bfe4417	Merge pull request #85472 from dcbw/kubelet-network-approvers kubelet/network: add sig-network-approvers to OWNERS	2020-02-01 12:55:19 -08:00
wojtekt	b11b7d354d	WatchBasedManager stops watching immutable objects	2020-01-31 20:53:21 +01:00
Kubernetes Prow Robot	1baceba376	Merge pull request #87394 from mattjmcnaughton/mattjmcnaughton/delete-sysctl-runtime-admit-handler Delete the sysctl runtime admit handler	2020-01-30 21:20:45 -08:00
Alena Prokharchyk	6c3093f970	Ineffassign fixes for pkg/controller and kubelet	2020-01-30 14:35:10 -08:00
Lee Verberne	d05bcf6800	Add namespace mode targeting to dockershim	2020-01-30 15:31:43 +01:00
Lee Verberne	4d4e111f01	Generated code for kubelet namespace targeting	2020-01-30 15:31:43 +01:00
Lee Verberne	9a6d50cb2a	Add namespace targeting to the kubelet	2020-01-30 15:31:43 +01:00
Kubernetes Prow Robot	7164152844	Merge pull request #87664 from liggitt/revert-parallel-volume Revert "Merge pull request #87258 from verult/slow-rxm-attach"	2020-01-29 22:12:01 -08:00
Kubernetes Prow Robot	ec3fc59f1b	Merge pull request #87627 from tallclair/rc-metrics Register RunPodSandbox* metrics	2020-01-29 22:11:25 -08:00
Jordan Liggitt	cd1059e3c4	Revert "Merge pull request #87258 from verult/slow-rxm-attach" This reverts commit `15c3f1b119`, reversing changes made to `52d7614a8c`.	2020-01-29 14:58:32 -05:00
Mike Danese	d55d6175f8	refactor	2020-01-29 08:50:45 -08:00
Kubernetes Prow Robot	649391af22	Merge pull request #84154 from ohsewon/hugepage_per_container Implement support for setting hugepages limit on container cgroup sandbox.	2020-01-29 07:32:14 -08:00
Tim Allclair	43c7f3be29	Register RunPodSandbox* metrics	2020-01-28 13:26:11 -08:00
Kubernetes Prow Robot	15c3f1b119	Merge pull request #87258 from verult/slow-rxm-attach Parallelize attach operations across different nodes for volumes that allow multi-attach	2020-01-28 08:33:41 -08:00
sewon.oh	463442aa29	Update container hugepage limit when creating the container Unit test for updating container hugepage limit Add warning message about ignoring case. Update error handling about hugepage size requirements Signed-off-by: sewon.oh <sewon.oh@samsung.com>	2020-01-28 09:35:02 +09:00
Cheng Xing	c6a03fa5be	Parallelize attach operations across different nodes for volumes that allow multi-attach	2020-01-27 15:02:25 -08:00
Jiahui Feng	b2bb3dfb59	add logging before kubelet waiting for cert during bootstrapping.	2020-01-27 10:12:36 -08:00
Kubernetes Prow Robot	98f63eee1b	Merge pull request #87460 from nolancon/policies_refactor Refactor Topology Manager policies to reduce code duplication	2020-01-24 21:11:15 -08:00
mattjmcnaughton	9e1c99c4e2	Delete the sysctl runtime admit handler As of https://github.com/kubernetes/kubernetes/pull/72831, the minimum docker version is 1.13.1. (and the minimum API version is 1.26). The only time the `RuntimeAdmitHandler` returns anything other than accept is when the Docker API version < 1.24. In other words, we can be confident that Docker will always support sysctl. As a result, we can delete this unnecessary and docker-specific code.	2020-01-22 08:51:39 -05:00
nolancon	4d76b1c8de	Add mergeFilteredHints: - Move remaining logic from mergeProvidersHints to generic top level mergeFilteredHints function. - Add numaNodes as parameter in order to make generic. - Move single NUMA node specific check to single-numa-node Merge function.	2020-01-22 09:07:41 +00:00
nolancon	fc300e0e7d	Move filterSingleNumaHints call to top level Merge	2020-01-22 08:39:22 +00:00
nolancon	45660fd3a2	Add filterProvidersHints function: - Move initial 'filtering' functionality to generic function filterProvidersHints level policy.go. - Call new function from top level Merge function. - Rename some variables/parameters to reflect changes.	2020-01-22 08:35:28 +00:00
nolancon	df9b2595f3	Update filterHints to filterSingleNumaHints: - Change function name - Remove policy parameter (unnecessary) - Update unit test to reflect change	2020-01-22 07:15:00 +00:00
Wei Huang	3f8b202266	Move GeneralPredicates logic to kubelet.	2020-01-21 09:27:00 -08:00
Kubernetes Prow Robot	9822016bf8	Merge pull request #87397 from klueska/upstream-cpu-manager-set-initial-containers Initialize CPUManager containerMap to set of initial containers	2020-01-20 17:39:50 -08:00
Kubernetes Prow Robot	e6b5194ec1	Merge pull request #84300 from klueska/upstream-cpu-manager-reconcile-on-container-state Update logic in `CPUManager` `reconcileState()`	2020-01-20 12:27:37 -08:00
Kevin Klues	bd9d8fa42f	Initialize CPUManager containerMap to set of initial containers A recent change made it so that the CPUManager receives a list of initial containers that exist on the system at startup. This list can be non-empty, for example, after a kubelet retart. This commit ensures that the CPUManagers containerMap structure is initialized with the containers from this list.	2020-01-20 20:42:29 +01:00
Kubernetes Prow Robot	37ee6425ef	Merge pull request #87255 from klueska/upstream-remove-redundant-active-pods-check Remove check for empty activePods list in CPUManager removeStaleState	2020-01-20 09:05:50 -08:00
Kubernetes Prow Robot	23fa359d6c	Merge pull request #84705 from whypro/cpumanager-panic-master Return error instead of panic when cpu manager fails on startup.	2020-01-20 07:25:37 -08:00
mattjmcnaughton	73b2d83125	Refactor docker specific oom const out of qos pkg Previously, `pkg/kubelet/qos` contained two different docker-specific OOM constants. One set the oom adj for sandbox docker containers and the other set the oom adj for the docker daemon. Move both to be closer to their actual usages in `dockershim`. This change addresses a TODO and leads us towards the overall goal of making `pkg/kubelet` runtime agnostic except for `dockershim`.	2020-01-20 10:22:31 -05:00
Kevin Klues	7be9b0fe55	Update comments and error messages in the CPUManager	2020-01-20 15:31:01 +01:00
Kevin Klues	f2acbf6607	Base CPUManager state reconciliation on container state, not pod state	2020-01-20 13:57:30 +00:00
Kevin Klues	f6cf9b8ce9	Move CPUManager Pod Status logic before container loop	2020-01-20 13:57:30 +00:00
Davanum Srinivas	b3853138a4	update generated files	2020-01-17 11:23:53 -05:00
Kubernetes Prow Robot	50f9ea7999	Merge pull request #85798 from nolancon/merge-policy-rebase Updated - topologymanager: Add Merge method to Policy	2020-01-17 05:14:56 -08:00
Kubernetes Prow Robot	9701baea0f	Merge pull request #87283 from klueska/update-printing-for-tm-bitmask Update bitmask printing to print in groups of 2 instead of all 64 bits	2020-01-16 12:04:32 -08:00
Kevin Klues	708278098a	Update bitmask printing to print in groups of 2 instead of all 64 bits	2020-01-16 17:28:52 +01:00
Kevin Klues	7069b1d6e8	Update TopologyManager single-numa-node logic to handle "don't cares" The logic has been updated to match the logic of the best-effort policy except in two places: 1) The hint filtering frunction has been updated to allow "don't care" hints encoded with a `nil` affinity mask, to pass through the filter in addition to hints that have just a single NUMA bit set. 2) After calculating the `bestHint` we transform "don't care" affinities encoded as having all NUMA bits set in their affinity masks into "don't care" affinities encoded as `nil`.	2020-01-16 08:50:35 +00:00
Kevin Klues	2905ffffa7	Rename TopologyManager test TestPolicyBestEffortMerge for consistency	2020-01-16 08:50:21 +00:00
Kevin Klues	94489c137c	Cleanup use of defaultAffinity in mergePermutation of TopologyManager	2020-01-16 08:50:12 +00:00
nolancon	5e23517ebf	Use reflect.DeepEqual check in policy_test.go	2020-01-16 08:13:07 +00:00
nolancon	92eb7cd601	Update "Single NUMA hint generation" expected affinity to nil	2020-01-16 08:13:07 +00:00
nolancon	8b3f6e61a2	Move test case "Two providers, 1 with 2 hints, 1 with single non-preferred hint matching" into specific policy tests	2020-01-16 08:13:07 +00:00
nolancon	681c42bfc2	Move test case "Two providers, 1 hint each, same mask, 1 preferred, 1 not 2/2" into specific policy tests	2020-01-16 08:13:07 +00:00
nolancon	a38a2562b2	Move test case "Two providers, 1 hint each, same mask, 1 preferred, 1 not 1/2" into specific policy test.	2020-01-16 08:13:07 +00:00
nolancon	f639da7637	Move test case "Two providers, 1 hint each, no common mask" into specific policy tests.	2020-01-16 08:13:07 +00:00
nolancon	401a2bb285	Move test case "Single TopologyHint with Preferred as false and NUMANodeAffinity as nil" into specific policy tests.	2020-01-16 08:13:06 +00:00
nolancon	6460ef6392	Move test case "Single TopologyHint with Preferred as true and NUMANodeAffinity as nil" into specific policy tests.	2020-01-16 08:13:06 +00:00
nolancon	baeff9ec5d	Move test case "HintProvider returns empty non-nil map[string][]TopologyHint from provider" into specific policy tests.	2020-01-16 08:13:06 +00:00
nolancon	599217d482	Move test case "HintProvider returns -nil map[string][]TopologyHint from provider" into specific policy tests	2020-01-16 08:13:06 +00:00
nolancon	57661ee946	Move test case 'HintProvider returns empty non-nil map[string][]TopologyHint' into specific policy tests.	2020-01-16 08:13:06 +00:00
nolancon	51f1af0395	Move test case 'TopologyHint not set' into individual policy tests	2020-01-16 08:13:06 +00:00
nolancon	8466a5852a	Restore policy_test.go to upstream Following commits will contain incremental changes to this file to ease review process and ensure all tests are accounted for.	2020-01-16 08:13:06 +00:00
nolancon	59bb6c4d6f	Update checks in mergeProvidersHints: - Initialize best Hint to TopologyHint{} - Update checks. - Move generic unit test case into policy specific tests and updated expected outcome to reflect changes.	2020-01-16 08:13:06 +00:00
nolancon	6758f95117	Restore original policy none test cases: Mistakenly overwritten in earlier commit	2020-01-16 08:13:06 +00:00
nolancon	2d1a535a35	Make mergePermutation generic: - Remove policy parameters to make function generic - Move function into top level policy.go	2020-01-16 08:13:06 +00:00
nolancon	5487941485	Refactor filterHints: - Restructure function - Remove bug fix for catching {nil true} - To be fixed in later commit - Restore unit tests to original state for testing filterHints	2020-01-16 08:13:06 +00:00
nolancon	adfd11f38f	Make iterateAllProviderTopologyHints generic: - Remove policy parameters to make this function generic. - Move function out of individual policies and into policy.go	2020-01-16 08:13:06 +00:00
nolancon	e43f0a5293	Reinstate canAdmitPodResult in policy_none: This is to keep consistency with the other policies. This change may be made across all policies in a future PR, but removing it from the scope of this PR for now.	2020-01-16 08:13:05 +00:00
nolancon	4cc5b9e46c	Edit hints returned from policies and unit tests: - Best Effort Policy: Return hint with nil affinity as opposed to defaultAffinity when provider has no preference for NUMA affinty or no possible NUMA affinities. - Single NUMA Node Policy: Remove defaultHint from mergeProvidersHints. Instead return appropriate TopologyHint where required. - Update unit tests to reflect changes. Some test cases moved into individual policy test functions due to differing returned affinties per policy.	2020-01-16 08:13:05 +00:00
nolancon	e3d0c9397f	Updates to single-numa-node policy: - Remove getHintMatch method. - Replace with simplified versions of mergePermutation and iterateAllProviderTopologyHints methods - as used in best-effort. - Remove getHintMatch unit tests.	2020-01-16 08:13:05 +00:00
nolancon	b5ca4989e3	Update unit tests: - Update filterHints test to reflect changes in previous commit. - Some common test cases achieve differing expected results based on policy due to independent merge strategies. These cases are moved into individual policy based test functions.	2020-01-16 08:13:05 +00:00
nolancon	17d615bca2	Update filterHints: - Only append valid preferred-true hints to filtered - Return true if allResourceHints only consist of nil-affinity/preferred-true hints: {nil true}, update defaultHint preference accordingly.	2020-01-16 08:13:05 +00:00
Adrian Chiris	9f21f49493	Additional unit tests for Topology Manager methods	2020-01-16 08:13:05 +00:00
Adrian Chiris	f886d2a832	Update single-numa-node policy unit tests	2020-01-16 08:13:05 +00:00
Adrian Chiris	2825a7be1a	Add new functionality for single-numa-node policy: Explanation taken from original commit: - Change the current method of finding the best hint. Instead of going over all permutations, sort the hints and find the narrowest hint common to all resources. - Break out early when merging to a preferred hint is not possible	2020-01-16 08:13:05 +00:00
Adrian Chiris	5ce2ea2773	Return defaultAffinity from PolicyBestEffort: Now that PolicySingleNUMANode is not considered here, return defaultAffinity as was the original case before previous bug fix	2020-01-16 08:13:05 +00:00
Adrian Chiris	eda1521562	Make mergeProviderHints policy-specific: - Remove need to pass policy and numaNodes as arguments - Remove PolicySingleNUMANode special case check in policy_best_effort - Add mergeProviderHints base to policy_single_numa_node for upcoming commit	2020-01-16 08:13:05 +00:00
Adrian Chiris	dc36924c37	Update policy_none removing canAdmitPodResult Update unit tests for none_policy Add Name test for policy_restricted	2020-01-16 08:13:05 +00:00
Adrian Chiris	cf8b098dda	Refactor policy-best-effort - Modularize code with mergePermutation method	2020-01-16 08:13:05 +00:00
Sascha Grunert	278717bc57	Fix ineffectual assignment to CPUSets Signed-off-by: Sascha Grunert <sgrunert@suse.com>	2020-01-16 08:57:42 +01:00
Kevin Klues	34b942a41d	Remove check for empty activePods list in CPUManager removeStaleState This check is redundant since we protect this call with a call to `m.sourcesReady.AllReady()` earlier on. Moreover, having this check in place means that we will leave some stale state around in cases where there are actually no active pods in the system and this loop hasn't cleaned them up yet. This can happen, for example, if a pod exits while the kubelet is down for some reason. We see this exact case being triggered in our e2e tests, where a test has been failing since October when this change was first introduced.	2020-01-15 20:09:24 +01:00
Kevin Klues	5802f3a910	Add proper activePods list in TestGetTopologyHints for CPUManager	2020-01-15 20:08:41 +01:00
Eric Ernst	cdd92d39b7	preemption: typo cleanup Signed-off-by: Eric Ernst <eric.ernst@intel.com>	2020-01-15 10:55:40 -08:00
Kubernetes Prow Robot	208cc18c69	Merge pull request #86728 from mattjmcnaughton/mattjmcnaughton/add-test-coverage-oom-watcher Add test coverage for oom watcher	2020-01-15 01:21:33 -08:00
Kubernetes Prow Robot	de34d2ce1e	Merge pull request #87193 from mattjmcnaughton/mattjmcnaughton/cleanup-rkt-code-in-pleg Clean up rkt specific code in `pkg/kubelet/pleg`	2020-01-14 22:21:46 -08:00
mattjmcnaughton	9077603b97	Add richer unit tests for OomWatcher Add unit tests for OomWatcher that actually test the logic defined in the `Start` method. As a result of an earlier refactor, its now trivial to mock the OOMInstance events which the `oom_watcher` is supposed to be watching.	2020-01-14 07:57:06 -05:00
mattjmcnaughton	ab7e0f58d5	Clean up rkt specific code in `pkg/kubelet/pleg` Clean up code in PLEG which was only necessary for the `rkt` runtime. Rkt is no longer a built-in runtime and docker(shim) uses the CRI, so its safe to remove this code entirely. This diff removes the last mentions of `rkt` in the kubelet.	2020-01-14 07:42:30 -05:00
Kubernetes Prow Robot	be26fbc638	Merge pull request #86282 from RainbowMango/pr_refactor_resource_endpoint Refactor kubelet resource metrics	2020-01-14 02:23:09 -08:00
Kubernetes Prow Robot	f4db8212be	Merge pull request #76496 from danielqsj/metrics-2 Clean deprecated metrics	2020-01-13 20:53:09 -08:00
Kubernetes Prow Robot	61d36e4a43	Merge pull request #85850 from danwinship/kubelet-ipv6-node-ip Allow "kubelet --node-ip ::" to mean prefer IPv6	2020-01-13 17:41:08 -08:00
Kubernetes Prow Robot	8467561f2c	Merge pull request #86783 from mattjmcnaughton/mattjmcnaughton/remove-unnecessary-modification-container-pid-namespace Remove no longer needed `modifyContainerPIDNamespaceOverrides`	2020-01-10 15:43:50 -08:00
Kubernetes Prow Robot	befc371364	Merge pull request #86702 from mattjmcnaughton/mattjmcnaughton/refactor-oom-watcher-to-allow-greater-test-coverage Refactor oom watcher to allow greater test coverage	2020-01-10 15:43:37 -08:00
danielqsj	ab182552b4	clean SinceInMicroseconds, convert to SinceInSeconds	2020-01-10 17:05:38 +08:00
danielqsj	8ae3f80048	remove deprecated metrics of dockershim	2020-01-10 17:05:38 +08:00
danielqsj	1a9b121764	remove deprecated metrics of kubelet	2020-01-10 16:46:52 +08:00
Kubernetes Prow Robot	7a50fdb2a6	Merge pull request #85993 from chendotjs/fix-cidr kubenet: replace gateway with cni result	2020-01-09 20:13:04 -08:00
Kubernetes Prow Robot	792fe793a1	Merge pull request #86946 from cchord/fix_typo fix typo	2020-01-08 14:46:24 -08:00
Kubernetes Prow Robot	fd0358fd21	Merge pull request #86689 from klueska/upstream-fix-cpumanager-v1-state-checksum Lock checksum calculation for v1 CPUManager state to pre 1.18 logic	2020-01-08 02:57:40 -08:00
Jiahao Zhu	680df17f39	fix typo	2020-01-08 15:48:58 +08:00
Kubernetes Prow Robot	8dca390262	Merge pull request #84927 from mattjmcnaughton/mattjmcnaughton/fix-kubelet-config-common Fix golint failures for pkg/kubelet/config/...	2020-01-07 21:09:40 -08:00
mattjmcnaughton	8897c435ad	Refactor oom watcher to allow greater test coverage This diff contains a strict refactor; there are no behavioral changes. Address a long standing TODO in `oom_watcher_linux_test.go` around test coverage. We refactor our `oom.Watcher` so it takes in a struct fulfulling the `streamer` interface (i.e. defines `StreamOoms` method). In production, we will continue to use the `oomparser` from `cadvisor`. However, for testing purposes, we can now create our own `fakeStreamer`, and control how it streams `oomparser.OomInstance`. With this fake, we can implement richer unit testing for the `oom.Watcher` itself. Actually adding the additional unit tests will come in a later commit.	2020-01-07 21:48:14 -05:00
Kubernetes Prow Robot	49e24adf3e	Merge pull request #86832 from mattjmcnaughton/mattjmcnaughton/remove-dead-code-in-fake-docker-client Remove dead code in fake docker client	2020-01-07 07:36:18 -08:00
Dan Winship	ce68edf700	Allow "kubelet --node-ip ::" to mean prefer IPv6	2020-01-07 07:53:21 -05:00
Kubernetes Prow Robot	f3df7a2fdb	Merge pull request #86727 from mattjmcnaughton/mattjmcnaughton/remove-recorder-PastEventf Remove `recorder.PastEventf` method	2020-01-07 04:38:49 -08:00
Kubernetes Prow Robot	dd5272b76f	Merge pull request #86575 from gongguan/nkubemark kubemark use remote cri	2020-01-07 03:20:46 -08:00
Kubernetes Prow Robot	195e8e3ad9	Merge pull request #86844 from mattjmcnaughton/mattjmcnaughton/update-cadvisor-stats-provider-comment Correct comment around which integrations require cadvisor_stats	2020-01-07 01:13:14 -08:00
Kubernetes Prow Robot	8b8f2aa4a5	Merge pull request #85431 from irbull/api-doc Add public documentation for kubelet/apis/config	2020-01-06 23:12:18 -08:00
louisgong	324e5ce7e3	hollow-node use remote CRI	2020-01-07 11:00:45 +08:00
Kubernetes Prow Robot	59b4933fb8	Merge pull request #86724 from gongguan/fix-fake-CRI fix fake remote CRI	2020-01-06 18:06:57 -08:00
Kubernetes Prow Robot	49bc696614	Merge pull request #86251 from bboreham/pleg-last-seen-metric Kubelet: add a metric to observe time since PLEG last seen	2020-01-06 18:06:18 -08:00
Kubernetes Prow Robot	d6412b856f	Merge pull request #84345 from danielqsj/withdialer replace grpc.WithDialer which is deprecated	2020-01-06 15:56:17 -08:00
Kubernetes Prow Robot	19ecd690fa	Merge pull request #86646 from yutedz/client-protocol Require client / server protocols	2020-01-06 13:34:18 -08:00
Kubernetes Prow Robot	b112ad4f0b	Merge pull request #86845 from mattjmcnaughton/mattjmcnaughton/remove-rkt-from-runtime-options Remove `rkt` from container runtime options	2020-01-06 11:12:29 -08:00
Kubernetes Prow Robot	9acf7d11fe	Merge pull request #86344 from klueska/upstream-cm-approver Add klueska as an approver in pkg/kubelet/cm/OWNERS	2020-01-06 09:54:16 -08:00
Ted Yu	906adbdfcd	Require client / server protocols	2020-01-06 08:50:04 -08:00
mattjmcnaughton	794d0d9b4d	Remove `rkt` from container runtime options Part of efforts to clean up mentions of rkt in kubelet. rkt was removed entirely in 1.11, in favor of using `rktlet` and CRI instead. It should no longer be listed at all as a runtime.	2020-01-05 09:27:38 -05:00
mattjmcnaughton	06b44c76fd	Correct comment around which integrations require cadvisor_stats This commit is part of a larger effort to clean up references to `rkt` in the kubelet. Previously, this comment hard-coded which integrations required the cadvisor stats provider. The comment has grown stale (i.e. referenced rkt and did not reference cri-o). Update the comment to instead point to the code which determines which integrations need the cadvisor stats provider.	2020-01-05 09:23:09 -05:00
mattjmcnaughton	f2cb1f35fe	Remove dead code in fake docker client The `FakeDockerClient` had a number of methods defined on it which were not being called anywhere. The majority were of the form `Assert...`. In the spirit of removing dead code, remove the methods which aren't being called.	2020-01-05 08:31:59 -05:00
louisgong	e8eb5c656b	fix fake remote CRI	2020-01-04 08:43:17 +08:00
Bryan Boreham	cc0b3e82eb	Kubelet: add a metric to observe time since PLEG last seen Expose the measurement that kubelet uses to judge that "PLEG is unhealthy". If we can observe the measurement growing then we can alert before the node goes unhealthy. Note that the existing metrics PLEGRelistInterval and PLEGRelistDuration are poor for this, because when relist() gets stuck they are never updated. Signed-off-by: Bryan Boreham <bryan@weave.works>	2020-01-03 10:01:27 +00:00
mattjmcnaughton	d09fe8e247	Remove no longer needed `modifyContainerPIDNamespaceOverrides` As of https://github.com/kubernetes/kubernetes/pull/72831/, the minimum kubernetes version is now 1.13.1. As a result, this function becomes a no-op. As the TODO indicates, we should delete it.	2020-01-02 09:09:02 -05:00
mattjmcnaughton	92940fa80d	Remove `recorder.PastEventf` method The `recorder.PastEventf` method wasn't actually working as advertised. It was supposed to accept a timestamp, which would be used when generating the event. However, as the [source code](https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/tools/record/event.go#L316) shows, this `timestamp` was never actually used. In other words, `PastEventf` is identical to `Eventf`. We have two options: one would be to fix `PastEventf` so that it works as advertised. The other would be to delete `PastEventf` and only support `Eventf`. Ultimately, I could only find one use of `PastEventf` in the code base, so I propose we just delete `PastEventf` and convert all uses to `Eventf`.	2019-12-30 12:00:23 -05:00
Kevin Klues	b373121a14	Make CPUManagerCheckpointV2 type an alias of CPUManagerCheckpoint This change is to prevent problems when we remove the V1->V2 migration code in the future. Without this, the checksums of all checkpoints would be hashed with the name CPUManagerCheckpointV2 embedded inside of them, which is undesirable. We want the checkpoints to be hashed with the name CPUManagerCheckpoint instead.	2019-12-28 19:29:13 +01:00
Kevin Klues	5faf8f4c52	Lock checksum calculation for v1 CPUManager state to pre 1.18 logic The updated CPUManager from PR #84462 implements logic to migrate the CPUManager checkpoint file from an old format to a new one. To do so, it defines the following types: ``` type CPUManagerCheckpoint = CPUManagerCheckpointV2 type CPUManagerCheckpointV1 struct { ... } type CPUManagerCheckpointV2 struct { ... } ``` This replaces the old definition of just: ``` type CPUManagerCheckpoint struct { ... } ``` Code was put in place to ensure proper migration from checkpoints in V1 format to checkpoints in V2 format. However (and this is a big however), all of the unit tests were performed on V1 checkpoints that were generated using the type name `CPUManagerCheckpointV1` and not the original type name of `CPUManagerCheckpoint`. As such, the checksum in the checkpoint file uses the `CPUManagerCheckpointV1` type to calculate its checksum and not the original type name of `CPUManagerCheckpoint`. This causes problems in the real world since all pre-1.18 checkpoint files will have been generated with the original type name of `CPUManagerCheckpoint`. When verifying the checksum of the checkpoint file across an upgrade to 1.18, the checksum is calculated assuming a type name of `CPUManagerCheckpointV1` (which is incorrect) and the file is seen to be corrupt. This patch ensures that all V1 checksums are verified against a type name of `CPUManagerCheckpoint` instead of ``CPUManagerCheckpointV1`. It also locks the algorithm used to calculate the checksum in place, since it wil never change in the future (for pre-1.18 checkpoint files at least).	2019-12-28 14:17:55 +01:00
danielqsj	19fe9f8d94	replace grpc.WithDialer which is deprecated	2019-12-26 17:46:59 +08:00
SataQiu	2497a1209b	bump k8s.io/utils version	2019-12-21 14:54:44 +08:00
Kubernetes Prow Robot	03e90b80ce	Merge pull request #86167 from yiyang5055/change-CounterVec-to-Counter change CounterVec to use Counter in the Kubelet's Pod Lifecycle Event…	2019-12-19 11:33:56 -08:00
whypro	f4bd4e2e96	Return error instead of panic when cpu manager starts failed.	2019-12-19 21:56:23 +08:00
chenyaqi01	c5002a348e	kubenet: replace gateway with cni result	2019-12-19 18:32:25 +08:00
Jacek Kaniuk	4303be3d9f	Revert pull request #85879 "hollow-node use remote CRI"	2019-12-19 10:52:35 +01:00
sshukun	12a5bdec00	Fix go-lint issues in package pkg/kubelet/checkpointmanager/testing/example_checkpoint_formats/v1	2019-12-19 12:53:33 +09:00
Kubernetes Prow Robot	814fc34cde	Merge pull request #85879 from gongguan/cri-kubemark hollow-node use remote CRI	2019-12-18 06:01:57 -08:00
louisgong	e8e1cc9ee0	extract PreInitRuntimeService from NewMainKubelet	2019-12-18 11:48:29 +08:00
Kubernetes Prow Robot	40df9f82d0	Merge pull request #82492 from gnufied/fix-uncertain-mounts Fix uncertain mounts	2019-12-17 14:49:57 -08:00
Kubernetes Prow Robot	a1fc96f41e	Merge pull request #84462 from klueska/upstream-cpu-manager-update-state-semantics Update CPUManager stored state semantics	2019-12-17 12:00:12 -08:00
Kevin Klues	9818b4522e	Add klueska as an approver in pkg/kubelet/cm/OWNERS	2019-12-17 10:40:23 +01:00
Kubernetes Prow Robot	0633e2cd34	Merge pull request #86303 from langyenan/misspell fix misspelling in comment	2019-12-16 21:39:49 -08:00
Kubernetes Prow Robot	a931227952	Merge pull request #83504 from lyft/remove-all-terminated-containers cri_stats_provider: do not consider exited containers when calculating cpu usage	2019-12-16 17:53:39 -08:00
Jordan Liggitt	a65d8aeb76	Add UID precondition to kubelet pod status patch updates	2019-12-16 14:27:32 -05:00
ianlang	c9418412d1	fix misspelling in comment	2019-12-16 17:27:08 +08:00
RainbowMango	c394d821fd	Deal with auto-generated files: - Update bazel by hack/update-bazel.sh	2019-12-16 10:27:02 +08:00
RainbowMango	ecf5f7d749	Deprecated metrics under /metrics/resource/v1alpha1	2019-12-16 10:27:02 +08:00
RainbowMango	0db7074e1a	Add new endpoint for resource metrics.	2019-12-16 10:26:54 +08:00
Kubernetes Prow Robot	69410eca4b	Merge pull request #86256 from liggitt/testapi Remove use of testapi package	2019-12-13 12:55:50 -08:00
Jordan Liggitt	5d5b444c4d	Remove use of testapi codecs, selflink, resourcepath functions	2019-12-13 11:56:29 -05:00
Kubernetes Prow Robot	7e01fe12bf	Merge pull request #86228 from ahg-g/ahg-r1 Deprecate scheduler's FailureReason	2019-12-12 18:33:32 -08:00
Kubernetes Prow Robot	6db550d1db	Merge pull request #85789 from ZP-AlwaysWin/dev-1202 Remove unnecessary nil check in if statement in nodelease controller	2019-12-12 18:32:54 -08:00
Abdullah Gharaibeh	70a2bccfd6	deprecate scheduler's FailureReason	2019-12-12 18:54:52 -05:00
Kubernetes Prow Robot	010291d4dc	Merge pull request #84951 from yutedz/status-mgr-sync-static Sync the status of static Pods	2019-12-11 19:40:32 -08:00
Hemant Kumar	ca532c6fb2	Ensure that error is returned on NodePublish	2019-12-11 22:10:09 -05:00
Kevin Klues	f553286156	Pass initial set of runtime containers to the CPUManager at startup These information associatedd with these containers is used to migrate the CPUManager state from it's old format to its new (i.e. keyed off of podUID and containerName instead of containerID).	2019-12-11 23:02:51 +01:00
Kevin Klues	6441e1ef43	Move CPUManager Checkpoint restoration to Start() instead of New()	2019-12-11 23:02:51 +01:00
Kevin Klues	69f8053850	Update top-level CPUManager to adhere to new state semantics For now, we just pass 'nil' as the set of 'initialContainers' for migrating from old state semantics to new ones. In a subsequent commit will we pull this information from higher layers so that we can pass it down at this stage properly.	2019-12-11 23:02:51 +01:00
Kevin Klues	185e790f71	Update CPUManager policies to adhere to new state semantics	2019-12-11 23:02:51 +01:00
Kevin Klues	7c760fea38	Change CPUManager state to key off of podUID and containerName Previously, the state was keyed off of containerID intead of podUID and containerName. Unfortunately, this is no longer possible as we move to a to model where we we allocate CPUs to containers at pod adit time rather than container start time. This patch is the first step towards full migration to the new semantics. Only the unit tests in cpumanager/state are passing. In subsequent commits we will update the CPUManager itself to use these new semantics. This patch also includes code to do migration from the old checkpoint format to the new one, assuming the existence of a ContainerMap with the proper mapping of (containerID)->(podUID, containerName). A subsequent commit will update code in higher layers to make sure that this ContainerMap is made available to this state logic.	2019-12-11 23:02:51 +01:00
Kevin Klues	9191a949ae	Extend makePod() helper in CPUManager to take PodUID and ContainerName	2019-12-11 23:02:51 +01:00
Kevin Klues	7a15d3a4d7	Fix bug in parsing int to string in CPUManager tests	2019-12-11 23:02:51 +01:00
Kevin Klues	765aae93f8	Move containerMap out of static policy and into top-level CPUManager	2019-12-11 23:02:51 +01:00
Kevin Klues	1d995c98ef	Update CPUmanager containerMap to allow removal by containerRef	2019-12-11 23:02:47 +01:00
Kevin Klues	0639bd0942	Change CPUManager containerMap to key off of (podUID, containerName) Previously it keyed off of a pointer to the actual pod / container, which was unnecessary, and hard to work with (especially on the retrieval side).	2019-12-11 23:02:11 +01:00
Kevin Klues	3881e50cce	Update CPUmanager containerMap to also return a containerRef	2019-12-11 23:01:01 +01:00
Kevin Klues	347d5f57ac	Move CPUManager ContainerMap to its own package	2019-12-11 22:59:00 +01:00
yiyang5055	0f410d625a	change CounterVec to use Counter in the Kubelet's Pod Lifecycle Event Generator	2019-12-11 23:51:28 +08:00
Kubernetes Prow Robot	9ddbc90039	Merge pull request #84191 from langyenan/getTypedVersion invoke getTypedVersion() instead of direct runtime call	2019-12-10 16:04:19 -08:00
Kubernetes Prow Robot	c7a65ca0c3	Merge pull request #85446 from RainbowMango/pr_remove_RawRegister Remove the derprecated API RawRegister from stability framework	2019-12-10 12:16:20 -08:00
Kubernetes Prow Robot	ff8cf507dc	Merge pull request #83841 from RainbowMango/pr_hide_kubelet_deprecated_metrics Turn off kubelet deprecated metrics	2019-12-09 11:30:02 -08:00
Kubernetes Prow Robot	fcc35b0468	Merge pull request #85899 from gongguan/slim_down_lister slim down some lister expansions	2019-12-09 07:20:17 -08:00
Kubernetes Prow Robot	398e2bcc73	Merge pull request #85874 from sambdavidson/ttlFunc Kubelet cert TTL via GaugeFunc	2019-12-09 07:20:02 -08:00
ianlang	babdcd0d14	invoke getTypedVersion() instead of direct runtime call	2019-12-09 15:31:45 +08:00
Kubernetes Prow Robot	95878329d4	Merge pull request #86009 from rphillips/fixes/at_most_one_ref kubelet: guarantee at most only one cinfo per containerID	2019-12-06 17:06:15 -08:00
Kubernetes Prow Robot	e624d1b7bf	Merge pull request #85001 from bmoix/fix-golint-kubelet-httpgetter kubelet: rename HTTPGetter interface	2019-12-06 17:05:53 -08:00
louisgong	7f5076d8ee	slim down some lister expansions	2019-12-07 08:27:06 +08:00
Samuel Davidson	aba0b31526	Changed Kubelet client and serving cert TTL/Expiry certs to use gaugefunc for calculating time remaining.	2019-12-06 15:52:03 -08:00
Kubernetes Prow Robot	c9f690d418	Merge pull request #85170 from timyinshi/logSymlink modify dockerID to containerID	2019-12-06 14:27:35 -08:00
Ryan Phillips	4762985fb6	kubelet: guarantee at most only one cinfo per containerID The trailing for loop removed within this PR can include active and inactive references. This modification will gauarantee at most only one reference per container is returned.	2019-12-06 14:11:08 -06:00
Noah Kantrowitz	0ac25f51fc	Tiny typo in a comment.	2019-12-06 01:32:09 -08:00
louisgong	0dd468039d	inject remoteRuntime to kubelet dependency	2019-12-06 14:12:08 +08:00
Kubernetes Prow Robot	060e0de56d	Merge pull request #85945 from obitech/kubelet_refactor_lenient_path Refactor kubelet component config lenient path decoding	2019-12-05 13:51:03 -08:00
Kubernetes Prow Robot	ad5d4c4705	Merge pull request #85706 from yutedz/per-node-dev Remove nodes slice in loop of takeByTopology	2019-12-05 13:50:30 -08:00
obitech	bf92350b11	Refactor kubelet component config lenient path decoding Use NewLenientSchemeAndCodec function provided in k8s.io/component-base/codec instead of private package function.	2019-12-05 19:23:48 +08:00
Ted Yu	36784331eb	Sync the status of static Pods	2019-12-04 13:28:42 -08:00
Hemant Kumar	4b8e552a88	Use typed errors for special casing volume progress Use typed errors rather than operation status for indicating operation progress	2019-12-04 14:48:30 -05:00
Adelina Tuvenie	bc7d254317	Fix Cpu Requests priority Windows. For Windows, CPU Requests ( Shares, Count and Maximum ) are mutually exclusive, however Kubernetes sends them all anyway in the pod spec. When using dockershim this is not an issue, as Docker checks for this specific situation here: `1bd184a4c2/daemon/daemon_windows.go (L87-L106)` However, when using CRI-Containerd this pods fail to spawn with an error from hcsshim. This PR intends to filter these values before they are sent to the CRI and not rely on the runtime for it. Related to: https://github.com/kubernetes/kubernetes/issues/84804	2019-12-04 19:32:26 +02:00
louisgong	b469404d97	hollow-node use remote CRI	2019-12-04 17:07:04 +08:00

... 2 3 4 5 6 ...

8347 Commits