Commit Graph

7826 Commits

Author SHA1 Message Date
André Martins
a5365d5be1 dockershim/network: fix panic for cni plugins in IPv4/IPv6 dual-stack mode
```
 k8s.io/kubernetes/pkg/kubelet/dockershim/network/cni.(*cniNetworkPlugin).GetPodNetworkStatus(0xc000a04370, 0xc000b89a62, 0xb, 0xc000b89a49, 0x18, 0x42edffb, 0x6, 0xc000cfa340, 0x40, 0xc000ced7d0, ...)
         /workspace/anago-v1.16.0-beta.1.787+48ca054daba9e6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/dockershim/network/cni/cni_others.go:78 +0x420
 k8s.io/kubernetes/pkg/kubelet/dockershim/network.(*PluginManager).GetPodNetworkStatus(0xc000a51880, 0xc000b89a62, 0xb, 0xc000b89a49, 0x18, 0x42edffb, 0x6, 0xc000cfa340, 0x40, 0x0, ...)
         /workspace/anago-v1.16.0-beta.1.787+48ca054daba9e6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/dockershim/network/plugins.go:391 +0x1f9
 k8s.io/kubernetes/pkg/kubelet/dockershim.(*dockerService).getIPsFromPlugin(0xc00029b600, 0xc000c25cb0, 0x40, 0x78c0000, 0x7982100, 0x0, 0x0)
         /workspace/anago-v1.16.0-beta.1.787+48ca054daba9e6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/dockershim/docker_sandbox.go:335 +0x1c3
 k8s.io/kubernetes/pkg/kubelet/dockershim.(*dockerService).getIPs(0xc00029b600, 0xc000b66cc0, 0x40, 0xc000c25cb0, 0x30bd171a, 0xed508364b, 0x0)
         /workspace/anago-v1.16.0-beta.1.787+48ca054daba9e6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/dockershim/docker_sandbox.go:373 +0xe3
 k8s.io/kubernetes/pkg/kubelet/dockershim.(*dockerService).PodSandboxStatus(0xc00029b600, 0x4ad8b20, 0xc000c25c80, 0xc000cde1c0, 0xc00029b600, 0xc000c25c80, 0xc0005f5bd0)
         /workspace/anago-v1.16.0-beta.1.787+48ca054daba9e6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/dockershim/docker_sandbox.go:439 +0x133
 k8s.io/kubernetes/vendor/k8s.io/cri-api/pkg/apis/runtime/v1alpha2._RuntimeService_PodSandboxStatus_Handler(0x42c4e00, 0xc00029b600, 0x4ad8b20, 0xc000c25c80, 0xc000c126c0, 0x0, 0x4ad8b20, 0xc000c25c80, 0xc000cb2d20, 0x42)
         /workspace/anago-v1.16.0-beta.1.787+48ca054daba9e6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.pb.go:7663 +0x23e
 k8s.io/kubernetes/vendor/google.golang.org/grpc.(*Server).processUnaryRPC(0xc000a4f760, 0x4b45280, 0xc000b02d80, 0xc000847c00, 0xc000a61b00, 0x78c97c0, 0x0, 0x0, 0x0)
         /workspace/anago-v1.16.0-beta.1.787+48ca054daba9e6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/server.go:995 +0x466
 k8s.io/kubernetes/vendor/google.golang.org/grpc.(*Server).handleStream(0xc000a4f760, 0x4b45280, 0xc000b02d80, 0xc000847c00, 0x0)
         /workspace/anago-v1.16.0-beta.1.787+48ca054daba9e6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/server.go:1275 +0xda6
 k8s.io/kubernetes/vendor/google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc000a8e9c0, 0xc000a4f760, 0x4b45280, 0xc000b02d80, 0xc000847c00)
         /workspace/anago-v1.16.0-beta.1.787+48ca054daba9e6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/server.go:710 +0x9f
 created by k8s.io/kubernetes/vendor/google.golang.org/grpc.(*Server).serveStreams.func1
         /workspace/anago-v1.16.0-beta.1.787+48ca054daba9e6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/server.go:708 +0xa1
```

Fixes: dba434c4ba ("kubenet for ipv6 dualstack")
Signed-off-by: André Martins <aanm90@gmail.com>
2019-09-10 21:06:19 +02:00
zhuangqh
057caf7fcf kubelet: refactor server containerLogs test to table driven test
Signed-off-by: zhuangqh <zhuangqhc@gmail.com>
2019-09-09 10:04:45 +08:00
Ted Yu
253797acab Avoid conflicting log message when AddPodToVolume encounters error 2019-09-05 09:38:56 +08:00
Paul Fisher
d32aa6af1d Add comment for testing 100+ CPU usage 2019-09-04 11:49:15 -07:00
Carlos de Paula
8cd98fbd60 Bump gonvml module and remove CGO dependency.
Signed-off-by: Carlos de Paula <me@carlosedp.com>
2019-09-04 15:27:57 -03:00
Bruce Ma
f9169d29cb skip recording inputs & outputs in fake script plugin when CNI_COMMAND=VERSION
Signed-off-by: Bruce Ma <brucema19901024@gmail.com>
2019-09-04 22:50:13 +08:00
Paul Fisher
f798cee51e pkg/kubelet: fix uint64 overflow when elapsed UsageCoreNanoSeconds exceeds 18446744073 2019-09-03 15:54:26 -07:00
Kubernetes Prow Robot
542f3c65a0 Merge pull request #78547 from MikeSpreitzer/fix-76699
Make iptables and ipvs modes of kube-proxy MASQUERADE --random-fully if possible
2019-09-03 14:34:58 -07:00
SataQiu
6d6b0be36b fix golint failures of pkg/kubelet 2019-09-02 17:47:08 +08:00
Mike Spreitzer
d86d1defa1 Made IPVS and iptables modes of kube-proxy fully randomize masquerading if possible
Work around Linux kernel bug that sometimes causes multiple flows to
get mapped to the same IP:PORT and consequently some suffer packet
drops.

Also made the same update in kubelet.

Also added cross-pointers between the two bodies of code, in comments.

Some day we should eliminate the duplicate code.  But today is not
that day.
2019-09-01 22:07:30 -04:00
Kubernetes Prow Robot
7d40536c81 Merge pull request #82024 from codenrhoden/mv-hostutil
Move HostUtil to pkg/volume/util/hostutil
2019-08-30 19:21:49 -07:00
Kubernetes Prow Robot
c86da8e2c1 Merge pull request #82048 from cheftako/kas-np4
Add support for konnectivity service to the etcd3 client.
2019-08-30 16:15:28 -07:00
Kubernetes Prow Robot
887edd2273 Merge pull request #82099 from lmdaly/single-numa-node-policy
Topology Manager Policy: single-numa-node
2019-08-30 11:21:26 -07:00
Walter Fender
edbb0fa2fe Add support for konnectivity service to the etcd3 client.
If konnectivity service is enabled, the etcd client will now use it.
This did require moving a few methods to break circular dependencies.

Factored in feedback from lavalamp and wenjiaswe.
2019-08-30 10:31:53 -07:00
Travis Rhoden
935c23f2ad Move HostUtil to pkg/volume/util/hostutil
This patch moves the HostUtil functionality from the util/mount package
to the volume/util/hostutil package.

All `*NewHostUtil*` calls are changed to return concrete types instead
of interfaces.

All callers are changed to use the `*NewHostUtil*` methods instead of
directly instantiating the concrete types.
2019-08-30 10:14:42 -06:00
Kubernetes Prow Robot
9165f7bf56 Merge pull request #82104 from klueska/upstream-fix-cpu-manager-topology-bug
Fix bug in CPUManager with setting topology for policies
2019-08-30 08:00:44 -07:00
Kubernetes Prow Robot
f442b6ef32 Merge pull request #82090 from liggitt/webhook-http2
Use http/1.1 for apiserver->webhook clients
2019-08-30 06:26:54 -07:00
yuxiaobo
065343933d delete extra comma 2019-08-30 16:03:33 +08:00
Louise Daly
8ad1b5ba3b Single-numa-node Topology Manager bug fix
Added one off fix for single-numa-node policy to correctly
reject pod admission on a resource allocation that spans
NUMA nodes

Co-authored-by: Kevin Klues <kklues@nvidia.com>
2019-08-30 07:17:56 +01:00
Louise Daly
f6c085f60e Added Single NUMA Node Policy which ensure resource are
aligned on a single NUMA node

Co-authored-by: Kevin Klues <kklues@nvidia.com>
2019-08-30 07:17:17 +01:00
Kevin Klues
5ed80dadcf Update CanAdmitPodResult() in TopologyManager to take a TopologyHint
Previously it only took a bool, which limited the logic it could perform
to determine if a pod should be admitted or not based on the merged hint
from the policy.
2019-08-30 07:17:17 +01:00
Kubernetes Prow Robot
7d6f8d8f69 Merge pull request #80570 from klueska/upstream-add-topology-manager-to-devicemanager
Add support for Topology Manager to Device Manager
2019-08-29 21:21:44 -07:00
Kubernetes Prow Robot
3ebe6a6a5f Merge pull request #77807 from matthyx/startupProbe
Add startupProbe to health checks
2019-08-29 21:21:30 -07:00
Kubernetes Prow Robot
7da563f0f8 Merge pull request #81573 from irajdeep/irajdeep/change_runningPod_runningContainer_metrics
Convert kubelet metrics(running_pod_count and running_container_count) from non-standard prometheus collectors to standard gauges
2019-08-29 18:08:42 -07:00
Matthias Bertschy
a042a4b0ee startupProbe: make update 2019-08-30 00:42:43 +02:00
Matthias Bertschy
1a08ea5984 startupProbe: Test changes 2019-08-30 00:40:26 +02:00
Matthias Bertschy
323f99ea8c startupProbe: Kubelet changes 2019-08-30 00:40:26 +02:00
Kubernetes Prow Robot
a9e5c4d6e4 Merge pull request #81968 from mtaufen/node-csr-hash
derive node CSR hashes from public keys
2019-08-29 13:31:41 -07:00
Kubernetes Prow Robot
da986c56ab Merge pull request #73944 from xiaoanyunfei/cleanup/rm_unuse_judge
rm unnecessary judgement
2019-08-29 13:30:57 -07:00
Kevin Klues
eb0216e54e Update semantics to set Preferred field in TopologyHint generation
We now only set Preferred to true if resources can be allocated with a
size equal to the minimimum _possible_ mask when all resources are
available.
2019-08-29 14:32:10 -05:00
Kevin Klues
e0e8b3e4fd Update CPUManager topology helpers to accept multiple ids 2019-08-29 13:22:54 -05:00
Rajdeep Das
c02d49d775 Update running_pod_count and running_container_count metric
As already mentioned in this issue https://github.com/kubernetes/kubernetes/issues/79286, some metrics like
"running_pod_count" and "running_container_count" uses non-standard prometheus metrics, this change converts them to be
standard prometheus gauges

Minor refactor in kubelet/pleg/generic.go and added some test for ruuning container and running pod metrics

Fixed issues related to github CI pipeline failure

* Updated bazel for new deps
* Add comment for exported metrics variables,RuuningContainerCount and RunningPodCount
* Specify keys explicitly in Guage metric instantation

Fix go lint errors

Replace "+=1" with "++", as reported by go lint

Set container state as a label for the metrics "running_container_count"

As per the metrics name "running_container_count" it should "ideally" be showing
the number of containers in "running" state , but it was showing all the container count, irrespective of the state it is in.
This commit adds a new label "container_running_state" to the metrics "running_container_count", which doesn't change the base metrics but adds the
option to query the metrics with "container_state" such as "running"/"unknown/...

remove unused methods reported by staticcheck

Remove variables while instantiating gauge(vec) which are default set to nil

Convert kubelet metrics(running_pod_count and running_container_count) to standard gauges and added label to running_container_count metrics.

Currently kubelet metrics(running_pod_count and running_container_count) use non-standard prometheus collectors , this change
converts them to standard prometheus gauges. Also this adds a new label(container_state) to running_container_count which does a breakdown of
containers tracked by kubelet based on the containers' state(running/unknown/created/exited).

Set statbility explicitly for running_pod_count and running_container_count and reformat test

register metrics explicitly in test , so that they don't become no-op
2019-08-29 17:23:04 +02:00
Kevin Klues
dcc9f66311 Add devicemanager tests for TopologyHint consumption 2019-08-29 08:22:50 -05:00
Kevin Klues
cc567afaf0 Consume TopologyHints in the devicemanager 2019-08-29 08:22:50 -05:00
Kevin Klues
a3320f80d9 Add devicemanager tests for TopologyHint generation 2019-08-29 07:45:43 -05:00
Kevin Klues
d3d7a8f5d4 Generate TopologyHints from the devicemanager 2019-08-29 07:45:43 -05:00
Louise Daly
9a118ceac4 Added stub support for Topology Manager to Device Manager
Co-authored-by: Conor Nolan <conor.nolan@intel.com>
Co-authored-by: Sreemanti Ghosh <sreemanti.ghosh@intel.com>
Co-authored-by: Kevin Klues <kklues@nvidia.com>
2019-08-29 07:45:43 -05:00
Kevin Klues
1c1f19c61c Change Topology.NUMANode in device plugin interface to a repeated field 2019-08-29 07:45:43 -05:00
Kubernetes Prow Robot
7d4d17583b Merge pull request #81722 from klueska/upstream-add-socket-awreness-to-topologymanager
Add NUMA Node awareness to the TopologyManager
2019-08-29 05:30:58 -07:00
sunxiaofei03
45d41ed9e5 replace iteration with hashmap in *state_of_world 2019-08-29 19:22:25 +08:00
KEBE
8dc401d141 Fix sync pod log format and a func typo. 2019-08-29 14:39:43 +08:00
Kubernetes Prow Robot
ca5babc1da Merge pull request #81534 from logicalhan/kubelet-migration
migrate kubelet's metrics/probes & metrics endpoint to metrics stability framework
2019-08-28 18:26:45 -07:00
Kevin Klues
ddfd9ac0ca Fix bug in CPUManager with setting topology for policies
Also add a check in the unit tests to avoid regressions
2019-08-28 17:32:25 -05:00
Jordan Liggitt
aef05c8dca Plumb NextProtos to TLS client config, honor http/2 client preference 2019-08-28 16:51:56 -04:00
Kubernetes Prow Robot
c6a506bb8c Merge pull request #78174 from gaorong/oom-event
enrich kubelet system oom event message info
2019-08-28 12:01:13 -07:00
Han Kang
3a50917795 migrate kubelet's metrics/probes & metrics endpoint to metrics stability framework 2019-08-28 11:16:38 -07:00
Kevin Klues
df1b54fc09 Fail fast with TopologyManager on machines with more than 8 NUMA Nodes 2019-08-28 11:04:52 -05:00
Kevin Klues
5660cd3cfb Add NUMA Node awareness to the TopologyManager 2019-08-28 11:04:52 -05:00
Kubernetes Prow Robot
35867b160a Merge pull request #81951 from klueska/upstream-update-cpu-amanger-numa-mapping
Update the CPUManager to include NUMANodeID in its topology information
2019-08-28 08:55:40 -07:00
Kubernetes Prow Robot
879418a714 Merge pull request #81828 from mars1024/bugfix/delete_lo_network
delete lo network when TearDownPod to avoid CNI cache leak
2019-08-28 03:09:11 -07:00