kubernetes 常见报错(持续更新)

1、k8s的node1节点处于NotReady

Oct 29 17:56:17 k8s-node1 kubelet[48455]: E1029 17:56:17.983157 48455 kubelet_node_status.go:94] Unable to register node “k8s-node3” with API server: nodes “k8s-node3” is forbidden: node “k8s-node1” is not allowed to modify node “k8s-node3”

Oct 29 17:56:18 k8s-node1 kubelet[48455]: E1029 17:56:18.307200 48455 reflector.go:123] object-“default”/”default-token-8njxz”: Failed to list *v1.Secret: secrets “default-token-8njxz” is forbidden: User “system:node:k8s-node1” cannot list resource “secrets” in API group “” in the namespace “default”: no relationship found between node “k8s-node1” and this object

重新部署node1节点,重新部署后

在master1节点测试重新加载网络,或者可以直接重新加载网络测试不重新部署node1节点

kubectl apply -f kube-flannel.yaml

kubectl apply -f apiserver-to-kubelet-rbac.yaml

 

2、kubectl, helm 命令自动补全

kubectl 命令自动补全:

yum install -y bash-completion

locate bash_completion

将一下内容添加到/etc/profile

source /usr/share/bash-completion/bash_completion

source <(kubectl completion bash)

helm 命令自动补全:

source <(helm completion bash)

 

3、新加入Node 节点

部署kubelet,kube-proxy 拷贝其他Node 配置文件(修改),启动文件 , 拷贝SSL 证书(只需要ca.pem kube-proxy-key.pem kube-proxy.pem)

创建flannel 工作目录,配置文件目录

[root@node1 ~]# mkdir /opt/cni/bin /etc/cni/net.d -p

# /opt/cni/bin 用于存放插件的二进制文件

#/etc/cni/net.d/XX.conf 用于存放某一个网络配置文件的名称

 

4、etcd 节点无法启动

查看etcd 集群状态:

[root@master1 cfg]# /opt/etcd/bin/etcdctl –ca-file=/opt/etcd/ssl/ca.pem –cert-file=/opt/etcd/ssl/server.pem –key-file=/opt/etcd/ssl/server-key.pem –endpoints=”https://192.168.31.63:2379,https://192.168.31.65:2379,https://192.168.31.66:2379″ cluster-health

failed to check the health of member 72130f86e474b7bb on https://192.168.31.66:2379: Get https://192.168.31.66:2379/health: dial tcp 192.168.31.66:2379: connect: connection refused

member 72130f86e474b7bb is unreachable: [https://192.168.31.66:2379] are all unreachable

member b46624837acedac9 is healthy: got healthy result from https://192.168.31.63:2379

member fd9073b56d4868cb is healthy: got healthy result from https://192.168.31.65:2379

cluster is degraded

到所在节点上etcd 上查看具体报错:

— Subject: Unit etcd.service has begun start-up

— Defined-By: systemd

— Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

— Unit etcd.service has begun starting up.

Jul 31 22:01:43 node2 etcd[39920]: recognized environment variable ETCD_NAME, but unused: shadowed by corresponding flag

Jul 31 22:01:43 node2 etcd[39920]: recognized environment variable ETCD_DATA_DIR, but unused: shadowed by corresponding flag

Jul 31 22:01:43 node2 etcd[39920]: recognized environment variable ETCD_LISTEN_PEER_URLS, but unused: shadowed by corresponding flag

Jul 31 22:01:43 node2 etcd[39920]: recognized environment variable ETCD_LISTEN_CLIENT_URLS, but unused: shadowed by corresponding flag

Jul 31 22:01:43 node2 etcd[39920]: recognized environment variable ETCD_INITIAL_ADVERTISE_PEER_URLS, but unused: shadowed by corresponding flag

Jul 31 22:01:43 node2 etcd[39920]: recognized environment variable ETCD_ADVERTISE_CLIENT_URLS, but unused: shadowed by corresponding flag

Jul 31 22:01:43 node2 etcd[39920]: recognized environment variable ETCD_INITIAL_CLUSTER, but unused: shadowed by corresponding flag

Jul 31 22:01:43 node2 etcd[39920]: recognized environment variable ETCD_INITIAL_CLUSTER_TOKEN, but unused: shadowed by corresponding flag

Jul 31 22:01:43 node2 etcd[39920]: recognized environment variable ETCD_INITIAL_CLUSTER_STATE, but unused: shadowed by corresponding flag

Jul 31 22:01:43 node2 etcd[39920]: etcd Version: 3.3.13

Jul 31 22:01:43 node2 etcd[39920]: Git SHA: 98d3084

Jul 31 22:01:43 node2 etcd[39920]: Go Version: go1.10.8

Jul 31 22:01:43 node2 etcd[39920]: Go OS/Arch: linux/amd64

Jul 31 22:01:43 node2 etcd[39920]: setting maximum number of CPUs to 4, total number of available CPUs is 4

Jul 31 22:01:43 node2 etcd[39920]: the server is already initialized as member before, starting as etcd member…

Jul 31 22:01:43 node2 etcd[39920]: peerTLS: cert = /opt/etcd/ssl/server.pem, key = /opt/etcd/ssl/server-key.pem, ca = , trusted-ca = /opt/etcd/ssl/ca.pem, client-cert-auth = false, crl-file =

Jul 31 22:01:43 node2 etcd[39920]: listening for peers on https://192.168.31.66:2380

Jul 31 22:01:43 node2 etcd[39920]: The scheme of client url http://127.0.0.1:2379 is HTTP while peer key/cert files are presented. Ignored key/cert files.

Jul 31 22:01:43 node2 etcd[39920]: listening for client requests on 127.0.0.1:2379

Jul 31 22:01:43 node2 etcd[39920]: listening for client requests on 192.168.31.66:2379

Jul 31 22:01:43 node2 etcd[39920]: recovered store from snapshot at index 1000012

Jul 31 22:01:43 node2 etcd[39920]: restore compact to 916650

Jul 31 22:01:43 node2 etcd[39920]: name = etcd-3

Jul 31 22:01:43 node2 etcd[39920]: data dir = /var/lib/etcd/default.etcd

Jul 31 22:01:43 node2 etcd[39920]: member dir = /var/lib/etcd/default.etcd/member

Jul 31 22:01:43 node2 etcd[39920]: heartbeat = 100ms

Jul 31 22:01:43 node2 etcd[39920]: election = 1000ms

Jul 31 22:01:43 node2 etcd[39920]: snapshot count = 100000

Jul 31 22:01:43 node2 etcd[39920]: advertise client URLs = https://192.168.31.66:2379

Jul 31 22:01:43 node2 etcd[39920]: read wal error (walpb: crc mismatch) and cannot be repaired

Jul 31 22:01:43 node2 systemd[1]: etcd.service: main process exited, code=exited, status=1/FAILURE

Jul 31 22:01:43 node2 systemd[1]: Failed to start Etcd Server.

— Subject: Unit etcd.service has failed

— Defined-By: systemd

— Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

— Unit etcd.service has failed.

— The result is failed.

接着我删除了 该节点上 etcd 的数据

rm -rf /var/lib/etcd/default.etcd/member/snap/*

rm -rf /var/lib/etcd/default.etcd/member/wal/*

重启该etcd 节点,还是报错:

[root@node2 wal]# journalctl -xe

Jul 31 22:05:30 node2 etcd[42059]: recognized environment variable ETCD_DATA_DIR, but unused: shadowed by corresponding flag

Jul 31 22:05:30 node2 etcd[42059]: recognized environment variable ETCD_LISTEN_PEER_URLS, but unused: shadowed by corresponding flag

Jul 31 22:05:30 node2 etcd[42059]: recognized environment variable ETCD_LISTEN_CLIENT_URLS, but unused: shadowed by corresponding flag

Jul 31 22:05:30 node2 etcd[42059]: recognized environment variable ETCD_INITIAL_ADVERTISE_PEER_URLS, but unused: shadowed by corresponding flag

Jul 31 22:05:30 node2 etcd[42059]: recognized environment variable ETCD_ADVERTISE_CLIENT_URLS, but unused: shadowed by corresponding flag

Jul 31 22:05:30 node2 etcd[42059]: recognized environment variable ETCD_INITIAL_CLUSTER, but unused: shadowed by corresponding flag

Jul 31 22:05:30 node2 etcd[42059]: recognized environment variable ETCD_INITIAL_CLUSTER_TOKEN, but unused: shadowed by corresponding flag

Jul 31 22:05:30 node2 etcd[42059]: recognized environment variable ETCD_INITIAL_CLUSTER_STATE, but unused: shadowed by corresponding flag

Jul 31 22:05:30 node2 etcd[42059]: etcd Version: 3.3.13

Jul 31 22:05:30 node2 etcd[42059]: Git SHA: 98d3084

Jul 31 22:05:30 node2 etcd[42059]: Go Version: go1.10.8

Jul 31 22:05:30 node2 etcd[42059]: Go OS/Arch: linux/amd64

Jul 31 22:05:30 node2 etcd[42059]: setting maximum number of CPUs to 4, total number of available CPUs is 4

Jul 31 22:05:30 node2 etcd[42059]: the server is already initialized as member before, starting as etcd member…

Jul 31 22:05:30 node2 etcd[42059]: peerTLS: cert = /opt/etcd/ssl/server.pem, key = /opt/etcd/ssl/server-key.pem, ca = , trusted-ca = /opt/etcd/ssl/ca.pem, client-cert-auth = false, crl-file =

Jul 31 22:05:30 node2 etcd[42059]: listening for peers on https://192.168.31.66:2380

Jul 31 22:05:30 node2 etcd[42059]: The scheme of client url http://127.0.0.1:2379 is HTTP while peer key/cert files are presented. Ignored key/cert files.

Jul 31 22:05:30 node2 etcd[42059]: listening for client requests on 127.0.0.1:2379

Jul 31 22:05:30 node2 etcd[42059]: listening for client requests on 192.168.31.66:2379

Jul 31 22:05:30 node2 etcd[42059]: member 72130f86e474b7bb has already been bootstrapped

Jul 31 22:05:30 node2 systemd[1]: etcd.service: main process exited, code=exited, status=1/FAILURE

Jul 31 22:05:30 node2 systemd[1]: Failed to start Etcd Server.

— Subject: Unit etcd.service has failed

— Defined-By: systemd

— Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

— Unit etcd.service has failed.

— The result is failed.

上网查阅了下:

并没有成功启动服务,可以看到提示信息: member 72130f86e474b7bb has already been bootstrapped

查看资料说是:

One of the member was bootstrapped via discovery service. You must remove the previous data-dir to clean up the member information. Or the member will ignore the new configuration and start with the old configuration. That is why you see the mismatch.

大概意思:

其中一个成员是通过discovery service引导的。必须删除以前的数据目录来清理成员信息。否则成员将忽略新配置,使用旧配置。这就是为什么你看到了不匹配。

看到了这里,问题所在也就很明确了,启动失败的原因在于data-dir (/var/lib/etcd/default.etcd)中记录的信息与 etcd启动的选项所标识的信息不太匹配造成的。

问题解决

第一种方式:

我们可以通过修改启动参数解决这类错误。既然 data-dir 中已经记录信息,我们就没必要在启动项中加入多于配置。

具体修改–initial-cluster-state参数:

[root@node2 member]# vim /usr/lib/systemd/system/etcd.service

将 –initial-cluster-state=new 修改成 –initial-cluster-state=existing,再次重新启动就ok了。

重启成功后再改回来

第二种方式:

 

复制其他节点的data-dir中的内容,以此为基础上以 –force-new-cluster 的形式强行拉起一个,然后以添加新成员的方式恢复这个集群。

 

5、yaml 文件无法创建Pod

报错:

[root@k8sm90 demo]# kubectl create -f tomcat-deployment.yaml

error: unable to recognize “tomcat-deployment.yaml”: no matches for kind “Deployment” in version “extensions/v1beta1”

解决方法:

修改yaml 文件,修改 apiVersion 选项: apps/v1

例如:

[root@k8sm90 demo]# cat tomcat-deployment.yaml

apiVersion: apps/v1

kind: Deployment

因为我的 k8s 版本是v1.18.5,在这个版本中 Deployment 已经从 extensions/v1beta1 弃用

改为了apps/v1

DaemonSet, Deployment, StatefulSet, and ReplicaSet resources will no longer be served from extensions/v1beta1, apps/v1beta1, or apps/v1beta2 by default in v1.16.

如果改完后发现还是报错,[root@host131 prometheus]# kubectl create -f prometheus-deployment.yaml

error: error validating “prometheus-deployment.yaml”: error validating data: ValidationError(Deployment.spec): missing required field “selector” in io.k8s.api.apps.v1.DeploymentSpec; if you choose to ignore these errors, turn validation off with –validate=false

解决方法:原因也非常清晰,Deployment.spec中需要添加selector选择器

selector:

matchLabels:

k8s-app: influxdb

#需要注意的是 Deployment.spec.selector 里面的 Label 需要和template里面的Label一致,这里之所以定义为 k8s-app: influxdb 也是因为为了和template已经存在的Label保持一致,这样就可以成功的进行创建了

kubernetes 常见报错(持续更新)

6、签发k8s 证书问题

报错:

“code”:5100,”message”:”Invalid policy: no key usage available”}

1、可能是因为 签发的CA 机构证书 复用 ,etcd,apiserver,kubelet 创建证书都需要 单独 创建3个CA 机构证书,!!!!不能复用

2、可能是因为 创建的CA 机构的 “CN”: “XX” 名称 与 证书申请文件 的 “CN”: “XX” 名称不一致!!!! 注意名称必须要保持一致

如:

kubernetes 常见报错(持续更新)

kubernetes 常见报错(持续更新)

3、可能是因为 创建的CA 机构的 ca-config.json 配置文件中的 “profiles”: {

“XXXXX” 与 cfssl 签发证书文件时 的 profile名称 不一致!!! 注意名称必须要保持一致

如:

kubernetes 常见报错(持续更新)

kubernetes 常见报错(持续更新)

7、执行命令 kubectl run nginx –image=nginx –replicas=2 –port=80 会反馈

Flag –replicas has been deprecated, has no effect and will be removed in the future.

并且只创建一个nginx 容器实例,没有副本

原因:

在K8S v1.18.0 以后,–replicas已弃用 ,推荐用 deployment 创建 pods

 

8、在创建ingress 规则时,报错: no matches for kind “Ingress” in version “networking.k8s.io/v1”

原因:

networking.k8s.io/v1beta1 == 1.14 to 1.18

networking.k8s.io/v1 = 1.19+

kubernetes 常见报错(持续更新)

因为我的k8s集群的api 版本是 1.18.6 ,也就是说在创建ingress 规则的时候,yaml 文件中只能写 apiVersion: networking.k8s.io/v1beta1

如果要写 apiVersion: networking.k8s.io/v1 需要满足 api 的版本 大于等于1.19

 

下面kubernetes 官方 贴出两个ingress 规则yaml 文件:

 

api 基于 1.14 to 1.18

[root@k8sm90 demo]#cat ingress-rule.yaml

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: test-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  rules:
  - http:
      paths:
      - path: /testpath
        pathType: Prefix
        backend:
          serviceName: test
          servicePort: 80

 

api 基于 1.19 +

[root@k8sm90 demo]#cat ingress-rule.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: minimal-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  rules:
  - http:
      paths:
      - path: /testpath
        pathType: Prefix
        backend:
          service:
            name: test
            port:
              number: 80

#另外 通过kubectl get apiservices 可以看到当前K8S 集群所有api

[root@master-1 templates]# kubectl get apiservices
NAME                                   SERVICE   AVAILABLE   AGE
v1.                                    Local     True        6d23h
v1.admissionregistration.k8s.io        Local     True        6d23h
v1.apiextensions.k8s.io                Local     True        6d23h
v1.apps                                Local     True        6d23h
v1.authentication.k8s.io               Local     True        6d23h
v1.authorization.k8s.io                Local     True        6d23h
v1.autoscaling                         Local     True        6d23h
v1.batch                               Local     True        6d23h
v1.coordination.k8s.io                 Local     True        6d23h
v1.networking.k8s.io                   Local     True        6d23h
v1.rbac.authorization.k8s.io           Local     True        6d23h
v1.scheduling.k8s.io                   Local     True        6d23h
v1.storage.k8s.io                      Local     True        6d23h
v1beta1.admissionregistration.k8s.io   Local     True        6d23h
v1beta1.apiextensions.k8s.io           Local     True        6d23h
v1beta1.authentication.k8s.io          Local     True        6d23h
v1beta1.authorization.k8s.io           Local     True        6d23h
v1beta1.batch                          Local     True        6d23h
v1beta1.certificates.k8s.io            Local     True        6d23h
v1beta1.coordination.k8s.io            Local     True        6d23h
v1beta1.discovery.k8s.io               Local     True        6d23h
v1beta1.events.k8s.io                  Local     True        6d23h
v1beta1.extensions                     Local     True        6d23h
v1beta1.networking.k8s.io              Local     True        6d23h
v1beta1.node.k8s.io                    Local     True        6d23h
v1beta1.policy                         Local     True        6d23h
v1beta1.rbac.authorization.k8s.io      Local     True        6d23h
v1beta1.scheduling.k8s.io              Local     True        6d23h
v1beta1.storage.k8s.io                 Local     True        6d23h
v2beta1.autoscaling                    Local     True        6d23h
v2beta2.autoscaling                    Local     True        6d23h

 

本文版权归 飞翔沫沫情 作者所有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出 原文链接 如有问题, 可发送邮件咨询,转贴请注明出处:https://www.fxkjnj.com/?p=2256

发表评论

登录后才能评论

评论列表(4条)

  • 岛屿末歌
    岛屿末歌 2020年8月20日 上午10:01

    希望大佬每天更新 嘻嘻

  • 倘若初见
    倘若初见 2020年8月21日 下午2:00

    完美!!

  • 汾西富格
    汾西富格 2021年1月19日 下午5:02

    no matches for kind “Ingress” in version “networking.k8s.io/v1”
    刚好遇到了这个问题,百度了下 就看到了这篇文章 跟着博主的思路 看了下 确实是api 版本的问题,感谢分享