请教下没有启用 HPA， deployment 什么情况下会自动将 replica 设置为 0

rabbitz 2022-05-09 15:55:46 +08:00 1933 次点击

这是一个创建于 1252 天前的主题，其中的信息可能已经有所发展或是发生改变。

最近碰到个问题，使用 deployment 运行 nexus3 跑一段时间后 replicas 会被设置为 0 ，也没启用 HPA ，restartPolicy 配置的是 Always ，查了 kube-apiserver 的 audit 日志也不是人为操作的，查看 nexus3 的日志也没有报错。

deployment yaml:

kind: Deployment apiVersion: apps/v1 metadata: name: service-nexus3-deployment namespace: service annotations: deployment.kubernetes.io/revision: '6' spec: replicas: 1 selector: matchLabels: app: service-nexus3 envronment: test template: metadata: creationTimestamp: null labels: app: service-nexus3 envronment: test annotations: kubesphere.io/restartedAt: '2022-02-16T01:11:44.479Z' spec: volumes: - name: service-nexus3-volume persistentVolumeClaim: claimName: service-nexus3-pvc - name: docker-proxy configMap: name: docker-proxy defaultMode: 493 containers: - name: nexus3 # 用的阿里的镜像仓库，删了仓库名 image: 'registry.cn-hangzhou.aliyuncs.com/nexus3-latest' ports: - name: tcp8081 containerPort: 8081 protocol: TCP resources: limits: cpu: '4' memory: 8Gi requests: cpu: 500m memory: 1Gi volumeMounts: - name: service-nexus3-volume mountPath: /data/server/nexus3/ terminationMessagePath: /dev/termination-log terminationMessagePolicy: File imagePullPolicy: Always - name: docker-proxy # 用的阿里的镜像仓库，删了仓库名 image: 'registry.cn-hangzhou.aliyuncs.com/nginx-latest' ports: - name: tcp80 containerPort: 80 protocol: TCP resources: limits: cpu: '2' memory: 4Gi requests: cpu: 500m memory: 1Gi volumeMounts: - name: docker-proxy mountPath: /usr/local/nginx/conf/vhosts/ terminationMessagePath: /dev/termination-log terminationMessagePolicy: File imagePullPolicy: Always restartPolicy: Always terminationGracePeriodSeconds: 30 dnsPolicy: ClusterFirst nodeSelector: disktype: raid1 securityContext: {} imagePullSecrets: - name: registrysecret schedulerName: default-scheduler strategy: type: RollingUpdate rollingUpdate: maxUnavailable: 0 maxSurge: 1 revisionHistoryLimit: 10 progressDeadlineSeconds: 600

HPA:

# kubectl get hpa -A No resources found

deployment describe：

... ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal ScalingReplicaSet 34m (x2 over 38h) deployment-controller Scaled down replica set service-nexus3-deployment-57995fcd76 to 0

kube controller 日志：

# kubectl logs kube-controller-manager-k8s-130 -n kube-system|grep nexus I0509 10:49:11.687356 1 event.go:281] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"service", Name:"service-nexus3-deployment", UID:"e0c4abba-bbe5-4c19-9853-de63ee571124", APIVersion:"apps/v1", ResourceVersion:"126342143", FieldPath:""}): type: 'Normal' reason: 'ScalingReplicaSet' Scaled down replica set service-nexus3-deployment-57995fcd76 to 0 I0509 10:49:11.701642 1 event.go:281] Event(v1.ObjectReference{Kind:"ReplicaSet", Namespace:"service", Name:"service-nexus3-deployment-57995fcd76", UID:"9f96fdf1-1e20-4c83-ad18-1b3640d52493", APIVersion:"apps/v1", ResourceVersion:"126342151", FieldPath:""}): type: 'Normal' reason: 'SuccessfulDelete' Deleted pod: service-nexus3-deployment-57995fcd76-t6bhx

kube-apiserver audit 相关日志: