草庐IT

pod lifecycle

程序员札记 2023-03-28 原文

我们一般将pod对象从创建至终这段时间范围成为pod的生命周期,它主要包含以下的过程:

  • pod创建过程
  • 运行初始化容器(init container)过程
  • 运行主容器(main container)
    • 容器启动后钩子(post start)、容器终止前钩子(pre stop)
    • 容器的存活性检测(liveness probe)、就绪性检测(readiness probe)
  • pod终止过程
image.png

在整个生命周期中,Pod会出现5种状态:

  • 挂起(Pending):API Server已经创建了Pod资源对象,但它尚未被调度完成或者仍处于下载镜像的过程中
  • 运行中(Running):Pod已经被调度到某节点,并且所有容器都已经被kubelet创建完成
  • 成功(Succeeded):Pod中的所有容器都已经成功删除并且不会被重启
  • 失败(Failed):所有容器都已经删除,但至少有一个容器删除失败,即容器返回了非0值的退出状态
  • 未知(Unknown):API Server无法正常获取到Pod对象的状态信息,通常由于网络通信失败所导致

pod的创建和终止

pod的创建过程

  1. 用户通过kubectl或其他api客户端提交需要创建的pod信息给apiserver
  2. apiserver开始生成pod对象的信息,并将信息存入etcd,然后返回确认信息至客户端
  3. apiserver开始反映etcd中pod对象的变化,其他组件使用watch机制来跟踪检查apiserver上的变动
  4. scheduler发现有新的pod对象要创建,开始为pod分配主机并将结果信息更新至apiserver
  5. node节点上的kubectl发现有pod调度过来,尝试调用docker启动容器,并将结果返回送至apiserver
  6. apiserver将接收到的pod状态信息存入etcd中
image.png

pod的终止过程

  1. 用户向apiserver发送删除pod对象的命令
  2. apiserver中的pod对象信息会随着时间的推移而更新,在宽限期内(默认30s),pod被视为dead
  3. 将pod标记为terminating状态
  4. kubelet在监控到pod对象转为terminating状态的同时启动pod关闭进程
  5. 端点控制器监控到pod对象的关闭行为时将其从所有匹配到此端点的service资源的端点列表移除
  6. 如果当前pod对象定义了prestop钩子处理器,则在其标记为terminating后即会以同步的方式启动执行
  7. pod对象中的容器进程收到停止信号
  8. 宽限期结束后,若pod中还存在仍在运行的进程,那么pod对象会收到立即终止的信号
  9. kubelet请求apiserver将此pod资源的宽限期设置为0从而完成删除操作,此时pod对于用户已不可见

初始化容器

初始化容器是在pod的主容器启动之前要运行的容器,主要是做一些主容器的前置工作,它具有两大特征

  1. 初始化容器必须运行完成直至结束,若某初始化容器运行失败,那么k8s需要重启它直到成功完成
  2. 初始化容器必须按照定义的顺序执行,当仅当前一个执行成功之后,后面的一个才能运行

初始化容器有很多的应用场景,下面列出的是最常见的几个

  • 提供主容器镜像中不具备的工具程序或自定义代码
  • 初始化容器要先于应用容器串行启动并运行完成,因此可用于延后应用容器的启动直至其依赖的条件得到满足

接下来做一个案例,模拟下面这个需求

假设要以主容器来运行nginx,但是要求在运行nginx之前要能够连接上mysql和redis所在服务器

为了简化测试,事先规定好mysql和redis服务器的地址

  1. 创建pod-initcontainer.yaml,内容如下:
apiVersion: v1
kind: Pod
metadata: 
  name: pod-initcontainer namespace: dev
  labels:
    user: ayanami
spec:
  containers: - name: main-container
    image: nginx:1.17.1 ports: - name: nginx-port
      containerPort: 80 initContainers: - name: test-mysql
    image: busybox:1.30 command: ['sh','-c','until ping 192.168.145.231 -c 1; do echo waiting for mysql...;sleep 2; done'] - name: test-redis
    image: busybox:1.30 command: ['sh','-c','until ping 192.168.145.232 -c 1; do echo waiting for redis...;sleep 2; done'] 

  1. 运行配置文件
[root@master ~]# vim pod-initcontainer.yaml
[root@master ~]# kubectl create -f pod-initcontainer.yaml 
pod/pod-initcontainer created
[root@master ~]# kubectl get pod pod-initcontainer -n dev
NAME                READY   STATUS     RESTARTS   AGE
pod-initcontainer   0/1     Init:0/2   0          20s

发现container一直处于初始化状态

此时添加ip地址(注:ens32是你的网卡的名字,不同机器可能不同,可以用ifconfig查看)

[root@master ~]# ifconfig ens32:1 192.168.145.231 netmask 255.255.255.0 up
[root@master ~]# ifconfig ens32:2 192.168.145.232 netmask 255.255.255.0 up
[root@master ~]# kubectl get pod pod-initcontainer -n dev
NAME                READY   STATUS    RESTARTS   AGE
pod-initcontainer   1/1     Running   0          19m

发现pod跑起来了

钩子函数

钩子函数能够感知自身生命周期中的事件,并在相应的时刻到来时运行用户指定的程序代码

k8s在主容器的启动之后和停止之前提供了两个钩子函数

  • post start:容器创建之后执行,如果失败了会重启容器
  • pre stop:容器终止之前执行,执行完成之后容器将成功终止,在其完成之前会阻塞删除容器的操作

钩子处理器支持使用下面三种方式定义动作:

  • Exec命令:在容器内执行一次命令
......
  lifecycle:
    postStart:
      exec:
        command: - cat - /tmp/healthy
......

  • TCPSocket:在当前容器尝试访问指定的socket
......
  lifecycle:
    postStart:
      tcpSocket:
        port: 8080 ......
  • HttpGet:在当前容器中向某url发起http请求
......
  lifecycle:
    postStart:
      httpGet:
        path: #uri地址
        port:
        host: 
        scheme: HTTP  #支持的协议,http或者https
......

下面演示钩子函数的使用

创建pod-hook-exec.yaml文件,内容如下:

apiVersion: v1
kind: Pod
metadata: 
  name: pod-hook-exec namespace: dev
spec:
  containers: - name: main-container
    image: nginx:1.17.1 ports: - name: nginx-port
      containerPort: 80 lifecycle:
      postStart:
        exec: #在容器启动的时候执行一个命令,修改掉nginx的默认首页内容
          command: ["/bin/sh","-c","echo postStart... > /usr/share/nginx/html/index.html"]
      preStop: #在容器停止之前停止nginx服务
        exec:
          command: ["/usr/sbin/nginx","-s","quit"]

使用配置文件

[root@master ~]# vim pod-hook-exec.yaml
[root@master ~]# kubectl create -f pod-hook-exec.yaml 
pod/pod-hook-exec created
[root@master ~]# kubectl get pod pod-hook-exec -n dev -o wide
NAME            READY   STATUS    RESTARTS   AGE   IP            NODE    NOMINATED NODE   READINESS GATES
pod-hook-exec   1/1     Running   0          43s   10.244.2.22   node1   <none>           <none> [root@master ~]# curl 10.244.2.22:80 postStart...

容器探测

容器探测用于检测容器中的应用实例是否能正常工作,是保障业务可用性的一种传统机制。如果经过探测,实例的状态不符合预期

,那么K8S就会把该问题实例“摘除”,不承担业务流量,k8s提供了两种探针来实现容器探测,分别是:

  • liveness probes:存活性探针,用于检测应用实例当前是否处于正常运行状态,如果不是,k8s会重启容器
  • readiness probes:就绪性探针,用于检测应用实例当前是否可以接受请求,如果不能,k8s不会转发流量

即livenessProbe决定是否重启容器,readinesProbe决定是否将请求转发给容器

上面两种探针目前均支持三种探测方式:

  • Exec命令:在容器内执行一次命令,如果命令执行的退出码为0,则认为程序正常,否则不正常
......
  livenessProbe:
      exec:
        command: - cat - /tmp/healthy
......
  • TCPSocket:将会尝试访问同一个用户容器的端口,如果能够建立这条连接,则认为程序正常,否则不正常
......
  livenessProbe:
      tcpSocket:
        port: 8080 
......
  • HTTPGet:调用容器内Web应用的URL,如果返回的状态码在200和399之间,则认为程序正常,否则不正常
......
  lifecycle:
    postStart:
      httpGet:
        path: #uri地址
        port:
        host: 
        scheme: HTTP  #支持的协议,http或者https
......

下面以liveness probes为例,做几个演示:

方式一:EXEC

创建pod-liveness-exec.yaml

apiVersion: v1
kind: Pod
metadata: 
  name: pod-liveness-exec namespace: dev
spec:
  containers: - name: main-container
    image: nginx:1.17.1 ports: - name: nginx-port
      containerPort: 80 livenessProbe:
      exec:
        command: ["/bin/cat","/tmp/hello.txt"] #执行一个查看文件的命令

使用配置文件

[root@master ~]# vim pod-liveness-exec.yaml
[root@master ~]# kubectl create -f pod-liveness-exec.yaml 
pod/pod-liveness-exec created
[root@master ~]# kubectl get pod pod-liveness-exec -n dev 
NAME                READY   STATUS    RESTARTS   AGE
pod-liveness-exec   1/1     Running   1          102s

发现pod重启了一次,查看错误信息

[root@master ~]# kubectl describe pod pod-liveness-exec -n dev
  Type     Reason     Age                  From               Message ----     ------     ----                 ----               ------- Normal   Scheduled <unknown>            default-scheduler  Successfully assigned dev/pod-liveness-exec to node1
  Normal   Pulled     49s (x4 over 2m20s)  kubelet, node1     Container image "nginx:1.17.1" already present on machine
  Normal   Created    49s (x4 over 2m20s)  kubelet, node1     Created container main-container
  Normal   Started    49s (x4 over 2m20s)  kubelet, node1     Started container main-container
  Normal   Killing    49s (x3 over 109s)   kubelet, node1     Container main-container failed liveness probe, will be restarted
  Warning  Unhealthy  39s (x10 over 2m9s)  kubelet, node1     Liveness probe failed: /bin/cat: /tmp/hello.txt: No such file or directory

修改文件内容

apiVersion: v1
kind: Pod
metadata: 
  name: pod-liveness-exec namespace: dev
spec:
  containers: - name: main-container
    image: nginx:1.17.1 ports: - name: nginx-port
      containerPort: 80 livenessProbe:
      exec:
        command: ["/bin/ls","/tmp/"] #执行一个查看文件的命令

使用配置文件

[root@master ~]# vim pod-liveness-exec.yaml 
[root@master ~]# kubectl delete -f pod-liveness-exec.yaml 
[root@master ~]# kubectl create -f pod-liveness-exec.yaml 
pod/pod-liveness-exec created
[root@master ~]# kubectl get pod pod-liveness-exec -n dev
NAME                READY   STATUS    RESTARTS   AGE
pod-liveness-exec   1/1     Running   0          84s

说明没有重启

方式二:TCPSocket

创建pod-liveness-tcpsocket.yaml

apiVersion: v1
kind: Pod
metadata: 
  name: pod-liveness-tcpsocket namespace: dev
spec:
  containers: - name: main-container
    image: nginx:1.17.1 ports: - name: nginx-port
      containerPort: 80 livenessProbe:
      tcpSocket:
        port: 8080  #尝试访问8080端口

使用配置文件

[root@master ~]# vim pod-liveness-tcpsocket.yaml
[root@master ~]# kubectl create -f pod-liveness-tcpsocket.yaml 
pod/pod-liveness-tcpsocket created
[root@master ~]# kubectl get pod pod-liveness-tcpsocket -n dev
NAME                     READY   STATUS    RESTARTS   AGE
pod-liveness-tcpsocket   1/1     Running   1          29s
[root@master ~]# kubectl describe pod pod-liveness-tcpsocket -n dev
Events:
Type     Reason     Age                  From               Message ----     ------     ----                 ----               ------- Normal   Scheduled <unknown>            default-scheduler  Successfully assigned dev/pod-liveness-tcpsocket to node1
Normal   Pulled     43s (x4 over 2m10s)  kubelet, node1     Container image "nginx:1.17.1" already present on machine
Normal   Created    43s (x4 over 2m10s)  kubelet, node1     Created container main-container
Normal   Started    43s (x4 over 2m10s)  kubelet, node1     Started container main-container
Normal   Killing    43s (x3 over 103s)   kubelet, node1     Container main-container failed liveness probe, will be restarted
Warning  Unhealthy  33s (x10 over 2m3s)  kubelet, node1     Liveness probe failed: dial tcp 10.244.2.25:8080: connect: connection refused

更改文件内容

apiVersion: v1
kind: Pod
metadata: 
  name: pod-liveness-tcpsocket namespace: dev
spec:
  containers: - name: main-container
    image: nginx:1.17.1 ports: - name: nginx-port
      containerPort: 80 livenessProbe:
      tcpSocket:
        port: 80  #尝试访问80端口

重新使用配置文件

[root@master ~]# vim pod-liveness-tcpsocket.yaml 
[root@master ~]# kubectl delete -f pod-liveness-tcpsocket.yaml 
pod "pod-liveness-tcpsocket" deleted
[root@master ~]# kubectl create -f pod-liveness-tcpsocket.yaml 
pod/pod-liveness-tcpsocket created
[root@master ~]# kubectl get pod pod-liveness-tcpsocket -n dev
NAME                     READY   STATUS    RESTARTS   AGE
pod-liveness-tcpsocket   1/1     Running   0          18s

表明没有问题

方式三:HTTPGet

apiVersion: v1
kind: Pod
metadata: 
  name: pod-liveness-httpget namespace: dev
spec:
  containers: - name: main-container
    image: nginx:1.17.1 ports: - name: nginx-port
      containerPort: 80 livenessProbe:
      httpGet: #其实就是访问http://127.0.0.1:80/hello
 scheme: HTTP #支持的协议,http或https
        port: 80 path: /hello #uri地址

使用配置文件

[root@master ~]# vim pod-liveness-httpget.yaml
[root@master ~]# kubectl create -f pod-liveness-httpget.yaml 
pod/pod-liveness-httpget created
[root@master ~]# kubectl get pod pod-liveness-httpget -n dev
NAME                   READY   STATUS    RESTARTS   AGE
pod-liveness-httpget   1/1     Running   1 75s
[root@master ~]# kubectl describe pod pod-liveness-httpget -n dev
Events:
  Type     Reason     Age                From               Message ----     ------     ----               ----               ------- Normal   Scheduled <unknown>          default-scheduler  Successfully assigned dev/pod-liveness-httpget to node2
  Normal   Pulled     18s (x3 over 74s)  kubelet, node2     Container image "nginx:1.17.1" already present on machine
  Normal   Created    18s (x3 over 74s)  kubelet, node2     Created container main-container
  Normal   Killing    18s (x2 over 48s)  kubelet, node2     Container main-container failed liveness probe, will be restarted
  Normal   Started    17s (x3 over 73s)  kubelet, node2     Started container main-container
  Warning  Unhealthy  8s (x7 over 68s)   kubelet, node2     Liveness probe failed: HTTP probe failed with statuscode: 404

可以看见pod在重启,详细描述说明没找到这个网址

修改配置文件

apiVersion: v1
kind: Pod
metadata: 
  name: pod-liveness-httpget namespace: dev
spec:
  containers: - name: main-container
    image: nginx:1.17.1 ports: - name: nginx-port
      containerPort: 80 livenessProbe:
      httpGet: #其实就是访问http://127.0.0.1:80/
 scheme: HTTP #支持的协议,http或https
        port: 80 path: / #uri地址

重新使用配置文件

[root@master ~]# kubectl delete -f pod-liveness-httpget.yaml 
pod "pod-liveness-httpget" deleted
[root@master ~]# kubectl create -f pod-liveness-httpget.yaml 
pod/pod-liveness-httpget created
[root@master ~]# kubectl get pod pod-liveness-httpget -n dev
NAME                   READY   STATUS    RESTARTS   AGE
pod-liveness-httpget   1/1     Running   0 24s
Events:
  Type    Reason     Age        From               Message ----    ------     ----       ----               ------- Normal  Scheduled <unknown>  default-scheduler  Successfully assigned dev/pod-liveness-httpget to node2
  Normal  Pulled     27s        kubelet, node2     Container image "nginx:1.17.1" already present on machine
  Normal  Created    27s        kubelet, node2     Created container main-container
  Normal  Started    27s        kubelet, node2     Started container main-container

表明配置没有问题

重启策略

在容器探测中,一旦容器探测出现了问题,kubernetes就会对容器所在的Pod进行重启,其实这是由Pod的默认重启策略决定的。Pod的重启策略有3种:

  • Always:容器失效时,自动重启该容器,默认值
  • OnFailure:容器终止运行且退出码不为0时重启
  • Never:不论状态如何,都不重启该容器
    重启策略是针对Pod设定的。首次需要重启的容器,将在其需要的时候立即进行重启,随后再次重启的操作将由kubelet延迟一段时间后进行,且反复的重启操作的延迟时长依次为10s、20s、40s、80s、160s和300s,300s是最大的延迟时长

新建pod-lifecycle.yaml,内容如下:

[root@k8s-master ~]# cat pod-lifecycle.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: pod-lifecycle
  namespace: dev
  labels:
    user: bulut
spec:
  containers:
    - name: nginx-container
      image: nginx:latest
  restartPolicy: Never
[root@k8s-master ~]#

其他配置

至此,已经使用liveness probe演示了三种探测方式,但是查看livenessProbe的子属性,会发现除了这三种方式,还有一些其他的配置,在这里一并解释下

[root@master ~]# kubectl explain pod.spec.containers.livenessProbe
KIND:     Pod
VERSION:  v1

RESOURCE: livenessProbe <Object> DESCRIPTION:
     Periodic probe of container liveness. Container will be restarted if the
     probe fails. Cannot be updated. More info:
     https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes
 Probe describes a health check to be performed against a container to
     determine whether it is alive or ready to receive traffic.

FIELDS:
   exec <Object> One and only one of the following should be specified. Exec specifies the
     action to take.

   failureThreshold <integer> Minimum consecutive failures for the probe to be considered failed after
     having succeeded. Defaults to 3. Minimum value is 1.

   httpGet <Object> HTTPGet specifies the http request to perform.

   initialDelaySeconds <integer> Number of seconds after the container has started before liveness probes
     are initiated. More info:
     https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes
 periodSeconds <integer> How often (in seconds) to perform the probe. Default to 10 seconds. Minimum
     value is 1.

   successThreshold <integer> Minimum consecutive successes for the probe to be considered successful
     after having failed. Defaults to 1. Must be 1 for liveness and startup.
     Minimum value is 1.

   tcpSocket <Object> TCPSocket specifies an action involving a TCP port. TCP hooks not yet
     supported

   timeoutSeconds <integer> Number of seconds after which the probe times out. Defaults to 1 second.
     Minimum value is 1. More info:
     https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

有关pod lifecycle的更多相关文章

随机推荐