k8s-nfs

创建nfs服务

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
yum install rpcbind nfs-utils #所有机器都要安装,他是通过宿主机来挂载的
systemctl start rpcbind #开启服务
systemctl start nfs-server # 开启服务
systemctl enable rpcbind # 开机启动
systemctl enable nfs-server #开机启动
mkdir -p /share
vim /etc/exports
#添加如下内容rw表示可读可写; no_root_squash的配置可以让任何用户都能访问此文件夹,192.168.4.*不支持,会出现访问拒绝的错
/share 192.168.4.1(rw,no_root_squash)
#加载配置服务
exportfs -rv
#测试挂载
mount -t nfs 192.168.4.2:/share /root/testshare
#删除挂载
umount /root/testshare
#mac测试挂载,在finder按快捷键command+k,输入如下地址
nfs://192.168.4.2/share

排查问题相关命令

1
2
showmount -e localhost #查询本机nfs共享目录情况
showmount -a localhost #查询本机共享目录连接情况

k8s配置使用nfs存储类nfs-client-provisioner

旧版不支持kubernetes 1.20以上版本

1
2
3
helm repo add stable http://mirror.azure.cn/kubernetes/charts
helm install my-release --set nfs.server=192.168.4.2 --set nfs.path=/share stable/nfs-client-provisioner
helm delete my-release #卸载

新版

1
2
3
4
helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
--set nfs.server=192.168.4.2 \
--set nfs.path=/share

使用

创建pvc,新建nginx-pvc-nfs.yaml 文件内容如下

1
2
3
4
5
6
7
8
9
10
11
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nginx-pvc
spec:
storageClassName: nfs-client
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Mi

执行kubectl apply -f nginx-pvc-nfs.yaml ,检查pvc是否创建成功

创建部署,新建nginx-deployment.yaml文件,内容如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
volumes:
- name: nginx-data
persistentVolumeClaim:
claimName: nginx-pvc
containers:
- name: nginx
image: nginx:alpine
ports:
- containerPort: 80
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: nginx-data

执行kubectl apply -f nginx-deployment.yaml

总结

nfs-client-provisioner容器只有在创建pvc的时候会通过该容器来连接管理nfs,pvc创建成功之后就与nfs-client-provisioner无关了,不管他是否还在运行。

nfs的挂载不是在容器内部,还是依赖于宿主机,因此宿主机需要有挂载的依赖等等

查看nfs的挂载情况

可以在pod所在机器执行df -h可以看到类似下面的输出

1
10.25.207.176:/mnt/data/kubesphere-loki-system-loki-storage-pvc-dc2431fb-5352-4265-8a9d-0a11b69d8588  9.8G  1.8G  7.5G   19% /var/lib/kubelet/pods/4d276910-7529-4355-8d92-6cfa1da68825/volumes/kubernetes.io~nfs/pvc-dc2431fb-5352-4265-8a9d-0a11b69d8588

删除

eip-nfs-nfs: 存在于kube-system中,是由 Kubernetes StorageClass 创建的资源,用于管理 NFS 存储。

local-path-provisioner:存在于kube-system中,是一种轻量级的存储提供程序,用于在 Kubernetes 中利用节点的本地存储。它通常用于小型或单节点集群。

问题

  1. nfs-client-provisioner容器报错: unexpected error getting claim reference: selfLink was empty, can't make reference

    原因:kubernetes在1.20版本移除了SelfLink,kubernetes Deprecate and remove SelfLink

    解决:nfs-client-provisioner使用新版

  2. 拉去镜像报错:Back-off pulling image "k8s.gcr.io/sig-storage/nfs-subdir-external-provisioner:v4.0.2"

    解决:使用docker代理镜像在线生成地址

  3. 部署nginx时,使用挂载卷,多台时,其中一台报错如下

    1
    2
    3
    4
    异常	FailedMount	2 分钟前
    (近 4 分钟发生 2 次) kubelet Unable to attach or mount volumes: unmounted volumes=[nginx-data], unattached volumes=[nginx-data kube-api-access-vq8pp]: timed out waiting for the condition
    异常 FailedMount 1 秒前
    (近 7 分钟发生 11 次) kubelet MountVolume.SetUp failed for volume "pvc-bb951005-5152-4ab8-ba6c-251d11af5c7a" : mount failed: exit status 32 Mounting command: mount Mounting arguments: -t nfs 192.168.4.2:/share/default-nginx-pvc-pvc-bb951005-5152-4ab8-ba6c-251d11af5c7a /var/lib/kubelet/pods/f274b225-5cec-432b-8b84-35f9355b0486/volumes/kubernetes.io~nfs/pvc-bb951005-5152-4ab8-ba6c-251d11af5c7a Output: mount.nfs: access denied by server while mounting 192.168.4.2:/share/default-nginx-pvc-pvc-bb951005-5152-4ab8-ba6c-251d11af5c7a

    解决:修改/etc/exports添加权限

  4. 客户端连接测试时

    1
    2
    3
    4
    5
    6
    7
    8
    [root@xxx mnt]# mount -t nfs 10.255.7.6:/mnt/data /mnt/test
    mount: 文件系统类型错误、选项错误、10.255.7.6:/mnt/data 上有坏超级块、
    缺少代码页或助手程序,或其他错误
    (对某些文件系统(如 nfs、cifs) 您可能需要
    一款 /sbin/mount.<类型> 助手程序)

    有些情况下在 syslog 中可以找到一些有用信息- 请尝试
    dmesg | tail 这样的命令看看。

    解决:安装nfs客户端:yum install -y nfs-utilssystemctl start nfs-utils

  5. 错误日志如下:

    1
    MountVolume.SetUp failed for volume "pvc-dc2431fb-5352-4265-8a9d-0a11b69d8588" : mount failed: exit status 32 Mounting command: systemd-run Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/bb3fdc10-38de-4437-b8df-81b207e57f1d/volumes/kubernetes.io~nfs/pvc-dc2431fb-5352-4265-8a9d-0a11b69d8588 --scope -- mount -t nfs 10.255.247.176:/mnt/data/kubesphere-loki-system-loki-storage-pvc-dc2431fb-5352-4265-8a9d-0a11b69d8588 /var/lib/kubelet/pods/bb3fdc10-38de-4437-b8df-81b207e57f1d/volumes/kubernetes.io~nfs/pvc-dc2431fb-5352-4265-8a9d-0a11b69d8588 Output: Running scope as unit run-62186.scope. mount: wrong fs type, bad option, bad superblock on 10.255.247.176:/mnt/data/kubesphere-loki-system-loki-storage-pvc-dc2431fb-5352-4265-8a9d-0a11b69d8588, missing codepage or helper program, or other error (for several filesystems (e.g. nfs, cifs) you might need a /sbin/mount.<type> helper program) In some cases useful info is found in syslog - try dmesg | tail or so.

    解决:简短错误信息就是mount 32错误,基本就是挂载忙,或者宿主机没有安装nfs-utils,因此集群所有节点最好执行安装yum install -y nfs-utils,不然部署的时候随机换了一台机器就会提示该错误