k3s节点设计
备注:在cuda节点安装虚拟机的时候cpu选择host,host性能比较好,而且支持avx,默认的cpu不支持avx。查看是否支持avx命令cat /proc/cpuinfo | grep avx
k3s各节点安装
节点node1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
| core@node1:~$ sudo hostnamectl set-hostname node1 core@node1:~$ sudo reboot core@node1:~$ sudo yum install -y curl core@node1:~$ curl -sfL https://get.k3s.io | K3S_TOKEN=EXXK_SECRET sh -s - server --cluster-init core@node1:~$ sudo reboot core@node1:~$ sudo kubectl get nodes NAME STATUS ROLES AGE VERSION node1 Ready control-plane,etcd,master 119s v1.30.6+k3s1
core@node1:~$ sudo rpm-ostree install nfs-utils
core@node1:~$ sudo vi /etc/rancher/k3s/registries.yaml core@node1:~$ sudo systemctl restart k3s
core@node1:~$ nmcli connection show NAME UUID TYPE DEVICE Wired connection 1 805c16a1-faea-3964-8df9-daa42b0323f4 ethernet ens18 cni0 41cec1dd-2f39-4b9a-900e-aece2676a95a bridge cni0 flannel.1 613f55c0-62db-4cc5-9889-a1b49269a515 vxlan flannel.1 lo 596dce45-2e76-429e-afeb-4d2fd8433fa2 loopback lo core@node1:~$ sudo nmcli con mod "Wired connection 1" ipv4.method manual ipv4.addresses 172.16.80.163/24 ipv4.gateway 172.16.80.1 ipv4.dns "172.16.10.63 192.168.100.254"
core@node1:~$ sudo nmcli con up "Wired connection 1"
core@node1:~$ nmcli connection show "Wired connection 1" | grep 'ipv4.method' ipv4.method: manual
|
节点node2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
| core@node2:~$ sudo hostnamectl set-hostname node2 core@node2:~$ sudo reboot core@node2:~$ sudo yum install -y curl core@node2:~$ curl -sfL https://get.k3s.io | K3S_TOKEN=EXXK_SECRET sh -s - server --server https://172.16.80.163:6443 core@node1:~$ sudo reboot core@node2:~$ sudo kubectl get nodes NAME STATUS ROLES AGE VERSION node1 Ready control-plane,etcd,master 17m v1.30.6+k3s1 node2 Ready control-plane,etcd,master 34s v1.30.6+k3s1
core@node2:~$ sudo rpm-ostree install nfs-utils
core@node2:~$ sudo vi /etc/rancher/k3s/registries.yaml core@node2:~$ sudo systemctl restart k3s
core@node2:~$ nmcli connection show NAME UUID TYPE DEVICE Wired connection 1 805c16a1-faea-3964-8df9-daa42b0323f4 ethernet ens18 cni0 41cec1dd-2f39-4b9a-900e-aece2676a95a bridge cni0 flannel.1 613f55c0-62db-4cc5-9889-a1b49269a515 vxlan flannel.1 lo 596dce45-2e76-429e-afeb-4d2fd8433fa2 loopback lo core@node2:~$ sudo nmcli con mod "Wired connection 1" ipv4.method manual ipv4.addresses 172.16.80.179/24 ipv4.gateway 172.16.80.1 ipv4.dns "172.16.10.63 192.168.100.254"
core@node2:~$ sudo nmcli con up "Wired connection 1"
core@node2:~$ nmcli connection show "Wired connection 1" | grep 'ipv4.method' ipv4.method: manual
|
节点exxk
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66
| exxk@exxk:~$ curl -sfL https://get.k3s.io | K3S_TOKEN=EXXK_SECRET sh -s - server --server https://172.16.80.163:6443 exxk@exxk:~$ sudo reboot exxk@exxk:~$ sudo kubectl get nodes [sudo] password for exxk: NAME STATUS ROLES AGE VERSION exxk Ready control-plane,etcd,master 3m4s v1.30.6+k3s1 node1 Ready control-plane,etcd,master 25m v1.30.6+k3s1 node2 Ready control-plane,etcd,master 8m19s v1.30.6+k3s1
exxk@exxk:~$ sudo apt install nfs-common
exxk@exxk:~$ sudo add-apt-repository ppa:graphics-drivers/ppa exxk@exxk:~$ sudo apt update exxk@exxk:~$ sudo apt install -y nvidia-driver-560 --no-install-recommends exxk@exxk:~$ sudo reboot exxk@exxk:~$ nvidia-smi
exxk@exxk:~$ sudo nvidia-ctk runtime configure --runtime=containerd exxk@exxk:~$ sudo reboot
exxk@exxk:~$ sudo sudo ctr run --rm --gpus 0 docker.io/nvidia/cuda:11.8.0-base-ubuntu20.04 bash nvidia-smi Fri Dec 13 10:05:41 2024 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 560.35.03 Driver Version: 560.35.03 CUDA Version: 12.6 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 3050 ... Off | 00000000:00:10.0 Off | N/A | | N/A 45C P8 3W / 60W | 2MiB / 4096MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | No running processes found | +-----------------------------------------------------------------------------------------+ exit
.....
exxk@exxk:~$ sudo vim /etc/rancher/k3s/registries.yaml exxk@exxk:~$ sudo systemctl restart k3s
exxk@exxk:~$ sudo nano /etc/netplan/00-installer-config.yaml
network: ethernets: ens18: dhcp4: false addresses: - 172.16.80.80/24 gateway4: 172.16.80.1 nameservers: addresses: - 172.16.10.63 - 192.168.100.254 version: 2 exxk@exxk:~$ sudo netplan apply
|
nfs存储
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
|
nfs:~# apk add nfs-utils nfs:~# mkdir /nfsdata nfs:/nfs# vi /etc/exports
/nfsdata 172.16.80.0/24(rw,nohide,no_subtree_check,no_root_squash) nfs:~# rc-update add nfs nfs:~# rc-service nfs start
nfs:~# vi /etc/network/interfaces auto lo iface lo inet loopback
auto eth0 iface eth0 inet static address 172.16.80.196 netmask 255.255.255,0 gateway 172.16.80.1 nfs:~# /etc/init.d/networking restart
|
客户端使用
kubeconfig获取,可以从其中一个主节点sudo cat /etc/rancher/k3s/k3s.yaml
拷贝或下载下来这个文件,修改里面的server ip为节点外网的ip,然后保存。
Lens
- 下载mac版本
- Lens客户端打开,点击左侧菜单
Local KubeConfigs上面的+号
,然后导入kubeconfig
。
kuboard
- 我这里直接用的另一个集群的kuboard,就么有安装,直接导入即可。
- 登录进入
Home Page->Add kubernetes
,填入kebeconfig
配置。
文件附录
/etc/rancher/k3s/registries.yaml
,内容如下:
1 2 3 4 5 6 7 8
| mirrors: "harbor.hcytech.dev": endpoint: - "http://harbor.exxktech.dev" configs: "harbor.exxktech.dev": tls: insecure_skip_verify: true
|