Install NVIDIA device plugin for Kubernetes
配置每个NVIDIA GPU节点上的Docker
- 增加
"default-runtime": "nvidia"
$ sudo vim /etc/docker/daemon.json
{
"registry-mirrors": ["https://75oltije.mirror.aliyuncs.com"],
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
}
}
- 重启服务
sudo systemctl restart docker
设置每个节点的污点
GPU 节点
kubectl taint node gpu1 nvidia.com/gpu:NoSchedule
kubectl taint node gpu2 nvidia.com/gpu:NoSchedule
CPU 节点 kubectl taint node ln2 node-type=production:NoSchedule kubectl ta