22 篇文章带有标签 “Ubuntu”

加速 Docker 构建镜像

  • Debian
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

Ubuntu PRETTY_NAME="Ubuntu Jammy Jellyfish (development branch)" NAME="Ubuntu" VERSION_ID="22.04" VERSION="22.04 (Jammy Jellyfish)" VERSION_CODENAME=jammy ID=ubuntu ID_LIKE=debian HOME_URL="https://www.ubuntu.

Ubuntu 上将 NVIDIA GPU 切换为 Intel 集成显卡 IGD

IGD(Integrated Graphics Device)

操作系统:Ubuntu 18.04,主机有一张 NVIDIA 的独立显卡 GP106 [GeForce GTX 1060 6GB],还有 Intel 酷睿处理器 i5 8500 自带的集成显卡(Intel UHD Graphics 630)。为了更充分的使用独立显卡用于深度学习计算,需要把集成显卡用于显示。在这个过程中遇到了各种各样的问题:

  • 鼠标和键盘失灵。
  • 登录 X Window 时,输入正确的密码不能登录。

选择 IGD,保存退出。

Install Python3.9 in Ubuntu20.04

在看到这个错误,首先想到的就是安装包 python3.9-distutils,但是在安装的时候看到它依赖的安装包都是 python3.8 的。没有找到更好的解决方法,这里找到一个解释

I believe this is a bug in Debian's Python package. Their modifications to Python have been a source of a long standing debate: https://gist.github.com/tiran/2dec9e03c6f901814f6d1e8dad09528e has a lot of discussion.
I think that is the issue, Ubuntu packages are inconsistent, i.e. python3.8 and python3.9 bring different set of modules, as well as have different set of decencies, while all that should really differ is just Python version. Same applies for python3.[89]-minimal. However, all 4 are consentient in one thing - not having sys.prefix/lib/pythonX.Y/site-packages in sys.path
To have that (and distutils) sorted, one needs to install a collection of python3 packages, but those are Python3.8 (and bring tons of semi-random libraries and so).

All Ubuntu provided Python3.[89] 'installations' have ensurepip removed. I think, the 'logic' is to force people to use python3-pip that (unless --no-install-recommends is used) brings whole tone of things including make, cpp and perl(!)

I strongly recommend filing an issue with Debian and Ubuntu for this -- while pip can do things to paper over the issue, the fundamental problem is that the Python installation is not proper. Part of the problem is that Debian users don't ask Debian's maintainers to make fixes for the things they break.
I will consider that.

manually installing distutils
How did you do this? If you've used what is available in CPython's source tree, then you've installed an incompatible distutils for the Python interpreter -- Debian relies on patches they make to distutils to keep things working.
Installing might be a bit of overstatement, since I am working with Docker container I am doing:

COPY --from=python:3.9-slim /usr/local/lib/python3.9/distutils /usr/lib/python3.9/distutils
I would do FROM python:3.9-slim but I have to rely on specific 'base' container.
python:3.9-slim has FROM debian:bullseye-slim

Install NVIDIA device plugin for Kubernetes

  1. 重启服务
sudo systemctl restart docker
  1. 使用Helm安装
helm install --generate-name nvdp/nvidia-device-plugin

失败(gpu2节点的Docker没有配置好) $ kubectl logs -n kube-system nvidia-device-plugin-1614240442-wfh6c 2021/02/26 07:03:48 Loading NVML 2021/02/26 07:03:48 Failed to initialize NVML: could not load NVML library. 2021/02/26 07:03:48 If this is a GPU node, did you set the docker default runtime to nvidia? 2021/02/26 07:03:48 You can check the prerequisites at: https://github.com/NVIDIA/k8s-device-plugin#prerequisites 2021/02/26 07:03:48 You can learn how to set the runtime at: https://github.

Dockerfile OpenCV4 Ubuntu20.04

  1. ImportError: libgthread-2.0.so.0: cannot open shared object file: No such file or directory
>>> import cv2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/dist-packages/cv2/__init__.py", line 5, in <module>
    from .cv2 import *
ImportError: libgthread-2.0.so.0: cannot open shared object file: No such file or directory

apt install -y libglib2.0-dev

需要交互,不能自动化安装。 apt install libglib2.0-dev Setting up tzdata (2020f-0ubuntu0.20.04.1) ...

Building ONNX Runtime

  • 拉取容器(编译环境)
docker pull nvidia/cuda:11.1-cudnn8-devel-ubuntu20.04
  • 运行容器
docker run -it --name build-onnxruntime-gpu --runtime nvidia \
    -v $(pwd)/onnxruntime:/onnxruntime -w /onnxruntime \
    nvidia/cuda:11.1-cudnn8-devel-ubuntu20.04
  • 更新apt镜像源
sed -i 's/archive.ubuntu.com/mirrors.aliyun.com/g' /etc/apt/sources.list
apt-get update
  • 安装依赖包
apt-get install language-pack-en git cmake python3 python3-pip -y
  • 修改语言环境
locale-gen en_US.UTF-8
update-locale LANG=en_US.UTF-8
  • 更新pip镜像源
pip3 config set global.index-url https://mirrors.aliyun.com/pypi/simple/
  • 安装numpy
pip3 install numpy

编译 ./build.

在Ubuntu上下载docker和nvidia-docker2离线安装包

以下是容器内操作

分析要下载的依赖安装包 $ apt-get install -s nvidia-docker2 Reading package lists... Done Building dependency tree Reading state information... Done The following additional packages will be installed: libcap2 libnvidia-container-tools libnvidia-container1 nvidia-container-runtime nvidia-container-toolkit The following NEW packages will be installed: libcap2 libnvidia-container-tools libnvidia-container1 nvidia-container-runtime nvidia-container-toolkit nvidia-docker2 0 upgraded, 6 newly installed, 0 to remove and 0 not upgraded. Inst libcap2 (1:2.32-1 Ubuntu:20.

在Ubuntu上安装NVIDIA GPU驱动

在一台新安装的 Ubuntu20.04 系统上安装 NVIDIA GPU 驱动。

  1. 更新 initramfs
$ sudo update-initramfs -u
  1. 重启系统
$ sudo reboot
  1. 验证 nouveau 是否禁用成功(当什么也不显示出来时代表成功)
$ lsmod | grep nouveau
  1. 到[NVIDIA 驱动程序下载]页面下载对应型号的驱动
$ wget https://cn.download.nvidia.com/tesla/450.80.02/NVIDIA-Linux-x86_64-450.80.02.run
  1. 安装驱动
$ sudo sh NVIDIA-Linux-x86_64-450.80.02.run

在Ubuntu上配置apt镜像源

在国内使用官方的镜像源安装 Ubuntu 应用非常慢,通常配置国内的镜像源来加快速度,如阿里云。

sed -i 's/archive.ubuntu.com/mirrors.aliyun.com/g' /etc/apt/sources.list

Linux系统DNS设置

之前在文件/etc/resolv.conf中设置了,过段时间总是自动恢复默认值。(注释中写的很详细,是不可编辑的由系统自动生成的文件)

Linux系统上修改用户名

今天同事安装了一台新的服务器Ubuntu20.04,但用户名和主机名不是我想要的,这里尝试了直接修改Linux文件的方式。

  • 方法一
$ uname -n

安装Harbor

  1. 生成CA证书
openssl req -x509 -new -nodes -sha512 -days 3650 \
 -subj "/C=CN/ST=Shandong/L=Jinan/O=LNSoft/OU=AI/CN=lnsoft.com" \
 -key ca.key \
 -out ca.crt
  1. 生成私钥
openssl genrsa -out lnsoft.com.key 4096
  1. 生成证书签名请求(CSR)

调整-subj选项中的值以反映您的组织。 如果使用FQDN连接Harbor主机,则必须将其指定为公用名(CN)属性,并在密钥和CSR文件名中使用它。

openssl req -sha512 -new \
    -subj "/C=CN/ST=Shandong/L=Jinan/O=LNSoft/OU=AI/CN=lnsoft.com" \
    -key lnsoft.com.key \
    -out lnsoft.com.csr

生成一个x509 v3扩展文件 cat > v3.

NFS配置

  • 查看服务状态
sudo systemctl status nfs-server
  • 查看开启的NFS协议
sudo cat /proc/fs/nfsd/versions
-2 +3 +4 +4.1 +4.2
  • 配置访问的路径
sudo nano /etc/exports
/data/nfs        172.16.33.0/24(rw,sync,fsid=0,crossmnt,no_subtree_check)
  • 应用配置
sudo exportfs -ra
  • 查看当前应用
sudo exportfs -v
  • 重启服务
sudo systemctl restart nfs-server
  • 查看NFS服务器导出列表
showmount -e 172.16.33.157
Export list for 172.16.33.157:
/data/nfs 172.16.33.0/24,172.16.128.164
  • 挂载NFS
sudo mount -t nfs 172.16.33.157:/ $(pwd)/nfs
  • 移除挂载
sudo umount $(pwd)/nfs