less than 1 minute read

编译 whisper.cpp

克隆仓库

git clone https://github.com/ggml-org/whisper.cpp.git
cd whisper.cpp

编译 whisper.cpp

cmake -B build -DGGML_CUDA=1 -DCMAKE_CUDA_ARCHITECTURES="110"
cmake --build build -j --config Release

下载模型

sh ./models/download-ggml-model.sh small
sh ./models/download-ggml-model.sh large-v3-turbo
  • tiny.en
  • tiny
  • base.en
  • base
  • small.en
  • small
  • medium.en
  • medium
  • large-v1
  • large-v2
  • large-v3
  • large-v3-turbo

运行 whisper-cli

./build/bin/whisper-cli -f samples/jfk.wav
./build/bin/whisper-cli -m /models/whisper.cpp/models/ggml-large-v3-turbo.bin -f samples/jfk.wav

whisper-server

运行 whisper-server

./build/bin/whisper-server \
    --model /models/whisper.cpp/models/ggml-large-v3-turbo.bin \
    --host 0.0.0.0 \
    --port 8080 \
    --flash-attn \
    --language auto

测试

curl 127.0.0.1:8080/inference \
    -H "Content-Type: multipart/form-data" \
    -F file="@samples/jfk.wav" \
    -F temperature="0.0" \
    -F temperature_inc="0.2" \
    -F response_format="json"
{
  "text": " And so, my fellow Americans, ask not what your country can do for you, ask what you can do for your country.\n"
}
curl 127.0.0.1:8080/inference \
    -H "Content-Type: multipart/form-data" \
    -F file="@/models/iic/SenseVoiceSmall/example/zh.mp3" \
    -F response_format="json"
{"text":"開放時間早上9點至下午5點。\n"}

⚠️ 自动识别语言时,识别为繁体中文,需要手动指定简体中文。

转录简体中文

运行 whisper-server 时,使用 --language zh 选项来指定简体中文。

./build/bin/whisper-server \
    --model /models/whisper.cpp/models/ggml-large-v3-turbo.bin \
    --host 0.0.0.0 \
    --port 8080 \
    --flash-attn \
    --language zh
curl 127.0.0.1:8080/inference \
    -H "Content-Type: multipart/form-data" \
    -F file="@/models/iic/SenseVoiceSmall/example/zh.mp3" \
    -F response_format="json"
{"text":"开放时间早上九点至下午五点\n"}

参考资料

Updated: