3 篇文章带有标签 “TTS”

2025年10月19日星期日

whisper.cpp 实战指南（Jetson Thor 平台）

git clone https://github.com/ggml-org/whisper.cpp.git
cd whisper.cpp

cmake -B build -DGGML_CUDA=1 -DCMAKE_CUDA_ARCHITECTURES="110"
cmake --build build -j --config Release

sh ./models/download-ggml-model.sh small
sh ./models/download-ggml-model.sh large-v3-turbo

tiny.en
tiny
base.en
base
small.en
small
medium.en
medium
large-v1
large-v2
large-v3
large-v3-turbo

./build/bin/whisper-cli -f samples/jfk.wav
./build/bin/whisper-cli -m /models/whisper.cpp/models/ggml-large-v3-turbo.bin -f samples/jfk.wav

2025年10月19日 1 分钟 102 字

2024年3月16日星期六

Open Source Models with Hugging Face

安装依赖库

pip install transformers

from transformers.utils import logging
logging.set_verbosity_error()

from transformers import pipeline
chatbot = pipeline(task="conversational", model="facebook/blenderbot-400M-distill")

from transformers import Conversation
user_message = "What are some fun activities I can do in the winter?"
conversation = Conversation(user_message)
conversation = chatbot(conversation)
print(conversation)

# 继续对话：要在 LLM 上下文中包含之前的对话，您可以添加一条“消息”以包含之前的聊天历史记录。
conversation.add_message(
    {
// ...

2024年3月16日 3 分钟 778 字

DeepLearningAI HuggingFace Gradio NLP ASR TTS

2023年12月9日星期六

SeamlessM4T — Massively Multilingual & Multimodal Machine Translation（大规模多语言和多模式机器翻译）

Seamless Communication

ASR: Automatic speech recognition for 96 languages.
S2ST: Speech-to-Speech translation from 100 source speech languages into 35 target speech languages.
S2TT: Speech-to-text translation from 100 source speech languages into 95 target text languages.
T2ST: Text-to-Speech translation from 95 source text languages into 35 target speech languages.
T2TT: Text-to-text translation (MT) from 95 source text languages into 95 target text languages.

conda create -n seamless-m4t python==3.10.9 -y
conda activate seamless-m4t

cli/m4t/predict/predict.py

2023年12月9日 5 分钟 1,012 字

SeamlessM4T ASR TTS conda MacBookProM2Max

3 篇文章带有标签 “TTS”

2025年10月19日 星期日

whisper.cpp 实战指南（Jetson Thor 平台）

2024年3月16日 星期六

Open Source Models with Hugging Face

2023年12月9日 星期六

SeamlessM4T — Massively Multilingual & Multimodal Machine Translation（大规模多语言和多模式机器翻译）

2025年10月19日星期日

2024年3月16日星期六

2023年12月9日星期六