23 篇文章带有标签 “apple-silicon”

2025年8月21日星期四

PyTorch 神经网络实战：从训练到推理的完整指南

该文本提供了一个关于PyTorch二分类神经网络的实现与性能分析的全面概述。首先，它通过具体代码示例展示了如何构建、训练、评估和保存一个基础的神经网络模型，并演示了如何加载模型进行推理。其次，文章深入探讨了不同模型参数规模下，Apple的MPS（Metal Performance Shaders）框架与CPU在训练时间上的性能对比，通过表格数据清晰地呈现了MPS在处理大型模型时相较于CPU的显著优势，并指出了性能的“转折点”。

我的电脑是 Apple MacBook Pro M2 Max 16寸 64G内存

PyTorch 二分类神经网络实现与训练示例 import torch import torch.nn.functional as F from torch.utils.data import Dataset from torch.utils.data import DataLoader # 模型网络 class NeuralNetwork(torch.nn.Module): def init(self, num_inputs, num_outputs): super().init() self.layers = torch.nn.Sequential( torch.nn.Linear(num_inputs, 30), torch.nn.ReLU(), torch.

2025-08-21 08:00

2024年5月8日星期三

Xorbits Inference: 模型服务变得更容易

macOS 上安装（M2）

conda create -n xinference python=3.10.9
conda activate xinference
pip install -U pip
pip install xinference

# GGML
CMAKE_ARGS="-DLLAMA_METAL=on" pip install llama-cpp-python

安装
- GGML 引擎

使用

运行 Xinference

2024-05-08 08:00

xinference xorbits-inference model-serving llm macos apple-silicon chatbox deployment python

2024年3月14日星期四

MLX: An array framework for Apple silicon

MLX 介绍

MLX 是一个为 Apple Silicon 芯片上的机器学习研究设计的 array 框架，由 Apple 机器学习研究团队提供。

熟悉的 API：MLX 拥有一个与 NumPy 紧密对应的 Python API。MLX 还拥有功能齐全的 C++、C 和 Swift API，这些 API 也紧密地反映了 Python API。MLX 拥有更高级别的包，如 mlx.nn 和 mlx.optimizers，它们的 API 紧密跟随 PyTorch，以简化构建更复杂模型的过程。
统一内存：MLX 与其他框架的一个显著区别在于其统一内存模型。MLX 中的数组存在于共享内存中。可以在任何支持的设备类型上执行 MLX 数组的操作，无需数据传输。
MLX 的设计受到了像 NumPy、PyTorch、Jax 和 ArrayFire 这样的框架的启发。

安装

pip install mlx
pip install mlx-lm

conda

conda install -c conda-forge mlx
conda install -c conda-forge mlx-lm

2024-03-14 08:00

mlx llm mistral qwen quantization lora qlora fine-tuning apple-silicon inference

2024年1月31日星期三

在 MLX 上使用 LoRA / QLoRA 微调 Text2SQL（八）：使用 LoRA 基于 TinyLlama 微调

TinyLlama

TinyLlama/TinyLlama-1.1B-Chat-v1.0

输入

<|system|>
You are a chatbot who can help code!</s>
<|user|>
Write me a function to calculate the first 10 digits of the fibonacci sequence in Python and print it out to the CLI.</s>
<|assistant|>

输出

2024-01-31 08:00

mlx lora tinyllama text2sql wikisql sql-generation chat-template fine-tuning apple-silicon

2024年1月28日星期日

在 MLX 上使用 LoRA / QLoRA 微调 Text2SQL（七）：MLX 微调的模型转换为 GGUF 模型

将 MLX 微调的模型转换为 GGUF 模型最大的意义是可以融入 GGUF 的生态系统，可以在更多的平台上使用。

LoRA 微调

大模型 Mistral-7B-v0.1

mistralai/Mistral-7B-v0.1

数据集 WikiSQL

修改脚本 mlx-examples/lora/data/wikisql.py

if __name__ == "__main__":
    # ......
    for dataset, name, size in datasets:
        with open(f"data/{name}.jsonl", "w") as fid:
            for e, t in zip(range(size), dataset):
                t = t[3:]
                json.dump({"text": t}, fid)
                fid.write("\n")

执行脚本 data/wikisql.py 生成数据集。

data/wikisql.py

安装 mlx-lm

pip install mlx-lm

微调

2024-01-28 08:00

mlx lora mistral-7b text2sql wikisql gguf llama-cpp model-conversion quantization apple-silicon

2024年1月27日星期六

在 MLX 上使用 LoRA / QLoRA 微调 Text2SQL（六）：使用 LoRA 基于 Deepseek-Coder-7B 微调

大模型 Deepseek-Coder-7B

数据集 WikiSQL

修改脚本 mlx-examples/lora/data/wikisql.py if name == "main": # ...... for dataset, name, size in datasets: with open(f"data/{name}.jsonl", "w") as fid: for e, t in zip(range(size), dataset): # deepseek-ai/deepseek-coder-7b-instruct-v1.5 # 去掉开头的 <｜begin▁of▁sentence｜>，因为 tokenizer 会自动添加 <｜begin▁of▁sentence｜> t = t[3:-4] + "<｜end▁of▁sentence｜>" json.dump({"text": t}, fid) fid.

2024-01-27 08:00

mlx lora deepseek-coder-7b text2sql wikisql sql-generation prompt-engineering fine-tuning apple-silicon

2024年1月26日星期五

在 MLX 上使用 LoRA / QLoRA 微调 Text2SQL（五）：对比使用 LoRA 和 QLoRA 基于 Mistral-7B 微调的效果

使用 LoRA 和 QLoRA 基于 Mistral-7B 微调的实验

LoRA 和 QLoRA 对比

微调

Iteration	LoRA Train Loss	LoRA Val Loss	LoRA Tokens/sec	QLoRA Train Loss	QLoRA Val Loss	QLoRA Tokens/sec
1		2.343			2.420
100	1.204		221.348	1.216		166.377
200	1.091	1.111	207.353	1.095	1.130	187.795
300	0.818		234.182	1.065		194.826
400	0.837	1.076	207.763	0.998	1.006	170.072
500	0.774		223.036	0.726		189.288
600	0.609	1.001	218.118	0.607	1.015	186.397

微调的参数量 LoRA 微调万分之 2.35 （1.704M / 7243.436M * 10000）的模型参数。 QLoRA 微调万分之 13.

2024-01-26 08:00

mlx lora qlora mistral-7b text2sql wikisql quantization sql-generation apple-silicon benchmark

2024年1月25日星期四

在 MLX 上使用 LoRA / QLoRA 微调 Text2SQL（四）：使用 QLoRA 基于 Mistral-7B 微调

预训练模型 mistralai/Mistral-7B-v0.1

量化

QLoRA 微调需要量化，生成 4 位量化的 Mistral 7B 并默认将其存储在 mlx_model 目录中

python convert.py --hf-path mistralai/Mistral-7B-v0.1 -q

mlx_model 目录结构如下：

mlx_model
├── config.json
├── model.safetensors
├── special_tokens_map.json
├── tokenizer.json
├── tokenizer.model
├── tokenizer_config.json
└── weights.00.safetensors

量化后的模型 8.0G

微调

QLoRA 微调

2024-01-25 08:00

mlx qlora mistral-7b text2sql wikisql quantization sql-generation fine-tuning apple-silicon

2024年1月24日星期三

在 MLX 上使用 LoRA / QLoRA 微调 Text2SQL（三）：分享微调后的模型到 HuggingFace Hub

mlx-community/Mistral-7B-v0.1-LoRA-Text2SQL

安装 mlx-lm

pip install mlx-lm

生成 SQL

python -m mlx_lm.generate --model mlx-community/Mistral-7B-v0.1-LoRA-Text2SQL \
                          --max-tokens 50 \
                          --prompt "table: students
columns: Name, Age, School, Grade, Height, Weight
Q: Which school did Wang Junjian come from?
A: "

SELECT School FROM Students WHERE Name = 'Wang Junjian'

上传模型到 HuggingFace Hub

加入 MLX Community 组织

在 MLX Community 组织中创建一个新的模型 mlx-community/Mistral-7B-v0.1-LoRA-Text2SQL

克隆仓库 mlx-community/Mistral-7B-v0.1-LoRA-Text2SQL

2024-01-24 12:00

mlx lora mistral-7b text2sql huggingface huggingface-hub model-sharing mlx-community apple-silicon

在 MLX 上使用 LoRA / QLoRA 微调 Text2SQL（二）：使用 LoRA 基于 Mistral-7B 微调

mlx-community/Mistral-7B-v0.1-LoRA-Text2SQL

本次微调的模型我已经上传到了 HuggingFace Hub 上，大家可以进行尝试。

安装 mlx-lm

pip install mlx-lm

生成 SQL

python -m mlx_lm.generate --model mlx-community/Mistral-7B-v0.1-LoRA-Text2SQL \
                          --max-tokens 50 \
                          --prompt "table: students
columns: Name, Age, School, Grade, Height, Weight
Q: Which school did Wang Junjian come from?
A: "

SELECT School FROM Students WHERE Name = 'Wang Junjian'

在 MLX 上使用 LoRA / QLoRA 微调 Text2SQL（一）：使用 LoRA 基于 Mistral-7B 微调

📌 没有使用模型的标注格式生成数据集，导致不能结束，直到生成最大的 Tokens 数量。

这次我们来解决这个问题。

数据集 WikiSQL

修改脚本 mlx-examples/lora/data/w

2024-01-24 08:00

mlx lora mistral-7b text2sql wikisql sql-generation fine-tuning apple-silicon huggingface

2024年1月23日星期二

在 MLX 上使用 LoRA / QLoRA 微调 Text2SQL（一）：使用 LoRA 基于 Mistral-7B 微调

安装

git clone https://github.com/ml-explore/mlx-examples.git
cd mlx-examples/lora

pip install -r requirements.txt

下载模型

mistralai/Mistral-7B-v0.1

pip install huggingface_hub hf_transfer

export HF_HUB_ENABLE_HF_TRANSFER=1
huggingface-cli download \
    --local-dir-use-symlinks False \
    --local-dir mistralai/Mistral-7B-v0.1 \
    mistralai/Mistral-7B-v0.1

huggingface_hub Environment variables

数据集 WikiSQL

样本格式

2024-01-23 08:00

mlx lora mistral-7b text2sql wikisql sql-generation fine-tuning apple-silicon

2023年12月26日星期二

whisper.cpp

NEON & MPS 🆚 CoreML

下载模型（large-v3）

models/download-ggml-model.sh large-v3

NEON & MPS

编译

make clean
make -j

main 帮助 ./main --help usage: ./main [options] file0.wav file1.wav ...

2023-12-26 08:00

whisper whisper-cpp speech-to-text apple-silicon metal coreml neon quantization macos macbookpro

2023年12月24日星期日

MLX LLMS Examples

MLX Examples

克隆代码

git clone https://github.com/ml-explore/mlx-examples
cd mlx-examples

创建虚拟环境

python -m venv env
source env/bin/activate

pip install -r llms/phi2/requirements.txt
pip install -r llms/qwen/requirements.txt

创建大模型链接 mkdir llms/phi2/microsoft ln -s /Users/junjian/HuggingFace/microsoft/phi-2 llms/phi2/microsoft/phi-2 mkdir llms/qwen/Qwen ln -s /Users/junjian/HuggingFace/Qwen/Qwen-14B-Chat llms/qwen/Qwen/Qwen-14B-Chat ln -s /Users/junjian/HuggingFace/Qwen/Qwen-1_8B llms/qwen/Qwen/Qwen-1_8B ln -s /Users/junjian/HuggingFace/Qwen/Qwen-1_8B-Chat llms/qwen/Qwen/Qwen-1_8

2023-12-24 08:00

mlx phi-2 qwen apple-silicon macbookpro local-llms inference llm-performance machine-learning

2023年12月21日星期四

MLX: An array framework for Apple silicon

MLX

统一内存：与 MLX 和其他框架的显着区别是统一内存模型。 MLX 中的数组位于共享内存中。 MLX 阵列上的操作可以在任何支持的设备类型上执行，而无需传输数据。

MLX Documentation

创建虚拟环境

mkdir ml-explore && cd ml-explore
git clone https://github.com/ml-explore/mlx
git clone https://github.com/ml-explore/mlx-examples

python -m venv env
source env/bin/activate

Phi-2

安装依赖包

cd llms/phi2
pip install -r requirements.txt

模型下载和转换

使用已经下载的模型

mkdir microsoft
ln -s /Users/junjian/HuggingFace/microsoft/phi-2 microsoft/phi-2

转换模型

python convert.py

这将生成 MLX 可以读取的 weights.npz 文件。

-rw-r--r--  1 junjian  staff   5.2G 12 20 20:36 weights.npz

运行

2023-12-21 08:00

mlx phi-2 qwen stable-diffusion t5 whisper apple-silicon machine-learning python

2023年12月20日星期三

Microsoft Phi-2

Phi-2: The surprising power of small language models

microsoft/phi-2

创建虚拟环境

conda create -n huggingface python==3.10.9
conda activate huggingface

安装依赖包

conda install pytorch torchvision -c pytorch
pip install transformers
pip install einops

下载模型

huggingface-cli download microsoft/phi-2 --local-dir microsoft/phi-2 --local-dir-use-symlinks False

代码 import torch from transformers import AutoModelForCausalLM, AutoTokenizer torch.set_default_device("mps") model = AutoModelForCausalLM.

2023-12-20 10:00

phi-2 llm hugging-face pytorch transformers microsoft apple-silicon mps small-language-models

SDXL Turbo

下载代码

git clone https://github.com/Stability-AI/generative-models.git Stability-AI/generative-models
cd Stability-AI/generative-models/

创建虚拟环境

python -m venv env
source env/bin/activate
pip install -r requirements/pt2.txt
pip install .

Apple Silicon 上没有安装成功，安装包 triton 不支持

下载模型

pip install "huggingface_hub[cli]"

SDXL-Turbo

huggingface-cli download stabilityai/sdxl-turbo --local-dir checkpoints --local-dir-use-symlinks False

CLIP huggingface-cli download openai/clip-vit-large-patch14 --lo

2023-12-20 08:00

sdxl-turbo stable-diffusion text-to-image diffusion-models hugging-face stability-ai clip image-generation apple-silicon

2023年7月18日星期二

在 MacBook Pro M2 Max 上测试 ChatGLM2-6B

ChatGLM2-6B

ChatGLM2-6B 是开源中英双语对话模型 ChatGLM-6B 的第二代版本，在保留了初代模型对话流畅、部署门槛较低等众多优秀特性的基础之上，ChatGLM2-6B 引入了如下新特性：

更强大的性能：基于 ChatGLM 初代模型的开发经验，我们全面升级了 ChatGLM2-6B 的基座模型。ChatGLM2-6B 使用了 GLM 的混合目标函数，经过了 1.4T 中英标识符的预训练与人类偏好对齐训练，评测结果显示，相比于初代模型，ChatGLM2-6B 在 MMLU（+23%）、CEval（+33%）、GSM8K（+571%）、BBH（+60%）等数据集上的性能取得了大幅度的提升，在同尺寸开源模型中具有较强的竞争力。
更长的上下文：基于 FlashAttention 技术，我们将基座模型的上下文长度（Context Length）由 ChatGLM-6B 的 2K 扩展到了 32K，并在对话阶段使用 8K 的上下文长度训练，允许更多轮次的对话。但当前版本的 ChatGLM2-6B 对单轮超长文档的理解能力有限，我们会在后续迭代升级中着重进行优化。
更高效的推理：基于 Multi-Query Attention 技术，ChatGLM2-6B 有更高效的推理速度和更低的显存占用：在官方的模型实现下，推理速度相比初代提升了 42%，INT4 量化下，6G 显存支持的对话长度由 1K 提升到了 8K。

2023-07-18 08:00

chatglm glm macos macbookpro apple hugging-face transformers pytorch apple-silicon quantization

2023年3月15日星期三

在 MacBook Pro M2 Max 上测试 LLaMA

LLaMA

LLaMA-13B 在大多数基准上的表现优于 GPT-3（175B），LLaMA-65B 与最好的型号 Chinchilla-70B 和 PaLM-540B 具有竞争力。

LLaMA: Open and Efficient Foundation Language Models

克隆

git clone https://github.com/facebookresearch/llama
cd llama

下载模型

修改 download.sh，配置下载模型的 地址(PRESIGNED_URL) 和 下载目录(TARGET_FOLDER)。

vim download.sh

PRESIGNED_URL="https://agi.gpt4.org/llama/LLaMA/*"             # replace with presigned url from email
TARGET_FOLDER="./"             # where all files should end up

bash download.sh

llama.cpp

构建

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make

拷贝 LLaMA 模型到当前目录 ls .

2023-03-15 08:00

llama llama-cpp gguf hugging-face macos macbookpro apple-silicon local-llms quantization python

2023年3月11日星期六

在 MacBook Pro M2 Max 上测试 Whisper

准备音频文件

macOS 上打开 QuickTimePlayer

[文件] -> [新建音频录制]
录制
朗读：荷兰发布了一份主题为“宣布即将对先进半导体制造设备采取的出口管制措施”的公告表示，鉴于技术的发展和地缘政治的背景，政府已经得出结论，有必要扩大现有的特定半导体制造设备的出口管制。
停止
保存(test.m4a)

荷兰限制部分型号光刻机出口中国市场影响几何

m4a 转换 wav

ffmpeg -i test.m4a -ar 16000 -ac 1 -c:a pcm_s16le test.wav

OpenAI Whisper

创建虚拟环境

conda create --name whisper python
conda activate whisper

安装

pip install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git

wget https://raw.githubusercontent.com/openai/whisper/main/requirements.txt
pip install -r requirements.txt

测试模型默认保存在 ~/.cache/whisper ls ~/.cache/whisper base.pt large-v2.

2023-03-11 08:00

whisper openai speech-to-text macos macbookpro apple-silicon ffmpeg python conda whisper-cpp

2023年3月4日星期六

在 MacBook Pro M2 Max 上使用 FFmpeg

Apple 芯片上进行硬件加速的框架

Video Toolbox

VideoToolbox 是一个低级框架，可提供对硬件编码器和解码器的直接访问。它提供视频压缩和解压缩服务，以及存储在 CoreVideo 像素缓冲区中的光栅图像格式之间的转换。这些服务以会话对象（压缩、解压缩和像素传输）的形式提供。

VideoToolbox还包括一些命令行工具，例如vttool、vtenc、vtdecode等，可以在终端中使用。这些工具可以用来检查视频的属性、转码视频、将视频转换为图像序列等任务。

Audio Toolbox

AudioToolbox 是一个音频处理框架，支持音频处理的硬件加速，它提供了一系列用于音频编码、解码、转换和处理的API接口。

安装 FFmpeg

static FFmpeg binaries for macOS 64-bit

创建目录

mkdir /opt/ffmpeg && cd /opt/ffmpeg

方法一：使用 curl

curl https://evermeet.cx/ffmpeg/ffmpeg-6.0.7z | tar -xz
curl https://evermeet.cx/ffmpeg/ffprobe-6.0.7z | tar -xz
curl https://evermeet.cx/ffmpeg/ffplay-6.0.7z | tar -xz

2023-03-04 08:00

ffmpeg macos macbookpro apple-silicon videotoolbox audiotoolbox hardware-acceleration video audio

23 篇文章带有标签 “apple-silicon”

2025年8月21日 星期四

2024年5月8日 星期三

2024年3月14日 星期四

2024年1月31日 星期三

2024年1月28日 星期日

2024年1月27日 星期六

2024年1月26日 星期五

2024年1月25日 星期四

2024年1月24日 星期三

2024年1月23日 星期二

2023年12月26日 星期二

2023年12月24日 星期日

2023年12月21日 星期四

2023年12月20日 星期三

2023年7月18日 星期二

2023年3月15日 星期三

2023年3月11日 星期六

2023年3月4日 星期六

2025年8月21日星期四

2024年5月8日星期三

2024年3月14日星期四

2024年1月31日星期三

2024年1月28日星期日

2024年1月27日星期六

2024年1月26日星期五

2024年1月25日星期四

2024年1月24日星期三

2024年1月23日星期二

2023年12月26日星期二

2023年12月24日星期日

2023年12月21日星期四

2023年12月20日星期三

2023年7月18日星期二

2023年3月15日星期三

2023年3月11日星期六

2023年3月4日星期六