利用多张 GPU 训练大语言模型

April 4, 2025

类别: Train LLM

标签: Train LLM DeepSpeed ZeRO FlashAttention Quantization 李宏毅 2025

目录

参考资料

参考资料