---
type: article
title:  "推理 LLM 技术内幕 - DeepSeek-R1/o1"
date:   2025-03-08 10:00:00 +0800
tags: [deepseek-r1, openai-o1, reasoning-model, chain-of-thought, test-time-compute, reinforcement-learning, llm, 推理模型]
---

![](/images/2025/ReasoningLLMTechnicalInsider/推理LLM技术内幕.001.jpeg)

![](/images/2025/ReasoningLLMTechnicalInsider/推理LLM技术内幕.002.jpeg)

![](/images/2025/ReasoningLLMTechnicalInsider/推理LLM技术内幕.003.jpeg)

![](/images/2025/ReasoningLLMTechnicalInsider/推理LLM技术内幕.004.jpeg)

![](/images/2025/ReasoningLLMTechnicalInsider/推理LLM技术内幕.005.jpeg)

![](/images/2025/ReasoningLLMTechnicalInsider/推理LLM技术内幕.006.jpeg)

![](/images/2025/ReasoningLLMTechnicalInsider/推理LLM技术内幕.007.jpeg)

![](/images/2025/ReasoningLLMTechnicalInsider/推理LLM技术内幕.008.jpeg)

![](/images/2025/ReasoningLLMTechnicalInsider/推理LLM技术内幕.009.jpeg)

![](/images/2025/ReasoningLLMTechnicalInsider/推理LLM技术内幕.010.jpeg)

![](/images/2025/ReasoningLLMTechnicalInsider/推理LLM技术内幕.011.jpeg)

![](/images/2025/ReasoningLLMTechnicalInsider/推理LLM技术内幕.012.jpeg)

![](/images/2025/ReasoningLLMTechnicalInsider/推理LLM技术内幕.013.jpeg)

![](/images/2025/ReasoningLLMTechnicalInsider/推理LLM技术内幕.014.jpeg)

![](/images/2025/ReasoningLLMTechnicalInsider/推理LLM技术内幕.015.jpeg)

![](/images/2025/ReasoningLLMTechnicalInsider/推理LLM技术内幕.016.jpeg)

![](/images/2025/ReasoningLLMTechnicalInsider/推理LLM技术内幕.017.jpeg)

![](/images/2025/ReasoningLLMTechnicalInsider/推理LLM技术内幕.018.jpeg)

![](/images/2025/ReasoningLLMTechnicalInsider/推理LLM技术内幕.019.jpeg)

![](/images/2025/ReasoningLLMTechnicalInsider/推理LLM技术内幕.020.jpeg)


- [Understanding Reasoning LLMs](https://sebastianraschka.com/blog/2025/understanding-reasoning-llms.html)
- [Sebastian Raschka：关于DeepSeek R1和推理模型，我有几点看法](https://www.jiqizhixin.com/articles/2025-02-09-4)
- [Large Language Models are Zero-Shot Reasoners](https://arxiv.org/abs/2205.11916)
- [Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters](https://arxiv.org/abs/2408.03314)
- [04 论文 Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters](https://zhuanlan.zhihu.com/p/18709529446)
- [论文笔记：Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling](https://zhuanlan.zhihu.com/p/8147928532)
- [DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning](https://arxiv.org/abs/2501.12948)
