跳转到主内容

军舰的日志

工具标签关于

工具标签关于搜索文章

推理 LLM 技术内幕 - DeepSeek-R1/o1

Understanding Reasoning LLMs
Sebastian Raschka：关于DeepSeek R1和推理模型，我有几点看法
Large Language Models are Zero-Shot Reasoners
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
04 论文 Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
论文笔记：Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

相关文章

2025年1月21日

DeepSeek R1: 通过强化学习激励 LLM 的推理能力

2025年3月31日

LLM 推理在软件任务中扮演什么角色？

2025年2月14日

部署 DeepSeek-R1 蒸馏模型

2024年2月29日

DeepSeek-Coder 论文解读

打开 Markdown

标签

deepseek-r1 openai-o1 reasoning-model chain-of-thought test-time-compute reinforcement-learning llm 推理模型

信息

2025年03月08日 10时00分

约 1 分钟阅读

阅读

© 2026 军舰的日志. All rights reserved. · 访问量 · 访客数

🤖

智能问答助手

⏳ 初始化...

Ollama Base URL

Ollama Embedding Model

LLM Base URL

LLM API Key

LLM Chat Model

日志级别

💡 配置和聊天记录仅保存在本地浏览器中