Name | DL | Torrents | Total Size |

![]() | 6.58GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 5.23GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 3.14GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 5.23GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.37GB |
![]() | 4.31GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 1.32GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
![]() | 4.30GB |
Type: Dataset
Tags: weights, LLM
Bibtex:
Tags: weights, LLM
Bibtex:
@article{, title= {DeepSeek-R1 model weights}, keywords= {LLM, weights}, author= {}, abstract= {Weights for DeepSeek-R1 from Huggingface We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen. DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models. https://i.imgur.com/q6NKD6T.png ## License This code repository and the model weights are licensed under the MIT License. DeepSeek-R1 series support commercial use, allow for any modifications and derivative works, including, but not limited to, distillation for training other LLMs. Please note that: DeepSeek-R1-Distill-Qwen-1.5B, DeepSeek-R1-Distill-Qwen-7B, DeepSeek-R1-Distill-Qwen-14B and DeepSeek-R1-Distill-Qwen-32B are derived from Qwen-2.5 series, which are originally licensed under Apache 2.0 License, and now finetuned with 800k samples curated with DeepSeek-R1. DeepSeek-R1-Distill-Llama-8B is derived from Llama3.1-8B-Base and is originally licensed under llama3.1 license. DeepSeek-R1-Distill-Llama-70B is derived from Llama3.3-70B-Instruct and is originally licensed under llama3.3 license. ``` @misc{deepseekai2025deepseekr1incentivizingreasoningcapability, title={DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning}, author={DeepSeek-AI}, year={2025}, eprint={2501.12948}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2501.12948}, } ``` }, terms= {}, license= {https://github.com/deepseek-ai/DeepSeek-R1/blob/main/LICENSE}, superseded= {}, url= {https://huggingface.co/deepseek-ai/DeepSeek-R1} }