DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Citation & Link Guo, Daya, et al. “Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning.” arXiv preprint arXiv:2501.12948 (2025). PDF Link 늦은 감이 있지만 그래도 정리를 안 할 ...