RE: LeoThread 2025-02-18 22:12

Part 4/7:

Following this extensive preparation, the existing Deep Seek R1 model underwent post-training using NVIDIA's Nemo 2.0 framework. Although details on the exact post-training techniques remain speculative, it's believed that a method akin to supervised fine-tuning was utilized alongside reinforcement learning.

Impact on Censorship and Performance Evaluation

One of the most significant outcomes of this initiative is R11 1776’s minimal censorship regarding Chinese topics. In comparative analyses, the R11 model showcased the least amount of censorship when queries about China were made, outperforming other models, including previous Deep Seek iterations and alternatives like Claude and OpenAI's GPT.