Part 2/7:
Understanding the Distillation Process
At its core, distillation refers to the process of training AI models on highly curated text data to refine their outputs. Typically, after a model is trained on general text from diverse sources, researchers conduct a phase called supervised fine-tuning. This is where the model learns to imitate high-quality completions from either a more sophisticated model or selects excerpts from an extensive range of textual data.
While some organizations resolutely adhere to ethical boundaries by respecting terms of service, others exploit loopholes to further their objectives. The debate over the legality and ethics of using these models for developing competitive AI raises probing questions about the definitions of "competitor" and acceptable use.