3. Hallucinations and Factual Accuracy
- Complex model hallucinations: More advanced synthetic data generators (like OpenAI's rumored "o1") may produce harder-to-detect hallucinations or inaccuracies.
- Traceability issues: It may become increasingly difficult to identify the source of errors or hallucinations in synthetically generated data.
- Compounding effect: Models trained on synthetic data containing hallucinations may produce even more error-prone outputs, creating a problematic feedback loop.