RE: LeoThread 2024-10-13 12:37

Complex model hallucinations: More advanced synthetic data generators (like OpenAI's rumored "o1") may produce harder-to-detect hallucinations or inaccuracies.
Traceability issues: It may become increasingly difficult to identify the source of errors or hallucinations in synthetically generated data.
Compounding effect: Models trained on synthetic data containing hallucinations may produce even more error-prone outputs, creating a problematic feedback loop.