Part 6/10:
The critique intensified when independent tests conducted to replicate the performance claims reportedly failed dramatically. Initial evaluations showed Reflection 70B trailing behind established models, such as llama 3, raising serious doubts about the claims made by Schumer. Meanwhile, speculation began to circulate that the model might be a repackaged version, with insiders alleging it utilized outputs from existing models, creating a façade of innovation.
Critics underscored the ambiguity surrounding access to the model's private API, which yielded impressive results but lacked transparency, raising questions about whether users could trust the claims made. Allegations surfaced that the model could be a wrapper for existing technologies, causing further distrust.