“PlayAI uses mostly open data sets, [as well as licensed data] and proprietary data sets that are built in-house,” Syed said. “We don’t use user data from the products in training, or creators to train models. Our models are trained on millions of hours of real-life human speech, delivering voices in male and female genders across multiple languages and accents.”
Most AI models are trained on public web data — some of which may be copyrighted or under a restrictive license. Many AI vendors argue that the fair-use doctrine shields them from copyright claims. But that hasn’t stopped data owners from filing class action lawsuits alleging that vendors used their data sans permission.