Apple has made a big deal out of paying for the data used to train its Apple Intelligence, but one firm it used is accused of allegedly ripping off YouTube videos.
Apple Intelligence may have been trained less legally and ethically than Apple believed
Apple Intelligence may have been trained less legally and ethically than Apple believed
All generative AI works by amassing enormous datasets called Large Language Models (LLMs), and very often, the source of that data is controversial. So much so that Apple has repeatedly claimed that its sources are ethical, and it’s known to have paid millions to publishers, and licensed images from photo library firms.
According to Wired, however, one firm whose data Apple has used, appears to have been less scrupulous about its sources. EleutherAI reportedly created a dataset it calls the Pile, which Apple has reported using for its LLM training.