Supplier used controversial sources for training Apple Intelligence

Apple has made a big deal out of paying for the data used to train its Apple Intelligence, but one firm it used is accused of allegedly ripping off YouTube videos.

Smartphone displaying a colorful, glowing sphere on its screen against a geometric purple background. The device's side buttons are visible.

Apple Intelligence may have been trained less legally and ethically than Apple believed

All generative AI works by amassing enormous datasets called Large Language Models (LLMs), and very often, the source of that data is controversial. So much so that Apple has repeatedly claimed that its sources are ethical, and it’s known to have paid millions to publishers, and licensed images from photo library firms.

According to Wired, however, one firm whose data Apple has used, appears to have been less scrupulous about its sources. EleutherAI reportedly created a dataset it calls the Pile, which Apple has reported using for its LLM training.

Continue Reading on AppleInsider | Discuss on our Forums