Adobe faces class-action suit on unauthorised AI data

Photo source: ITPro

Adobe faces a proposed class-action lawsuit accusing it of using unauthorised book copies to train its SlimLM language models. The suit, filed by Oregon author Elizabeth Lyon, claims her non-fiction guides appeared in training data alongside thousands of others.

SlimLM features lightweight models with 400 million parameters for mobile document tasks. Adobe states pre-training used SlimPajama-627B, a 627-billion-token dataset from Cerebras in June 2023.

Lyon’s case—first reported by Reuters—alleges shady sourcing.

“The SlimPajama dataset was created by copying and manipulating the RedPajama dataset (including copying Books3),” the lawsuit says. “Thus, because it is a derivative copy of the RedPajama dataset, SlimPajama contains the Books3 dataset, including the copyrighted works of Plaintiff and the Class members.”

Books3, a 191,000-book trove from piracy sites, has sparked suits against Apple, Salesforce, and others over RedPajama use. Anthropic settled a similar claim for $1.5 billion in September. Adobe, known for ethical Firefly AI tools since 2023, has not commented.

Experts warn of rising demands for licensed data in AI training.

Adobe faces class-action suit on unauthorised AI data

Why does chasing a failed builder sometimes make creditors worse off?

WorkSafe issued two clashing verdicts on the same employer

IRD arrested a specialist doctor at Wellington Airport over student loan debt

$19 million saving risks wiping out $100 million in charitable giving

Meta halts AI worker monitoring scheme

YouTube reaches settlement in teen addiction lawsuit