More developments in the Copyright Class-action Conundrum
The alleged use of copyrighted datasets to train generative AI models has led to a wave of copyright litigation. Let’s look at this class action conundrum.
Let’s take a look at the (probably will always be) ongoing debate over copyright infringement by AI art generators. The U.S. District Court Judge William H. Orrick, of the Northern District of California, filed a decision in a copyright infringement class action lawsuit against Stability AI, Midjourney, and DeviantArt, all of which are established AI image generators. The class action lawsuit was filed by three artists —Sarah Anderson, Kelly McKernan, and Karla Ortiz. The lawsuit claims that their work was used without their consent to train the generative AI models.
The AI companies had filed a motion to dismiss the copyright infringement case against them, which Judge Orrick largely granted, citing that the complaint was defective in numerous respects. One of the main issues was that two of the artists, McKernan and Ortiz, did not file copyrights on their art with the U.S. Copyright Office. Which is just fucking batty in my opinion. Furthermore, Anderson had only copyrighted 6 of the hundreds of works cited in the artists’ complaint.
The artists claim that some of their images were included in the Large-scale Artificial Intelligence Open Network (LAION). LAION is an open-source database, that all three AI art generator programs use for training. However, Judge Orrick noted that it is not plausible that every Training Image used to train Stable Diffusion was copyrighted, or that all DeviantArt users’ Output Images rely upon copyrighted Training Images.
From Other’s Sources
In an abridged quotation from an article on VentureBeat, by Carl Franzen, he states:
“The complaint notes that even non-copyrighted works may be automatically eligible for copyright protections if they include the artists’ “distinctive mark,” such as their signature… The complaint notes that any AI companies that relied upon the widely-used LAION-400M and LAION-5B datasets — which do contain copyrighted works but only links to them and other metadata about them, and were made available for research purposes — would have had to download the actual images to train their models, thus making “unauthorized copies.”
Perhaps most damningly for the AI art companies, the complaint notes that the very architecture of diffusion models themselves — in which an AI adds visual “noise” or additional pixels to an image in multiple steps, then tries to reverse the process to get close to the resulting initial image — is itself designed to come as close to possible to replicating the initial training material.”
Carl Franzen
The complaint shows a misunderstanding of both the law and the technology.
— Emad (@EMostaque) December 2, 2023
“with the heavy disclaimer I have no training in law or legal matters beyond my research of them as a journalist”
I would recommend looking at the prior responses from the defendants counsels.
Emad, of Stability AI, weighs in on the mentioned article via Twitter (or x…)
More Copyright Lawsuits
The cases and ongoing lawsuits continue to be a contentious issue at the intersection of AI and copyright law. Comedian Sarah Silverman recently filed a lawsuit against OpenAI and Meta in July 2023, which further highlights the growing concern around the use of copyrighted data to train generative AI models. This lawsuit was recently partly dismissed, and amendments are required to go forward on the case.
There is also the lawsuit involving CoPilot. This lawsuit accuses Microsoft, GitHub, and OpenAI (I would not want to stand against their teams of lawyers) of using copyrighted code for their outputs.
Conclusion
The use of copyrighted data to train generative AI models has led to a wave of litigation. The issue of how AI technologies could affect copyright and intellectual property has become a far more pressing issue for generative AI.
As an artist or a user of AI art generators, it is important to be aware of the legal implications of using generative AI to create art. The lawsuits mentioned highlight the growing concern around the use of allegedly copyrighted data to train generative AI. It is important to ensure that the data we use to train AI is not copyrighted or is at least used with the consent of the copyright holder.
Being that Generative AI is rapidly becoming more and more commonplace and mainstream, I truly expect more and more lawsuits to continue. The legal landscape and AI landscape are both on a precipice of change. I can’t believe I am saying this, but it actually seems like an exciting time to be a copyright lawyer.