Boasting one of the fastest-growing user bases, ChatGPT employs advanced artificial intelligence (AI) technology to assist users across diverse tasks, including answering questions, composing essays, analyzing data, translating text, and even generating code. However, concerns have arisen regarding the sources of ChatGPT’s information, and their protection under copyright law.
On December 27, 2023, The New York Times Company (“The Times”) filed a Complaint in the U.S. District Court for the Southern District of New York against Defendants OpenAI (a web of interrelated Delaware entities) and Microsoft Corporation, alleging multiple counts of copyright infringement. This case delves into an often unpredictable defense to copyright infringement called “fair use,” which looks to four factors to determine if use of a copyrighted work is fair: 1) the purpose and character of the use; 2) the nature of the copyrighted work; 3) the amount and substantiality of the portion of the copyrighted work used; and 4) the effect of the use upon the potential market or value of the copyrighted work.
In its Complaint, The Times argues that OpenAI and Microsoft’s use of The Times’s copyrighted content is not fair use. Specifically, The Times alleges that Defendants should not be permitted to use its copyrighted content to train their GenAI models—such as ChatGPT—because the GenAI models learn the content, then turn around and compete with, closely mimic, occasionally misattribute facts to, and even exactly reproduce portions of the Times’s copyrighted works. The Times states it has always been dedicated to “producing independent journalism,” and owns over 3 million registered copyrighted works. Importantly, The Times also emphasizes that it takes extensive time, money, and manpower to produce its accurate, award-winning journalism pieces, and that its longstanding presence has allowed it to build trust among its millions of subscribers around the world. In light of this, The Times concludes that Defendants’ use of The Times’s copyrighted content in connection with their GenAI models is not fair use because it impacts The Times’s ability and funding to conduct business, as well as its relationship with its readers.
Interestingly, although no formal answer has been filed yet with the Court, OpenAI responded to The Times’s Complaint with a statement on its website calling The Times’s lawsuit “without merit.” OpenAI stresses that it supports news organizations by highlighting its existing partnerships with entities such as the Associated Press and American Journalism Project, and goes on to say that The Times “is not telling the full story” about previous negotiations between the parties or how it got OpenAI’s GenAI models to reproduce exact copyrighted works. Finally, OpenAI stresses that training AI models with publicly available internet materials is fair use, as supported by a variety of precedents in several different countries, including the US, because it supports innovation and advancement.
In sum, in a society that values originality and reliable information as well as creative innovation and widespread access to knowledge, defining boundaries in the use and training of AI to determine what is ultimately ‘fair’ becomes crucial, but exceedingly difficult. We look forward to further developments in this dispute, and learning from the new insights into copyright law that it could provide.
You can read The Times’s Complaint here: https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec2023.pdf
And OpenAI’s response here: https://openai.com/blog/openai-and-journalism