Frye on Thomson Reuters v. ROSS: Brief for Amici Curiae in Support of Appellant-Defendant’s Petition for Certification Under s 1292(c)

Brian L. Frye (U Kentucky J. David Rosenberg College Law) has posted “Thomson Reuters v. ROSS: Brief for Amici Curiae in Support of Appellant-Defendant’s Petition for Certification Under s 1292(c)” on SSRN. Here is the abstract:

This is an amicus brief in support of ROSS Intelligence’s petition for interlocutory review of the district court’s order in Thomson Reuters Enter. Ctr. GMBH v. Ross Intel. Inc.,

No. 1:20-CV-613-SB, 2025 WL 458520 (D. Del. Feb. 11, 2025). Plaintiffs Thomson Reuters and West allege that ROSS infringed the copyright in West’s headnotes by using them to train an AI model. The district court largely granted the plaintiffs’ motions for summary judgment, finding that at least some of West’s headnotes are protected by copyright and that ROSS’s use of West’s headnotes was not protected by the fair use doctrine. This amicus brief argues that the Third Circuit should grant interlocutory review because West’s headnotes are not copyrightable subject matter. Accordingly, this case is not an appropriate vehicle for the court to determine whether the use of copyrighted works to train an AI model is infringing or a fair use.

Saw & Tan on Unpacking Copyright Infringement Issues in the GenAI Development Lifecycle and a Peek into the Future

Cheng Lim Saw (Singapore Management U Yong Pung How Law) and Bryan Zhi Yang Tan (Singapore Management U Yong Pung How Law) have posted “Unpacking Copyright Infringement Issues in the GenAI Development Lifecycle and a Peek into the Future” on SSRN. Here is the abstract:

Generative AI (“GAI”) refers to deep learning models that ingest input data and “learn” to produce output that mimics such data when duly prompted. This feature, however, has given rise to numerous claims of infringement by the owners of copyright in the training material. Relevantly, three questions have emerged for the law of copyright: (1) whetherprima facie acts of infringement are disclosed at each stage of the GAI development lifecycle; (2) whether such acts fall within the scope of the text and data mining (“TDM”) exceptions; and (3) whether (and, if so, how successfully) the fair use exception may be invoked by GAI developers as a defence to infringement claims. This paper critically examines these questions in turn and considers, in particular, their interplay with the so-called “memorisation” phenomenon. It is argued that although infringing acts might occur in the process of downloading in-copyright training material and training the GAI model in question, TDM and fair use exceptions (where available) may yet exonerate developers from copyright liability under the right conditions.

Mantegna on ARTificial: Why Copyright Is Not the Right Policy Tool to Deal with Generative AI

Micaela Mantegna (Berkman Klein Center) has posted “ARTificial: Why Copyright Is Not the Right Policy Tool to Deal with Generative AI” (The Yale Law Journal Forum | April 22, 2024) on SSRN. Here is the abstract:

The rapid advancement and widespread application of Generative Artificial Intelligence (GAI) raise complex issues regarding authorship, originality, and the ethical use of copyrighted materials for AI training.

As attempts to regulate AI proliferate, this Essay proposes a taxonomy of reasons, from the perspective of creatives and society alike, that explain why copyright law is ill-equipped to handle the nuances of AI-generated content.

Originally designed to incentivize creativity, copyright doctrine has been expanded in scope to cover new technological mediums. This expansion has proven to increase the complexity and uncertainty of copyright doctrine’s application—ironically leading to the stifling of innovation. In this Essay, I warn that further attempts to expand the interpretation of copyright doctrine to accommodate the particularities of GAI might well worsen that problem, all while failing to fulfill copyright’s stated goal of protecting creators’ rights to consent, attribution, and compensation.

Moreover, I argue that, in that expansion, there is the peril of overreaching copyright laws that will negatively impact society and the development of ethical AI. This Essay explores the philosophical, legal, and practical dimensions of these challenges in four parts.

Goodyear on Artificial Infringement

Michael Goodyear (New York U Law) has posted “Artificial Infringement” (UC Law Journal, Forthcoming) on SSRN. Here is the abstract:

Generative AI is changing the way we do everything from legal research to artistic creation. This is possible through recent advances in machine learning that allow AI systems to program themselves. With greater AI capacity, however, comes increasingly unpredictable outputs. AI systems will often generate an output the user and the developer never considered. Sometimes, these unforeseen outputs can infringe others’ copyrights in creative works. In the past two years, copyright law has become one of the leading legal and policy battlegrounds for generative AI. Yet the question of who should be liable when AI systems infringe has barely been addressed.

By examining the historical and doctrinal response of copyright law to new technologies, this Article offers a new analytical framework for determining liability for what it terms artificial infringement, or infringing outputs created by generative AI systems. Time and again, new technologies have posed challenges to existing copyright law, straining its capacity to both protect authors’ rights to incentivize new creative works and provide public access to those works. Courts and Congress have been able to maintain this balance by using a variety of doctrinal tools, including fair use, compulsory licensing, and secondary liability. One undertheorized tool, however, is the refinement of the copyright infringement claim. Courts introduced the “volition or causation” requirement to balance copyright in response to the rise of complex machine-generated infringements.

This Article proposes that the AI system should be held directly liable for artificial infringement because it caused the infringing expression to occur. By making the AI system the direct infringer, courts can remove copyright law from strict liability and instead utilize and refine secondary liability doctrines to conduct a more nuanced, fault-based analysis of user and developer liability for AI-generated infringements. Together with the fair use doctrine, this conceptualization of the AI system as the direct infringer and users and developers as potentially secondarily liable provides a more comprehensive resolution to the existential infringement battles between copyright owners and AI while maintaining a balance between copyright’s competing policy goals.

Yeong on Accessing and Using Data for AI Systems

Zee Kin Yeong (Singapore Academy Law) has posted “Accessing and Using Data for AI Systems” (The 4th Judicial Roundtable, 23 – 26 April 2024, at Durham Law School) on SSRN. Here is the abstract:

This paper discusses the copyright and data protection issues relating to accessing and using data for AI systems. The discussion commences with the issues relating to processing of data for model development. Secondary use of data, data sharing with partners or collection of publicly available data for machine learning raise issues such text and data mining in copyright law and legitimate interests in data protection law. Different issues have to be considered for the deploymentr of trained models, eg pre-deployment tests may be required under data protection laws for compliance with data security requirements. AI systems may rely on repositories and will require input data from users: there are also pertinent copyright and data protection issues to consider, such as data subject rights.

Mantegna on ARTificial: Why Copyright Is Not the Right Policy Tool to Deal with Generative AI

Micaela Mantegna (Berkman Klein Center) has posted “ARTificial: Why Copyright Is Not the Right Policy Tool to Deal with Generative AI” (The Yale Law Journal Forum | April 22, 2024) on SSRN. Here is the abstract:

The rapid advancement and widespread application of Generative Artificial Intelligence (GAI) raise complex issues regarding authorship, originality, and the ethical use of copyrighted materials for AI training.

As attempts to regulate AI proliferate, this Essay proposes a taxonomy of reasons, from the perspective of creatives and society alike, that explain why copyright law is ill-equipped to handle the nuances of AI-generated content.

Originally designed to incentivize creativity, copyright doctrine has been expanded in scope to cover new technological mediums. This expansion has proven to increase the complexity and uncertainty of copyright doctrine’s application—ironically leading to the stifling of innovation. In this Essay, I warn that further attempts to expand the interpretation of copyright doctrine to accommodate the particularities of GAI might well worsen that problem, all while failing to fulfill copyright’s stated goal of protecting creators’ rights to consent, attribution, and compensation.

Moreover, I argue that, in that expansion, there is the peril of overreaching copyright laws that will negatively impact society and the development of ethical AI. This Essay explores the philosophical, legal, and practical dimensions of these challenges in four parts.

Rättzén on Location Is All You Need: Copyright Extraterritoriality and Where to Train Your AI

Mattias Rättzén (Independent) has posted “Location Is All You Need: Copyright Extraterritoriality and Where to Train Your AI” (26 The Columbia Science and Technology Law Review 175-289 (2024)) on SSRN. Here is the abstract:

The development of artificial intelligence (“AI”) models requires vast quantities of data, which will often include copyrighted materials. The reproduction of copyrighted materials in the course of training AI models will infringe on copyright, unless there are applicable exceptions and limitations exempting such activities. There is so far considerable divergence between jurisdictions, including between the United States, EU, U.K., Japan, Singapore, Australia, India, Israel, and many more countries, in this regard. In the absence of international harmonization, there is therefore a high likelihood that the same type of training activity would be considered copyright infringement in some countries but not in others.

The AI community is not blind to that risk. If copyright law restricts the development and deployment of AI, developers may decide to relocate their operations elsewhere, where the reproduction of training data is clearly not infringing. This Article concludes that there is a loophole in the international copyright system, as it currently stands, that would permit large-scale copying of training data in one country where this activity is not infringing. Once the training is done and the model is complete, developers could then make the model available to customers in other countries, even if the same training activities would have been infringing if they had occurred there. Because copyright laws are territorial in nature, by default they can only restrict infringing conduct occurring in their respective countries. From that point of view for AI developers, location is indeed all you need.

The EU has become the first to respond to this problem by retroactively extending their text and data mining exception extraterritorially to training activities occurring in non-EU countries, once the completed AI model is placed on the EU market. While such an extraterritorial application benefits rightholders and closes the loophole now present, it makes the situation significantly more complex for developers. If other regulators decide to follow the same path as the EU, which previously happened in the data privacy context, then developers would be facing multiple, conflicting copyright laws targeting the same underlying activity. This could significantly complicate the development process for AI and potentially undermine the AI industry. This Article critically discusses these and related issues, and whether an extraterritorial application of copyright laws is compatible with territoriality norms that are supposed to respect foreign sovereignty. It also explores, in light of these difficulties, whether we should instead shift focus from regulating the inputs (i.e., the data used to train AI models) to regulating the outputs (i.e., the AI-generated content itself). Indeed, to the extent that the transnational data loophole cannot be closed without infringing upon foreign sovereignty, we may need to look at other regulatory means instead.

This Article urgently calls for a coordinated international effort in copyright law, which balances the interests of rightholders with the technical, regulatory, and economic realities faced by developers. How we resolve these issues could make or break the future of AI. If we cannot find a way to reconcile the interests of rightholders and AI stakeholders, the world may be left with a segregated and fragmented AI landscape, one in which there can only be losers and no winners.

Citation: Mattias Rättzén, Location Is All You Need: Copyright Extraterritoriality and Where To Train Your AI, 26 Colum. Sci. & Tech. L. Rev. 175 (2024)

Official source: https://journals.library.columbia.edu/index.php/stlr/article/view/13338/6542

Cooper et al. on Machine Unlearning Doesn’t Do What You Think: Lessons for Generative AI Policy, Research, and Practice

A. Feder Cooper (Microsoft Research) et al. have posted “Machine Unlearning Doesn’t Do What You Think: Lessons for Generative AI Policy, Research, and Practice” on SSRN. Here is the abstract:

We articulate fundamental mismatches between technical methods for machine unlearning in Generative AI, and documented aspirations for broader impact that these methods could have for law and policy. These aspirations are both numerous and varied, motivated by issues that pertain to privacy, copyright, safety, and more. For example, unlearning is often invoked as a solution for removing the effects of targeted information from a generative-AI model’s parameters, e.g., a particular individual’s personal data or in-copyright expression of Spiderman that was included in the model’s training data. Unlearning is also proposed as a way to prevent a model from generating targeted types of information in its outputs, e.g., generations that closely resemble a particular individual’s data or reflect the concept of “Spiderman.” Both of these goals-the targeted removal of information from a model and the targeted suppression of information from a model’s outputs-present various technical and substantive challenges. We provide a framework for thinking rigorously about these challenges, which enables us to be clear about why unlearning is not a general-purpose solution for circumscribing generative-AI model behavior in service of broader positive impact. We aim for conceptual clarity and to encourage more thoughtful communication among machine learning (ML), law, and policy experts who seek to develop and apply technical methods for compliance with policy objectives.

Hrdy on Trade Secrecy Meets Generative AI

Camilla Alexandra Hrdy (Rutgers) has posted “Trade Secrecy Meets Generative AI” (“Disrupting AI” Symposium Issue of the Chicago Kent Law Review, Forthcoming 2025) on SSRN. Here is the abstract:

Generative AI models like ChatGPT raise novel issues for trade secret law. This Essay identifies three major developments and explains how the law will likely respond based on analogies to past technologies and past case law. 

First, widespread use of generative AI poses new risks to companies’ existing trade secrets. For example, trade secret owners’ own employees might inadvertently share trade secrets with a generative AI tool like ChatGPT, which might disseminate this information to competitors or third parties. I argue this new disclosure risk, at the margins, raises the bar for keeping trade secrets. But companies will likely adapt their risk management strategies, as they did in the face of prior information-distribution technologies, such as the internet. 

Second, generative AI will add to the universe of information that can be protected under trade secret law. Trade secret law will be available even for information that is not protected by patent and copyright law. Patent and copyright law have human creator requirements. But trade secret law has no human creator requirement. Therefore, purely AI-generated outputs that do not qualify for patent or copyright protection can be protected as trade secrets. 

Third, companies that develop valuable new generative AI tools will be able to rely on trade secrecy to protect that technology, even when other forms of IP are unavailing. Trade secret law, especially when supplemented by restrictive contractual “terms of use,” can protect various types of information related to generative AI, including information that does not qualify for copyright or patent protection. 

Even though generative AI models will initially benefit from a combination of trade secrecy and contract protection, the models are highly vulnerable to “reverse engineering.” For example, OpenAI, the maker of ChatGPT, recently accused the makers of the new AI model, “DeepSeek,” of engaging in “knowledge distillation” to develop their competing system—using the larger, more complex, and more expensive ChatGPT model to build a smaller, simpler, and cheaper one. Trade secret law, although it generally permits reverse engineering, may or may not condone this conduct. Courts might construe these activities as a violation of contract law, since knowledge distillation seems to violate OpenAI’s contractual terms of use, but courts may also view these activities as a violation of federal and state trade secret law. In software cases, courts have held that using cutting-edge techniques like data scraping to access trade secrets constitutes  acquisition by “improper means,” and thus misappropriation, especially when contractual terms of use explicitly prohibit this conduct. The makers of DeepSeek claim they independently developed their model, but if this is not true, trade secret law could provide an avenue for legal liability.

Ginsburg & Austin on Regulating Deepfakes at Home and Abroad

Jane C. Ginsburg (Columbia U Law) and Graeme W. Austin (Victoria U Wellington) have posted “Regulating Deepfakes at Home and Abroad” on SSRN. Here is the abstract:

AI technology enables the creation of “deepfakes”—known in legal documents as “digital replicas”—capable of simulating the visual and vocal appearance of real people, living or dead. AI programs can also generate musical compositions in the style of well-known composers or performers, as well as video sequences. What may be good fun in private may become pernicious, offensive, and even dangerous, if widely disseminated over social media or through commercial channels. But, at least in the U.S., legal protections for performers and ordinary individuals against digital replicas, are at best, scanty. The first part of this Essay reviews existing protections against the creation and dissemination of deep fakes under U.S. copyright and trademarks laws as well as representative State right of publicity laws. Our brief survey supports the conclusion of the U.S. Copyright Office that “new federal legislation is urgently needed” because “existing laws fail to provide fully adequate protection.” These failures appear plainer still once consideration extends to the capacity of these doctrines to reach foreign violations. The second part of this Essay’s analysis will show how the currently pending U.S. legislation may, and may not, provide performers and ordinary individuals with enforceable rights against the use of their voices and visual likenesses in digital replicas. Given the few material barriers to cross-border dissemination of deep fakes, any evaluation of the strength of the protections afforded by a new U.S. intellectual property right should consider its international scope, particularly in light of recent Supreme Court caselaw restricting the territorial reach of U.S. intellectual property protections.