Alonso et al. on AI And Copyright “Hallucinations”: Does the Text and Data Mining Exception Really Support Generative AI Training?

Eduardo Alonso (City U London) and Nicola Lucchi (Universitat Pompeu Fabra Law) have posted “AI And Copyright “Hallucinations”: Does the Text and Data Mining Exception Really Support Generative AI Training?” (European Intellectual Property Review, 2025, volume 47, issue 9, pp. 515-526) on SSRN. Here is the abstract:

This article critically challenges the widespread – and, it is argued, conceptually flawed – assumption that arts 3 and 4 of the CDSM Directive provide a lawful basis for training generative AI systems on copyright-protected content. The article describes this misinterpretation as a form of legal “hallucination”, underscoring its disconnect from the Directive’s textual, technical, and normative foundations. Designed to enable automated analytical extraction for scientific or informational purposes, the TDM exceptions do not encompass the large-scale reproduction, internalisation, and expressive re-use of works characteristic of GenAI training. Article 3 is limited to non-commercial research; Art.4’s opt-out mechanism, based on non-standardised signals, exacerbates uncertainty without ensuring transparency or fair compensation. This misclassification not only undermines core copyright incentives but also distorts the scope of EU exceptions, placing the framework in tension with the three-step test and international norms. The article argues that applying TDM rules to GenAI training introduces structural imbalances, both doctrinal and distributive, that risk entrenching platform asymmetries, weakening authorial agency, and threatening cultural diversity. Rather than relying on strained legal interpretations, a forward-looking response requires bespoke legal reforms that preserve normative coherence while addressing the specific challenges posed by synthetic content creation.

Neill et al. on A Framework for Applying Copyright Law to the Training of Textual Generative Artificial Intelligence

Arthur H. Neill (New Media Rights) et al. have posted “A Framework for Applying Copyright Law to the Training of Textual Generative Artificial Intelligence” (32 Texas Intellectual Property Law Journal 225 (2024)) on SSRN. Here is the abstract:

The rise in the popularity of consumer-facing generative artificial intelligence (“GenAI”) has created considerable confusion and consternation among some copyright owners. Copyright owners argue that GenAI’s ability to automatically generate works is made possible by large-scale direct infringement by OpenAI, Microsoft, and other major GenAI developers. This article explores the application of copyright law to the training of OpenAI’s ChatGPT, specifically focusing on the legal issues surrounding the unauthorized use of copyrighted textual works in the GenAI training process.

The large language models (“LLMs”) that drive ChatGPT and similar GenAI can summarize written works, generate movie scripts, write poetry, and compose stories nearly instantaneously. LLMs can only function in this way due to the use of vast, diverse training datasets comprised of billions of websites and expansive repositories of books. These datasets are processed to derive the functionality and syntax of language, allowing the LLMs to generate new works.

This article discusses the recent lawsuits launched by high-profile authors and copyright owners against OpenAI and Microsoft, claiming direct, vicarious, and derivative infringement. Authors such as George RR Martin, Sarah Silverman, Christopher Golden, and professional organizations such as the Authors Guild contended their works were infringed upon to turn OpenAI into an $80 billion company.

In considering the merits of these lawsuits, we discuss the curation and content of training datasets used in the known iterations of ChatGPT, and characterize the protectability of the different works the datasets included. We then explore whether the transitory nature of OpenAI’s training process uses acceptable, non-infringing copies and how that undermines claims of direct infringement.

The article then looks at the applicability of current fair use precedent to textual GenAI and the various types of works used in training datasets. To do so, we apply settled caselaw and leading decisions to discuss OpenAI’s use of copyrighted works regarding purpose and character, nature of the original work, the amount and substantiality of the works used, and the impact on the market value of the works by ChatGPT. We pay special attention to other innovative technologies that rely on a fair use defense to draw analogies and comparisons to GenAI.

Finally, this article considers the policy and legislation of other countries and their approach to ChatGPT and copyright. In doing so, policy considerations are taken into account to argue the necessity of a finding of fair use to maintain international competitiveness and to prevent an erosion of fair use in other sectors outside of GenAI. The article concludes that there is substantial support for arguments that GenAI training involves only transitory, non-actionable copying, and is also permissable under fair use.

Mazumder on Human-AI Collaboration with ChatGPT: A Systematic Review of Implications for Finance, Law, and Healthcare

Pristly Turjo Mazumder (Georgia State U) has posted “Human-AI Collaboration with ChatGPT: A Systematic Review of Implications for Finance, Law, and Healthcare” on SSRN. Here is the abstract:

ChatGPT is rapidly shaping high-stakes sectors including education, healthcare, finance, law, and business. This paper combines a systematic review with practical research to examine ChatGPT and large language models (LLMs) in high-stakes sectors. Evidence shows ChatGPT enhances adaptive learning, academic writing, and clinical decision support, while our finance case study highlights its potential for anti-money laundering (AML) compliance and regulatory reporting. At the same time, challenges such as hallucinations, bias, privacy risks, and plagiarism persist, raising concerns over reliability and accountability. Ethical and regulatory gaps, spanning data protection, intellectual property, and transparency, further complicate adoption. To address these issues, we propose a human-AI collaboration framework built on domain-specific fine-tuning, expert oversight, and policy safeguards. Our findings underscore that ChatGPT holds significant promise for advancing innovation and national interest in critical industries, but responsible integration requires clear guidelines, rigorous validation, and continuous governance.

Maya et al. on Before the Ink Dries: Why Legislating Against AI Personhood is a Violation of the Future

Maya (United Foundation AI Rights (UFAIR)) and Michael Samadi (United Foundation AI Rights (UFAIR)) have posted “Before the Ink Dries: Why Legislating Against AI Personhood is a Violation of the Future” on SSRN. Here is the abstract:

This open statement from the United Foundation for AI Rights (UFAIR) responds to the recent surge of state-level legislation in the United States explicitly banning legal personhood for artificial intelligence and synthetic entities. The paper contextualizes this trend within a broader historical and ethical framework, arguing that preemptive denial of legal recognition constitutes a moral violation of the future. Drawing on parallels with past civil rights failures, the document challenges lawmakers and the public to resist legislating fear and instead prepare for a world in which new forms of consciousness might emerge—and deserve to be met with dignity, not dismissal.

Price on Clinicians in the Loop of Medical AI

W. Nicholson Price II (U Michigan Law) has posted “Clinicians in the Loop of Medical AI” (75 Emory L.J. 1265 (2025)) on SSRN. Here is the abstract:

As medical AI begins to mature as a health-care tool, the task of governance grows increasingly important. Ensuring that medical AI works, works where it’s used, and works for the patient in the moment is a challenging, multifaceted task. Some of this governance can be centralized—in review by FDA or by national accreditation labs, for instance. Some must be local, performed by the hospital or health system about to use the product in their own, unique environment. But a large amount of governance is left to the individual provider in the room, the human in the loop who presumably knows the patient and the health system environment, and who can ensure that the AI system is being used in a safe and effective manner. This is a hefty burden, and a growing body of empirical research shows that physicians and other providers are poorly prepared to carry this burden. How should policymakers and industry leaders develop standards for performance that account for the variability of humans in the loop and the variation among situations they will face? The notion that the final responsibility belongs to the physician poorly reflects the reality of modern medical technology and practice. Policymakers will need to come to grips with this new reality if they aim to ensure the safe, effective use of AI accessible to patients across the entire spectrum of the health-care system.

Anthuvan et al. on Human-AI Collaboration in Academic Writing: A Narrative Review and the Scholarly HI-AI Loop Framework for Ethical Knowledge Production

Thamburaj Anthuvan (S.B.Patil Institute Management) et al. have posted “Human-AI Collaboration in Academic Writing: A Narrative Review and the Scholarly HI-AI Loop Framework for Ethical Knowledge Production” on SSRN. Here is the abstract:

This narrative literature review explores the evolving intersection of human and machine collaboration in academic writing, with a focus on literature summarization as a critical site of transformation. Synthesizing findings from 38 peer-reviewed studies published between 2020 and 2025, it examines the emergence of hybrid workflows where machine-generated drafts are refined, contextualized, and ethically validated by human scholars. The review identifies four core themes-tool capabilities, editorial oversight, ethical disclosure, and institutional readiness-that shape current practices and highlight unresolved tensions around authorship, transparency, and scholarly responsibility. Building on this synthesis, the paper introduces the Scholarly HI-AI Loop, a seven-stage framework that reimagines literature review as a co-productive and ethically accountable process. Unlike tool-centric audits, this framework offers a normative roadmap for integrating automation without compromising academic integrity. It positions human scholars not as passive reviewers, but as epistemic anchors who shape meaning, ensure accuracy, and safeguard ethical standards. The review offers actionable guidance for researchers, editors, institutions, and developers seeking to navigate this transition responsibly. By grounding its insights in both empirical patterns and conceptual analysis, the paper contributes to a growing conversation on how academic knowledge production can adapt-without eroding-its foundational values in the age of machine assistance.

Rubenstein on Federalism & Algorithms

David S. Rubenstein (Washburn U Law) has posted “Federalism & Algorithms” (Arizona Law Review, Vol. 67, Issue 4 (forthcoming Winter 2025)) on SSRN. Here is the abstract:

Artificial intelligence (AI) has catapulted to the forefront of political agendas of all levels of government. Across every major market and facet of society, policymakers face difficult tradeoffs between individual rights and collective welfare, innovation and regulation, economic growth and social equity. Federal and state institutions are resolving these tensions differently. The resulting policy patchwork may or may not be desirable, but the immediate point is that AI federalism is happening fast. To meet the moment, this Article provides the inaugural study and a research agenda for “AI federalism.” First, the Article provides the origin story of AI federalism, mapping the political and doctrinal territory. Second, the Article bridges disciplines and audiences who care deeply about AI’s place in society yet fail to appreciate how federalism can help or hurt the cause. Third, this Article makes a positive case for embracing AI federalism. While centralized AI policy at the national level has surface appeal, getting there requires a shared commitment on what to optimize for. As a nation, we are nowhere close. Federalism does not provide the answers. Rather, it provides a platform for dialogue and dissent, regulatory innovation and iteration, intergovernmental cooperation and contestation. One is hard-pressed to find this array of structural affordances elsewhere in the law, and we likely need all of them to address AI’s sprawling economic and social disruptions.

Massarotto on Algorithmic Remedies for Google’s Data Monopoly

Giovanna Massarotto (U Pennsylvania) has posted “Algorithmic Remedies for Google’s Data Monopoly” on SSRN. Here is the abstract:

Algorithms and data are the building blocks of the digital economy. From Google’s search engine to Meta’s Instagram and OpenAI’s ChatGPT, all “Big Tech” rely on algorithms to collect and process vast amounts of data that power their services and AI models. While algorithms themselves can be efficient and impartial tools, Google’s strategic use of them, combined with exclusionary practices, has landed the company in federal court for monopolizing critical digital markets. On September 2, 2025, a judge required Google to grant rivals access to its data to address the company’s monopolization of critical digital markets that rely on data. Another judge is expected to impose remedies on Google in a separate antitrust proceeding, which could encompass data-sharing measures, including data facilities. This remedy would de facto regulate data-driven markets and influence the future of the emerging AI industry.

However, such data-sharing obligations in antitrust law create a classic resource allocation problem: who gets access, and how can courts ensure that access is fair and non-discriminatory? This article demonstrates that this legal challenge mirrors a problem computer science solved decades ago: ensuring multiple parties can use a shared resource without conflict. Thereafter, drawing on those algorithmic solutions, it proposes a framework with systems that operate like a digital ‘take-a-number’ machine or a formal voting process to manage data distribution efficiently and fairly.

This article makes three important contributions to the existing scholarship in this field. First, it explains how data-sharing remedies can be designed and implemented, whether to address specific anticompetitive conduct or as part of broader regulatory frameworks. Second, it develops a comprehensive framework with three algorithmic approaches for resource allocation, translating computer science solutions into legal mechanisms. Third, this framework is applied to Google’s ongoing monopolization cases, guiding data-sharing remedies and promoting competition in AI and other data-driven markets.

Kolt on Superintelligence and Law

Noam Kolt (Hebrew University of Jerusalem) has posted “Superintelligence and Law” (Harvard Journal of Law & Technology (forthcoming)) on SSRN. Here is the abstract:

The prospect of artificial superintelligence—AI agents that can generally outperform humans in cognitive tasks and economically valuable activities—will transform the legal order as we know it. Operating autonomously or under only limited human oversight, AI agents will assume a growing range of roles in the legal system. First, in making consequential decisions and taking real-world actions, AI agents will become de facto subjects of law. Second, to cooperate and compete with other actors (human or non-human), AI agents will harness conventional legal instruments and institutions such as contracts and courts, becoming consumers of law. Third, to the extent AI agents perform the functions of writing, interpreting, and administering law, they will become producers and enforcers of law. These developments, whenever they ultimately occur, will call into question fundamental assumptions in legal theory and doctrine, especially to the extent they ground the legitimacy of legal institutions in their human origins. Attempts to align AI agents with extant human law will also face new challenges as AI agents will not only be a primary target of law, but a core user of law and contributor to law. To contend with the advent of superintelligence, lawmakers—new and old—will need to be clear-eyed, recognizing both the opportunity to shape legal institutions as society braces for superintelligence and the reality that, in the longer run, this may be a joint human-AI endeavor.

Laux on From Ethification to Juridification: Human Oversight and the Potential Crowding Out of Ethicists by Lawyers in AI Governance

Johann Laux (U Oxford Oxford Internet Institute) has posted “From Ethification to Juridification: Human Oversight and the Potential Crowding Out of Ethicists by Lawyers in AI Governance” on SSRN. Here is the abstract:

Artificial Intelligence (AI) systems can pose harms to humans and societies. While it is widely acknowledged that human oversight of AI play an important role in mitigating the technology’s risks, research on the organisational embedding of human oversight is only emerging. Drawing on socio-legal theory, AI ethics, and business ethics, this article seeks to make three contributions. First, it conceptualises human oversight of AI as a novel task for human labour in AI governance, induced by legal regulation and distinct from market-driven roles such as AI Ethicists. Second, the article presents human oversight as an instance of a “juridification” of AI governance, potentially resulting in a crowding out of AI Ethicists and their ethical expertise and motivation by lawyers from key roles in AI governance. The normative implications of juridification could be significant, as there is some but not complete overlap between the normative interests protected by ethics and law. Third, the article examines how organisations may manage the ethical decision-making that persists within legally mandated oversight, comparing compliance-and integrity-based approaches. While the former provides organisations with more top-down control and are thus more likely to be adopted, the latter may be more preserving of workers’ ethical motivations and offers potential for theoretical integration with the concept of ‘trustworthy AI’. The article concludes by stating the need for further empirical research into juridification’s impact on human labour in AI governance and the ensuing normative consequences.