Oguine & Badillo-Urquiola on Inference Is Not Consent: Privacy Risks from Training LLMs on Social Media and Web-Scraped Data

Ozioma C. Oguine (U Notre Dame) and Karla Badillo-urquiola (U Central Florida) have posted “Inference Is Not Consent: Privacy Risks from Training LLMs on Social Media and Web-Scraped Data” on SSRN. Here is the abstract:

Large Language Models (LLMs) are increasingly embedded in tools used for education, creativity, productivity, and personal inquiry. Trained on vast web-scraped datasets, these models do not merely reproduce public content; they infer connections, identities, and attributes that individuals may never have disclosed. This paper introduces and centers the concept of inferential privacy in LLMs, arguing that privacy harms in the LLM era stem not just from memorization or data leakage, but from the automated synthesis of plausible, sensitive, or stigmatizing information. Drawing on research in data protection, HCI, AI ethics, and law, we examine how these harms disproportionately affect marginalized communities, including youth, activists, LGBTQ+ individuals, and people with disabilities. We critique the inadequacy of current regulatory frameworks such as GDPR and CCPA, which assume static data and explicit collection, and propose an expanded approach that treats inference as a distinct site of harm. We conclude with a roadmap for action, including inferential privacy audits, participatory red-teaming, context-aware model design, and regulatory innovation. This paper advocates for a shift in how we conceptualize privacy, away from control over data points and toward protection against algorithmic misrepresentation.

Park on Private Equity and A.I. in Healthcare: A Perilous Pairing for Patient Privacy

Eunice Park (Western State College Law) has posted “Private Equity and A.I. in Healthcare: A Perilous Pairing for Patient Privacy” (53 Hofstra L. Rev. 349 (2025)) on SSRN. Here is the abstract:

The American healthcare system faces two trends that not only threaten the quality of care but also patient privacy: private equity acquisitions in the healthcare sector, and the incursion of AI-supported technology. While law enforcement efforts have focused on private equity’s anticompetitive effects in healthcare, attention has not yet turned to the privacy harms. To mitigate the harms to patient privacy, this Article proposes expanding upon already-existing pre-merger reporting requirements to include enhanced transparency of private equity’s data governance plans when utilizing AI systems.

Fan on AI-Enhanced Evidence

Mary D. Fan (U Washington Law) has posted “AI-Enhanced Evidence” on SSRN. Here is the abstract:

Technological transformations in how we live our lives through the lenses of cell phone cameras, surveillance videos, and other multimedia are producing vast volumes of evidence that can be easily digitally enhanced. Courts have long admitted technologically enhanced evidence under flexible rules on authentication that pose a low bar to getting before a jury. In an era of concern over artificial intelligence (AI), however, potential judicial resistance and reform proposals are emerging, spurred by concerns over how generative AI can create deepfakes or misleadingly alter evidence. The problem with piecemeal approaches that ratchet up barriers to enhanced evidence is that they may come at the expense of parties who are least able to bear the cost and undermine the right to present a defense. 

This article analyzes how to address the challenges of AI-enhanced evidence through the theoretical and pragmatic lenses of inequality of arms and access to justice. We ignore the impact of changes to the admissibility of key forms of proof, such as audiovisual evidence, on parties lacking resources at the peril of exacerbating long-burning challenges. The requirements to introduce AI-enhanced evidence can either alleviate or aggravate the inequality in arms between parties. The article offers proposals for improving notice, disclosure, and fair context for AI-enhanced evidence to safeguard reliability without further exacerbating the inequality of arms and access to justice. The article also turns to judicial standing orders as a strategy to enact reforms without having to wait years or even decades for evidence rules changes.

Caputo on ‘Quiet’ Enjoyment: Uncovering the Hidden History of the Right to Attention in Private and Public

Nicholas A. Caputo (Oxford Martin) has posted “’Quiet’ Enjoyment: Uncovering the Hidden History of the Right to Attention in Private and Public” (Stanford Technology Law Review (forthcoming 2025)) on SSRN. Here is the abstract:

Legal scholars have largely neglected attention as a subject of legal rights, even as attention has become one of the most valuable economic resources of the modern era. This Article argues that a right to attention has existed implicitly in American law since the early twentieth century, emerging in response to technological, social, and economic changes in that period that made attention both increasingly valuable and increasingly impinged upon, as America shifted toward knowledge work and leisure activities that demanded sustained focus. By examining court decisions in private law doctrines around property and public law doctrines around speech that can only be explained by reference to an implicit right to attention, this Article begins to uncover the ways in which judges and lawmakers built out a set of legal protections that enabled people to invoke the law to protect their own attention while avoiding stifling the sometimes-disruptive conduct of others. In particular, I show that in private law, courts began recognizing “attentional nuisances,” nontrespassory invasions of land that did not cause physical but only attentional harm, thereby creating a framework for protecting a person’s attention on her own land. In public spaces, the new right to attention came into conflict with also-emerging free speech rights, which seem to require the ability to attract the attention of others in order to express oneself to them. There, the Supreme Court sought a balance through the development of frameworks like time, place, or manner doctrine, which allowed governments to try to regulate attention-grabbing stimuli without directly regulating speech, and through the uneven development of listeners’ rights. In closing, I argue that the right to attention developed in the early twentieth century provides a foundation upon which a modern right to attention addressed to the attention economy could be developed that is both rooted in the experience of the past and capable of meeting the novel challenges presented by digital technology and the rise of artificial intelligence, which promise another epochal technological revolution like that which gave rise to the right a century ago. Drawing out the right to attention buried in the caselaw gives scholars, lawmakers, and the public a set of tools that they can use to decide how to adapt it to the demands of the present. The future of attention relies upon the lessons of its past, and recognizing explicitly the so-far hidden right to attention provides better ways shaping its future.

Solove on On Privacy and Technology (excerpt)

Daniel J. Solove (George Washington U Law) has posted “On Privacy and Technology (excerpt)” (Daniel J. Solove, ON PRIVACY AND TECHNOLOGY (Oxford University Press 2025)) on SSRN. Here is the abstract:

This is an except from Daniel J. Solove’s book, ON PRIVACY AND TECHNOLOGY (Oxford University Press 2025). The book discusses the profound changes technology is wreaking upon privacy, why these changes matter, and what can be done about them.

Drawing from a quarter-century of thinking, teaching, and writing about privacy law, Solove explores the important questions: Can privacy law keep up with rapidly changing digital technologies?  Is it possible to protect privacy in the age of AI?

Solove argues that the answers to these questions are yes, but only if privacy law takes a radical new direction.  Weaving together philosophy, literature, and the humanities with concrete practical knowledge, ON PRIVACY AND TECHNOLOGY tackles issues such as control, manipulation, automation, consent, and reputation, offering a new and provocative perspective on privacy.

In this excerpt, Solove explores the meaning and importance of privacy, including the societal value of privacy. He debunks several myths that stand in the way of effective regulation such as (1) the myth of the privacy paradox, (2) the myth of technology exceptionalism, (3) the myth that regulation stifles innovation, (4) the myth that changes in degree don’t matter, (5) the myth that the law is an interloper, and (6) the myth of technology neutrality.

Solove argues that for various technological changes affecting privacy, the law often not only fails to address them adequately but also contributes to the problems. Many of the problems stem not directly from technology but from prevailing views and myths about technology and privacy.

Solove on Privacy in Authoritarian Times: Surveillance Capitalism and Government Surveillance

Daniel J. Solove (George Washington U Law) has posted “Privacy in Authoritarian Times: Surveillance Capitalism and Government Surveillance” on SSRN. Here is the abstract:

As the United States and much of the world face a resurgence of authoritarianism, the critical importance of privacy cannot be overstated. Privacy serves as a fundamental safeguard against the overreach of authoritarian governments.

Authoritarian power is greatly enhanced in today’s era of pervasive surveillance and relentless data collection. We are living in the age of “surveillance capitalism.” There are vast digital dossiers about every person assembled by thousands of corporations and readily available for the government to access.

In the coming years, both the federal government and some state governments may intensify surveillance and data collection efforts, targeting immigrants, punishing those involved in seeking or providing abortion services, and cracking down on gender-affirming healthcare. Personal data could also be weaponized against critics and others who resist these efforts. These campaigns may be bolstered by vigilante groups, using personal data to dox, threaten, and harm individuals they oppose—echoing historical instances where ordinary citizens actively aided totalitarian regimes in identifying and punishing dissenters or perceived “undesirables.”

In this Article, I contend that privacy protections must be significantly heightened to respond to growing threats of authoritarianism. Major regulatory interventions are necessary to prevent government surveillance from being used in inimical ways. But reforming Fourth Amendment jurisprudence and government surveillance alone will not protect against many authoritarian invasions of privacy, especially given the oligarchical character of the current strain of authoritarianism.

To adequately regulate government surveillance, it is essential to also regulate surveillance capitalism. Government surveillance and surveillance capitalism are two sides of the same coin. It is impossible to protect privacy from authoritarianism without addressing consumer privacy.

This Article proposes regulatory measures that should be taken to address government surveillance and surveillance capitalism – on both sides of the coin – to guard against authoritarianism. Federal lower court judges have some leeway to strengthen Fourth Amendment and other Constitutional protections as well as consumer privacy protections. State court judges can interpret their state’s constitutions in ways that diverge from the way U.S. Supreme Court interpretations. State legislators can enact a wide array of measures to limit government surveillance by their states and others as well as to reign in surveillance capitalism, minimize the data available to authoritarian regimes, regulate data brokers, incentivize the creation of less privacy-invasive surveillance technologies, and curtail the increasing government-industrial collusion. There is no silver bullet, but these measures across the entire landscape of privacy law can make a meaningful difference.

Li & Sang on Assessing the Legality of Privacy Policies in Chinese Mobile Applications: A Textual Analysis Approach

Juan Li (Central South U) and Jinze Sang (China U of Political Science and Law) have posted “Assessing the Legality of Privacy Policies in Chinese Mobile Applications: A Textual Analysis Approach” (J Law Ethics & Tech 2024). Here is the abstract:

Chinese regulators have intensified their supervision over personal information collection and utilization, following the promulgation of the Personal Information Protection Law (PIPL) and other data protection regulations in November 2021. Whilst regulators primarily rely on data flows to identify suspicious activities in data collection and transfer during and after the use of apps, there is still a lack of protection measures to prevent privacy breaches. Privacy policies are important for enhancing the prevention and protection of the privacy rights and interests of app users, as well as for fostering self-regulation among operators. However, privacy policies are often lengthy and replete with vague expressions, making it difficult for users to read and thus inadequate for safeguarding users’ information rights. It is crucial to address the power imbalance between operators and users and to assess the legality of app privacy policies formulated by operators. This paper introduces a methodology for evaluating the legality of privacy policies based on legal knowledge. First, we collected policy texts and constructed the Children’s Privacy Policy Corpus (CCPP-181) that covers a variety of privacy policies for children’s apps. Then, we proposed a legality evaluation method grounded in regulatory standards, and applied it to annotate CCPP-181 corpus annotation. After three rounds of annotation, 20.2% of the 1160 sentences in the corpus were identified to have legality problems. Based on the legal text analysis method, this paper analyzes the legality issues in app privacy policies, in order to eliminate the inequality of app privacy policies and protect user data security.

Keywords

Conejos on Social Media Should be a Scrape Free Zone from AIFS

Rafael Conejos (George Washington U Law) has posted “Social Media Should be a Scrape Free Zone from AIFS” on SSRN. Here is the abstract:

Artificial Intelligence (AI) is used to screen job applicants in the United States (U.S.) everyday by the thousands yet neither the applicants nor some of the employers who use it know precisely how they work or why they arrive at those results. This is due to algorithms being trade secrets and the black box dilemma of AI. If rejection is based on factors which amount to unlawful discrimination, how does an applicant prove it if the process is a secret?
In a highly competitive job environment, where there are more applicants than there are openings, employers rely on the predictions made by Artificial Intelligence Filtering System (AIFS) in determining, based on limited data, if one applicant will underperform or outperform another applicant. AIFS achieves this through training data which instructs it to seek out desirable qualities in an applicant from the information the latter has provided or which AIFS finds publicly available. The more information it has about an applicant, the more accurate the prediction. Like a general warrant being served by an officer in a home, AIFS scrape all publicly available information about the applicant, including their social media accounts, with the goal that more information, regardless of relevance for the role, is helpful when it ‘scores’ one applicant against another.
However, not all social media accounts are intended by the applicant to be viewed through the lens of an employer’s AIFS. Often, social media, outside of LinkedIn, is a place of self-expression and public advocacy. There are many private matters in one’s life which an employer has no business in using to assess one’s fitness for a role. Allowing AIFS to remove the practical obscurity of a person by scraping years’ worth of data on one’s social media account is not only an invasion of privacy, but it chills freedom of expression.
AI’s ability to harness and process vast amounts of data from social media allows it to predict facts an applicant may have intentionally left out, such as age, race, religion, or political affiliation. Having acquired these facts, an employer can achieve systematic job discrimination in the thousands while hiding behind the “neutrality” of AI. It can use proxy factors as lawful excuses not to hire an applicant based on poor cultural fit. Most of all, employers can evade litigation simply because of the high burden of proof that a plaintiff needs to prove that discrimination was the ‘but for’ factor which resulted in him not getting the job.
When it comes to the accessibility of online data, such as public social media accounts, the U.S. legal system struggles with a binary concept of privacy in which it sees personal data as being private or public but never anything in between. While in the European Union (EU), the collection and processing of personal data is centered on consent and relevance for a specific purpose and not the manner on how data is made available.

Solove & Hartzog on Kafka in the Age of AI and the Futility of Privacy as Control

Daniel J. Solove (George Washington Law) and Woodrow Hartzog (Boston University Law) have posted “Kafka in the Age of AI and the Futility of Privacy as Control” (104 Boston University Law Review 1021 (2024)) on SSRN. Here is the abstract:

Although writing more than a century ago, Franz Kafka captured the core problem of digital technologies – how individuals are rendered powerless and vulnerable. During the past fifty years, and especially in the 21st century, privacy laws have been sprouting up around the world. These laws are often based heavily on an Individual Control Model that aims to empower individuals with rights to help them control the collection, use, and disclosure of their data.

In this Essay, we argue that although Kafka starkly shows us the plight of the disempowered individual, his work also paradoxically suggests that empowering the individual isn’t the answer to protecting privacy, especially in the age of artificial intelligence. In Kafka’s world, characters readily submit to authority, even when they aren’t forced and even when doing so leads to injury or death. The victims are blamed, and they even blame themselves.

Although Kafka’s view of human nature is exaggerated for darkly comedic effect, it nevertheless captures many truths that privacy law must reckon with. Even if dark patterns and dirty manipulative practices are cleaned up, people will still make bad decisions about privacy. Despite warnings, people will embrace the technologies that hurt them. When given control over their data, people will give it right back. And when people’s data is used in unexpected and harmful ways, people will often blame themselves.

Kafka’s provides key insights for regulating privacy in the age of AI. The law can’t empower individuals when it is the system that renders them powerless. Ultimately, privacy law’s primary goal should not be to give individuals control over their data. Instead, the law should focus on ensuring a societal structure that brings the collection, use, and disclosure of personal data under control.

Tokson on Government Purchases of Private Data

Matthew Tokson (Utah Law) has posted “Government Purchases of Private Data” (Wake Forest Law Review, Forthcoming) on SSRN. Here is the abstract:

The United States lacks a comprehensive data privacy statute, and most states impose only minimal legal constraints on consumer data collection. This regulatory vacuum has given rise to commercial markets in sensitive private data. In recent years, federal agencies and local police departments have begun to purchase this data from specialized brokers in order to track individuals’ activities over time. Much of this data, collected by cellphone apps and internet servers, is likely constitutionally protected. But government attorneys have mostly concluded that purchasing data is a valid way of bypassing the Constitution’s restrictions.

This Article addresses the increasingly prominent issue of government purchases of private data, and examines broader issues of privacy protection in an era of commercial markets in personal information. The Article questions the widespread assumption that the Fourth Amendment can never apply to commercial purchases. Police officers can generally purchase an item available to the public without constitutional restriction. But a closer examination of data markets demonstrates that sensitive cellphone data is not publicly available or exposed. Rather, the vendors who sell such data do so either exclusively to law enforcement agencies or in large, anonymized chunks to other marketing companies. Because sensitive cellphone data remains functionally private, a government purchase of such data violates the Fourth Amendment.

The Article then challenges the idea that consumers waive their rights in their cellphone data when they use apps or other services. The explanations customers see when an app asks for permission to access their data are often insufficient or misleading, and they typically say nothing about personal data being sold to other parties. Further, penalizing users for disclosing their data to service providers creates harmful incentives and is incompatible with meaningful Fourth Amendment protection in the digital age.

The Article sits at the intersection of consumer privacy and Fourth Amendment law, as poorly regulated markets in personal data and flawed concepts of consumer consent now threaten to erode fundamental constitutional rights. The Article draws broader lessons about the inadequacy of consumer privacy law in the United States. It examines the potential for private surveillance to become government surveillance, via technical and legal interoperability. And it assesses a variety of possible solutions through which legal actors can prevent commercial markets in private data from undermining Fourth Amendment rights.