Dr. Öğr. Üyesi Tuğba Güleş / İstanbul 29 Mayıs Üniversitesi Hukuk Fakültesi Fikri Mülkiyet Hukuku Anabilim Dalı
Bu blog yazısına atıf için: Tuğba Güleş, Training Data Is the New Battleground: Why Getty v. Stability AI May Redesign Copyright’s Core Architecture, 29 Mayıs Hukuk Blogu, https://hukuk.29mayis.edu.tr/tr/blog/38799, 30 Nisan 2026.
Introduction
If copyright law continues to treat access as an exception rather than a design principle, it risks becoming part of the innovation problem rather than its solution. Few disputes illustrate this tension more clearly than Getty Images v. Stability AI.[1] Far from being a conventional infringement case, this litigation raises a structural question: can copyright law remain conceptually stable in an era where creativity increasingly depends on machine learning infrastructures?[2]
At stake is not only whether the use of copyrighted works for training generative AI systems constitutes infringement, but whether foundational doctrines such as reproduction and derivative use remain functionally coherent.[3]
From Images to Infrastructure: The Facts Behind the Dispute
Getty Images alleges that Stability AI copied vast quantities of copyrighted images to train its generative model without authorization. The claim emphasizes large-scale dataset extraction as a form of unauthorized reproduction embedded within the technical architecture of the system.
More strikingly, it is alleged that certain outputs generated by the model contain Getty’s watermark.[4] This transforms the dispute from an abstract debate about training processes into a concrete case of potentially recognizable reproduction.
Yet focusing solely on copying obscures the deeper issue: whether training data should be conceptualized as protected content or as a functional input necessary for innovation.[5]
Reproduction Without Perception: A Doctrinal Tension
At the doctrinal level, the central question is whether machine learning training qualifies as “reproduction.” Traditional copyright theory presumes copying in a perceptible and communicative form. However, AI systems process works to extract statistical correlations rather than to reproduce expressive content.[6]
This raises a structural dilemma. If every act of data ingestion is treated as reproduction, copyright law risks expanding into a regime of data exclusivity.[7] Conversely, excluding such processes from the scope of protection may leave right holders without control over economically significant uses of their works.[8]
The concern becomes significantly more pronounced when considered alongside emerging empirical research indicating that machine learning models are not always purely generalizing systems. Under certain technical conditions such as overfitting, insufficient regularization, or specific prompting strategies these models may inadvertently retain and reproduce portions of the data on which they were trained. This phenomenon, often described as “memorization,” suggests that outputs may include recognizable fragments of original works, thereby strengthening arguments that AI systems can generate content that is not merely inspired by training data but, in some instances, directly derived from it. Such “memorization” challenges the argument that training is purely transformative.
A Regulatory Choice Disguised as Legal Interpretation
The dispute ultimately forces courts into a policy role. The question is not merely interpretative but distributive: how should the law allocate value between creators and technology developers?[9]
A strict infringement approach would require licensing at scale, potentially reinforcing market concentration by favoring large actors capable of negotiating extensive rights portfolios.[10] By contrast, a permissive approach risks eroding the economic foundations of copyright by allowing uncompensated extraction of value.[11]
The European Union has attempted to mediate this tension through the introduction of text and data mining exceptions.[12] These provisions permit data mining while allowing right holders to opt out, reflecting a compromise between access and control.[13]
Implications for Turkish Law: Between Silence and Doctrinal Expansion
Turkish copyright law does not currently provide a clear framework for text and data mining. The Law on Intellectual and Artistic Works No. 5846 of Türkiye regulates reproduction and adaptation rights but does not explicitly address automated data processing.[14]
This creates interpretative pressure on courts. Expanding the scope of reproduction to include AI training could generate legal uncertainty and discourage technological development. At the same time, a restrictive interpretation may fail to protect right holders adequately.
Comparative developments suggest that legislative intervention rather than judicial expansion may provide a more coherent solution.[15]
Beyond Copyright: Trademark as a Functional Corrective
The presence of Getty’s watermark in generated outputs highlights an important point: copyright may not be the most effective tool for regulating AI outputs.
Trademark law, by contrast, is structurally better equipped to address issues of confusion and misappropriation. Under the Turkish Industrial Property Code, the unauthorized use of distinctive signs can trigger liability where it creates unfair advantage or consumer confusion.[16]
This suggests a broader doctrinal shift: as copyright struggles to accommodate technological complexity, trademark law may re-emerge as a more adaptable enforcement mechanism.[17]
Conclusion: Designing Copyright for an AI-Driven Economy
Getty v. Stability AI makes visible a structural shift in intellectual property law. The central question is no longer confined to the scope of exclusive rights, but extends to how legal systems organize and mediate access to the informational resources on which contemporary innovation depends. In an environment where creativity is increasingly cumulative and data-driven, copyright operates not only as a mechanism of protection but also as a framework of coordination between creators, intermediaries, and technology developers.
Approaching training data purely through the lens of proprietary control risks entrenching informational asymmetries and raising entry barriers for new innovators. At the same time, an unqualified openness model may dilute incentives for creative production by allowing large-scale value extraction without meaningful participation of right holders. The challenge, therefore, lies in moving beyond this binary and articulating a calibrated model of “structured access.”
Such a model would recognize that not all uses of protected works are normatively equivalent. A principled distinction can be drawn between (i) non-expressive, functional uses such as large-scale data analysis and model training and (ii) expressive or substitutive outputs that compete with or replicate protected works. While the former may justify broader access under carefully designed limitations or exceptions, the latter should remain subject to traditional infringement analysis. This differentiation allows copyright to accommodate technological processes without abandoning its core protective function.
From a policy perspective, this implies a shift toward ex ante governance mechanisms rather than purely ex post enforcement. Tools such as opt-out systems, transparency obligations regarding training datasets, and collective licensing frameworks could provide a more predictable and balanced ecosystem. Equally important is the development of technical accountability standards to distinguish genuine generalization from impermissible memorization.
For jurisdictions like Türkiye, where explicit regulatory provisions on data-driven uses remain limited, the case underscores the urgency of legislative clarification. Relying solely on the expansive interpretation of existing rights risks doctrinal instability and legal uncertainty. A forward-looking reform agenda potentially incorporating tailored exceptions, safeguards for right holders, and innovation-friendly mechanisms would better align the legal framework with the realities of artificial intelligence.
Ultimately, the future of copyright in the AI era will depend on its capacity to evolve from a system primarily concerned with exclusion to one capable of structuring access, allocating value, and sustaining creative ecosystems. The significance of Getty v. Stability AI lies precisely in this: it compels legal systems to confront the limits of inherited doctrines and to engage, more directly than ever before, with the design of an innovation-compatible intellectual property order.
The challenge for policymakers is therefore not to choose between protection and innovation, but to recalibrate the legal framework so that both can coexist. Whether this case becomes a turning point will depend on the willingness of legal systems to move beyond doctrinal formalism and engage with the structural realities of artificial intelligence.
RESOURCES
European Commission, Impact Assessment on the Modernisation of EU Copyright Rules, SWD(2016) 301 final, 2016.
Directive (EU) 2019/790 of the European Parliament and of the Council, 17 April 2019.
FISHER, William W., The Growth of Intellectual Property: A History of the Ownership of Ideas in the United States, Harvard University Press, 2001.
GERVAIS, Daniel J., ‘The Machine as Author’, Iowa Law Review, Vol. 105, 2020.
Getty Images (US) Inc v Stability AI Ltd, High Court of Justice (UK), 2023.
GRIMMELMANN, James, ‘There’s No Such Thing as a Computer-Authored Work’, Columbia Journal of Law & the Arts, Vol. 39, 2016.
GUADAMUZ, Andres, ‘Artificial Intelligence and Copyright’, Current Legal Problems, Vol. 70, No. 4, 2017.
KUR, Annette, ‘Trademark Functions in European Union Law’, Max Planck Institute Research Paper, 2019.
LEMLEY, Mark A./CASEY, Bryan, ‘Fair Learning’, Texas Law Review, Vol. 99, 2021.
OECD, Intellectual Property and Artificial Intelligence, 2022.
ROSATI, Eleonora, Copyright in the Digital Single Market, Oxford University Press, 2021.
SAMUELSON, Pamela, ‘Copyright and Machine Learning’, Columbia Public Law Research Paper No. 14-716, 2023.
The Law on Intellectual and Artistic Works No. 5846 (FSEK).
Turkish Industrial Property Code No 6769 (SMK).
UK Intellectual Property Office, Artificial Intelligence and Intellectual Property: Copyright and Patents, 2022.
WIPO, Revised Issues Paper on Intellectual Property Policy and Artificial Intelligence, 2020.
[1] Getty Images (US) Inc v Stability AI Ltd, High Court of Justice (UK), 2023, para. 1.
[2] UK Intellectual Property Office, Artificial Intelligence and Intellectual Property: Copyright and Patents, 2022, p. 12.
[3] Andres Guadamuz, ‘Artificial Intelligence and Copyright’, Current Legal Problems, Vol. 70, No. 4, 2017, p. 11.
[4] Getty Images (US) Inc v Stability AI Ltd, 2023, para. 45.
[5] Mark A. Lemley/Bryan Casey, ‘Fair Learning’, Texas Law Review, Vol. 99, 2021, p. 758.
[6] Pamela Samuelson, ‘Copyright and Machine Learning’, Columbia Public Law Research Paper No. 14-716, 2023, p. 7.
[7] James Grimmelmann, ‘There’s No Such Thing as a Computer-Authored Work’, Columbia Journal of Law & the Arts, Vol. 39, 2016, p. 414.
[8] OECD, Intellectual Property and Artificial Intelligence, 2022, p. 22.
[9] World Intellectual Property Organization (WIPO), Revised Issues Paper on Intellectual Property Policy and Artificial Intelligence, 2020, para. 33.
[10] Daniel J. Gervais, ‘The Machine as Author’, Iowa Law Review, Vol. 105, 2020, p. 2072.
[11] William W. Fisher III, The Growth of Intellectual Property: A History of the Ownership of Ideas in the United States, Harvard University Press, 2001, p. 6.
[12] Directive (EU) 2019/790 of the European Parliament and of the Council, 17 April 2019, Art. 3.
[13] Eleonora Rosati, Copyright in the Digital Single Market, Oxford University Press, 2021, p. 147.
[14] The Law on Intellectual and Artistic Works No. 5846 (FSEK), Art. 21.
[15] European Commission, Impact Assessment on the Modernisation of EU Copyright Rules, SWD(2016) 301 final, 2016, p. 104.
[16] 6769 Turkish Industrial Property Code (SMK), Art. 29.
[17] Annette Kur, ‘Trademark Functions in European Union Law’, Max Planck Institute Research Paper, 2019, p. 7.