Google and Its Confederate AI Platforms Want Retroactive Absolution For AI Training Wrapped in the American Flag

[This post is based on an excerpt from the Artist Rights Institute's submission to the National Science Foundation's Request for Information that I wrote. The full submission is linked below.]

It is crucial for policymakers to have a clear understanding of where we are today with respect to the collision between AI and artist rights, including copyright.  The corrosion of artist rights by the richest corporations in commercial history is not something that may happen in the future.  Massive infringement has already occurred,[1] is occurring this minute, and will continue to occur into the future at an increasing rate.  

Companies like Google would have you believe that some vague “balancing” should be adopted in the future,[2] but the reality—the truth—is that Google and its confederates have already done the balancing act and may well have both allocated markets and fixed prices (at zero in the case of works of copyright).  Google put its adjudicated monopolist’s[3] big thumb on that scale in its own favor, which is as surprising as gambling at Rick’s Café Américian.  

Depending on how far back one focuses, massive infringement appears to have been part of the plan that started with Google Books now twenty years ago.  As the tech historian George Dyson observed in 2005 after a trip to the Googleplex during the Google Books digitization craze:

My visit to Google? Despite the whimsical furniture and other toys, I felt I was entering a 14th-century cathedral—not in the 14th century but in the 12th century, while it was being built. Everyone was busy carving one stone here and another stone there, with some invisible architect getting everything to fit. The mood was playful, yet there was a palpable reverence in the air.  “We are not scanning all those books to be read by people,” explained one of my hosts after my talk.  “We are scanning them to be read by an AI.”[4]

Google’s plan[5] with Google Books was likely no different than the company’s plan for AI—seek forgiveness through deception, not permission.  Or better yet, seek retroactive absolution by litigation or legislation.  

And just like the company did in 2005 with Google Books, they want you to believe that their massive infringement is not a crime, it’s about “innovation” versus “regulation,” except this time it’s all wrapped up in the American flag.  It’s not stealing, it’s the “AI gap.”  And what red-blooded American could be against stopping China in the AI race because China ignores copyright.  We should be just like them and protect the machines.   

OpenAI’s filing[6] in this RFI also reveals this backwards thinking.  They tell the Foundation is not that China fails to respect human expression, it’s that China fails to protect AI training.  “Today, CCP-controlled China has a number of strategic advantages, including…[i]ts ability to benefit from copyright arbitrage being created by democratic nations that do not clearly protect AI training by statute, like the US, or that reduce the amount of training data through an opt-out regime for copyright holders, like the EU.”[7]  Yes, the “copyright arbitrage” is not that the US offers greater protection for human expression, it’s that the US fails to protect the machines enough.  

And the cherry on top is that OpenAI misleads[8] the Foundation by citing to the EU’s highly controversial “opt-out” regime.[9] That regime is not long for this world and likely violates the Berne Convention’s prohibition on formalities for starters.[10]  Similar prohibitions are included in other international treaties to which the US, UK and EU are parties which makes OpenAI’s misleading assertion even more odious.[11]  A separate opt-out regime has been rejected by thousands of commenters in the UK IPO consultation on the Government’s highly controversial “Data (Use and Access) Bill”[12] that drew thousands of comments in opposition and which failed miserably in the House of Lords.  

Both Google and OpenAI would wrap themselves in the American flag in their appeal to jingoistic imagery of Silicon Valley fighting the good fight for American innovation against the Chinese Communist Party.  Given the commercial history of Silicon Valley and the People’s Republic of China,[13] their buy-American bromides make for interesting reading if you can see past the oozing irony.  And the hypocrisy. 


What's Good for General Bullmoose is Good for the USA!

This commercial jingo is nothing new.  A highlight of the classic American musical Lil’ Abner[14] is the song “[w]hat’s good for General Bullmoose is good for the USA.”[15]  The song uses the character “Bashington T. Bullmoose” toparody former General Motors’ president Charles Wilson who told the Senate Armed Services Committee that "What is good for the country is good for General Motors."[16]  The statements by OpenAI, Google and we expect many others to the Foundation that respecting copyright will create an “AI gap” and impede the U.S. in the “AI race” will go down in history with these risible statements by Bullmoose (and Wilson)—unless the AI rewrites the history.  For now, let us savory the irony while we still can.  

It must also be said that wherever the Foundation ends up on protection of artist rights for America’s AI Action Plan, the tech giants will likely view that position as a starting place for erosion, even if they get exactly what they want from the U.S. government.  This is certainly the position they have taken with other safe harbor abuse such as the “DMCA” safe harbor.[17]  As the Copyright Alliance testified to Congress in 2020:

The primary problem is that section 512 has been so misinterpreted by the courts [in litigation brought by copyright owners trying to use the DMCA for what they thought was its intended purpose] that service providers have little risk and need only do the absolute minimum required under the DMCA. All the while, copyright owners are being devastated by online infringement.[18]  

We fully expect the same treatment with AI under a new version of rules established by Mr. Schmidt’s cabal.  We are grateful to the Foundation for establishing the transparency necessary for human culture to survive.

[1] Cade Metz, Cecilia Kang, Sheera Frenkel, Stuart A. Thompson and Nico Grant, “How Tech Giants Cut Corners to Harvest Data for AI,” New York Times (April 8, 2024) available at https://www.nytimes.com/2024/04/06/technology/tech-giants-harvest-data-artificial-intelligence.html (“Google transcribed YouTube videos to harvest text for its A.I. models, five people with knowledge of the company’s practices said. That potentially violated the copyrights to the videos, which belong to their creators….Google said that its A.I. models “are trained on some YouTube content,” which was allowed under agreements with YouTube creators, and that the company did not use data from office apps outside of an experimental program.”)

[2] Google, Response to the National Science Foundation’s and Office of Science & Technology Policy’s Request for Information on the Development of an Artificial Intelligence (AI) Action Plan, Nat. Sci. Found. Docket No. NSF_FRDOC_0001 (Mar. 13, 2025) at 5, hereafter “Google filing.”

[3] United States v. Google LLC, No. 20-cv-3010, 2024 WL 3647498 (D.D.C. Aug. 5, 2024).

[4] George Dyson, Conversation: Technology (Oct. 23, 2005) available at https://www.edge.org/conversation/george_dyson-turings-cathedral 

[5] Arguably, the Google Books case (Authors Guild v. Google, Inc., 804 F.3d 202 (2d Cir. 2015)) did not address using the Google Books corpus for AI training as a permitted non-display use.  Maybe it should have.  Google Books played a significant role in Google's AI development, particularly in the early stages. By digitizing millions of books, Google created a vast dataset that was instrumental in training natural language processing (NLP) models. This initiative helped Google refine its search algorithms, improve language understanding, and develop tools like Google Translate.  Google's NLP training draws from a variety of sources, including publicly available datasets and proprietary data. For example, Google's BERT model was pre-trained using large text corpora like Wikipedia and other publicly available datasets.  Of course, Google’s use of Wikipedia in its AI likely violates the Creative Commons Attribution-ShareAlike 3.0 Unported License (CC BY-SA 3.0) used by Wikipedia. Google has contributed to Wikipedia and its related entities in various ways. For instance, Google has provided financial support to the Wikimedia Foundation, which operates Wikipedia. In 2010, Google donated $2 million to the foundation, and in 2019, it contributed an additional $3 million.  Moreover, Google and Wikimedia Enterprise began a partnership in 2021. This collaboration allows Google to access Wikimedia's content more efficiently for its services, such as search results and knowledge panel.  Wikipedia has never made a claim against Google for violating its terms of use and likely will never make such a claim.

[6] OpenAI, Response to the National Science Foundation’s and Office of Science & Technology Policy’s Request for Information on the Development of an Artificial Intelligence (AI) Action Plan, Nat. Sci. Found. Docket No. NSF_FRDOC_0001 (Mar. 13, 2025) available at https://cdn.openai.com/global-affairs/ostp-rfi/ec680b75-d539-4653-b297-8bcf6e5f7686/openai-response-ostp-nsf-rfi-notice-request-for-information-on-the-development-of-an-artificial-intelligence-ai-action-plan.pdf hereafter “OpenAI filing.”

[7] OpenAI filing at 4.

[8] Indeed, see Jennifer Rankin, EU accused of leaving ‘devastating’ copyright loophole in AI Act, The Guardian (Feb. 19, 2025) available at https://www.theguardian.com/technology/2025/feb/19/eu-accused-of-leaving-devastating-copyright-loophole-in-ai-act (“Axel Voss, a German centre-right member of the European parliament, who played a key role in writing the EU’s 2019 copyright directive, said that law was not conceived to deal with generative AI models: systems that can generate text, images or music with a simple text prompt.”) and see Paul Keller, _AI and Copyrights: A Convergence of Opt-Outs, Open_Future (Nov. 29, 2025) available at https://openfuture.eu/blog/ai-and-copyright-convergence-of-opt-outs/ (The article critiques the EU's opt-out regime for AI training, arguing it may hinder innovation and create practical challenges for implementation. It highlights concerns about balancing intellectual property rights with technological progress and questions the feasibility of enforcing machine-readable opt-outs effectively in a rapidly evolving AI landscape.)

[9] See, e.g., Wouter van Wengen and Radboud Ribbert, EU AI Act’s Opt-Out Trend May Limit Data Use for Training AI Models available at https://www.gtlaw.com/en/insights/2024/7/eu-ai-acts-opt-out-trend-may-limit-data-use-for-training-ai-models (The EU AI Act introduces an opt-out mechanism for copyright holders, allowing them to reserve their works from being used for AI training. This aligns with the EU's Text and Data Mining Directive, ensuring lawful access to data while balancing innovation and intellectual property rights. Full enforcement begins in 2024.); but see Voss supra n. 8 stating that the 2019 EU Copyright Directive was never intended to deal with generative AI.

[10] See Berne Convention for the Protection of Literary and Artistic Works art. 5(2), Sept. 28, 1979, S. Treaty Doc. No. 99-27. 

[11] See, e.g., Agreement on Trade-Related Aspects of Intellectual Property art. 9(1), Apr. 15, 1994, 1869 U.N.T.S. 299 [hereinafter TRIPS] (“Members shall comply with Articles 1 through 21 of the Berne Convention (1971) and the Appendix thereto.”); WIPO Copyright Treaty art. 1(4), Dec. 20, 1996, 2186 U.N.T.S. 121 (extending protection to computer programs and databases: “Contracting Parties shall comply with Articles 1 to 21 and the Appendix of the Berne Convention.”); WIPO Performances and Phonograms Treaty art. 20, Dec. 20, 1996, 2186 U.N.T.S. 203 (extending protection to sound recordings and certain performances: “The enjoyment and exercise of the rights provided for in this Treaty shall not be subject to any formality.”); see also Beijing Treaty on Audiovisual Performances art. 17, June 24, 2012, 51 I.L.M. 1214 (extending protection to audiovisual fixations of performances and certain unfixed performances: “The enjoyment and exercise of the rights provided for in this Treaty shall not be subject to any formality.”). 

[12] Available at https://bills.parliament.uk/bills/3825.

[13] See, e.g., Cheang Ming, Google is blocked in China, but that’s not stopping it from opening an A.I. center there, CNBC (Dec. 13, 2017) available at https://www.cnbc.com/2017/12/13/alphabets-google-opens-china-ai-centre.html.  Google opened an AI research center in Beijing in 2017, focusing on natural language processing and machine learning. This center aimed to tap into China's talent pool and contribute to Google’s AI advancements.  Microsoft has a significant footprint in China, including its AI and Research division. The company operates research labs in Beijing and Shanghai, which have contributed to advancements in computer vision, speech recognition, and natural language understanding. Microsoft's Azure cloud platform is also available in China through a partnership with 21Vianet, a local data center operator. This collaboration allows Microsoft to comply with Chinese regulations while providing AI and cloud services to local businesses.  IBM has been active in China for decades, with its Watson AI platform playing a key role in the company's PRC operations. NVIDIA has a strong presence in China. Its GPUs are widely used by Chinese tech companies for AI training and deployment. NVIDIA has also partnered with local firms to develop AI applications in areas such as autonomous driving and smart cities. Apple has integrated AI into its products and services, such as Siri and facial recognition technology. The company relies heavily on China for manufacturing and has invested in local R&D centers.

[14] Li’l Abner (1956), book by Norman Panama and Melvin Frank, based on the comic by Al Capp.

[15] Music by Johnny Mercer and Lyrics by Gene De Paul.

[16] Charles Erwin Wilson, Confirmation of Charles Erwin Wilson as Secretary of Defense, Senate Armed Services Committee (Jan. 14, 1953).

[17] 17 U.S.C. §512. 

[18] Copyright Alliance CEO Keith Kupferschmid, Senate Judiciary Intellectual Property Subcommittee, The Role of Private Agreements and Existing Technology in Curbing Online Piracy (Dec. 15, 2020) available at https://files.constantcontact.com/d2e8d4e5501/cfec37f5-261e-4c6d-a19e-335a2d52e258.pdf

Artist Rights Institute Comment NSF RFI AI Action Plan v 2Download

Previous
Previous

@Artist Rights Institute Newsletter 3/31/25

Next
Next

Newsletter, February 25