TL;DR:
- The EU Artificial Intelligence Act has been modified to address the rise of generative AI tools and incorporates transparency and disclosure requirements for large language models.
- Concerns have been raised about the use of publicly available data to train generative AI, potentially infringing on the original work of artists and creators without compensation.
- American AI companies operating in the European market need to be aware of and comply with the EU AI Act’s requirements, exploring alternative methods of data collection.
- It is challenging to identify specific data segments within large and diverse datasets, making it difficult to comply with the EU AI Act’s documentation requirements.
- Non-compliance with the EU AI Act can result in significant fines, urging AI developers to establish mechanisms for documenting training data.
- The level of detail required in the summaries of training data remains unclear, leading to potential baseless lawsuits and the need for additional specific guidelines.
- Discrepancies between EU and US copyright laws create confusion for companies trying to comply with both sets of regulations.
- EU copyright protection is automatic and starts from the moment a work is created, while the US has a formal registration process and excludes AI-generated works from copyright protection.
- Lawsuits have already been initiated against companies for copyright violations, and the EU AI Act may lead to more frequent lawsuits with higher penalties.
- American AI companies should carefully analyze the implications of the EU AI Act and consider a flexible approach to regulation for generative AI models.
- The divergences between EU and US copyright laws create challenges for firms collaborating in both regions, and the latest amendments to the AI Act worsen this situation.
- AI firms need to prepare for the potential impact of the EU regulations, as they could significantly impact their operations and reshape their existence.
Main AI News:
The EU Artificial Intelligence Act, which was passed by the European Parliament on May 11, has undergone some modifications in recent weeks to address the emergence and widespread adoption of generative AI tools like ChatGPT, DALL-E, Google Bard, and Stable Diffusion. The amendments primarily focus on incorporating transparency and disclosure requirements for large language models, a category of AI algorithms known as “general-purpose AI systems” that utilize extensive datasets and machine learning to comprehend and generate content.
One of the key concerns that prompted these revisions is the utilization of publicly available data scraped from the internet to train generative AI technologies such as ChatGPT and DALL-E. This practice has sparked worries among artists and creators who feel that their original work is being “stolen” by these large language systems without any corresponding compensation from the companies responsible for their development.
The fundamental objective of this regulation is to safeguard the rights of creators and copyright holders. However, it inadvertently poses potential challenges for American AI firms operating within the European market. As a result, US-based AI companies seeking to provide services to EU citizens must be cognizant of these requirements and explore alternative methods for deliberate and cautious data collection.
According to Article 28b 4(c) of the EU AI Act, providers of generative AI systems must “document and make publicly available a sufficiently detailed summary of the use of training data protected under copyright law” while respecting national and Union legislation on copyright. The predicament lies in the fact that it is practically impossible to identify every single segment of data and generated content or even individual images. For instance, GPT-3 was trained on a colossal 45 terabytes of text data, making it unfeasible to trace specific data segments within such vast and diverse datasets.
In a Forbes interview conducted in September 2022, David Holz, the founder of MidJourney, acknowledged the challenges associated with obtaining a hundred million images and determining their sources. He emphasized the absence of a centralized registry for such information. If this law comes into effect, AI companies in the United States and other countries worldwide could face significant difficulties. Non-compliance with the law carries the risk of hefty fines amounting to up to 30 million euros or 6 percent of annual revenue turnover, whichever is greater. Consequently, it is crucial for AI developers to develop mechanisms for documenting training data to ensure compliance.
Upon initial examination of this section of the act, it appears that copyright owners and original creators may have an opportunity for fair compensation for their original works. However, the broad language regarding the obligations imposed on AI companies makes it challenging to determine the level of detail required in their summaries.
Consequently, it remains unclear how creators can ascertain whether their work is being utilized in the training dataset. This ambiguity could potentially give rise to unnecessary and baseless lawsuits, particularly if compliant companies provide an overly inclusive “detailed summary of the use of training data.” To avoid such circumstances, it is imperative to introduce additional specific guidelines to this section of the act.
In addition to the concerns surrounding the EU AI Act, another significant issue arises from the disparities between copyright laws in the European Union and the United States. This disparity is likely to result in confusion and inconsistencies among companies striving to comply with both sets of regulations.
In the EU, copyright protection is granted automatically to individuals who create literary, scientific, and artistic works. There is no formal copyright registry, and the protection begins from the moment the work is created. As a result, companies developing large language model AI systems must exercise caution when utilizing content from EU creators to avoid potential copyright infringements.
On the other hand, the United States has a formal copyright registration process, and not all works are eligible for copyright protection. In a statement released by the US Copyright Office on March 16, 2023, it was clarified that AI-generated works are not eligible for copyright protection. This distinction highlights the significant challenge of determining which types of content are genuinely protected under copyright law.
Already, various lawsuits have been initiated against companies like OpenAI, Microsoft, and Github for copyright violations. AI art tools such as Stable Diffusion and MidJourney have also faced copyright lawsuits recently. If the EU AI Act is enacted, these companies are likely to face even more frequent lawsuits with substantially higher penalties.
As the EU AI Act becomes a reality, American AI companies must carefully analyze the implications for their own operations. It remains to be seen whether the transparency requirements outlined in the EU AI Act will stimulate new innovations and efforts to achieve compliance. A more flexible approach to regulating generative AI models would be preferable for US businesses and the broader Western market. The divergences between EU and US copyright laws create an unfavorable landscape for firms attempting to collaborate in both regions, and the latest amendments to the AI Act exacerbate this dynamic.
AI firms must brace themselves for the potential impact of these impending EU regulations, as they have the potential to reshape the very foundations of their existence.
Conlcusion:
The EU Artificial Intelligence Act and its amendments pose significant implications for the market, particularly for American AI companies operating in the European market. The incorporation of transparency and disclosure requirements for large language models and the protection of copyright holders’ rights aim to safeguard original works and ensure fair compensation.
However, the challenges surrounding data documentation, the discrepancies between EU and US copyright laws, and the potential for increased lawsuits and penalties create a complex and uncertain landscape for businesses. As a result, companies must carefully analyze the implications, adapt their data collection practices, and navigate the regulatory landscape to ensure compliance and mitigate potential risks. A more flexible approach to regulation and harmonization between EU and US copyright laws would benefit both AI firms and the broader market, fostering collaboration and innovation while protecting creators’ rights.