Skip to content

Data typically is needed to train and fine-tune modern artificial intelligence models. AI can use data – including personal information – to recognize patterns and predict results.

Companies that utilize personal information to train an AI may either be acting as a controller or a processor depending on the degree of discretion that they exercise in deciding how the AI will function, the type of personal information that will be used to train the AI, how the AI will be allowed to process the training data, and the conditions by which the AI will be allowed to retain or share the training data.

If a company is considered a controller, it must satisfy the following requirements under the GDPR with respect to training data:

GDPR RequirementGDPR CitationImpact on Controller’s Use of Training Data
1. Lawful basis of processingArt. 6Controllers are required to identify one of six lawful purposes of processing.[1] Some supervisory authorities have suggested that if a company uses publicly sourced data to train an AI (e.g., data scraped from the internet), the only plausible lawful purposes would be either (1) the consent of the individuals whose personal information is being provided or (2) the legitimate interest of the controller.[2]
2. Record of processing activitiesArt. 30(1)Controllers are required to record within their records of processing activities, among other things, the type of personal information that was used to train an AI, the individuals about whom the personal information related, the purpose for which the data was utilized, and any restrictions imposed upon the AI’s use or retention of such data.
3. Data minimizationArt. 5(1)(c), (e)Controllers are required to minimize the extent to which personal information is utilized, and the duration in which it is kept in identifiable form. In the context of training an AI, the controller should consider how to minimize the type and amount of data provided to train the AI, as well as the length of time to which the AI will have access to such data.
4. Privacy noticeArt. 12 – 14Controllers are required to provide individuals with information relating to how personal information is processed.[3] Some supervisory authorities have specifically taken the position that companies which use personal information to train an AI must draft and publish a privacy notice that provides “data subjects whose data have been collected and processed for the purposes of training algorithms . . . with information on how the processing is carried out, the logic underlying the processing . . . , [and] the rights to which they are entitled.”[4] If a controller is utilizing publicly sourced data (e.g., data scraped from the internet) some supervisory authorities have suggested that it may be appropriate for controllers to inform the public via mass media (e.g., radio, television, newspapers) about the scraping and how they can find the company’s privacy notice.[5]
5. Access rightsArt. 15Controllers are required to permit individuals to access any personal information held about them. In the context of training an AI, a controller should be prepared to respond to individual’s request for access to the personal information about them that may have been involved in the AI training.
6. Correction rightsArt. 16Controllers are required to permit individuals to request that inaccurate information be corrected. In the context of training an AI, some supervisory authorities have taken the position that companies which use publicly sourced data (e.g., data scraped from the internet) should create an online tool “by which to request and obtain rectification of any personal data relating to them” both in the context of data used to train an AI and any data created by the AI. [6]
7. Erasure rightsArt. 17Controllers are required to permit individuals to request that personal information about them be deleted if processing is no longer necessary in relation to the purposes for which it was collected. In the context of training an AI, if a controller receives a deletion request it should consider whether personal information from the requester can be deleted from the training set.[7]
8. Right to withdraw consent / objectArt. 7(3), 21If a controller has based their use of training data on the consent of individuals, the GDPR require that they provide individuals the ability to withdraw consent. Similarly if a controller has based their use of training data on the controller’s legitimate interest, the GDPR requires that the controller provide an ability for users to object to the continued use of their data.[8]
9. Data protection impact assessmentsArt. 35The GDPR requires that controllers conduct data protection impact assessments if they are using new technologies that are “likely to result in a high risk” to individuals. As a result, a controller should consider whether it is appropriate to conduct a DPIA in connection with using personal information to train an AI.
10. Cross-border data transfersArt. 44-50To the extent that personal information will be sent to an AI that is hosted outside of the European Economic Area, a controller may need to take steps to ensure that such data is adequately protected in the jurisdiction to which it is sent.
11. Vendor managementArt. 28To the extent that a controller will rely on a third party to host personal information used to train an AI (e.g., a third party hosted AI product), the GDPR may require that the third party agree to specific contract provisions required of processors.

[1] EDPB-EDPS Joint Opinion 5/2021 on the proposal for a Regulation of the European Parliament and of the Council laying down harmonized rules on artificial intelligence (Artificial Intelligence Act) at para. 60 (June 18, 2021).

[2] Garante Per La Protezione Dei Dati Personali, Provision of April 11, 2023[9874702] (English translation).

[3] EDPB-EDPS Joint Opinion 5/2021 on the proposal for a Regulation of the European Parliament and of the Council laying down harmonized rules on artificial intelligence (Artificial Intelligence Act) at para. 60 (June 18, 2021) (stating that data subjects should be informed when their data is used for AI training).

[4] Garante Per La Protezione Dei Dati Personali, Provision of April 11, 2023[9874702] (English translation).

[5] Garante Per La Protezione Dei Dati Personali, Provision of April 11, 2023[9874702] (English translation).

[6] Garante Per La Protezione Dei Dati Personali, Provision of April 11, 2023[9874702] (English translation). 

[7] See EDPB-EDPS Joint Opinion 5/2021 on the proposal for a Regulation of the European Parliament and of the Council laying down harmonized rules on artificial intelligence (Artificial Intelligence Act) at para. 60 (June 18, 2021) (stating that data subjects have a right to deletion / erasure in connection with their personal data being used to train an AI).

[8] See EDPB-EDPS Joint Opinion 5/2021 on the proposal for a Regulation of the European Parliament and of the Council laying down harmonized rules on artificial intelligence (Artificial Intelligence Act) at para. 60 (June 18, 2021) (stating that data subjects have a right to restriction in connection with their personal data being used to train an AI).

Print:
Email this postTweet this postLike this postShare this post on LinkedIn
Photo of David A. Zetoony David A. Zetoony

David Zetoony, Co-Chair of the firm’s U.S. Data, Privacy and Cybersecurity Practice, focuses on helping businesses navigate data privacy and cyber security laws from a practical standpoint. David has helped hundreds of companies establish and maintain ongoing privacy and security programs, and he

David Zetoony, Co-Chair of the firm’s U.S. Data, Privacy and Cybersecurity Practice, focuses on helping businesses navigate data privacy and cyber security laws from a practical standpoint. David has helped hundreds of companies establish and maintain ongoing privacy and security programs, and he has defended corporate privacy and security practices in investigations initiated by the Federal Trade Commission, and other data privacy and security regulatory agencies around the world, as well as in class action litigation.

Photo of Carsten A. Kociok Carsten A. Kociok

Carsten Kociok is a partner in the Technology, Financial Services and Data Privacy Practice in Berlin and Co-Head of Greenberg Traurig’s global Fintech Group. He advises national and international clients across all industries, including financial services, information technology, artificial intelligence, ecommerce, media, health

Carsten Kociok is a partner in the Technology, Financial Services and Data Privacy Practice in Berlin and Co-Head of Greenberg Traurig’s global Fintech Group. He advises national and international clients across all industries, including financial services, information technology, artificial intelligence, ecommerce, media, health care, telecoms, retail and real estate, on a wide variety of complex commercial and regulatory matters.

Carsten is a leading technology lawyer, ranked consistently in Band 1 for Fintech Legal in Germany since 2020. He has in-depth and wide-ranging experience in the areas of privacy and cybersecurity, payments law, financial services, e-money products, blockchain technology, and financial and banking regulation, as well as in artificial intelligence regulation – including compliance with the EU AI Act – and the integration of AI technologies into existing software systems.

Carsten regularly assists clients in licensing projects and audit proceedings with financial regulators and advises on the contractual and regulatory aspects of developing, implementing and operating financial technology products and transactions.

On the data privacy side, Carsten counsels clients on complex data-driven business models and regulatory matters, including on international data transfers, data privacy compliance, monetization of data, artificial intelligence, litigation, cybersecurity and data breach response.

Carsten regularly lectures and publishes on various FinTech and data privacy topics. Prior to joining the firm, Carsten worked at Olswang Germany for eight years and in the Capital Transaction Practice Group of an international law firm in New York.