Skip to content

The term “data minimization” generally refers to two requirements within the GDPR: (1) a company should only collect personal data that is “necessary” in relation to its purpose, and (2) a company should keep data for “no longer than is necessary for [that] purpose[].”[1] Put differently, a company should only collect what it needs, and keep it for as long as it needs it.

Data is typically needed to train and fine-tune modern artificial intelligence models. If that data includes personal information, then pursuant to the GDPR a controller should consider what is the least amount of personal information that may be needed for training, and what is the least amount of time the AI model needs access to such information. Many of the leading providers of AI services allow controllers to delete training data without impacting the fine-tuning that the personal data facilitated of the AI model.[2] That feature, however, must be enabled by the controller based upon the controller’s determination of the amount of time the training data should be kept within the environment.

[1] GDPR, Article 5(1)(c), (e). Note that under the GDPR the term “data minimization” is sometimes used to refer to minimizing the collection of information and the term “storage minimization” is used to refer to minimizing the retention of information.

[2] See Data, privacy, and security of Azure OpenAI Service (last viewed June 2023)