AI how-to sheets
This content is a courtesy translation of the original publication in French. In the event of any inconsistencies between the French version and this English translation, please note that the French version shall prevail.
Introduction
What is the scope of the AI how-to sheets?
The CNIL provides concrete answers for the creation of databases used to train artificial intelligence (AI) systems, which involve personal data.
Sheet 1
Determining the applicable legal regime
The CNIL can help you determine the legal regime applicable to the processing of personal data during the development phase.
Sheet 2
Defining a purpose
The CNIL can help you define the purpose(s), taking into account the specificities of developing AI systems.
Sheet 3
Determining the legal qualification of AI system providers
Data controller, joint controller or processor: the CNIL is helping suppliers of AI systems to determine their status.
Sheet 4
(1/2)
Ensuring the lawfulness of the data processing - Defining a legal basis
The CNIL helps you determine your obligations based on your responsibility and the means of collecting or reusing the data.
Sheet 4
(2/2)
Ensuring the lawfulness of the data processing - In case of re-use of data, carrying out the necessary additional tests and verifications
The CNIL helps you determine your obligations, depending on the means of collecting the data and its source.
Sheet 5
Carrying out a data protection impact assessment when necessary
Creating a dataset for the training of an AI system can lead to high risks to people’s rights and freedoms. In this case, a data protection impact assessment is mandatory. The CNIL explains how, and in which cases, it should be realised.
Sheet 6
Taking into account data protection when designing the system
To ensure that the development of an AI system respects data protection, it is necessary to carry out a prior reflection when designing it.
Sheet 7
Taking data protection into account in data collection and management
The CNIL details how data protection principles relate to training data management.
Sheet 8
Relying on the legal basis of legitimate interests to develop an AI system
Controllers will most commonly rely on their legitimate interests for the development of AI systems. However, this legal basis cannot be used without respecting its conditions and implementing sufficient safeguards.
Focus (1/2)
The legal basis of legitimate interests: Focus sheet on open source models
In view of their potential benefits, open source practices should be considered when assessing legitimate interests of an AI system provider. However, it is necessary to adopt safeguards to limit the harm they can cause to individuals.
Focus (2/2)
The legal basis of legitimate interests: Focus sheet on measures to implement in case of data collection by web scraping
The collection of data accessible online by web scraping must be accompanied by measures to guarantee the rights of data subjects.
Sheet 9
Informing data subjects
Organizations that process personal data to develop AI models or systems must inform concerned data subjects. The CNIL specifies the obligations in this regard.
Sheet 10
Respect and facilitate the exercise of data subjects’ rights
Individuals whose data is collected, used or reused to develop an AI system have rights over their data that allow them to maintain control over it. The controllers are responsible to comply with them and to facilitate their exercise.
Sheet 11
Annotating data
The data annotation phase is crucial to ensure the quality of the AI model. This challenge can be achieved by means of a rigorous methodology guaranteeing performance for the system and protection of personal data.
Sheet 12
Ensuring the security of an AI system’s development
The security of AI systems is an issue too often overlooked by their designers. However, it remains an obligation to guarantee data protection both during the development of the system and in anticipation of its deployment. This how-to sheet details the risks and measures to be taken.