Artificial intelligence: the action plan of the CNIL

16 May 2023

In the face of recent news on artificial intelligence, and in particular so-called generative AIs such as ChatGPT, the CNIL publishes an action plan for the deployment of AI systems that respect the privacy of individuals.

The main thing is:

The CNIL has been undertaking work for several years to anticipate and respond to the issues raised by AI.
In 2023, it will extend its action on augmented cameras and wishes to expand its work to generative AIs, large language models and derived applications (especially chatbots).
Its action plan is structured around four strands:
- to understand the functioning of AI systems and their impact on people;
- enabling and guiding the development of privacy-friendly AI;
- federate and support innovative players in the AI ecosystem in France and Europe;
- audit and control AI systems and protect people.
This work will also make it possible to prepare for the entry into application of the draft European AI Regulation, which is currently under discussion.

The protection of personal data, a fundamental challenge in the development of AI

The development of AI is accompanied by challenges in the field of data protection and individual freedoms that the CNIL has been working to address for several years now. Since the publication in 2017 of its report on the ethical challenges of algorithms and artificial intelligence, the CNIL repeatedly pronounced on the issues raised by the new tools brought about by this new technology.

In particular, generative artificial intelligence (see box below) has been developing rapidly for several months, whether in the field of text and conversation, via large language models (LLMsin English), such as GPT-3, BLOOM or Megatron NLG and derived chatbots (ChatGPT or Bard), but also in those of imaging (Dall-E, Midjourney, Stable Diffusion, etc.) or speech (Vall-E).

These foundationmodels and the technological bricks that rely on them seem to already find many cases of application in a variety of sectors. Nevertheless, the understanding of their functioning, their possibilities and their limitations, as well as the legal, ethical and technical issues surrounding their development and use remain largely under debate.

Considering that the protection of personal data is a major challenge for the design and use of these tools, the CNIL publishes its action plan on artificial intelligence, which aims — among other things — to frame the development of generative AI.

What is generative AI?

Generative artificial intelligence is a system capable of creating text, images or other content (music, video, voice, etc.) from a human user’s instruction. These systems can produce new content from training data. Their performance is now close to some productions made by people because of the large amount of data that has been used for their training. However, these systems require the user to clearly specify their queries in order to achieve the expected results. A real know-how is therefore developed around the composition of the user’s queries (quick engineering).

For example, the image below, entitled ‘Space Opera Theatre’, was generated by user Jason M. Allen using the Midjourney tool on the basis of a textual instruction describing his expectations (theatrical decor, toges, pictorial inspirations, etc.).

IA générative : Space Opera Theatre - Jason M. Allen (2022)

Credit : Jason M. Allen (2022), CCo license

A four-pronged action plan

For several years, the CNIL has been undertaking work aimed at anticipating and responding to the challenges posed by artificial intelligence, its different variations (classification, prediction, content generation, etc.) and its various use cases. Its new artificial intelligence service will be dedicated to these issues, and will support other CNIL services that also face uses of these algorithms in many contexts.

Faced with challenges related to the protection of freedoms, the acceleration of AI and news related to generative AI, the regulation of artificial intelligence is a main focus of the CNIL’s action.

This regulation is structured around four objectives:

Understanding the functioning of AI systems and their impacts for people
Enabling and guiding the development of AI that respects personal data
Federating and supporting innovative players in the AI ecosystem in France and Europe
Audit and control AI systems and protect people

Understanding the functioning of AI systems and their impacts on people

The innovative techniques used for the design and operation of AI tools raise new questions about data protection, in particular:

the fairness and transparency of the data processing underlying the operation of these tools;
the protection of publicly available data on the web against the use of scraping, or scraping, of data for the design of tools;
the protection of data transmitted by users when they use these tools, ranging from their collection (via an interface) to their possible re-use and processing through machine learning algorithms;
the consequences for the rights of individuals to their data, both in relation to those collected for the learning of models and those which may be provided by those systems, such as content created in the case of generative AI;
the protection against bias and discrimination that may occur;
the unprecedented security challenges of these tools.

These aspects will be one of the priority areas of work for the Artificial Intelligence Service and the CNIL Digital Innovation Laboratory (LINC).

Dedicated dossier

In order to highlight some of these issues specific to generative AI, the CNIL Digital Innovation Laboratory (LINC) has published a dossier dedicated to them. This file consists of four components:

details of the technical functioning of recent conversational agents and recalls the central place of data for the creation of underlying foundation models;
sets out various legal questions raised by the design of these models, both for intellectual property and for data protection;
specifies the ethical challenges of generative AIs for the reliability of information, malicious uses and the avenues of public detection and warning of the presence of content generated in this way;
illustrates with different experiments the positive or negative uses that can be made of these tools.

This dossier completes the resources proposed by the CNIL on its website for professionals and the general public.

Allowing and guiding the development of AI that respects personal data

Many stakeholders have told the CNIL about the uncertainty surrounding the application of the GDPR to AI, especially for the training of generative AI.

In order to support actors in the field of artificial intelligence and to prepare for the entry into force of the European AI Regulation (which is being discussed at European level and on which the CNIL and its European counterparts had published an opinion in 2021), the CNIL already proposes:

first fact sheets on AI, published in 2022 on cnil.fr, including educational content on the main principles of AI and a guide to support professionals in their compliance;
a position, also published in 2022, on the use of ‘enhanced’ video surveillance using AI on images in public space.

It continues its doctrinal work and will soon publish several documents. Thus:

the CNIL will soon submit to a consultation a guide on the rules applicable to the sharing and re-use of data. This work will include the issue of re-use of freely accessible data on the internet and now used for learning many AI models. This guide will therefore be relevant for some of the data processing necessary for the design of AI systems, including generative AIs.
it will also continue its work on designing AI systems and building databases for machine learning. These will give rise to several publications starting in the summer of 2023, following the consultation which has already been organised with several actors, in order to provide concrete recommendations, in particular as regards the design of AI systems such as ChatGPT. The following topics will be gradually addressed:
- the use of the system of scientific research for the establishment and re-use of training databases;
- the application of the purpose principle to general purpose AIs and foundation models such as large language models;
- the explanation of the sharing of responsibilities between the entities which make up the databases, those which draw up models from that data and those which use those models;
- the rules and best practices applicable to the selection of data for training, having regard to the principles of data accuracy and minimisation;
- the management of the rights of individuals, in particular the rights of access, rectification and opposition;
- the applicable rules on shelf life, in particular for the training bases and the most complex models to be used;
finally, aware that the issues raised by artificial intelligence systems do not stop at their conception, the CNIL is also pursuing its ethical reflections on the use and sharing of machine learning models, the prevention and correction of biases and discrimination, or the certification of AI systems.

Federal and support innovative players in the AI ecosystem in France and Europe

CNIL’s AI regulation aims to bring out, promote and help prosper actors in a framework that is faithful to the values of protecting French and European fundamental rights and freedoms. This support, already undertaken, takes three forms:

for the past two years, the CNIL has launched a sandbox to support innovative projects and actors, which has led it to focus on AI-based projects. The ‘sandboxes’ on health in 2021 (12 accompanied projects) and on education in 2022 (10 accompanied projects) thus made it possible to provide tailored advice to innovative AI players in these areas. The CNIL will soon open a new call for projects for the edition of 2023[which will concern in particular the use of artificial intelligence in the public sector];
it launched a specific support programme for providers of ‘enhanced’ video surveillance in the context of the experimentation provided for in the Law on the Olympic and Paralympic Games of 2024;
finally, the CNIL opened in 2023 a new ‘enhanced support’ programme to assist innovative companies in their compliance with the GDPR: the first winners of this reinforced accompaniment are innovative companies in the field of AI.

More generally, the CNIL wishes to engage in a sustained dialogue with research teams, R & D centers and French companies developing, or wishing to develop, AI systems in a logic of compliance with personal data protection rules.

These teams and companies can contact the CNIL at [email protected].

Audit and control AI systems and protect people

The definition of the framework for the development of artificial intelligence systems while respecting individual rights and freedoms implies, downstream, that the CNIL monitors compliance. It is therefore essential for the CNIL to develop a tool to audit the AI systems submitted to it, both a priori and later.

The CNIL’s control action will focus in particular in 2023 on:

compliance with the position on the use of ‘enhanced’ video surveillance, published in 2022, by public and private actors;
the use of artificial intelligence in the fight against fraud, for example in the fight against social insurance fraud, in view of the challenges linked to the use of such algorithms;
the investigation of complaints lodged with the CNIL. If the legal framework for the training and use of generative AIs needs to be clarified, which the CNIL will be working on, complaints have already been lodged. The CNIL has, in particular, received several complaints against the company OpenAI which manages the ChatGPT service, and has opened a control procedure. In parallel, a dedicated working group has been set up within the European Data Protection Board (EDPS)to ensure a coordinated approach by the European authorities and a harmonised analysis of the data processing implemented by the OpenAI tool.

The CNIL will pay particular attention to what actors processing personal data in order to develop, train or use artificial intelligence systems have:

carried out a Data Protection Impact Assessment (DIA) to document risks and take measures to reduce them;
take measures to inform people;
planned measures for the exercise of the rights of persons adapted to this particular context.

Thanks to this collective and essential work, the CNIL wants to establish clear rules protecting the personal data of European citizens in order to contribute to the development of privacy-friendly AI systems.

Texte reference