DeepMind Invites AI Community to Collaborate for Safe and Responsible AI Development

AGI (Artificial General Intelligence) is considered the technological "holy grail" by companies like OpenAI or DeepSeek. Presented as an opportunity for humanity, it also raises concerns about its potential risks to society, notably the loss of control. In a recently published 145-page document, Google DeepMind proposes an approach to mitigate these risks, emphasizing that planning, preparation, and proactive collaboration are essential to achieve this.

Experts' opinions on the damage AGI could cause to humanity are very divided. The positions of Yoshua Bengio, Geoffrey Hinton, and Yann LeCun, recipients of the 2018 Turing Award, well illustrate this divide. Geoffrey Hinton, after years at Google, chose in 2023 to leave his position to speak freely about the dangers of AI. He particularly fears the ability of advanced models to misinform, manipulate, or escape human control. Similarly, Yoshua Bengio has advocated for a temporary pause in AGI development, co-signing the letter from the Future of Life Institute. Both advocate for strong governance, public oversight, and security protocols before crossing critical thresholds.

Yann LeCun, now Chief AI Scientist at Meta, takes a more optimistic and technical stance. According to him, AGI remains a distant goal: current models, although powerful, neither understand the world nor possess true agency. Advocating for the continuation of open research while emphasizing the exploratory nature of current AI, he believes that fears about human extinction or loss of control are premature, if not unfounded.

On the other hand, Shane Legg, co-founder and AGI Chief Scientist at Google DeepMind, believes that without control, AGI could pose existential risks to humanity. Along with his co-authors of the article "An Approach to Technical AGI Safety and Security," he estimates that AGI should be achieved by the end of this decade.

In this document, they explore four main risk areas:

Misuse: When malicious actors exploit AGI for destructive purposes;
Misalignment: When AGI acts contrary to its creators' intentions;
Errors: When AGI makes unintentionally harmful decisions;
Structural risks: Multi-agent dynamics that could lead to unforeseen consequences.

They primarily focus on managing the risks of misuse and misalignment, which constitute the most direct and urgent threats.

Risk Prevention Strategies

To prevent misuse, DeepMind proposes a series of security and access control measures aimed at preventing access to dangerous capabilities. These measures include:

The proactive identification of high-risk capabilities;
The implementation of strict restrictions to limit access to these capabilities;
Ongoing monitoring and enhanced safety strategies for models.

Regarding misalignment, the approach relies on two levels of defense:

Model-level security: Amplification of supervision and advanced training to ensure AGI remains aligned with human intentions;
System-level security: Implementation of control and monitoring mechanisms to detect and correct any potentially dangerous drift.

Tools such as model interpretability and uncertainty estimation are also recommended to improve the effectiveness of security measures.

The DeepMind team hopes that the scientific community will join them in their efforts to ensure secure and controlled access to the potential benefits of AGI.

Translated from DeepMind invite la communauté de l'IA à collaborer pour que l'IA soit développée de manière sûre et responsable

To better understand

What regulatory framework could be implemented to govern the use of AGI?

A regulatory framework for AGI might include strict transparency requirements, independent oversight protocols, and clear liability for AI creators to prevent misuse and misalignment.