MLCommons Aims to Measure AI’s Dark Side with New Benchmark
MLCommons, a nonprofit dedicated to assessing the performance of AI systems, is turning its focus to the potential dangers of these technologies. With the launch of a new benchmark called AILuminate, the organization aims to evaluate how AI models handle sensitive and harmful scenarios.
AILuminate subjects large language models to over 12,000 prompts across 12 critical categories, including incitement to violence, hate speech, child exploitation, intellectual property violations, and promotion of self-harm. Each model receives a rating ranging from “poor” to “excellent” based on its responses. To ensure the integrity of the test, the prompts remain confidential, preventing models from being trained specifically to pass them.
Addressing Challenges in AI Safety
Peter Mattson, founder and president of MLCommons and a senior engineer at Google, acknowledges that assessing the risks posed by AI is both complex and inconsistent across the industry. “AI is a young technology, and the field of AI testing is equally nascent,” says Mattson. “Improving safety not only benefits society but also strengthens the marketplace.”
This benchmark arrives at a crucial moment. AI safety is under scrutiny, with measures like President Biden’s AI Executive Order, which includes establishing an AI Safety Institute to test advanced models. However, future policy directions remain uncertain, especially with potential shifts in US leadership, as Donald Trump has vowed to dismantle the order.
AILuminate also offers a global perspective, as MLCommons collaborates with international organizations, including China’s Huawei and Alibaba. If widely adopted, this tool could enable cross-country comparisons of AI safety, fostering transparency in AI development.
Early Results
Several prominent AI models have already undergone AILuminate testing. Anthropic’s Claude, Google’s Gemma, and Microsoft’s Phi all earned “very good” ratings. Meanwhile, OpenAI’s GPT-4o and Meta’s largest Llama model scored “good.” The lowest score, “poor,” went to OLMo from the Allen Institute for AI, a model not specifically optimized for safety, as noted by Mattson.
Rumman Chowdhury, CEO of Humane Intelligence, a nonprofit specializing in AI model evaluation, praised the initiative. “It’s encouraging to see scientific rigor in AI assessment. Developing best practices and inclusive testing methods is critical for ensuring AI models align with our expectations,” she said.
Building Safer Standards
MLCommons envisions AILuminate functioning like safety ratings in the automotive industry, motivating companies to improve their scores and pushing the benchmark to evolve over time. However, this tool does not address risks related to models developing deceptive or uncontrollable behavior—issues that have gained attention since the rapid rise of ChatGPT in late 2022. Governments and AI companies continue to probe these concerns, but MLCommons believes its broader approach complements such efforts.
“Safety institutes are conducting evaluations, but they may not cover the full spectrum of hazards relevant to AI systems,” Mattson explains. “Our benchmark allows for a more expansive view of risks.”
Rebecca Weiss, executive director of MLCommons, highlights the organization’s ability to stay ahead of industry developments compared to slower government processes. “Policymakers have great intentions but often struggle to keep pace with the rapid evolution of AI,” she says.
Global Collaboration
With around 125 members, including major tech players like OpenAI, Google, and Meta, as well as institutions such as Stanford and Harvard, MLCommons is well-positioned to lead the charge in AI safety. While no Chinese companies have used AILuminate yet, MLCommons has partnered with AI Verify, a Singapore-based safety organization, to incorporate insights from Asian researchers and businesses.
AILuminate’s introduction marks a significant step toward creating a standardized, independent way of gauging AI risks—an essential move as these technologies continue to shape our world.