Anthropic is launching a program to fund the development of new AI benchmarks to evaluate the performance and impact of AI models, including generative models like Claude. The program will provide payments to third-party organizations that can effectively measure advanced capabilities in AI models.
Anthropic aims to elevate the field of AI safety by creating challenging benchmarks focused on AI security and societal implications. They seek tests that assess a model’s ability to carry out cyberattacks, manipulate people, and enhance weapons of mass destruction. The program also supports research into benchmarks that explore AI’s potential in scientific study, language understanding, bias mitigation, and toxicity self-censorship.
Anthropic envisions platforms for subject-matter experts to develop evaluations and large-scale trials involving thousands of users. While the company’s commercial ambitions may raise concerns, they aim to be a catalyst for comprehensive AI evaluation becoming an industry standard.