Overview of Project
CLTC researchers Tony Barrett, Jessica Newman, and colleagues are leading an effort to create an AI risk management-standards profile for increasingly multi-purpose or general-purpose AI, such as cutting-edge large language models. The profile guidance will be primarily for use by developers of such AI systems, in conjunction with the NIST AI Risk Management Framework (AI RMF) or the AI risk management standard ISO/IEC 23894. This profile will be a contribution to standards on AI safety, security, and ethics, with risk-management practices or controls for identifying, analyzing, and mitigating risks. We aim to publish Version 1.0 by the end of 2023, preceded by draft versions for feedback.
We are seeking participants to provide input and review, especially experts in AI safety, security, ethics, and policy. Participants will receive invitations to attend optional workshops approximately once every three months, and/or to provide input or feedback on drafts of the profile at your convenience. We also will make a version of the latest draft publicly available approximately once every three months, for anyone to provide feedback to the project lead via email. The final Version 1.0 and later updates will be available for free online for anyone to use.
If you are interested in participating, contact Tony Barrett (anthony.barrett@berkeley.edu).
Purpose and Intended Audience
Increasingly multi-purpose or general-purpose AI systems, such as BERT, CLIP, GPT-3, DALL-E 2, and PaLM, can provide many beneficial capabilities, but they also introduce risks of adverse events with societal-scale consequences. This document provides risk-management practices or controls for identifying, analyzing, and mitigating risks of such AI systems. We intend this document primarily for upstream developers of these AI systems; others that can benefit from this guidance include downstream developers of end-use applications that build on a multi-purpose or general-purpose AI system platform. This document facilitates conformity with leading AI risk management standards and frameworks, adapting and building on the generic voluntary guidance in the NIST AI RMF and ISO/IEC 23894 AI risk management standard, with a focus on the unique issues faced by developers of increasingly multi-purpose or general-purpose AI systems.
Examples of Preliminary Guidance
This project builds on our recent work on arXiv, “Actionable Guidance for High-Consequence AI Risk Management: Towards Standards Addressing AI Catastrophic Risks” (https://arxiv.org/abs/2206.08966). In Section 3 of that work, we provide actionable-guidance recommendations for: identifying risks from potential unintended uses and misuses of AI systems; including catastrophic-risk factors within the scope of risk assessments and impact assessments; identifying and mitigating human rights harms; and reporting information on AI risk factors, including catastrophic-risk factors.
In Section 4 of that work, we outline our preliminary ideas for an AI RMF Profile, with supplementary guidance for cutting-edge, increasingly multi-purpose or general-purpose AI. Our ideas for guidance included the following examples:
- As part of risk identification for increasingly multi-purpose or general-purpose AI systems:
- Identify potential use cases and misuse cases, to be considered in decisions on disallowed/unacceptable use-case categories of applications.
- As part of risk assessment for increasingly multi-purpose or general-purpose AI systems:
- Use red teams and adversarial testing as part of extensive interaction with the AI systems to identify emergent properties such as new capabilities and failure modes.
- As part of risk mitigation for increasingly multi-purpose or general-purpose AI systems:
- When training state-of-the-art machine learning models, increase the amount of compute incrementally (e.g., by not more than three times between each increment), and test models after each incremental increase to identify emergent properties.
Why Create this Profile?
Other initial AI RMF profiles seem likely to focus on specific industry sectors and end-use applications, e.g., in critical infrastructure or other high-risk categories of the draft EU AI Act. That seems valuable, especially for downstream developers of end-use applications, and could help the AI RMF achieve interoperability with other regulatory regimes such as the EU AI Act. However, an approach focused on end-use applications could overlook an opportunity to provide profile guidance for upstream developers of increasingly general-purpose AI, including AI systems sometimes referred to as “foundation models”. Such AI systems can have many uses, and early-development risk issues such as emergent properties that upstream developers are often in a better position to address than downstream developers building on AI platforms for specific end-use applications.
Guidance in this profile focuses on managing the broad context and associated risks of increasingly multi-purpose or general-purpose AI, e.g.:
- To address important underlying risks and early-development risks in a way that does not rely on having certainty about each specific end-use application of the technology.
- To provide guidance on sharing of risk management responsibilities between upstream and downstream developers.
Milestones
We are proceeding with the following profile-creation stages and approximate dates:
- Planning and preliminary outreach – Q3 2022
- Initial workshop and interviews, first draft of the profile completed – Q4 2022
- Second workshop, interviews, second draft of the profile, alpha test – Q1 2023
- Third workshop, interviews, third draft profile, beta test – Q2 2023
- Release Profile 1.0 on UC Berkeley Center for Long-Term Cybersecurity and arXiv websites – Q3 or Q4 2023
Project Leads:
Anthony M. Barrett, Ph.D., PMP
Visiting Scholar, AI Security Initiative, Center for Long-Term Cybersecurity, UC Berkeley
anthony.barrett@berkeley.edu
Jessica Newman
Director, AI Security Initiative, Center for Long-Term Cybersecurity, UC Berkeley; Co-Director, AI Policy Hub, UC Berkeley
Dan Hendrycks
Ph.D. Candidate
Berkeley AI Research Lab, UC Berkeley
Brandie Nonnecke, Ph.D.
Director, CITRIS Policy Lab, UC Berkeley
Co-Director, AI Policy Hub, UC Berkeley
Next Step: Seeking Participants
We are seeking experts in AI safety, security, ethics, and policy who are interested in standards development, risk management, and the particular opportunities and challenges associated with increasingly multi-purpose or general-purpose AI, such as cutting-edge large language models. Participants will receive invitations to attend optional quarterly workshops, or to provide input or feedback on drafts at your convenience. All activities are optional; no minimum time commitment is required.
Individual-level participation options include:
- Providing ideas in workshops or interviews
- Reviewing drafts
- Serving as a test user
Organization-level support options include:
- Providing time for employees to participate
- Allowing use of organizational logo on the Profile.
Please contact Tony Barrett (anthony.barrett@berkeley.edu) if you or your organization want to participate!