News / June 2023

AI Researchers Submit Comments on NTIA AI Accountability Policy

On June 9, 2023, a team of researchers from the UC Berkeley AI Security Initiative and AI Policy Hub sent a letter to Stephanie Weiner, Acting Chief Counsel for the National Telecommunications and Information Administration (NTIA). Their letter was a response to a request for comments (RFC) issued by the NTIA in April 2023. NTIA will consider comments they receive as they prepare a report and recommendations for the Biden-Harris administration on an AI accountability policy. Reportedly, policy options under consideration include starting with steps that would not require new legislation, for example by adding AI-related requirements to federal government procurement standards.

 The authors of the letter are: Anthony Barrett, Ph.D., PMP, Visiting Scholar at the AI Security Initiative (part of the Center for Long-Term Cybersecurity); Jessica Newman, Director of the AI Security Initiative and Co-Director of the AI Policy Hub; and Brandie Nonnecke, Ph.D., Director of the CITRIS Policy Lab, part of CITRIS and the Banatao Institute, Co-Director of the AI Policy Hub, and Associate Research Professor at the Goldman School of Public Policy.

The full text of the letter is included below.

Dear Ms. Weiner,

Thank you for the NTIA AI Accountability Request for Comment (RFC) released April 2023. We offer the following submission for your consideration.

We are researchers affiliated with UC Berkeley, with expertise on AI research and development, safety, security, policy, and ethics. We previously submitted responses to NIST several times over the past two years at various stages of NIST’s development of the AI Risk Management Framework (AI RMF).

We support regulation and accountability mechanisms that appropriately address impacts and risks of AI development and deployment, and provide assurance that AI systems are legal, effective, ethical, safe, and otherwise trustworthy. However, we caution that risks must be framed appropriately to address them effectively. For example, under some “risk based” regulatory approaches, such as in the 2021 draft EU AI Act from the European Commission, essentially all explicit responsibility and accountability would fall to the developers and deployers of end-use applications of AI, leaving a gap in accountability for the original developers of large language models (such as in ChatGPT) that can power many downstream applications.¹ That was only remedied in the 2023 European Parliament agreement on the AI Act that included transparency and risk-management requirements for providers of large language models as well as other “general purpose AI”, “foundation models” and generative AI systems. The US should learn from the EU’s experience, and avoid a regulatory approach that would result in a gap in addressing risks that can begin to emerge during the development and training stages of the AI lifecycle, as well as risks stemming from the broad capabilities and applicability of increasingly general purpose systems.

Here are some of our key recommendations on the NTIA AI Accountability RFC:

  • AI accountability mechanisms should address risks to individuals, groups, society, and the planet, and should address decisions at all stages of the AI lifecycle. Some important risks can emerge during the design and development stages of the AI lifecycle, while others that can emerge later may still be best addressed during the design and development stages, when decisions about whether to use AI at all, and if so how and why are best addressed. AI accountability mechanisms should be understood as socio-technical interventions, including both technical constraints and human oversight, and able to address risks that emerge throughout the AI lifecycle.²
  • Developers of large language models (LLMs), “foundation models”, other general-purpose AI systems (GPAIS) or other increasingly advanced AI should be accountable for AI trustworthiness and risk-management tasks, such as AI development and testing tasks to characterize knowledge limits, for which they are better placed (and often better resourced) than others in the AI value chain. Responsibilities for risk assessment and risk management should not fall only on downstream developers of applications built on GPAIS platforms, nor only on deployers of AI systems in particular end-use categories.³
  • Federal government procurement requirements and other regulations should incorporate and build on AI best practices, standards and guidance where appropriate and available, such as in the NIST AI Risk Management Framework (AI RMF) and the Blueprint for an AI Bill of Rights, adapting key elements of workable voluntary soft-law guidance as part of mandatory hard-law requirements. Such AI standards guidance can be required immediately for federal procurement without legislation, and also can continue to be a key part of AI regulations if and when legislation is passed for new regulatory requirements. For regulations building on the NIST AI RMF and addressing cutting-edge LLMs or other increasingly general-purpose AI, “highly capable AI foundation models”, or “frontier models”, we also recommend adapting or incorporating the guidance for GPAIS developers in our project “AI Risk Management-Standards Profile for Increasingly Multi- or General-Purpose AI”.⁴
  • Government agencies and departments should be provided with sufficient resources to uphold pre-existing laws in the age of AI proliferation. However, a new federal AI law with provisions for sufficient oversight and fines or other penalties for irresponsible actions would also be highly valuable. Without such a federal law and enforcement, companies may perceive net incentives to move too hastily to develop and deploy excessively risky AI systems. For example, a federal law should prohibit ongoing development or use of AI systems with insufficiently mitigated potential to cause severe or catastrophic harm to human safety and rights. A federal AI law should also mandate reasonable testing, evaluation, and monitoring of AI systems, transparency requirements such as for labeling of AI-generated content, public disclosure of AI systems and their evaluations when those systems impact people, and remedies for harms caused. Finally, any agency responsible for enforcing a law must have the authority and resources to assess and enforce compliance.

Our comments on specific items in the NTIA AI Accountability RFC

In the following, we list NTIA RFC questions for which we provide answers, and omit NTIA RFC questions that we do not specifically address.

AI Accountability Objectives

1. What is the purpose of AI accountability mechanisms such as certifications, audits, and assessments? Responses could address the following:

a. What kinds of topics should AI accountability mechanisms cover? How should they be scoped?

The scope of AI accountability mechanisms should include reasonable steps to assess and mitigate foreseeable risks to individuals, groups, society, and the planet including by implementing appropriate best practices, as with the scope of guidance in the NIST AI Risk Management Framework (AI RMF).

AI accountability mechanisms should be understood as socio-technical interventions, addressing the interconnection and interplay between technical, social, and organizational elements of AI risks, and should address decisions at all stages of the AI lifecycle, including AI design, development, deployment, and monitoring. This is consistent with the NIST AI RMF, which specifies that “AI systems are inherently socio-technical in nature,” and that mitigating AI risks necessarily includes interventions at the human and organizational level, and throughout the AI lifecycle.

Foreseeable risks that should be covered by AI accountability mechanisms include risks that emerge during the design and development stages of the AI lifecycle such as:

  • Privacy violations and copyright infringement from AI training data;
  • New security vulnerabilities such as data poisoning attacks;
  • Bias and inaccuracy from non-representative or inadequate datasets;
  • Potential labor rights violations for example psychological trauma or under valuing the work of data annotators and AI feedback providers; and
  • Environmental harm caused by high energy costs of training AI systems.

Foreseeable risks that emerge during the deployment and monitoring stages of the AI lifecycle that should be covered by AI accountability mechanisms can include:

  • The degradation of the information ecosystem caused by the rise of synthetic media and misinformation, including implications for democratic processes and values such as free and fair elections;
  • Gender and sexual violence and discrimination, including non-consesual AI-generated pornography, sexually explicit content, and child sexual abuse material, or CSAM;
  • Bias, discrimination, and cultural homogeneity, including inaccurate or skewed outputs for minority groups, communities, and cultures;
  • Addiction and over use of AI-enabled products, for example from the use of recommender systems;
  • Persuasion and manipulation of people, including encouraging shifts in beliefs and behaviors in potentially dangerous ways;
  • Over-reliance on AI-enabled tools, for example caused by a lack of transparency about uncertainty and limitations, leading to errors in high stakes settings;
  • Inability to provide consent to engagement with an AI system, for example if it is not communicated to a user or impacted individual that an AI system is in use and why;
  • Automation of human labor resulting in significant changes to and losses of jobs;
  • Deterioration of human wellbeing, for example from a loss of human connection or increasing anxiety and depression caused by seeing toxic or harmful content;
  • New inequities and exclusion caused by less ability for marginalized communities to access and benefit from AI technologies;
  • Concentration of power by a small number of countries, companies, and people due to the network effects of big data and the significant labor and compute needs to develop and deploy AI systems;
  • Loss of liberties and human rights caused by AI-enabled biometric identification and surveillance capabilities, including predictive policing and facial recognition technologies used by law enforcement;
  • Environmental degradation caused by high energy costs of operating AI systems;
  • New security vulnerabilities in machines or systems with embedded AI technologies, including susceptibility to adversarial attacks and adversarial prompting;
  • More powerful cyber attacks including AI-enabled scamming, phishing, malware, etc.;
  • Privacy violations including leaking of sensitive personal information, personally identifiable information, or intellectual property, for example by a security vulnerability or by a large language model;
  • Loss of human understanding and control over complex systems facilitated by AI, for example financial markets;
  • Loss of human understanding and control over AI systems, especially “black box” and autonomous AI systems, for example AI agents capable of assigning and carrying out sub-tasks to fulfill a given goal;
  • Development or exploitation of chemical, biological, nuclear, or other types of weapons systems assisted by AI technologies;
  • War crimes or inadvertent escalation caused by the use of AI-enabled weapons or military machinery;
  • International destabilization caused by power shifts facilitated by the use of AI.

For more information about reducing risks and striving for trustworthy AI throughout the AI lifecycle, see e.g., “Taxonomy of Trustworthiness for Artificial Intelligence,” published by the UC Berkeley Center for Long-Term Cybersecurity.

b. What are assessments or internal audits most useful for? What are external assessments or audits most useful for?

Internal assessments and audits are most useful for providing organization-internal decision makers with valuable information for their risk management decision making, partly because they typically occur more frequently than and prior to external audits. External audits are most useful for accountability to external stakeholders.

2. Is the value of certifications, audits, and assessments mostly to promote trust for external stakeholders or is it to change internal processes? How might the answer influence policy design?

From a societal risk management perspective, the most important effect of certifications, audits and assessments is to change (or at least to verify) internal processes. Certifications, audits and assessments also have benefits of promoting trust for external stakeholders, which is valuable from an enterprise risk management perspective. The societal risk management perspective should be the focus of public policy design, to provide incentives for firms to mitigate risks that otherwise would be negative externalities of their actions, resulting in adverse impacts affecting other stakeholders while the firms reap the rewards. Moreover, processes for certifications, audits and assessments must be sufficiently high-quality and comprehensive to avoid false confidence in AI systems. To support this, third-party certifiers, auditors, and assessors should be transparent in how they complete their assessments, should be accredited or licensed to provide sufficient assurance of the robustness of their assessments, and subject to audits themselves.

4. Can AI accountability mechanisms effectively deal with systemic and/or collective risks of harm, for example, with respect to worker and workplace health and safety, the health and safety of marginalized communities, the democratic process, human autonomy, or emergent risks?

Yes, especially regarding some reasonably foreseeable risks, such as the risks mentioned in response to question 1.a. Systemic and/or collective risks of harm are a primary type of risk that is already seen and expected to increase from the widespread use of AI systems at scale. Numerous AI accountability mechanisms, including risk management frameworks, AI impact assessments, human rights impact assessments, data protection and privacy impact assessments, AI audits, red teaming exercises, bug bounties and bias bounties, and others can help address not only individual or enterprise risk, but also systemic and collective risk.

AI accountability mechanisms can also help address some early-stage emergent properties of advanced AI. For examples of such guidance on identifying reasonably foreseeable impacts of AI systems, see e.g., Section 3.2 of our work “Actionable Guidance for High-Consequence AI Risk Management: Towards Standards Addressing AI Catastrophic Risks”, https://arxiv.org/abs/2206.08966.

5. Given the likely integration of generative AI tools such as large language models ( e.g., ChatGPT) or other general-purpose AI or foundational models into downstream products, how can AI accountability mechanisms inform people about how such tools are operating and/or whether the tools comply with standards for trustworthy AI?

One way is for developers of general-purpose AI systems (GPAIS) and foundation models to be able to demonstrate that they are doing a reasonably good job of following relevant guidance and standards, including the NIST AI RMF and our supplementary guidance for developers of GPAIS and foundation models. (For more, including links to our draft guidance, see our project webpage: https://cltc.berkeley.edu/seeking-input-and-feedback-ai-risk-management-standards-profile-for-increasingly-multi-purpose-or-general-purpose-ai/) These include guidance for GPAIS and foundation model developers to 1) carry out key AI trustworthiness and risk-management tasks, such as AI development and testing tasks to characterize knowledge limits, in a measurable or at least documentable way, and 2) make necessary information available to independent auditors or others as appropriate (e.g., to enable third party auditability).

6. The application of accountability measures (whether voluntary or regulatory) is more straightforward for some trustworthy AI goals than for others. With respect to which trustworthy AI goals are there existing requirements or standards? Are there any trustworthy AI goals that are not amenable to requirements or standards? How should accountability policies, whether governmental or non-governmental, treat these differences?

It is currently difficult or infeasible to measure whether some important AI systems meet particular AI trustworthiness objectives. For example, LLMs typically use deep-learning architectures for which explainability and interpretability measures are currently very limited. An alternative approach in the meantime is to provide LLM developers with guidance to identify gaps in meeting those objectives, and to avoid deploying LLMs in ways for which those gaps would result in unacceptable risks.

7. Are there ways in which accountability mechanisms are unlikely to further, and might even frustrate, the development of trustworthy AI? Are there accountability mechanisms that unduly impact AI innovation and the competitiveness of U.S. developers?

Some accountability mechanisms can have tradeoffs among trustworthiness objectives. For example, providing access to a frontier-model LLM through an API instead of via open-source distribution of model code and weights can have many risk-reduction benefits such as reducing potential for misuse and reducing potential for uncontrolled proliferation, with a tradeoff of reducing particular types of transparency (e.g. opportunity for anyone to fully inspect source code) that open-source communities and others place high value on, including for purposes of finding security vulnerabilities. However, this has not seemed to have greatly reduced open-source US AI innovation; in recent years it has often seemed that some capabilities of open-source AI developers have only lagged capabilities of closed-source frontier-model industry leaders by a few years or even just a few months. Moreover, developers could have specific trusted auditors inspect their code or other aspects of their AI systems.

Existing Resources and Models

13. What aspects of human rights and/or industry Environmental, Social, and Governance (ESG) assurance systems can and should be adopted for AI accountability?

The use of AI technologies should never exempt organizations from the requirement to uphold human rights and environmental, social, and governance responsibilities. Human rights impact assessments, for example, can be an important tool for stakeholders throughout the AI lifecycle. If AI accountability mechanisms — like risk assessments — do not carefully consider human rights risks, their use may inadvertently perpetuate the harms they seek to mitigate.⁵ The Universal Declaration of Human Rights and corresponding international human rights instruments, and UN treaties and commentaries have helped to clarify core human rights definitions and interpretations over decades, leading to broad global consensus. To support AI accountability, AI risk assessments, for example, should utilize these established human rights definitions and interpretations to provide greater specificity in how concepts such as “non-discrimination” are interpreted in the context of an AI system.

14. Which non-U.S. or U.S. (federal, state, or local) laws and regulations already requiring an AI audit, assessment, or other accountability mechanism are most useful and why? Which are least useful and why?

There are many U.S. laws and regulations that already support AI accountability, though none comprehensively address a full range of AI risks. The Illinois Biometric Information Privacy Act (BIPA) is an example of a state law that has had a meaningful impact on restricting how companies can handle sensitive personal information. The law has given people more control over their biometric information and provides a private right of action to individuals harmed. California’s Consumer Privacy Act (CCPA) is another example of a state privacy law that gives individuals more control over their data including the right to opt-out of the sale or sharing of their personal information. Facial recognition bans, which have been enacted by numerous states and cities across the country, are also an important AI accountability mechanism, because they establish clear boundaries of unacceptable uses of AI-enabled technologies that can threaten civil liberties and rights.

Accountability Subjects

15. The AI value or supply chain is complex, often involving open source and proprietary products and downstream applications that are quite different from what AI system developers may initially have contemplated. Moreover, training data for AI systems may be acquired from multiple sources, including from the customer using the technology. Problems in AI systems may arise downstream at the deployment or customization stage or upstream during model development and data training.

a. Where in the value chain should accountability efforts focus?

It depends in part on which actors are best placed to take particular actions. No actors should be shielded from accountability, as distinct risks and harms occur throughout the value chain. For example, developers of large-scale GPAIS and foundation models should be responsible and accountable for AI trustworthiness and risk-management tasks, such as AI assessment and testing tasks to characterize knowledge limits and other limitations and vulnerabilities, for which they are better placed (and often better resourced) than others in the AI value chain. Responsibilities for risk assessment and risk management should not fall only on downstream developers of applications built on GPAIS platforms, nor only on deployers of AI systems in particular end-use categories.

b. How can accountability efforts at different points in the value chain best be coordinated and communicated?

Our GPAIS profile lays out a basic principle: risks should be addressed by the organization best placed in the value chain to address them. We also provide corresponding types of tasks for GPAIS developers, such as testing requiring direct access to training data, and types of tasks for downstream developers, such as establishing specific context and compliance requirements for a particular end-use application. The EU AI Act also specifies various responsibilities for providers of GPAIS and foundation models, as distinct from requirements for deployers of AI systems in particular types of end-use applications.

d. Since the effects and performance of an AI system will depend on the context in which it is deployed, how can accountability measures accommodate unknowns about ultimate downstream implementation?

Developers of AI systems often can use available information and engage with relevant external experts and community members to identify various types of reasonably foreseeable beneficial uses — as well as misuses and abuses — of their AI systems, beyond whatever specific uses or purposes that the AI system developer may have originally intended for that AI system. For examples of such guidance on identifying reasonably foreseeable uses, misuses and abuses of AI systems, see e.g., Section 3.1 of our work “Actionable Guidance for High-Consequence AI Risk Management: Towards Standards Addressing AI Catastrophic Risks”, https://arxiv.org/abs/2206.08966.

16. The lifecycle of any given AI system or component also presents distinct junctures for assessment, audit, and other measures. For example, in the case of bias, it has been shown that “[b]ias is prevalent in the assumptions about which data should be used, what AI models should be developed, where the AI system should be placed — or if AI is required at all.” How should AI accountability mechanisms consider the AI lifecycle? Responses could address the following:

b. How should AI audits or assessments be timed? At what stage of design, development, and deployment should they take place to provide meaningful accountability?

AI assessments are typically critical at several points throughout the AI lifecycle. They are often most important in the early design and development stages when critical decisions are still being made about how to design a system and if its use is legal, safe, and ethical. AI assessments are also critical to verify and validate a system after development, and to monitor real world impacts after deployment. Particular types of accountability mechanisms make sense at one or more of these times.

For GPAIS, at least some assessments should occur during development and testing. In our GPAIS guidance, we even recommend that model training include repeated testing and assessment (including red-teaming) during an incremental scale-up process. This can provide opportunities to detect emergent properties of large-scale models at an early or intermediate scaling stage. Red-teaming assessments can be performed by experts external to the AI development organization, as with the red-teaming assessment of GPT-4 by ARC Evals. (For brief description of these, see pp. 15–16 of the GPT-4 System Card at https://cdn.openai.com/papers/gpt-4-system-card.pdf)

18. Should AI systems be released with quality assurance certifications, especially if they are higher risk?

Yes, this is valuable, as long as the quality assurance certification processes are themselves sufficiently high-quality, comprehensive, and transparent. Otherwise, the result could be false confidence in AI systems.

19. As governments at all levels increase their use of AI systems, what should the public expect in terms of audits and assessments of AI systems deployed as part of public programs? Should the accountability practices for AI systems deployed in the public sector differ from those used for private sector AI? How can government procurement practices help create a productive AI accountability ecosystem?

The U.S. government should hold itself to the highest standard and lead by example. The public should expect that AI systems deployed as part of public programs will uphold the principles articulated in the Blueprint for an AI Bill of Rights, including that all AI systems used will be safe and effective, privacy preserving, include protections against discrimination, provide notice and explanation, and provide alternatives, consideration, and fallback. Publicly available assessments of AI systems deployed as part of public programs can help support transparency and accountability beyond AI registries that may provide no insight into the validity and trustworthiness of systems used.

As an immediate step for AI accountability that does not require new legislation, federal government procurement requirements should take advantage of AI best practices, standards and guidance where appropriate and available, such as in the NIST AI RMF and the Blueprint for an AI Bill of Rights, adding force of law as appropriate to adapt key elements of voluntary soft-law guidance into mandatory hard-law requirements.

Accountability Inputs and Transparency

21. What are the obstacles to the flow of information necessary for AI accountability either within an organization or to outside examiners? What policies might ease researcher and other third-party access to inputs necessary to conduct AI audits or assessments?

There are several obstacles to gaining access to data and information to support AI accountability. First, industry can legitimately claim that data and models are proprietary, but these claims could also be exploited to evade oversight. To address this, third-party auditors/assessors could gain access to data and models via a secure enclave, review tests conducted by internal researchers, and have entities like the FTC assist in determining whether data and models can be shared with third parties. Second, AI systems that use personally identifiable information (PII) protected under state or federal law may be limited in their ability to share data with third-party auditors. To address this, third parties may be able to use masked or redacted data or other PII-protection approaches in audits and assessments. Third, it is imperative that AI developers and users maintain records, such as data provenance, model development and testing, risk mitigations, and continuous monitoring performance. These materials will be invaluable to third-party auditors and assessors. Guidance is needed on what should be included in these records and how they should be provided to third parties.

23. How should AI accountability “products” ( e.g., audit results) be communicated to different stakeholders? Should there be standardized reporting within a sector and/or across sectors? How should the translational work of communicating AI accountability results to affected people and communities be done and supported?

AI risk communication deserves significant attention. Current practices of communication, for example releasing long “model cards,” “system cards,” or audit results are incredibly important, but are not serving the needs of users or affected people and communities.⁶ The UC Berkeley Center for Long-Term Cybersecurity has published research that outlines best practices in effective digital risk communications including facilitating dialogue with users, providing actionable risk communication, and measuring the effectiveness of risk communications in an ongoing manner.⁷ AI companies should learn from decades of experience with risk communication from other sectors, and strive to be responsive to different stakeholders’ needs, producing messaging that is tailored to the specific context, offers choices, and is designed for accessibility.

Barriers to Effective Accountability

26. Is the lack of a federal law focused on AI systems a barrier to effective AI accountability?

There are many existing federal laws that support various aspects of AI accountability, and existing agencies and departments should be provided with sufficient resources to uphold these laws in the age of AI proliferation. However, the lack of a federal law specifically focused on AI accountability leaves gaps. While current soft-law voluntary standards and frameworks have helped, it would likely be much more effective to have hard-law regulatory requirements with potential for fines or other penalties for irresponsible actions. A federal law makes it perfectly clear to all companies, both within and beyond the national borders, what actions are required and what outcomes are unacceptable. Without a federal law, the incentives for companies to move quickly to deploy new AI technologies may be highly detrimental to the American people at large. Moreover, providing consistency in regulatory expectations at the national level could reduce industry uncertainty about the prospect of a patchwork of regulations across industry sectors and geographic regions.

29. How does the dearth of measurable standards or benchmarks impact the uptake of audits and assessments?

We would not characterize the current situation as completely lacking in measurable standards for AI. In developing our GPAIS profile guidance, one of our key criteria for draft guidance has been measurability or at least documentability. NIST seemed to take a similar approach in developing AI RMF guidance.

AI Accountability Policies

30. What role should government policy have, if any, in the AI accountability ecosystem? For example:

a. Should AI accountability policies and/or regulation be sectoral or horizontal, or some combination of the two?

AI accountability policies and regulations should be a combination of sectoral and horizontal. The latest agreement on the EU AI Act in the European Parliament exemplifies this: it includes sectoral regulations, e.g., for particular high-risk application areas such as in critical infrastructure, as well as some horizontal regulations, e.g., in provisions for providers of GPAIS and foundation models that can be used in many sectors.

c. If a federal law focused on AI systems is desirable, what provisions would be particularly important to include? Which agency or agencies should be responsible for enforcing such a law, and what resources would they need to be successful?

First, federal procurement requirements and other regulations incorporating or adapting the NIST AI RMF should at least add force of law to important but otherwise voluntary statements in the NIST AI RMF, such as the following: “In cases where an AI system presents unacceptable negative risk levels — such as where significant negative impacts are imminent, severe harms are actually occurring, or catastrophic risks are present — development and deployment should cease in a safe manner until risks can be sufficiently managed.” (NIST AI RMF Version 1.0, p.8) Such provisions will be necessary to adapt the soft-law voluntary guidance of the NIST AI RMF into appropriate hard-law requirements and regulations.

Second, a federal law focused on AI should mandate reasonable testing, evaluation, and monitoring of AI systems to ensure that they uphold the characteristics of trustworthiness defined in the NIST AI RMF, including that they are appropriately valid and reliable, safe, secure and resilient, accountable and transparent, explainable and interpretable, privacy-enhanced, and fair with harmful biases managed. If the AI system interacts with people, impacts people, or relies on human data, a summary of the testing and evaluation should be publicly disclosed.

Third, high risk uses and unacceptable uses of AI should be defined. For example, it should be made clear that high risk uses of AI such as in critical infrastructure should be subject to stringent oversight, and that unacceptable uses of AI systems include cases with imminent significant negative impacts, severe harms, or catastrophic risks to human safety and rights.

Fourth, developers of LLMs, foundation models, or other GPAIS should be accountable for AI trustworthiness and risk-management tasks, such as AI development and testing tasks to characterize knowledge limits, for which they are better placed (and often better resourced) than others in the AI value chain. Responsibilities for risk assessment and risk management should not fall only on downstream developers of applications built on GPAIS platforms, nor only on deployers of AI systems in particular end-use categories.

Lastly, of course, any agency responsible for enforcing a law must have the authority and resources to enforce compliance. NIST received authorization from Congress to develop the AI RMF only as voluntary guidance for industry and other actors. As far as we are aware, neither NIST nor any other federal agency currently has regulatory authority and policies in place that require industry to use the AI RMF, nor the resources to assess and enforce compliance.

Footnotes

  1. See, e.g., https://ainowinstitute.org/publication/gpai-is-high-risk-should-not-be-excluded-from-eu-ai-act
  2. See, e.g., https://techpolicy.press/five-takeaways-from-the-nist-ai-risk-management-framework/
  3. See, e.g., https://ainowinstitute.org/publication/gpai-is-high-risk-should-not-be-excluded-from-eu-ai-act
  4. See our project page at https://cltc.berkeley.edu/seeking-input-and-feedback-ai-risk-management-standards-profile-for-increasingly-multi-purpose-or-general-purpose-ai/
  5. See https://carrcenter.hks.harvard.edu/files/cchr/files/nonnecke_and_dawson_human_rights_implications.pdf
  6. See https://techpolicy.press/how-should-companies-communicate-the-risks-of-large-language-models-to-users/
  7. See https://cltc.berkeley.edu/publication/designing-risk-communications-a-roadmap-for-digital-platforms/