Probabilistic Risk Assessment - Promises, Benefits and Challenges

What is probabilistic risk assessment?

Probabilistic risk assessment (PRA) is a systematic methodology to evaluate risks associated with a complex engineered system such as an airline or nuclear power plant. This methodology, used by organizations such as the Nuclear Regulatory Commission and NASA, quantifies risks according to their magnitude and severity.

How can PRA be used to evaluate AI risk?

  1. Elicit scenarios, variables, and dependencies: Use relevant risk identification techniques to elicit scenarios, the most important variables determining the level of risks, and dependencies between variables. That includes event and fault-tree analysis, but could also include many other relevant techniques that have been described in existing work (e.g. Koessler et al., 2023). This part of the work should be done at least partly by industry.
  2. Elicit priors on extinction accidents based on organizational track record, including past incidents, near-misses, and accidents in the relevant category and past abilities of an organization to execute its plans. Data on past incidents can be used quantitatively to get a prior estimate of the chances of an existential accident, by estimating how much more likely an accident is than an extinction accident or a near-miss. Organizations can then extrapolate risk to avoid dangerous activities. Data on past abilities to execute can be used to qualitatively identify patterns, common causes, and trends. It can be used as a way to identify the plausibility of the lab's plan. Quantitatively, it can be included in the final estimate of a PRA.
  3. Provide all that relevant data to experts in both forecasting and AI to allow them to come up with an estimate of risks using well-defined questions.

This high-level description needs to be refined and improved. Many techniques could be used and it should be determined empirically what are the most efficient ways to provide valuable information for the PRA. It's likely that 80% of the effort is in the first two parts, yet very little effort to date has been invested by the industry into threat modeling and scenario analysis. In order for experts to confidently forecast something below, e.g. 1/100000, would require massive efforts in scenario analysis and threat modeling from industry and regulators.

Benefits & Challenges

Core Benefits

Setting a clear goal

PRA allows organizations to set a clear goal for industry to achieve. This has many benefits: 
  • It is very transparent and understandable from a policy standpoint.
  • It allows organizations to quantitatively assess how far safety currently is from the adequate level.
  • It is a direct estimate of what regulation actually targets (i.e. decreasing specific risks).
  • It provides clarity over the goals that industry should meet to push further.

Leveraging industry:

  • It drives industry towards better safety practices by prioritizing investment in those practices which have the best cost-benefit tradeoff for them.
  • It incentivizes industry to solve problems ahead of time because the targets are outcome-based rather than process-based.

Setting the right level of ambition and rigor

  • It has a proven track record of making the nuclear safety field much more rigorous and realistic in its analysis.
  • It helps to set the right level of ambition in an easily understandable way and allows pushing for specific goals in a clear way.

Core Challenges

  • Feasibility: By far the biggest challenge is to ensure the probability estimate is precise enough and feasible. Progress in the field of forecasting in the past 2 decades suggests that expert forecasters with AI expertise might be able to appropriately assess evidence and provide estimates that are reasonably sound (probably more so than nuclear safety experts before it was created). Anecdotally, early discussions with forecasters with AI expertise from the Samotsvety group suggest that it may be feasible.
  • Scalability: If such an analysis is feasible only by world-class AI forecasters, it raises a scalability issue. Indeed, if there are dozens of systems to assess, there won’t currently be enough talent to assess those. If PRA requires unusually rare talent, then it could be used only for the most dangerous frontier systems (e.g. GPT-5). Other methods that are less dependent on very rare skills or give less freedom to expert judgment could be used for less high-stakes systems.
  • Relevance: Given that current AI systems are trained end-to-end, that the weights of neural networks are not really possible to decompose, and that nuclear systems are much better understood, one might be worried that this method is not applicable because the necessary information can't be obtained.

Counterpoints to the above: 

  • Nuclear is better understood in large parts in hindsight. At the time of enforcement of PRA in nuclear, nuclear risks weren’t well understood
  • AI safety encompasses all the systems around the foundation models that are meant to make it safer. Hence, there is some relevance to thinking in terms of components of the system, the foundation model being only one of the elements. 
  • The fact that we don’t understand a foundation model will be factored into experts’ estimates of the probability, i.e. it’s hard to reach a very high degree of confidence about some system being safe if it’s not understood. Hence, PRA would be expected to incentivize industry players to invest much more into interpretability.

Relevant Links