Paper Unavailable

Optimal Al Liability for Unavoidable Risks: A Game-Theoretic Perspective

Gleb Papyshev, Keith Jin Deng Chan, Sara Migliorinic

Abstract

Despite significant advancements in Al, unavoidable risks, such as specification gaming and hallucinations, remain as inherent features of these systems. Current regulatory frameworks, including initiatives like the EU Al Act, focus on risk prevention but fail to adequately address liability for harms caused by these unavoidable risks. To address this gap, we developed a game-theoretic model that examines the optimal liability framework for Al developers. Our model proposes a dynamic liability regime that incentivizes developers to invest in explainability practices. Under this framework, liability exposure decreases as developers demonstrate higher levels of explainability, thereby creating a direct economic incentive for improving interpretability. The regime links liability to explainability benchmarking, allowing courts to evaluate whether harm was truly unavoidable or attributable to deficiencies in the system design. The framework we advocate for is flexible and adaptive, relying on industry-driven benchmarking standards to ensure that liability rules evolve alongside technological advancements.

Previous
Previous

Wijk et al. — "RE-Bench"

Next
Next

Freedman et al. — "Active Teacher Selection for Reinforcement Learning from Human Feedback"