_________________________________ PARALLEL INSIGHTS FOR AI SAFETY Dylan Holmes _________________________________ 8/Mar/2022 One aspect of AI safety resembles the issues around biosecurity: BIOSECURITY: Bioengineering promises advances in fields such as medicine and agriculture. With today's inexpensively available equipment, small distributed "backyard" operations can contribute alongside well-funded governments. Unfortunately, when misused, bioengineering poses global existential risks: whether created deliberately or released accidentally, an engineered bioweapon could cause massive suffering and potentially end all life on earth. The safety problem consists of finding ways to prevent disaster among so many independent operations and different purposes. However, AI itself is special (and uniquely dangerous) in that its engineering products are intelligent agents [1], potentially equal to or exceeding human intelligence. Thus another aspect of AI safety looks like issues around politics, ethics, or economics: ECONOMICS: How do you promote work aligned with human flourishing? How do you promote prosocial interactions among intelligent beings? What social arrangements, incentives, and rules will help? What are the cultural measures of well-being? Finally, much AI research nowadays is data-driven. As a result, it often feeds into a massive surveillance industry which captures, buys, and sells the microscopic details of our lives. This surveillance contributes to environmental waste, attention theft, bias automation, police overreach, and internet pollution---externalities that endanger our ability to determine our own lives. In this aspect, AI safety looks like issues around sovereignty: SOVEREIGNTY: How do you control how personal data is used and sold? How do you establish boundaries of privacy, fairness, survival, and peace when they run against powerful, indifferent profit motives? Why point out the common ground between AI safety and other domains like biosecurity, economics, and sovereignty? After all, the existential dangers posed by machine intelligence are genuinely unique---unprecedented and leviathan in scope. First of all, this common ground exposes just how hard the problems of AI safety really are. For example, suppose we take seriously the analogy that a megacorporation is a sort of proto-artificial intelligence: it is a human-made construct, built out of rules, that pursues its value-maximizing goals independently of any one human being's control and possibly at the expense of human welfare. There is a /corporate/ value-alignment problem which I think is strictly easier than the familiar AI value-alignment problem---to put it in slogan form, I suspect if we knew enough to solve AI value alignment, we could leverage that knowledge to solve corporate alignment as a matter of course. Yet, in my view, achieving corporate alignment is a tangibly difficult project on its own [2]. Thus you can explain how difficult AI value alignment is by pointing to the comparatively more accessible problem of corporate value alignment. You can argue for how much work there is to be done on AI alignment by pointing out our dismal progress on developing and enacting a theory of corporate alignment. And, perhaps, you can dissect the problems with these lesser automata, our corporations, to make inroads on the full project of AI alignment [3]. Zeroing in on such areas of overlap can help us concretely identify potential weaknesses in our AI safety proposals. For example, if we field-test a method for safe AI engineering by applying it to a bioengineering problem, our failures can potentially expose areas where the AI method falls short. To be sure, the fit between AI safety and these parallel fields will never be perfect---e.g., you might be able to influence AI development through hardware regulations or algorithmic safeguards that do not transfer to, say, influencing bioweapon development. Nonetheless, parallel insights can provide a crucial source of ongoing feedback, intuition, and verification---and proactively, in advance of superintelligent AI. Finally, we can use areas of overlap to measure overall progress. We should expect /genuinely/ effective solutions in AI safety to have carryover applications in other areas: If you have a proposal to mitigate the risk of an uncontrolled AI outbreak, you ought to be better equipped to mitigate a /biological/ outbreak. If you learn how to align robotic agents with human interests, you ought to have learned something practical about corporate governance, etc. (Conversely, by studying what works and what doesn't in a field like modern biosafety, you ought to be able to extract lessons applicable to AI safety.) Common ground is a useful barometer: if we are succeeding at tackling the tremendous coordination, engineering, and political challenges to bring about AI safety, we should see spillover effects reverberating through other fields. We should be able to chart our progress, in part, by seeing where our disseminated knowledge has done useful work on other problems. I believe that parallel insights can help us assess and strengthen our AI safety methods. At the same time, I admit that I'm daunted by the implications of this approach. I've heard optimists suggest, for example, that some unknown fact about computation might cause sufficiently advanced human-like machines to preferentially develop human-aligned values. Where, I ask, is the analogous human-protective principle in microbiology [4]? I've tried to estimate how long until we've put an effective AI alignment strategy in place---and then I ask, as a proxy, how long until we attain prosocial megacorporations? Such parallel insights give one practical indication of how hard the work is. Perhaps parallel insights can also provide traction on this difficult work. By zeroing in on where AI safety overlaps with other domains, we can see more clearly what to try, what works, and what doesn't. The dangers of AI are unprecedented, but not completely strange. My hope is that, as we draw insights from other domains, the problems---and their solutions---will start to look usefully familiar. Footnotes _________ [1] Not that pathogens aren't intelligent and adaptive. [2] This line of reasoning is related to the computer scientist's idea of a reduction: you find that solving the big problem also demolishes the little problem---so if that little problem is hard, then the big problem must be very, very hard. [3] One such insight from thinking about the corporate case is that it's probably easier to build in prosocial constraints /before/ an AI is made rather than retrofit them to an existing AI---a potentially useful warning. [4] Actually there /are/ related principles in microbiology---such as the virulence-transmissibility trade-off---but the challenge there is to argue they're at all relevant to our purpose-built machines.