An ‘Ethical Turing Test’ for Autonomous Artificial Agents?
posted by Ken Anderson
My first encounters with legal issues of autonomous artificial agents came a few years ago in international law of autonomous lethal weapons systems. In an email exchange with an eminent computer scientist working on the problems of engineering systems that could follow the fundamental laws of war, I expressed some doubt that it would be quite so easy as all that come up with algorithms that could, in effect, “do Kant” (in giving effect to the categorical legal imperative not to target the civilians). Or, even more problematically, “do Bentham and Mill” (in providing a proportionality calculus of civilian harm set against military necessity). Indeed (I noted primly, clutching my Liberal Arts degree firmly in the Temple of STEM), we humans didn’t have an agreed upon way of addressing the proportionality calculus ourselves, given that it seemed to invoke incomparable and incommensurable values. So how was the robot going to do what we couldn’t?
The engineer’s answer was simultaneously cheering and alarming, but mostly insouciant: ‘I don’t have to solve the philosophical problems. My machine programming just has to do on average as well or slightly better than human soldiers do.’ Which, in effect, sets up what we might call an “ethical Turning Test” for the ideal autonomous artificial agent. ”Ethics for Robot Soldiers,” as Matthew Waxman and I are calling it in a new project on autonomous robotic weapons. If, in practice, we can’t tell which is the human and which is the machine in matters of ethical decision-making, then it turns out not to matter how we get to that point. Getting there means, in this case, not so much human versus machine, but instead behaviorism versus intentionality.
It is on account of reflections on autonomous robot soldiers of the (possible) future that I so eagerly read Samir Chopra and Laurence White’s book. It does not disappoint. It is the only general theory of what might emerge across multiple areas of law over the next few decades. Still more importantly in my view, it is the only account on offer that manages to find the sweet spot between a sci-fi speculation so rampant that it merely assumes away the problems by making artificial agents into human beings, on the one hand, and so granular that it does not offer a theory of agents and agency, rather than a collection of discrete legal problems, on the other. It accomplishes all this splendidly.
But it precisely because the text finds that sweet spot that I have a nagging question – one that is perhaps answered in the book but which I simply didn’t adequately understand. But let me put it directly, as a way of understanding the book’s fundamental frame. In the struggle between behaviorism and the “intentional stance” that runs throughout the book, but particularly in its encounters with the law of agency, and particularly as found in the Restatement, I was not sure where the argument finally comes down as regarding the status of intentionality. At some points, it did seem to be an irreducible aspect of certain behaviors, insofar as those behaviors could only be such under an intentional description, such as human relationships. But sometimes it seemed as though intentionality was an irreducible aspect of human behavior – even though the artificial agent might still pass the Turing Test on a purely behavioral basis and be indistinguishable from the human.
At still other points, I thought I was to understand that intentionality was no longer an ontological status, but something closer to an “organizational heuristic” for how human beings direct themselves toward particular goals – a human methodology, true, but merely one way of going about means to ends behaviors, in which an artificial agent might accomplish the task quite differently. And in that case, I had a further question as to whether the underlying view of the “formation of judgment” was one that assumed the model of “supply ends, I’ll supply means” – or whether, instead, it held, at least as far as human judgment goes, a view that the formation of judgment does not cleanly separate them in this way. It seemed to matter, at least as far as the conceptualization of how the artificial agent made its judgments, and in what they would finally consist.
It is entirely possible that I have not understood something fundamental in the book, and the answer to what does “intention” mean in the text is actually quite plain. But this question, in relation to behaviorism and the artificial agent, is what I have found hardest to grasp. I suppose this is particularly so when, for good reasons, the book is mostly about behavior, not intention. The reason I find the question important is that it seems to me that many of the crucial relationships (and also judgments, per the worry above) that might be permitted, or ascribed, to artificial agents depend upon a certain relation – that of a fiduciary, for example, with all the peculiar “relational” positioning that is implied in that special form of agency.
Does being a fiduciary, then, at least in the strong sense of exercising discretion, imply relationships that only exist under a certain intention? Or relationships that might be said to exist only under a certain affect – love, for example? And does it finally matter? Or is the position taken by the book finally one that either reduces the intention to the sum of behaviors, or else suggests that for the purposes for which we create – “endow,” more precisely – artificial agents, behavior is enough, without it being under any kind of description? I apologize for being overly abstract and obscure here. Reduced to the most basic: what is the status, on this general theory, of intention? And with that question, let me say again: Outstanding book; congratulations!