The Company as a Model for Misaligned AGI

31 May 2020

A common counterargument to AGIs going FOOM on us and placing the universe in a state we dislike is that there is a fundamental incompatibility between intelligence and having goals considered dumb by humans. (I’ve touched upon this in my previous post, under “But a superintelligent AGI couldn’t be this dumb!”)

This question is hard to settle conclusively, because it’s making a general statement about the possible existence of some kinds of generally intelligent minds. Our only real example of a generally intelligent mind is ours, so it’s easy to fall into an anthropocentric view, even though we have no particular reason to believe that all general intelligences are similar to ours. This means we should work to identify useful analogies of what other intelligences could look like.

In the last post, we talked about drug addicts as a possible partial model for AGIs. In my book review of Other Minds, I also discussed octopuses as a completely different type of mind, even though they’re not generally intelligent.

But there might be another good example we can draw intuitions from: companies. A company is not a literal AGI, but it can legally act as a person in many respects, and it can be considered a sort of hive intelligence, where the decisions a company takes are not that of any individual within it, but collective decisions of company members.

As Milton Friedman postulated, companies, at least large enough ones, have another interesting characteristic: they have a single terminal goal, making money. It may be true that smaller companies may have different goals or values, thanks to founders with a vision or activist investors, but this tends to go away as stockholders become more numerous and anonymous. Even large corporations with a strong culture and mission are required to care about profit, a reflection of their investors’ preferences.

Companies do have instrumental goals, things they want to achieve in order to make money, for instance building the best search engine, or controlling the most oil fields. But these goals are strictly subordinate to the terminal goal of making money, and, if an instrumental goal turns out ineffective at profit generation, a company will likely pivot into a new one, possibly totally rewriting its mission statement in the process.

Hopefully this is at least a little bit reminiscent of the paperclip maximizer and its single terminal goal of maximizing paperclips. There are clear differences too: a company’s actual goal isn’t just to make a lot of money for their shareholders, it’s to make a lot of real, inflation-adjusted money for their shareholders. In particular, a company can’t turn the Earth into dollar bills, enslave all humans, or even just hack into central bank mainframes to give its shareholders a pentillion dollars each, because any of these actions would devalue money to the point of uselessness.

But note that this restriction of not destroying the world comes about not as the result of the company doing some soul-searching and concluding that it would be ethically unacceptable, but as a mere consequence of the terminal goal being formulated in terms of real money. Given the near-complete lack of ethical consideration demonstrated in many industries (oil, tobacco, oil, bananas, oil, etc.), it seems clear that the profit-making goal is entirely disjoint from concern on behalf of companies for the welfare of humanity. If there was a way to destroy earth that happened to generate a lot of profit and not directly devalue money (hello, hard-to-model ecological externalities), it would probably happily be followed by many companies. One can imagine catastrophic scenarios where a company clones trillions of humans for slave labor on Mars, keeping Earth’s economy completely intact: real profit is being generated, humankind is safe, but this is still a very bad scenario.

As I’ve mentioned before, the goal of paperclip maximizing is actually not very plausible; in reality, we would expect more complex goals, much like real profit generation, which might imply many restrictions and constrain damage done to humanity in some ways, but still result in terrible outcomes.

And as Charlie Stross remarked in his “Dude, you broke the future!” talk, society is bad at making laws to counter companies’ harmful behavior, in part because of high law-making latency, even though companies are typically pretty slow, making decisions on the order of days at best. An AGI that can take new decisions in minutes would likely vastly outrun our ability to control it: a single person would not be intelligent enough to come up with protections against it, and a large group of people would be too slow to be effective.

To sum up, the profit goal came about as part of a system humans devised to maximize human welfare, so it’s an instrumental goal for humanity, but it is being executed as a terminal goal by companies, with no regard for the reasons it was put into place.

This is extremely similar to the paperclip maximizer concern. Slow, moderately intelligent companies can already optimize for their terminal value at the expense of humanity’s overall welfare, and despite our best efforts. A fast, opaque AGI could presumably vastly outperform them at the task of completely ignoring human welfare.