Radical Transparency

Nothing that’s been said before, but it didn’t click until I thought about it some more and had an AHA! moment, so I’m doing my own write up.

Let’s say that you’re faced with a Newcomb problem[1].

The basic gist is this: Omega shows up, an entity that you know can predict your actions almost perfectly. Concretely, out of the last million times it has played out this scenario, it has been right 99.99% of the time[2]. Omega presents you with two boxes, of which box A contains $1000000 or nothing, and box B always contains $1000. You have only two choices, take just box A (one boxing) or take both box A and B (two boxing). The twist is that if Omega predicted you would two box, then A is empty, but if it predicted you would one box, then box A contains the $1000000.

Causal decision theory (CDT) is a leading brand of decision theories that says you should two box[3]: once Omega presents you with the boxes, Omega has already made up its mind. In that case, there’s no direct causal relationship between your choice and the boxes having money in them, so the box A already has $1000000 or nothing in it. So, it’s always better to two box since you always end up with $1000 more than you would otherwise.

People that follow CDT to two boxing claim that one boxing is irrational, and that Omega is specifically rewarding irrational people. To me it seems clear CDT was never meant to handle problems that include minds modeling minds: is it also irrational to show up in Grand Central station at noon in Schelling’s coordination problem, despite the lack of causal connection between your actions and the actions of your anonymous compatriot? So you might agree that CDT just doesn’t do well in this case[4] and decide to throw CDT out the window for this particular problem, netting ourselves an expected $999900.10 from one boxing[5], instead of the expected $1099.90 payout from two boxing.

But let’s throw in a further twist: let’s say the boxes are transparent, and you can see how much money is inside, and you see $1000000 inside box A, in addition to the $1000 inside box B. Now do you two box?


I previously thought “duh, of course”: you SEE the two boxes, both with money in them. Why wouldn’t you take both? A friend I respect told me that I was being crazy, but didn’t have time to explain, and I went away confused. Why would you still one box with an extra $1000 sitting in front of you?

(Feel free to think about the problem before continuing.)

The problem was that I was thinking too small: I was thinking about the worlds in which I had both boxes with money in them, but I wasn’t thinking about how often those worlds would happen. If Omega wants to maintain a 99.99% accuracy rate, it can’t just give anyone a box with $1000000. It has to be choosy, to look for people that will likely one box even when severely tempted.

That is, if you two box in clear-box situations and you get presented with a clear box with $1000000 in it, congratulations, you’ve won the lottery. However, people like you simply aren’t chosen often (at a 0.01% rate), so in the transparent Newcomb world it is better to be the sort of person that will one box, even when tempted with arguably free money.


The clear-box formulation makes it even clearer how Newcomb’s problem relates to ethics.

Yes, ethics. Let’s start with what Omega might put in an advertisement:

“I’m looking for someone that will likely one box when given ample opportunity to two box, and literally be willing to leave money on the table.”

Now, let’s replace some words:

“I’m looking for a <study partner> that will likely <contribute to our understanding of the class material> when given ample opportunity to <coast on our efforts>.”

“I’m looking for <a startup co-founder> that will likely <help build a great business> when given ample opportunity to <exploit the business for personal gain>.”

“I’m looking for <a romantic partner> that will likely <be supportive> when given ample opportunity to <make asymmetric relationship demands>.”

In some ways these derived problems are wildly different: these (lowercase) omegas don’t choose correctly as often as 99.99% of the time, there’s an iterated aspect, both parties are playing simultaneously, and there’s reputation involved[6]. But the important decision theory core carries over, and moreover it generalizes past “be nice” into alien domains that include boxes with $1000000 in them, and still correctly decides to get the $1000000.


[1]  I agree for most intents and purposes that the Parfit’s Hitchhiker formulation of the problem is strictly better because it lacks problems that commonly trip people up in Newcomb’s problem, like needing a weird Omega. However, then you get the clear-box problem right away, and I’m going for more incremental counter-intuitive-ness right now.

[2]  Traditional Newcomb problem formulations started with a perfect predictor, but it becomes a major point that people get tripped up over because it’s so damn “unrealistic”. I’m sure no one would object to Omega never losing tic-tac-toe, but no one seems to want to accept a hypothetical entity that can run TEMPEST attacks on human brains and do inference really well. Whatever, it’s ultimately not important to the problem, so it’s somewhat better to place realistic bounds on Omega.

[3]  Notably, Evidential Decision Theory says you should one box, but fails on other problems, and makes it a point to avoid getting news (which isn’t the worst policy when applied to most common news sources, but this applies to all information inflow).

[4]  I haven’t really grokked it, but friends are excited about functional decision theory, which works around some of the problems with CDT and EDT.

[5]  It’s not exactly $1000000, since Omega isn’t omniscient and only has 99.99% accuracy, so we have to take the average of the outcomes weighted by their probability to get the overall expected outcome ($1000000 * 0.9999 + $1000 * 0.0001 = $999900.10).

[6]  Notably, it starts to bear some resemblance to the iterated prisoner’s dilemma.