Measuring Shadows: Newcomb's Decision

Friday, November 23, 2012

Newcomb's Decision

This post is partially a continuation of my previous posts on utilitarianism, and partially on philosophy in general; mostly, it's my two cents on one of the odder parts of consequentialist debate: decision theories.

Newcomb's Paradox

You, a mere mortal, encounter P, some super smart alien. Or maybe it's a supercomputer, or maybe a god; versions of the paradox differ on this. P comes up to you and says: "I have a deal for you. I'm going to give you two boxes--box A, and box B. Box B is transparent, and you can see $1,000 in it. You can't see what's in box A. I'm going to give you two choices. The first is to take box A--you get whatever is in it. The other choice is to take both boxes--you get box A, plus the $1,000 from box B."

So, you ask, why don't you take both boxes, getting the free $1,000? Well, says P, there's a catch: "I have predicted whether you will take one box or two boxes." (Or maybe I've simulated all of the atoms in the universe, or maybe studied your psychology, or maybe something else--versions of the paradox differ in how P knows how many boxes you're going to take. But however he knows it, you believe him; maybe he has, in the past, predicted everyone who's taken this challenge successfully.) "So I know what you're going to do", says P, "and before you arrived I decided how much money to put in box A. If I predicted that you were going to take only box A, I put $1,000,000 in it. Otherwise--if I predicted that you were going to take both boxes--I left box A empty."

"So", says P, "How many boxes do you want to take?"

---

So, how many boxes should you take? Well, this "paradox", and ones like it, have spawned countless arguments over "decision theories". I would try to define them but I think it's easier to see the two main decision theories by example of how many boxes they take. The first type of decision theory, evidential decision theory, says: well, what behavior is consistent with the highest expected value for me? If I take one box (box A), I will get the $1,000,000 from it; if I take two, P won't put any money in box A, and so I'll just get the $1,000 from box B. So, the evidential decision theorist would only take box A.

Causal decision theorists, on the other hand, say: what actions will cause the best results? So, a causal decision theorist would say: if I take two boxes, then that'll cause me to get the $1,000 in box B, whereas if I only take one box I won't, and since P has already decided whether to put the money in box A, my decisions can't cause the money to exit or not exit. And so the causal decision theorist would take both boxes.

So, who's right? Well, the one-box-taking evidential decision theorist will take only box A, and--per the assumptions of the problem--find $1,000,000 in it. The two-box-taking causal decision theorist, on the other hand, will take both box A and box B, confident that their actions can't change what's in box A--and then will find box A to be empty, and end up with $1,000.

So is evidential decision theory (ED) correct? Did the causal decision theorist (CD) throw away $999,000 because they insisted on taking the $1,000 from box B?

Well, let's back up a second. Why was there $1,000,000 in box A when the ED opened the box, but not when the CD did? What exactly do we mean when we say that P knows how many boxes you will pick?

One of two things is true. Either we live in a universe where, prior to you making your choice, P knows how many boxes you'll take, or we don't.

Say P doesn't know with certainty. He's pretty sure--he's studied you a lot and studied psychology a lot and is pretty damn sure he knows how many boxes you'll take--but theoretically he could be wrong. Well, in this case the two-box-taking CD's mistake was in how he lived his life up until that point. He made lots of decisions that made P think he would take two boxes, and so P didn't put anything in box A for him. If he really wanted that $1,000,000 he should have written lots of blog posts during his life about how he would take only one box so that when the day came he could convince P to put the $1,000,000 in box A--and then taken both boxes, to get the full $1,001,000. But he didn't, and now that he's sitting there with both boxes in front of him he might as well take both of them--P has already decided that there isn't going to be any money in the first box.

But what if P knows for sure? What if your blog posts can't fool him? What if he's simulated every atom in the universe and knows whether you'll take one box or two? Then shouldn't you choose to be an evidential decision theorist, and choose to take only box A, so that you can get the $1,000,000?

Well, the trick is in the word choose. If P knows for sure how many boxes you'll take then it's already been decided--and it doesn't mean anything to talk about how many boxes you should choose to take. It's already been decided how many boxes you're taking.

My point, I guess, is that evidential decision theory only makes sense in a universe where (a) P is sure how many boxes you'll take, but (b) you still have the option to take either one or both. But these are contradictory assumptions--the contradictory assumptions behind Newcomb's paradox.

In fact, causal decision theory is the same thing as evidential decision theory in non-contradictory universes. There is no distinction between actions that happen if and only if you make some decision with actions caused by that decision except in inconsistent universes where something can be dependent on a decision but somehow not causally related to it.

So what would happen if P offered this deal to me? Well, I'd talk a lot about how much I only intend to take one box, but P wouldn't buy it; he'd leave box A empty, and I'd take both boxes for a total of $1,000.

But if my goal is to get more than $1,000 from this process I've been screwed for a while. I've been screwed since first thought through this problem and realized that it made no sense to take only one box. I've been screwed since the minute I was born and P realized I was going to be a two-boxer. There's nothing I can do about it.

Though writing this blog post certainly won't help.

28 comments:

Anders KaseorgNovember 23, 2012 at 1:53 AM
If the reason P knows how many boxes you’d take is that he simulated every atom in the universe, then between the time he made the offer and the time you decided, how do you know that you are not part of the simulation?

That is why I’m a one-boxer. (Universe simulators take note.)
ReplyDelete
Replies
UnknownNovember 23, 2012 at 11:48 AM
I think I disagree with both your claimed requirements for one-box to make sense. (Note that this does not mean I disagree with your arguments why two-box makes sense :P). I could elaborate, but that's not the point.

I'm a physicist. Theory is good, but the real way to learn the truth is with experiments. Sam, let's do an experiment. We'll each write an AI for this game (with no randomness; if you allow coin flipping, then the game gets even more interesting :P). An impartial observer will read the the AIs, perhaps run some diagnostic experiments, and then put in money the boxes and let them play. If you want the Oracle to be less than perfect, we can instead do this: don't show him the code, but rather show him: with 1% probability the result of a single run of the code, and with 99% probability a single random (50-50) result. Then the oracle has got a 50.5% chance; good enough for me.

We'll run a million trials, and see who has a higher expected value. No matter what theoretical discussions we have, we must believe that whichever strategy makes more money on average is a better strategy.

-David

PS: Here's my AI:

def newcomb(whatever-inputs):
return 1
ReplyDelete
Replies
Jeff WuNovember 23, 2012 at 1:22 PM
The real choice is not whether to one-box or two-box. The choice is, "Which decision theory should I use?". Without explicitly introducing this choice, I get confused reasoning about the relationship between P's decision and your boxing decision.

So you have the option of choosing a decision theory which precommits to taking only one box. And this decision causes you to end up with more money. So shouldn't you choose a decision theory like that?

You say "There's nothing I can do about [being screwed]". But now it's clear that there is something you can do: choose to be the type of person to one-box. Of course, once you decide it makes no sense, you're indeed screwed. But surely it was you that screwed yourself over, then?

Also, it's fine for there to be noise between the arrows.
ReplyDelete
Replies
Jeff WuNovember 23, 2012 at 1:39 PM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownNovember 25, 2012 at 12:18 PM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownNovember 25, 2012 at 12:47 PM
Sam, I'm curious what you would do in this game:

2 people, call them A and B are going to play a prisoner's dilemma; they each write down "1" or "2." Afterward if they both wrote 1, they both get a thousand dollars; if they both write 2 they get a dollar, and if one writes 1 and one writes 2, then mister 1 gets nothing and mister 2 gets $1,001.

I tell you that you are player B, and that player A's made his move earlier; his move is written in this envelope right here.

ok, that's just a regular prisoner's dilemma; probably you should defect like in a regular prisoner's dilemma.

But now I tell you that player A was you yesterday. I took you into this experiment and told you everything I'm telling you now, including assuring you that player A's move was pre-written (although at the time I was lying and the envelope was empty). But then instead of evaluating your move, I wiped your memory and sent you home, and put your move into envelope A. In fact, you actually have no idea if, right now, it is really the last move of the game, or if I'm going to wipe your memory, put your move in the envelope, and bring you back tomorrow.

What do you write?

If you want to stand by 2-boxing, you have to say one of two things:
1) You would still defect in a prisoner's dilemma where you get to write both moves (they have to be the same, but you get to write both).
2) This is different from Newcomb's paradox, despite the fact that there's an envelope on the table which I claim predicts your move with reasonable confidence, and if it says "1" then you get $1000 for "1" and $1 for "2", and if it says "2" then you get $0 for "1" and $1 for "2".
ReplyDelete
Replies
UnknownNovember 26, 2012 at 7:10 PM
You're right, your scenario #2 is a much better way to phrase it, because you don't care about your twin's payoff. I think you can make stronger claims than "I would cooperate" and "it's similar to Newcomb's paradox."

Surely everyone would cooperate in scenario #2: if you wouldn't cooperate, then I could tell you that this person has behaved opposite you in only 20% of circumstances. If you would still defect, then what if it were 1%? What about 0.001%? 0.1^10^30? There can't be a discontinuity at 100%, because if there were you would have to defect in scenario 1 because there's a *chance* that all the air molecules in the room will gather themselves in just the right way to refract the light so that your reflection behaves differently from you (I think it's about 0.1^10^30, plus or minus a few orders of magnitude in the order of magnitude.)

Also surely scenario #2 is the same as newcomb's paradox; just rename "twin" to "oracle," ask him for his move yesterday, and tell him to submit his move in the form of an amount of money in box #1, and you're done.
ReplyDelete
Replies
Jeff WuNovember 28, 2012 at 1:40 AM
Do you really mean that "everyone would cooperate"? I predict at least 10% of people would defect in scenario 2, but not scenario 1. I don't understand your argument, even at a high level. The less they are like you, the better it is to defect, no? I don't understand what you're saying with the discontinuity either.

But yeah, it seems like it's the same as the version of Newcomb's paradox where P uses a psychological twin to predict your move.

While I think anyone who defects in scenario 1 is completely absurd, I find it is more difficult to make an argument in favor of scenario 2 than you suggest.

I think to me, it boils down to something like this: Take action such that you are (expected to be) in worlds where your objectives are achieved. Importantly, when updating the distribution of worlds you're in, you should condition on the fact that you took what actions you did. It's not about causality or "free will". In particular, when considering counterfactuals, your actions can update the distribution of worlds you were in, just like the actions of others can.

If you want to continue this conversation, feel free to email/gchat me: WuTheFWasThat. I realize I could subscribe my email, but it's more annoying.
ReplyDelete
Replies
AnonymousNovember 13, 2022 at 1:55 AM
You're a sick puppy. Get help.
ReplyDelete
Replies

Add comment

Measuring Shadows

Friday, November 23, 2012

Newcomb's Decision

Newcomb's Paradox

28 comments:

Contributors

Labels

Blog Archive