One question I've had recently is "Are agents acting on selfish preferences doomed to having conflicts with other versions of themselves?" A major motivation of TDT and UDT was the ability to just do the right thing without having to be tied up with precommitments made by your past self - and to trust that your future self would just do the right thing, without you having to tie them up with precommitments. Is this an impossible dream in anthropic problems?
In my recent post, I talked about preferences where "if you are one of two copies and I give the other copy a candy bar, your selfish desires for eating candy are unfulfilled." If you would buy a candy bar for a dollar but not buy your copy a candy bar, this is exactly a case of strategy ranking depending on indexical information.
This dependence on indexical information is inequivalent with UDT, and thus incompatible with peace and harmony.
To be thorough, consider an experiment where I am forked into two copies, A and B. Both have a button in front of them, and 10 candies in their account. If A presses the button, it deducts 1 candy from A. But if B presses the button, it removes 1 candy from B and gives 5 candies to A.
Before the experiment begins, I want my descendants to press the button 10 times (assuming candies come in units such that my utility is linear). In fact, after the copies wake up but before they know which is which, they want to press the button!
The model of selfish preferences that is not UDT-compatible looks like this: once A and B know who is who, A wants B to press the button but B doesn't want to do it. And so earlier, I should try and make precommitments to force B to press the button.
But suppose that we simply decided to use a different model. A model of peace and harmony and, like, free love, where I just maximize the average (or total, if we specify an arbitrary zero point) amount of utility that myselves have. And so B just presses the button.
(It's like non-UDT selfish copies can make all Pareto improvements, but not all average improvements)
Is the peace-and-love model still a selfish preference? It sure seems different from the every-copy-for-themself algorithm. But on the other hand, I'm doing it for myself, in a sense.
And at least this way I don't have to waste time with precomittment. In fact, self-modifying to this form of preferences is such an effective action that conflicting preferences are self-destructive. If I have selfish preferences now but I want my copies to cooperate in the future, I'll try to become an agent who values copies of myself - so long as they date from after the time of my self-modification.
If you recall, I made an argument in favor of averaging the utility of future causal descendants when calculating expected utility, based on this being the fixed point of selfish preferences under modification when confronted with Jan's tropical paradise. But if selfish preferences are unstable under self-modification in a more intrinsic way, this rather goes out the window.
Right now I think of selfish values as a somewhat anything-goes space occupied by non-self-modified agents like me and you. But it feels uncertain. On the mutant third hand, what sort of arguments would convince me that the peace-and-love model actually captures my selfish preferences?
If one cares about their copies because their past self self-modified to a stable point, then what matters are the preferences of this causal ancestor. If I don't want my preferences to be satisfied if I am given a pill that makes me evil, then I will self-modify so that if one of my future copies takes the evil pill, my other future copies will not help them.
In other words, there is absolutely not one true definition here.
However, at a minimum, agents will self-modify so that copies of them with the same values and world-model, but who locate themselves at different places within that model, will sacrifice for each other.
You are just giving yourself a large incentive to lie to your alter ego if you suspect that you are diverging. That doesn't sound good.
On the original post: I don't think that it's practical to commit to something like that right now as a human. I have the same problem with TDT. I can agree that self modifying is best, but still not do as I would wish to have precommitted. But as we're talking about cloning here anyway, we can assume that self-modification is possible, in which the question arises whether this modification has positive expected utility. I think it does, but you seem to be trying to say that you wouldn't need to modify, as each side would stay selfish but still do what they would have preferred in the past. Why would you continue doing something that you committed to if it no longer has positive utility?
Would you pay the traveler in Parfit's hitchhiker as a selfish agent? If not, why cooperate with your alter ego after you find out that you are B? (Yes, I'm comparing this to Parfit's hitchhiker with your commitment to press the button if B analogous to a commitment to give money later. It's a little different as it's symmetrical, but the question of whether you should pay up seems isomorphic. Assuming the traveler isn't reading your mind, in which case TDT enters the picture.)