Wiki Contributions

Comments

Answer by Chi NguyenMay 03, 202420

I watched and read a ton of Lab Muffin Beauty Science when I got into skincare. Apart from Sunscreen, I think a lot of it is trial and error with what has good short-term effects. I'm not sure about long-term effects at all tbh. Lab Muffin Beauty Science is helpful for figuring out your skin type, leads for which products to try first, and how to use them. (There's a fair number of products you wanna ramp up slowly and even by the end only use on some days.)

Are there types of published alignment research that you think were (more likely to be) good to publish? If so, I'd be curious to see a list.

Agree-vote: I generally tend to choose work over sleep when I feel particularly inspired to work.

Disagree-vote: I generally tend to choose to sleep over work when even when I feel particularly inspired to work.

Any other reaction, new answer or comment, or no reaction of any kind: Neither of the two descriptions above fit.

I considered making four options to capture the dimension of whether you endorse your behaviour or not but decided against it. Feel free to supplement this information.

Interesting. The main thing that pops out for me is that it feels like your story is descriptive while we try to be normative? I.e. it's not clear to me from what you say whether you would recommend to humans to act in this cooperative way towards distant aliens, but you seem to expect that they will do/are doing so. Meanwhile, I would claim that we should act cooperatively in this way but make no claims about whether humans actually do so.

Does that seem right to you or am I misunderstanding your point?

Letting on-lookers know that I responded in this comment thread

I'm not sure I understand exactly what you're saying, so I'm just gonna write some vaguely related things to classic acausal trade + ECL:

 

I'm actually really confused about the exact relationship between "classic" prediction-based acausal trade and ECL. And I think I tend to think about them as less crisply different than others. I've tried to unconfuse myself about that for a few hours some months ago and just ended up with a mess of a document. Some intuitive way to differentiate them:

  • ECL leverages the correlation between you and the other agent "directly."
  • "Classic" prediction-based acausal trade leverages the correlation between you and the other agent's prediction of you. (Which, intuitively, they are less in control of than their decision-making.

--> This doesn't look like a fundamental difference between the mechanisms (and maybe there are in-betweeners? But I don't know of any set-ups) but like...it makes a difference in practice or something?

 

On the recursion question:

I agree that ECL has this whole "I cooperate if I think that makes it more likely that they cooperate", so there's definitely also some prediction flavoured thing going on and often, the deliberation about whether they'll be more likely to cooperate when you do will include "they think that I'm more likely to cooperate if they cooperate". So it's kind of recursive.

Note that ECL at least doesn't strictly require that. You can in principle do ECL with rocks "My world model says that conditioning on me taking action X, the likelihood of this rock falling down is higher than if I condition on taking action Y." Tbc, if action X isn't "throw the rock" or something similar, that's a pretty weird world model.  You probably can't do "classic" acausal trade with rocks?

 

Some more not well-in-order not thought-out somewhat incoherent thinking-out-loud random thoughts and intuitions:

More random and less coherent: Something something about how when you think of an agent using some meta-policy to answer the question "What object-level policy should I follow?", there's some intuitive sense in which ECL is recursive in the meta-policy while "classic" acausal trade is recursive in the object-level policy. I'm highly skeptical of this meta-policy object-level policy thing making sense though and also not confident in what I said about which type of trade is recursive in what.

Another intuitive difference is that with classic acausal trade, you usually want to verify whether the other agent is cooperating. In ECL you don't. Also, something something about how it's great to learn a lot about your trade partner for classic acausal trade and it's bad for ECL? (I suspect that there's nothing actually weird going on here and that this is because it's about learning different kinds of things. But I haven't thought about it enough to articulate the difference confidently and clearly.)

The concept of commitment race doesn't seem to make much sense when thinking just about ECL and maybe nailing down where the difference comes from is interesting?

Thanks! I actually agree with a lot of what you say. Lack of excitement about existing intervention ideas is part of the reason why I'm not all in on this agenda at the moment. Although in part I'm just bottlenecked by lack of technical expertise (and it's not like people had great ideas for how to align AIs at the beginning of the field...), so I don't want people to overupdate from "Chi doesn't have great ideas."

With that out of the way, here are some of my thoughts:

  • We can try to prevent silly path-dependencies in (controlled or uncontrolled i.e. misaligned) AIs. As a start, we can use DT benchmarks to study how DT endorsements and behaviour change under different conditions and how DT competence scales with size compared to other capabilities. I think humanity is unlikely to care a ton about AI's DT views and there might be path-dependencies. So like, I guess I'm saying I agree with "let's try to make the AI philosophically competent."
    • This depends a lot on whether you think there are any path-dependencies conditional on ~solving alignment. Or if humanity will, over time, just be wise enough to figure everything out regardless of the starting point.
    • One source of silly path-dependencies is if AIs' native DT depends on the training process and we want to de-bias against that. (See for example this or this for some research on what different training processes should incentivise.) Honestly, I have no idea how much things like that matter. Humans aren't all CDT even though my very limited understanding of evolution is that it should, in the limit, incentivise CDT.
    • I think depending on what you think about the default of how AIs/AI-powered earth-originating civilisation will arrive at conclusions about ECL, you might think some nudging towards the DT views you favour is more or less justified. Maybe we can also find properties of DTs that we are more confident in (e.g. "does this or that in decision problem X" than whole specified DTs, which, yeah, I have no clue. Other than "probably not CDT."
  • If the AI is uncontrolled/misaligned, there are things we can do to make it more likely it is interested in ECL, which I expect to be net good for the agents I try to acausally cooperate with. For example, maybe we can make misaligned AI's utility function more likely to have diminishing returns or do something else that would make its values more porous. (I'm using the term in a somewhat broader way than Bostrom.)
    • This depends a lot on whether you think we have any influence over AIs we don't fully control.
  • It might be important and mutable that future AIs don't take any actions that decorrelate them with other agents (i.e. does things that decrease the AI's acausal influence) before they discover and implement ECL. So, we might try to just make it aware of that early.
    • You might think that's just not how correlation or updatelessness work, such that there's no rush. Or that this is a potential source of value loss but a pretty negligible one.
  • Things that aren't about making AIs more likely to do ECL: Something not mentioned, but there might be some trades that we have to do now. For example, maybe ECL makes it super important to be nice to AIs we're training. (I am mostly lean no on this question (at least for "super important") but it's confusing.) I also find it plausible we want to do ECL with other pre-ASI civilisations who might or might not succeed at alignment and, if we succeed and they fail, part-optimise for their values. It's unclear to me whether this requires us to get people to spiritually commit to this now before we know whether we'll succeed at alignment or not. Or whether updatelessness somehow sorts this because if we (or the other civ) were to succeed at alignment, we would have seen that this is the right policy, and done this retroactively.

Yeah, you're right that we assume that you care about what's going on outside the lightcone! If that's not the case (or only a little bit the case), that would limit the action-relevance of ECL.

(That said, there might be some weird simulations-shenanigans or cooperating with future earth-AI that would still make you care about ECL to some extent although my best guess is that they shouldn't move you too much. This is not really my focus though and I haven't properly thought through ECL for people with indexical values.)

Whoa, I didn't know about this survey, pretty cool! Interesting results overall.

It's notable that 6% of people also report they'd prefer absolute certainty of hell over not existing, which seems totally insane from the point of view of my preferences. The 11% that prefer a trillion miserable sentient beings over a million happy sentient beings also seems wild to me. (Those two questions are also relatively more correlated than the other questions.)

Thanks, I hadn't actually heard of this one before!

edit: Any takes on addictiveness/other potential side effects so far?

Load More