1-page outline of Carlsmith's otherness and control series

Nathan Young

1-page outline of Carlsmith's otherness and control series

4 min read24th Apr 20243 comments

16

Joe’s summary is here, these are my condensed takeaways in my own words. All links in this section are to the essays.

Outline

Carlsmith tackles two linked questions:
- How should we behave towards future beings (future humans, AIs etc)?
- What should our priors be about how AIs will behave towards us?
So the first question:
We worry about value in the future, how should we behave towards the future and future beings?
- (1) We could trust in some base goodness - the universe, god, AIs being good
- (2) We could accept that all future beings will be alien to us and stop worrying (here and here)
- (3) We could have moral systems or concepts of goodness/niceness
- (4) We could seize power over the future (here and here)
- (5) We could adopt a different poise which centred around notions like growth/harmony/ “attunement” (here and here)
(1)
- Yudkowskians don’t tend to trust in god or the universe. That’s part of their schtick. They don’t trust AIs will be good
(2)
- This is perhaps what Hanson argue for^[1]
- Yudkowskians do not buy this
  - Are they avoiding how alien future people will be?
  - How can they justify a notion of good that’s robust over time? (eg often Yudkowsky pushes against this notion)
  - Are they avoiding discomfort?^[2]
(3)
- Moral systems vary wildly at edge cases
- On the scale of the whole future, even good people controlling it might be a moral disaster. The notion of paperclipping elides this, because it is involuntary and aesthetically dull. But the arguments also extend to law abiding and even relatively joyful beings
(4)
- This powerseeking is hard to distinguish from the power seeking of AIs^[3] (also (here and here)
(5)
- There is some notion of attunement/ trust/ growth/ balance which rationalists and EAs are quite inimical to^[4]
- There seems some way of being which navigates how to interact with complex systems, letting them grow, tending to them without being able to dominate the or being dominated by them
- What might this look like?
Second question. How might AIs treat us?
- If we assume that we cannot trust (not 1,3) think it’s a problem (not 2) and ignore (5) then
  - it’s very easy to see AI as a tool while it is weaker than us and a competitor if it becomes stronger
  - Even if AI may be others ways than this, the safe option is to assume the competitor frame
- Under (5) AI might be
  - Something good but alien (like an octopus)
  - Something dangerous but not competitive (like a dead-eyed bear)
  - Something else. Other. Not us
Takeaways
- We should consider other ways to be towards those we could control but who might control us
- We should consider other relationships towards AI^[5]
- In AI discourse there is a lack of clarity in notions of attunement, respect, harmony in relation to the sub-optimal choices of other conscious beings (here and here)
- It is possible that our priors are driven by our lack of this notion^[6]^[7]

Some quotes I liked/was moved by:

Where Joe is quoting someone else I also link to the original source

On being ‘just statistics’

“Just” is rarely a bare metaphysic. More often, it’s also an aesthetic. And in particular: the aesthetic of disinterest, boredom, deadness. Certain frames – for example, mechanistic ones – prompt this aesthetic more readily. But you can spread deadness over anything you want, consciousness included. Cf depression, sociopathy, etc.

Werner Herzog, on the deadness of nature: (source, link to essay section)

“And what haunts me is that in all the faces of all the bears that Treadwell ever filmed, I discover no kinship, no understanding, no mercy. I see only the overwhelming indifference of nature. To me, there is no secret world of the bears, and this blank stare speaks only of a half-bored interest in food.”

From Yudkowsky: (source, link to essay section)

No rescuer hath the rescuer.
No Lord hath the champion,
no mother and no father,
only nothingness above.

Yudkowsky, on the death of his brother^[8], (source, link to essay section)

... Yehuda did not "pass on". Yehuda is not "resting in peace". Yehuda is not coming back. Yehuda doesn't exist any more. Yehuda was absolutely annihilated at the age of nineteen. Yes, that makes me angry. I can't put into words how angry. It would be rage to rend the gates of Heaven and burn down God on Its throne, if any God existed. But there is no God, so my anger burns to tear apart the way-things-are, remake the pattern of a world that permits this....

Haters gonna hate; atheists gonna yang^[9]; agents gonna power-seek

(link)

Utilitarianism does not love you, nor does it hate you, but you're made of atoms that it can use for something else

(link)

On whether it is wrong to cut down ancient trees?

And yet, for all this, something about just cutting down this ancient, living tree for lumber does, indeed, feel pretty off to me. It feels, indeed, like some dimension related to "respect" is in deficit.

Crossposted from my blog (link to be attached later because I formatted for lesswrong before substack)

What did you think of this? I have a much longer point by point summary and if 10 people sign up to a paid subscription of my blog I’ll finish and post that to them^[10].

Are there other pieces you would like summarised/translated into Nathanese?

^{^}
It’s funny to me that Carlsmith’s hierarchy of atheism seems to imply Hanson is the deepest atheist, disbelieving not only in God and the goodness of the universe but also that there is a stable notion of good over time. I softly endorse this
^{^}
Specific quote: "On the other hand, some sort of discomfort in trying to control the values of future humans persists (at least for me). I think Hanson is right to notice it – and to notice, too, its connection to trying to control the values of the AIs. I think the AI alignment discourse should, in fact, prompt this discomfort – and that we should be serious about understanding, and avoiding, the sort of yang-gone-wrong that it's trying to track."
^{^}
Specific quote: "Utilitarianism does not love you, nor does it hate you, but you're made of atoms that it can use for something else."
^{^}
Specific quote: "Indeed, for closely related reasons, when I think about the two ideological communities that have paid the most attention to AI risk thus far—namely, Effective Altruism and Rationalism—the non-green of both stands out."
^{^}
Specific quote: "Fear? Oh yes, I expect fear. But not only that. And we should look ahead to the whole thing."
^{^}
Specific quote: "I want to start this series by acknowledging how many dimensions of interspecies-relationship this narrative leaves out"
^{^}
To me, there is a slight undercurrent of this being a self-fulfilling prophecy/ vicious cycle- that we make a world of conflict slightly more likely by considering that world more likely than it is
^{^}
I find this quote tremendously moving. And some part of me sings in unison
^{^}
Carlsmith links the notion of powerseeking, agency, activity and a lack of trust and labels it 'yang'. I have thought using it a lot since
^{^}
And write more of this kind of stuff in future. This post took 5 - 15 hours more than if I'd just listened to the pieces. Getting it this short took a long time, as the saying goes, "If I had more time, I would have written a shorter letter" (seems we don't know who originally said this)

New to LessWrong?

Getting Started

FAQ

Library

Ethics & MoralityPriorsAI

Frontpage

16

1-page outline of Carlsmith's otherness and control series

24th Apr 2024

4Nathan Helm-Burger

2Nathan Young

New Comment

3 comments, sorted by

top scoring

Click to highlight new comments since: Today at 7:39 AM

[-]Nathan Helm-Burger11d40

Personally, I most enjoyed the first one in the the series, and enjoyed listening to Joe's reading of it even more than when I just read it. My top three are 1, 6, 7.

[-]Nathan Young11d20

I sort of don't think it hangs together that well as a series. Like I think it implies a lot more interesting points that it makes, hence my reordering.

[-]Nathan Young11d20

Someone said they dislike bullet point lists. Here is the same piece formatted as paragraphs. Do you prefer it? (in which case I will edit and change it)

Carlsmith tackles two linked questions:

How should we behave towards future beings (future humans, AIs etc)?
What should our priors be about how AIs will behave towards us?

Let's consider the first question - what should our poise be towards the future? Perhaps we are worried about the future being less valuable than it could be or that humans will be killed or outcompeted by AI.

The blog posts contain a range of answers to this question, broadly categorised as follows:

We could trust in some base goodness - the universe, god, AIs being good
We could accept that all future beings will be alien to us and stop worrying (see here and here)
We rely on moral systems or concepts of goodness/niceness
We could seize power over the future (see here and here)
We could adopt a different poise which centred around notions like growth/harmony/ “attunement” (here and here)

Let's look at each. Generally each point links to a blog or pair of blog posts.

Trusting in basic goodness, eg god or the universe. I might think God holds the future in His hands, or more broadly that things tend to be okay. Carlsmith considers a distrust of this as a feature of Yudkowskianism, which he labels "deep atheism". Not merely not trusting in God, but not trusting that things will be 'okay' unless we make them so. For similar reasons, Yudkowskians don't assume AIs will be good. AIs will be good. For them this isn't a good answer.

Next, I might decide this isn't fixable. Hanson argues future people of any stripe might be as deeply alien to us as we are to, say the Ancient Greeks. He doesn't expect a future we consider good to be possible or, likely, desirable. Carlsmith notes that Yudkowskians don't hold this view and muses why. Are they avoiding how alien future people will be? Do they have a clear notion of good that's robust over time? ( Yudkowsky doensn't seem to think so). Or are they avoiding thinking about something uncomfortable?

Many answers seem to rely on moral systems, but these present their own problems. Moral systems vary wildly at edge cases meaning that at the scale of the future, many beliefs systems would advocate for seizing control against others. We fear paperclipping partially because it is involuntary and aesthetically dull. But the arguments also extend to law abiding and even relatively joyful beings taking increasing control of the future via legal and positive sum means.

However, without the above, most justifications for seizing control of the future look like those of the AIs. I too would be trying to gain the most resources for my aims at the cost of others, regardless of their ethical stances. In this sense, the AIs aren't bad because they foom, they are bad because they are.. not us. However this looks worrying like a justification that Stalin or the paperclippers could use. See here and here.

Finally, Carlsmith posits a hidden fifth option, for which we current lack good concepts. He points to a notion of trust/ growth/ balance/ attunement. He talks about the colour, 'green' from Magic the Gathering, which is about growing in harmony with complex systems, sometimes trusting, sometimes acting. He notes rationalists and EAs are quite historically inimical to this (favouring 'blue' and 'black'). He repeatedly tries to point at this missing way of being.

There are therefore a number of ways of dealing with the future, with a number of flaws.

But there is a second parallel discussion, about how might AIs treat us. Perhaps because our imagination of our future selves informs how we imagine AIs.

For instance if we assume that we cannot trust things (not 1,3) then it's very easy to see AIs as a tool or competitor. Either it is more powerful than us or we are more powerful than it.

However, if there is a meaningful position on (5). There may be other ways to relate to AIs and future people. Here we might not control them and they might not control us. We might relates as gentle aliens (eg octopus), as dead-but-not-supreme nature (like a bear). Or something even more other than that. Something we cannot imagine but should attempt to.

Note that this doesn't mean we shouldn't fear AIs, they might still be capable of ruining the future, but this poise feels different.

In conclusion, this is my shortest summary of this set of blogs (though there is much much more in there). We should consider other ways to be towards those we could control but who might control us. We should consider other possible relationships towards AI. In AI discourse there is a lack of clarity in notions of attunement, respect, harmony in relation to the sub-optimal choices of other conscious beings. It is possible that this affects our priors about what AI might be like, possibly pushing towards a worse equilibrium.

Moderation Log