Wiki Contributions

Comments

simeon_c10756

Idea: Daniel Kokotajlo probably lost quite a bit of money by not signing an OpenAI NDA before leaving, which I consider a public service at this point. Could some of the funders of the AI safety landscape give some money or social reward for this?

I guess reimbursing everything Daniel lost might be a bit too much for funders but providing some money, both to reward the act and incentivize future safety people to not sign NDAs would have a very high value. 

I mean the full option space obviously also includes "bargain with Russia and China to make credible commitments that they stop rearming (possibly in exchange for something)", and I think we should totally explore that path aswell, I just don't have much hope in it at this stage which is why I'm focusing on the other option, even if it is a fucked up local nash equilibrium. 

I've been thinking a lot recently about taxonomizing AI risk related concepts to reduce the dimensionality of AI threat modelling while remaining quite comprehensive. It's in the context of developing categories to assess whether labs plans cover various areas of risk.

There are two questions I'd like to get takes on. Any take on one of these 2 wd be very valuable.

  1. In the misalignment threat model space, a number of safety teams tend to assume that the only type of goal misgeneralization that could lead to X-risks is deceptive misalignment. I'm not sure to understand where that confidence comes from. Could anyone make or link to a case that rules out the plausibility of all other forms of goal misgeneralization? 
  2. It seems to me that to minimize the dimensionality of the threat modelling, it's sometimes more useful to think about the threat model (e.g. a terrorist misuses an LLM to develop a bioweapon) and sometimes more useful to think about a property which has many downstream consequences on the level of risk. I'd like to get takes on one such property:
    1. Situational awareness: It seems to me that it's most useful to think of this property as its own hazard which has many downstream consequences on the level of risk (most prominently that a model with it can condition on being tested when completing tests). Do you agree or disagree with this take? Or would you rather discuss situational awareness only in the context of the deceptive alignment threat model?

Rephrasing based on an ask: "Western Democracies need to urgently put a hard stop to Russia and China war (preparation) efforts" -> Western Democracies need to urgently take actions to stop the current shift towards a new World order where conflicts are a lot more likely due to Western democracies no longer being a hegemonic power able to crush authoritarians power that grab land etc. This shift is currently primarily driven by the fact that Russia & China are heavily rearming themselves whereas Western democracies are not.

@Elizabeth

Answer by simeon_c30

I liked this extension (https://chrome.google.com/webstore/detail/whispering/oilbfihknpdbpfkcncojikmooipnlglo), which I use for long messages. I press a shortcut, it starts recording with Whisper, then repress and it puts the transcript in my clipboard.

In those, Ukraine committed to pass laws for Decentralisation of power, including through the adoption of the Ukrainian law "On temporary Order of Local Self-Governance in Particular Districts of Donetsk and Luhansk Oblasts". Instead of Decentralization they passed laws forbidding those districts from teaching children in the languages that those districts wants to teach them. 

Ukraines unwillingness to follow the agreements was a key reason why the invasion in 2022 happened and was very popular with the Russian population

I ignored that, that's useful, thank you. 

My (simple) reasoning is that I pattern matched hard to the Anschluss (https://en.wikipedia.org/wiki/Anschluss) as a prelude to WW2 where democracies accepted a first conquest hoping that it would stop there (spoiler: it didn't). 

Minsk really much feels the same way. From the perspetive of democracies it seems kinda reasonable to try one time a peaceful resolution accepting a conquest and see if Putin stops (although in hindsight it's unreasonable to not prepare to the possibility he doesn't). Now that he started invading Ukraine as a whole, it seems really hard for me to believe "once he'll get Ukraine, he'll really stop". I expect many reasons to invade other adjacent countries to come up aswell.

The latest illegal land grab was done by Israel without any opposition by the US. If you are truly worried about land grabs being a problem why not speak against that US position of being okay with some land grabs instead of just speaking for buying more weapons?

Two things on this. 

  1. Object-level: I'm not ok with this. 
  2. At a meta-level, there's a repugnant moral dilemma fundamental to this:
    1. The American hegemonic power was abused, e.g. see https://en.wikipedia.org/wiki/July_12,_2007,_Baghdad_airstrike or a number of wars that the US created for dubious reasons (i.e. usually some economic or geostrategic interests). (same for France, I'm just focusing on the US here for simplicity)
    2. Still, despite those deep injustice, the 2000s have been the least lethal in interstate conflicts because hegemony with threat of being crushed by the great power disincentivizes heavily anyone to fight. 
      1. It seems to me that hegemony of some power or coalition of powers is the most stable state for that reason. So I find this state quite desirable.
    3. Then the other question is, who should be in that position?
      1. I have the chance to be able to write this about my country without ending up in jail for. And if I do end up in jail, I have higher odds than in most other countries to be able to contest it. 
      2. So, although western democracies are quite bad and repugnant in a bunch of ways, I find them the least worse and most beneficial existing form of political power to currently defend and preserve the hegemony of.

Indeed. One consideration is that the LW community used to be much less into policy adjacent stuff and hence much less relevant on that domain. Now, with AI governance becoming an increasingly big deal, I think we could potentially use some of that presence to push for certain things in defense. 

Pushing for things in the genre of what Noah describes in the first piece I shared seems feasible for some people in policy.

simeon_c4731

Idk what the LW community can do but somehow, to the extent we think liberalism is valuable, the Western democracies need to urgently put a hard stop to Russia and China war (preparation) efforts. I fear that rearmament is a key component of the only viable path at this stage.

I won't argue in details here but link to Noahpinion, who's been quite vocal on those topics. The TLDR is that China and Russia have been scaling their war industry preparation efforts for years, while Western democracies industries keep declining and remain crazily dependent from the Chinese industry. This creates a new global equilibrium where the US is no longer powerful enough to disincentivize all authoritarians regime from grabbing more land etc.

Some readings relevant to that:

I know this is not a core LW theme but to the extent this threat might be existential to liberalism, and to the existence of LW as a website in the first place, I think we should all care. It would also be quite terrible for safety if AGI was developed during a global war, which seems uncomfortably likely (~10% imo).

Reply4321

If you wanna reread the debate, you can scroll through this thread (https://x.com/bshlgrs/status/1764701597727416448). 

There was a hot debate recently but regardless, the bottom line is just "RSPs should probably be interpreted literally and nothing else. If a literal statement is not strictly there, it should be assumed it's not a commitment."

I've not seen people doing very literal interpretation on those so I just wanted to emphasize that point.

Load More