Squiggle Maximizer (formerly "Paperclip maximizer")

1358010

I addressed this in my top level comment also but do we think Yud here has the notion that there is such a thing as "our full moral architecture" or is he reasoning from the impossibility of such completeness that alignment cannot be achieved by modifying the 'goal'?

1358010

This entry should address the fact the "the full complement of human values" is an impossible and dynamic set. There is no full set, as the set is interactive with a dynamic environment that presents infinite conformations (from an obviously finite set of materials), and also because the set is riven with indissoluble conflicts (hence politics); whatever set was given to the maximizer AGI would have to be rendered free of these conflicts which would then no longer be the full set etc.

Applied to Out of the Box by jesseduffield ago

Question: Are innerly-misaligned (superintelligent) AI systems supposed to necessarily be squiggle maximizers, or are squiggle maximizers supposed to only be one class of innerly-misaligned systems?

I added some caveats about the potential for empirical versions of moral realism and how precise values targets are in practice.

While the target is small in mind space, IMO, it's not that small wrt. things like the distribution of evolved life or more narrowly the distribution of humans.

Any future AGI,AGI with full power over the lightcone, if it is not to destroy us,most potential from a human perspective, must have something sufficiently close to human values as its terminal value (goal). Further, seemingly small deviations could result in losing most of the value. Human values don'tseem unlikely to spontaneously emerge in a generic optimization process.process[1]. A dependably safe AI would therefore have to be programmed explicitly with human values or programmed with the ability (including the goal) of inferring human values.

  1. ^

    Though it's conceivable that empirical versions of moral realism could hold in practice.

Applied to PaperclipGPT(-4) by Michael Tontchev ago