Squiggle Maximizer (formerly "Paperclip maximizer") - History

135808d10

I addressed this in my top level comment also but do we think Yud here has the notion that there is such a thing as "our full moral architecture" or is he reasoning from the impossibility of such completeness that alignment cannot be achieved by modifying the 'goal'?

135808d10

This entry should address the fact the "the full complement of human values" is an impossible and dynamic set. There is no full set, as the set is interactive with a dynamic environment that presents infinite conformations (from an obviously finite set of materials), and also because the set is riven with indissoluble conflicts (hence politics); whatever set was given to the maximizer AGI would have to be rendered free of these conflicts which would then no longer be the full set etc.

•

Applied to Towards an Ethics Calculator for Use by an AGI by sweenesm 5mo ago

•

Applied to Reaction to "Empowerment is (almost) All We Need" : an open-ended alternative by Ryo 6mo ago

•

Applied to Out of the Box by jesseduffield 6mo ago

•

Applied to Non-superintelligent paperclip maximizers are normal by Adam Zerner 7mo ago

•

Applied to Nature < Nurture for AIs by scottviteri 1y ago

•

Applied to Will Artificial Superintelligence Kill Us? by James_Miller 1y ago

•

Applied to But What If We Actually Want To Maximize Paperclips? by snerx 1y ago

•

Applied to Prediction: any uncontrollable AI will turn earth into a giant computer by Karl von Wendt 1y ago

WilliamKiely1y10

Question: Are innerly-misaligned (superintelligent) AI systems supposed to necessarily be squiggle maximizers, or are squiggle maximizers supposed to only be one class of innerly-misaligned systems?

ryan_greenblatt1y40

I added some caveats about the potential for empirical versions of moral realism and how precise values targets are in practice.

While the target is small in mind space, IMO, it's not that small wrt. things like the distribution of evolved life or more narrowly the distribution of humans.

ryan_greenblatt v1.12.0Apr 5th 2023 GMT (+316/-20) 1

Any future ~~AGI,~~AGI with full power over the lightcone, if it is not to destroy ~~us,~~most potential from a human perspective, must have something sufficiently close to human values as its terminal value (goal). Further, seemingly small deviations could result in losing most of the value. Human values ~~don't~~seem unlikely to spontaneously emerge in a generic optimization ~~process.~~process^[1]. A dependably safe AI would therefore have to be programmed explicitly with human values or programmed with the ability (including the goal) of inferring human values.

^{^}
Though it's conceivable that empirical versions of moral realism could hold in practice.

•

Applied to An Appeal to AI Superintelligence: Reasons to Preserve Humanity by James_Miller 1y ago

•

Applied to PaperclipGPT(-4) by Michael Tontchev 1y ago