As I understand, interpretability research doesn't exactly got stuck, but it's very-very-very far from something like this even for not-SotA models. And the gap is growing.

Reply

'Empiricism!' as Anti-Epistemology

Tapatakt1mo11

Which concept they might obtain by reading my book on Highly Advanced Epistemology 101 For Beginners, or maybe just my essay on Local Validity as a Key to Sanity and Civilization, I guess?"

Perhaps, there should be two links here?

Reply

'Empiricism!' as Anti-Epistemology

Tapatakt1mo10

Do you think that if someone filtered and steelmanned Quintin's criticism, it would be valuable? (No promises)

Reply

'Empiricism!' as Anti-Epistemology

Tapatakt1mo10

I think from Eliezer's point of view it goes kinda like this:

People can't see why the arguments of other side are invalid.
Eliezer tried to engage with them, but most listeners/readers can't tell who is right in this discussions.
Eliezer thinks that if he provides people with strawmenned versions of other side's arguments and refutation of this strawmenned arguments, then the chance that this people will see why he's right in the real discussion will go up.
Eliezer writes this discussion with strawmen as a fictional parable because otherwise it would be either dishonest and rude or a quite boring text with a lot of disclaimers. Or because it's just easier for him to write it this way.

After reading this text at least one person (you) thinks that the goal "avoid dishonesty and rudeness" were not achieved, so text is a failure.

After reading this text at least one person (me) thinks that 1. I got some useful ideas and models. 2. Of course, at least the smartest opponents of Eliezer have better arguments and I don't think Eliezer would disagree with that, so text is a success.

Ideally, Eliezer should update his strategy of writing texts based on both pieces of evidence.

I can be wrong, of course.

Reply

Tapatakt's Shortform

Tapatakt1mo10

Are there any research about how can we change network structure or protocols to make it more difficult for rogue AI to create and run a distributed copies of itself?

Reply

Tapatakt's Shortform

Tapatakt2mo16

What if just turn off the possibility to use the reaction by clicking it in the list of already used reactions? Yes, people would use them less, but more deliberately.

Reply

Tapatakt's Shortform

Tapatakt2mo11

Lesswrong reactions system creates the same bias as normal reactions - it's much much easier to use the reaction someone already used. So the first person to use a reaction under a comment gets undue influence on what reactions there will be under that comment in the future.

Reply

1

Complexity of value but not disvalue implies more focus on s-risk. Moral uncertainty and preference utilitarianism also do.

Tapatakt2mo20

I think I'm at least close to agreeing, but even if it's like this now, it doesn't mean that the complex-positive-value-optimizer can produce more value mass than simple-negative-value-optimizer.

Reply