Leaving LessWrong for a more rational life

You are unlikely to see me posting here again, after today. There is a saying here that politics is the mind-killer. My heretical realization lately is that philosophy, as generally practiced, can also be mind-killing.

As many of you know I am, or was running a twice-monthly Rationality: AI to Zombies reading group. One of the bits I desired to include in each reading group post was a collection of contrasting views. To research such views I've found myself listening during my commute to talks given by other thinkers in the field, e.g. Nick Bostrom, Anders Sandberg, and Ray Kurzweil, and people I feel are doing “ideologically aligned” work, like Aubrey de Grey, Christine Peterson, and Robert Freitas. Some of these were talks I had seen before, or generally views I had been exposed to in the past. But looking through the lens of learning and applying rationality, I came to a surprising (to me) conclusion: it was philosophical thinkers that demonstrated the largest and most costly mistakes. On the other hand, de Grey and others who are primarily working on the scientific and/or engineering challenges of singularity and transhumanist technologies were far less likely to subject themselves to epistematic mistakes of significant consequences.

Philosophy as the anti-science...

What sort of mistakes? Most often reasoning by analogy. To cite a specific example, one of the core underlying assumption of singularity interpretation of super-intelligence is that just as a chimpanzee would be unable to predict what a human intelligence would do or how we would make decisions (aside: how would we know? Were any chimps consulted?), we would be equally inept in the face of a super-intelligence. This argument is, however, nonsense. The human capacity for abstract reasoning over mathematical models is in principle a fully general intelligent behaviour, as the scientific revolution has shown: there is no aspect of the natural world which has remained beyond the reach of human understanding, once a sufficient amount of evidence is available. The wave-particle duality of quantum physics, or the 11-dimensional space of string theory may defy human intuition, i.e. our built-in intelligence. But we have proven ourselves perfectly capable of understanding the logical implications of models which employ them. We may not be able to build intuition for how a super-intelligence thinks. Maybe—that's not proven either. But even if that is so, we will be able to reason about its intelligent behaviour in advance, just like string theorists are able to reason about 11-dimensional space-time without using their evolutionarily derived intuitions at all.

This post is not about the singularity nature of super-intelligence—that was merely my choice of an illustrative example of a category of mistakes that are too often made by those with a philosophical background rather than the empirical sciences: the reasoning by analogy instead of the building and analyzing of predictive models. The fundamental mistake here is that reasoning by analogy is not in itself a sufficient explanation for a natural phenomenon, because it says nothing about the context sensitivity or insensitivity of the original example and under what conditions it may or may not hold true in a different situation.

A successful physicist or biologist or computer engineer would have approached the problem differently. A core part of being successful in these areas is knowing when it is that you have insufficient information to draw conclusions. If you don't know what you don't know, then you can't know when you might be wrong. To be an effective rationalist, it is often not important to answer “what is the calculated probability of that outcome?” The better first question is “what is the uncertainty in my calculated probability of that outcome?” If the uncertainty is too high, then the data supports no conclusions. And the way you reduce uncertainty is that you build models for the domain in question and empirically test them.

The lens that sees its own flaws...

Coming back to LessWrong and the sequences. In the preface to Rationality, Eliezer Yudkowsky says his biggest regret is that he did not make the material in the sequences more practical. The problem is in fact deeper than that. The art of rationality is the art of truth seeking, and empiricism is part and parcel essential to truth seeking. There's lip service done to empiricism throughout, but in all the “applied” sequences relating to quantum physics and artificial intelligence it appears to be forgotten. We get instead definitive conclusions drawn from thought experiments only. It is perhaps not surprising that these sequences seem the most controversial.

I have for a long time been concerned that those sequences in particular promote some ungrounded conclusions. I had thought that while annoying this was perhaps a one-off mistake that was fixable. Recently I have realized that the underlying cause runs much deeper: what is taught by the sequences is a form of flawed truth-seeking (thought experiments favored over real world experiments) which inevitably results in errors, and the errors I take issue with in the sequences are merely examples of this phenomenon.

And these errors have consequences. Every single day, 100,000 people die of preventable causes, and every day we continue to risk extinction of the human race at unacceptably high odds. There is work that could be done now to alleviate both of these issues. But within the LessWrong community there is actually outright hostility to work that has a reasonable chance of alleviating suffering (e.g. artificial general intelligence applied to molecular manufacturing and life-science research) due to concerns arrived at by flawed reasoning.

I now regard the sequences as a memetic hazard, one which may at the end of the day be doing more harm than good. One should work to develop one's own rationality, but I now fear that the approach taken by the LessWrong community as a continuation of the sequences may result in more harm than good. The anti-humanitarian behaviors I observe in this community are not the result of initial conditions but the process itself.

What next?

How do we fix this? I don't know. On a personal level, I am no longer sure engagement with such a community is a net benefit. I expect this to be my last post to LessWrong. It may happen that I check back in from time to time, but for the most part I intend to try not to. I wish you all the best.

A note about effective altruism…

One shining light of goodness in this community is the focus on effective altruism—doing the most good to the most people as measured by some objective means. This is a noble goal, and the correct goal for a rationalist who wants to contribute to charity. Unfortunately it too has been poisoned by incorrect modes of thought.

Existential risk reduction, the argument goes, trumps all forms of charitable work because reducing the chance of extinction by even a small amount has far more expected utility than would accomplishing all other charitable works combined. The problem lies in the likelihood of extinction, and the actions selected in reducing existential risk. There is so much uncertainty regarding what we know, and so much uncertainty regarding what we don't know that it is impossible to determine with any accuracy the expected risk of, say, unfriendly artificial intelligence creating perpetual suboptimal outcomes, or what effect charitable work in the area (e.g. MIRI) is have to reduce that risk, if any.

This is best explored by an example of existential risk done right. Asteroid and cometary impacts is perhaps the category of external (not-human-caused) existential risk which we know the most about, and have done the most to mitigate. When it was recognized that impactors were a risk to be taken seriously, we recognized what we did not know about the phenomenon: what were the orbits and masses of Earth-crossing asteroids? We built telescopes to find out. What is the material composition of these objects? We built space probes and collected meteorite samples to find out. How damaging an impact would there be for various material properties, speeds, and incidence angles? We built high-speed projectile test ranges to find out. What could be done to change the course of an asteroid found to be on collision course? We have executed at least one impact probe and will monitor the effect that had on the comet's orbit, and have on the drawing board probes that will use gravitational mechanisms to move their target. In short, we identified what it is that we don't know and sought to resolve those uncertainties.

How then might one approach an existential risk like unfriendly artificial intelligence? By identifying what it is we don't know about the phenomenon, and seeking to experimentally resolve that uncertainty. What relevant facts do we not know about (unfriendly) artificial intelligence? Well, much of our uncertainty about the actions of an unfriendly AI could be resolved if we were to know more about how such agents construct their thought models, and relatedly what language were used to construct their goal systems. We could also stand to benefit from knowing more practical information (experimental data) about in what ways AI boxing works and in what ways it does not, and how much that is dependent on the structure of the AI itself. Thankfully there is an institution that is doing that kind of work: the Future of Life institute (not MIRI).

Where should I send my charitable donations?

Aubrey de Grey's SENS Research Foundation.

100% of my charitable donations are going to SENS. Why they do not get more play in the effective altruism community is beyond me.

If you feel you want to spread your money around, here are some non-profits which have I have vetted for doing reliable, evidence-based work on singularity technologies and existential risk:

  • Robert Freitas and Ralph Merkle's Institute for Molecular Manufacturing does research on molecular nanotechnology. They are the only group that work on the long-term Drexlarian vision of molecular machines, and publish their research online.
  • Future of Life Institute is the only existential-risk AI organization which is actually doing meaningful evidence-based research into artificial intelligence.
  • B612 Foundation is a non-profit seeking to launch a spacecraft with the capability to detect, to the extent possible, ALL Earth-crossing asteroids.

I wish I could recommend a skepticism, empiricism, and rationality promoting institute. Unfortunately I am not aware of an organization which does not suffer from the flaws I identified above.

Addendum regarding unfinished business

I will no longer be running the Rationality: From AI to Zombies reading group as I am no longer in good conscience able or willing to host it, or participate in this site, even from my typically contrarian point of view. Nevertheless, I am enough of a libertarian that I feel it is not my role to put up roadblocks to others who wish to delve into the material as it is presented. So if someone wants to take over the role of organizing these reading groups, I would be happy to hand over the reigns to that person. If you think that person should be you, please leave a reply in another thread, not here.

EDIT: Obviously I'll stick around long enough to answer questions below :)

Comments

sorted by
magical algorithm
Highlighting new comments since Today at 3:48 AM
Select new highlight date
Rendering 50/269 comments  show more

Thanks for sharing your contrarian views, both with this post and with your previous posts. Part of me is disappointed that you didn't write more... it feels like you have several posts' worth of objections to Less Wrong here, and at times you are just vaguely gesturing towards a larger body of objections you have towards some popular LW position. I wouldn't mind seeing those objections fleshed out in to long, well-researched posts. Of course you aren't obliged to put in the time & effort to write more posts, but it might be worth your time to fix specific flaws you see in the LW community given that it consists of many smart people interested in maximizing their positive impact on the far future.

I'll preface this by stating some points of general agreement:

  • I haven't bothered to read the quantum physics sequence (I figure if I want to take the time to learn that topic, I'll learn from someone who researches it full-time).

  • I'm annoyed by the fact that the sequences in practice seem to constitute a relatively static document that doesn't get updated in response to critiques people have written up. I think it's worth reading them with a grain of salt for that reason. (I'm also annoyed by the fact that they are extremely wordy and mostly without citation. Given the choice of getting LWers to either read the sequences or read Thinking Fast and Slow, I would prefer they read the latter; it's a fantastic book, and thoroughly backed up by citations. No intellectually serious person should go without reading it IMO, and it's definitely a better return on time. Caveat: I personally haven't read the sequences through and through, although I've read lots of individual posts, some of which were quite insightful. Also, there is surprisingly little overlap between the two works and it's likely worthwhile to read both.)

And here are some points of disagreement :P

You talk about how Less Wrong encourages the mistake of reasoning by analogy. I searched for "site:lesswrong.com reasoning by analogy" on Google and came up with these 4 posts: 1, 2, 3, 4. Posts 1, 2, and 4 argue against reasoning by analogy, while post 3 claims the situation is a bit more nuanced. In this comment here, I argue that reasoning by analogy is a bit like taking the outside view: analogous phenomena can be considered part of the same (weak) reference class. So...

  • Insofar as there is an explicit "LW consensus" about whether reasoning by analogy is a good idea, it seems like you've diagnosed it incorrectly (although maybe there are implicit cultural norms that go against professed best practices).

  • It seems useful to know the answer to questions like "how valuable are analogies", and the discussions I linked to above seem like discussions that might help you answer that question. These discussions are on LW.

  • Finally, it seems you've been unable to escape a certain amount of reasoning by analogy in your post. You state that experimental investigation of asteroid impacts was useful, so by analogy, experimental investigation of AI risks should be useful.

The steelman of this argument would be something like "experimentally, we find that investigators who take experimental approaches tend to do better than those who take theoretical approaches". But first, this isn't obviously true... mathematicians, for instance, have found theoretical approaches to be more powerful. (I'd guess that the developer of Bitcoin took a theoretical rather than an empirical approach to creating a secure cryptocurrency.) And second, I'd say that even this argument is analogy-like in its structure, since the reference class of "people investigating things" seems sufficiently weak to start pushing in to analogy territory. See my above point about how reasoning by analogy at its best is reasoning from a weak reference class. (Do people think this is worth a toplevel post?)

This brings me to what I think is my most fundamental point of disagreement with you. Viewed from a distance, your argument goes something like "Philosophy is a waste of time! Resolve your disagreements experimentally! There's no need for all this theorizing!" And my rejoinder would be: Resolving disagreements experimentally is great... when it's possible. We'd love to do a randomized controlled trial of whether universes with a Machine Intelligence Research Institute are more likely to have a positive singularity, but that unfortunately we don't currently know how to do that.

There are a few issues with too much emphasis of experimentation over theory. The first issue is that you may be tempted to prefer experimentation over theory even for problems that theory is better suited for (e.g. empirically testing prime number conjectures). The second issue is that you may fall prey to the streetlight effect and prioritize areas of investigation that look tractable from an experimental point of view, ignoring questions that are both very important and not very tractable experimentally.

You write:

Well, much of our uncertainty about the actions of an unfriendly AI could be resolved if we were to know more about how such agents construct their thought models, and relatedly what language were used to construct their goal systems.

This would seem to depend on the specifics of the agent in question. This seems like a potentially interesting line of inquiry. My impression is that MIRI thinks most possible AGI architectures wouldn't meet its standards for safety, so given that their ideal architecture is so safety-constrained, they're focused on developing the safety stuff first before working on constructing thought models etc. This seems like a pretty reasonable approach for an organization with limited resources, if it is in fact MIRI's approach. But I could believe that value could be added by looking at lots of budding AGI architectures and trying to figure out how one might make them safer on the margin.

We could also stand to benefit from knowing more practical information (experimental data) about in what ways AI boxing works and in what ways it does not, and how much that is dependent on the structure of the AI itself.

Sure... but note that Eliezer Yudkowsky from MIRI was the one who invented the AI box experiment and ran the first few experiments, and FHI wrote this paper consisting of a bunch of ideas for what AI boxes consist of. (The other thing I didn't mention as a weakness of empiricism is that empiricism doesn't tell you what hypotheses might be useful to test. Knowing what hypotheses to test is especially nice to know when testing hypotheses is expensive.)

I could believe that there are fruitful lines of experimental inquiry that are neglected in the AI safety space. Overall it looks kinda like crypto to me in the sense that theoretical investigation seems more likely to pan out. But I'm supportive of people thinking hard about specific useful experiments that someone could run. (You could survey all the claims in Bostrom's Superintelligence and try to estimate what fraction could be cheaply tested experimentally. Remember that just because a claim can't be tested experimentally doesn't mean it's not an important claim worth thinking about...)

I understand “politics is the mind-killer” enough to not consider LW community as a tribe that I have to belong to, and I could easily turn away from LW and say “the Sequences and FAI is nonsense”, just like I turned away from various gurus and ideologies before. But I disagree with what you saying, not the Sequences or MIRI criticism, but with your evaluation of LW community and your unwillingness to engage anymore. Honestly, I'm upset that you suddenly stopped the reading group.

Despite Yudkowsky's obvious leanings, the Sequences are not about FAI, nor they are about Many-Worlds, Tegmark Mathematical Universe, Roko's Basilisk or whatever. They are first and foremost about how to not end up an idiot. They are about how to not become immune to criticism, they are about Human's Guide to Words, they are about System 1 and System 2.

I don't care about Many Worlds, FAI, Fun theory and Jeffreyssai stuff, but LW was the thing that stopped me from being a complete and utter idiot. Now I see that people I care about, due to not internalizing LW's simple truths, are being complete and utter idiots, with their death spirals, and tribal affiliations, and meaningless usage of words, and theories that don't predict shit, and it breaks my heart.

If you want to criticize LW for lack of actual instrumental rationality, you're not the first, Yvain did that in 2009, and he was right in understanding the problem, though he couldn't provide a solution either. I personally believe that combating akrasia is the most important task in the world, not FAI, because if a cure for akrasia could be found, we could train armies of superhuman-scientists, who then would solve cancer, nanotechnology and AI-risk. That's why reading modern cognitive sciences and CBT and neuroscience is probably more important than everything, at least that's what I think.

And here I am, somebody who wishes to be part of LW community, but also disagreeing, either conceptually or politically, with much of the LW memes. Yet you don't want to engage with me anymore. LW is not a monolith, where everybody follows Yudkowsky, it's the most contrarian (and thus mentally healthy) place I've ever seen on the Web.

LW is not end of it all, but the Sequences are the bare minimum that people require to be sane. Hey, some people through sheer study of maths and physics can develop correct epistemology, so they don't need the Sequences, but I wasn't, and many people aren't.

It's not about tribal things. If you had your own forum with lots of people, who share similar criticism of LW, hey, I'd go there and leave LW. But you don't have such forum, so by leaving LW you just leave people like me alone. What's the point of that? Do you really believe leaving LW like that is more utility, than trying to create an island within it?

Honestly, I even started thinking the only reason you wrote this post because you realized you're too lazy to continue the reading group, so you needed a good excuse. But that's ridiculous, and I assign very low probability to that.

The sole point of my comment is this. I'm not upset because of your fundamental disagreement with Yudkowsky and LW's ideology and memes. I'm upset because you stop the reading group, which is important, because, like I said, the Sequences are about basic rational thinking, not deep philosophy, in which Yudkowsky indeed might be completely wrong. I'm upset, because your departure would mean that you think that LW is completely lost, and there is not at least a sizable minority, who'd say “you know what, you're right, let's do something about it”. That's sad.

(I'll update this post with more thoughts)

Despite Yudkowsky's obvious leanings, the Sequences are ... first and foremost about how to not end up an idiot

My basic thesis is that even if that was not the intent, the result has been the production of idiots. Specifically, a type of idiotic madness that causes otherwise good people, self-proclaimed humanitarians to disparage the only sort of progress which has the potential to alleviate all human suffering, forever, on accelerated timescales. And they do so for reasons that are not grounded in empirical evidence, because they were taught though demonstration modes of non-empirical thinking from the sequences, and conditioned to think this was okay through social engagement on LW.

When you find yourself digging a hole, the sensible and correct thing to do is stop digging. I think we can do better, but I'm burned out on trying to reform from the inside. Or perhaps I'm no longer convinced that reform can work given the nature of the medium (social pressures of blog posts and forums work counter to the type of rationality that should be advocated for).

I don't care about Many Worlds, FAI, Fun theory and Jeffreyssai stuff, but LW was the thing that stopped me from being a complete and utter idiot.

I don't want to take that away. But for me LW was not just a baptismal fount for discovering rationality, it was also an effort to get people to work on humanitarian relief and existential risk reduction. I hope you don't think me crazy for saying that LW has had a subject matter bias in these directions. But on at least some of these accounts the effect had by LW and/or MIRI and/or Yudkowsky's specific focus on these issues may be not just suboptimal, but actually negative. To be precise: it may actually be causing more suffering than would otherwise exist.

We are finally coming out of a prolonged AI winter. And although funding is finally available to move the state of the art in automation forward, to accelerate progress in life sciences and molecular manufacturing that will bring great humanitarian change, we have created a band of Luddites that fear the solution more than the problem. And in a strange twist of double-think, consider themselves humanitarians for fighting progress.

If you had your own forum with lots of people, who share similar criticism of LW, hey, I'd go there and leave LW. But you don't have such forum, so by leaving LW you just leave people like me alone. What's the point of that? Do you really believe leaving LW like that is more utility, than trying to create an island within it?

I am myself working on various projects in my life which I expect to have positive effects on the world. Outside of work, LW has at times occupied a significant fraction of my leisure time. This must be seen as an activity of higher utility than working more hours on my startup, making progress on my molecular nanotech and AI side projects, or enriching myself personally in other ways (family time, reading, etc.). I saw the Rationality reading group as a chance to do something which would conceivably grow that community by a measurable amount, thereby justifying a time expenditure. However if all I am doing is bringing more people into a community that is actively working against developments in artificial intelligence that have a chance of relieving human suffering within a single generation… the Hippocratic corpus comes to mind: “first, do no harm.”

I am not sure yet what I will fill the time with. Maybe I'll get off my butt and start making more concrete progress on some of the nanotech and AI stuff that I have been letting slide in recent years.

I recognize also that I am making broad generalizations which do not always apply to everyone. You seem to be an exception, and I wish I had engaged with you more. I also will miss TheAncientGeek's contrarian posts, as well as many others who deserve credit for not following a herd mentality.

Thank you for your response, that's really important for me.

I've never seen disparaging of actually helping people on LW. Can you point to examples? Can you argue that it is a tendency? You say that there is lots of outright hostility to anything against x-risks and human misery, except if it's MIRI. I wouldn't even imagine anyone would say that of LW, but maybe I'm blind, so I'll be grateful if you prove me wrong. Yudkowsky is definitely pro-immortality and supported donating to SENS.

I don't even think MIRI and MIRI-leaning LWers are against ongoing AI research. I've never heard anything like “please stop doing any AI until we figure out friendliness”, only “hey, can you please put more effort into friendliness too, it's very important?” And even if you think that MIRI's focus on friendliness is order of magnitude misplaced, it's just a mistake of prioritizing, not a fundamental philosophical blunder. Again, if you can expand on this topic, I would only say thank you.

Maybe “reform” isn't the right word. The Sequences aren't going anywhere, so of course LW will be FAI-centric for a long time, but within LW there is already a substantial amount of people (that's my impression, I never actually counted) who are not simply contrarian, but actually assign different priorities on what should be done about the world. More inline with your thoughts, than Yudkowsky's. Maybe you can still stay and steer this substantial minority in the right direction, instead of useless splitting.

I bet most people on LW are not even high-karma prolific writers, they are less knowledge, less confident, but also more open to contrary views, such as yours. Just by writing one big article about how you think LW's focus is misplaced can be of extreme help for such people. Which, BTW, includes me, because I never posted anything.

I'd actually would love to see you writing articles on all your theses here, on LW. LW-critical articles were already promoted a few times, including Yvain's article, so it's not like LW is criticism-intolerant.

If you actually do that, and provide lots of examples and evidence, it would be a breathe of fresh air for all those people, who will continue to be attracted to LW. You don't have to put titanic effort into “reform”, just erect a pole.

Despite Yudkowsky's obvious leanings, the Sequences are not about FAI, nor [etc]...they are first and foremost about how to not end up an idiot. They are about how to not become immune to criticism, they are about Human's Guide to Words, they are about System 1 and System 2.

I've always had the impression that Eliezer intended them to lead a person from zero to FAI. So I'm not sure you're correct here.

...but that being said, the big Less Wrong takeaways for me were all from Politics is the Mind-Killer and the Human's Guide to Words -- in that those are the ones that have actually changed my behavior and thought processes in everyday life. They've changed the way I think to such an extent that I actually find it difficult to have substantive discussions with people who don't (for example) distinguish between truth and tribal identifiers, distinguish between politics and policy, avoid arguments over definitions, and invoke ADBOC when necessary. Being able to have discussions without running over such roadblocks is a large part of why I'm still here, even though my favorite posters all seem to have moved on. Threads like this one basically don't happen anywhere else that I'm aware of.

Someone recently had a blog post summarizing the most useful bits of LW's lore, but I can't for the life of me find the link right now.

I've always had the impression that Eliezer intended them to lead a person from zero to FAI. So I'm not sure you're correct here.

Eliezer states this explicitly on numerous occasions, that his reason for writing the blog posts was to motivate people to work with him on FAI. I'm having trouble coming up with exact citations however, since it's not very google-able.

My prior perception of the sequences was that EY started from a firm base of generlaly good advice about thinking. Sequences like Human's guide to words and How to actually change your mind stand on their own. He then however went off the deep end trying to extend and apply these concepts to questions in the philosophy of the mind, ethics, and decision theory in order to motivate an interest in friendly AI theory.

I thought that perhaps the mistakes made in those sequences where correctable one-off errors. Now I am of the opinion that the way in which that philosophical inquiry was carried out doomed the project to failure from the start, even if the details of the failure is subject to Yudkowsky's own biases. Reasoning by thought experiment only over questions that are not subject to experimental validation basically does nothing more than expose one's priors. And either you agree with the priors, or you don't. For example, does quantum physics support the assertion that identity is the instance of computation or the information being computed? Neither. But you could construct a thought experiment which validates either view based on the priors you bring to the discussion, and I have wasted much time countering his thought experiments with those of my own creation before I understood the Sisyphean task I was undertaking :\

On the other hand, de Grey and others who are primarily working on the scientific and/or engineering challenges of singularity and transhumanist technologies were far less likely to subject themselves to epistematic mistakes of significant consequences.

This part isn't clear to me. The researcher who goes into generic anti-cancer work, instead of SENS-style anti-aging work, probably has made an epistemic mistake with moderate consequences, because of basic replaceability arguments.

But to say that MIRI's approach to AGI safety is due to a philosophical mistake, and one with significant consequences, seems like it requires much stronger knowledge. Shooting very high instead of high is riskier, but not necessarily wronger.

Thankfully there is an institution that is doing that kind of work: the Future of Life institute (not MIRI).

I think you underestimate how much MIRI agrees with FLI.

Why they do not get more play in the effective altruism community is beyond me.

SENS is the second largest part of my charity budget, and I recommend it to my friends every year (on the obvious day to do so). My speculations on why EAs don't favor them more highly mostly have to do with the difficulty of measuring progress in medical research vs. fighting illnesses, and possibly also the specter of selfishness.

I think you underestimate how much MIRI agrees with FLI.

Agreed - or, at least, he underestimates how much FLI agrees with MIRI. This is pretty obvious e.g. in the references section of the technical agenda that was attached to FLI's open letter. Out of a total of 95 references:

  • Six are MIRI's technical reports that've only been published on their website: Vingean Reflection, Realistic World-Models, Value Learning, Aligning Superintelligence, Reasoning Under Logical Uncertainty, Toward Idealized Decision Theory
  • Five are written by MIRI's staff or Research Associates: Avoiding Unintended AI Behaviors, Ethical Artificial Intelligence, Self-Modeling Agents and Reward Generator Corruption, Problem Equilibirum in the Prisoner's Dilemma, Corrigibility,
  • Eight are ones that tend to agree with MIRI's stances and which have been cited in MIRI's work: Superintelligence, Superintelligent Will, Singularity A Philosophical Analysis, Speculations concerning the first ultraintelligent machine, The nature of self-improving AI, Space-Time Embedded Intelligence, FAI: the Physics Challenge, The Coming Technological Singularity

That's 19/95 (20%) references produced either directly by MIRI or people closely associated with them, or that have MIRI-compatible premises.

Yudkowsky obviously supports immortality. Quote from his letter on his brother's death:

If you object to the Machine Intelligence Research Institute then consider Dr. Aubrey de Grey's Methuselah Foundation, which hopes to defeat aging through biomedical engineering.

If SENS is not sufficiently promoted as a target for charity, I have no idea why is that, and I dispute that it's because of LW community's philosophical objections, unless somebody can convince me otherwise. BTW, EA community != LW community, so maybe lot's of Effective Altruists just don't consider immortality the same way the do malaria (cached thoughts etc).

Recently I have realized that the underlying cause runs much deeper: what is taught by the sequences is a form of flawed truth-seeking (thought experiments favored over real world experiments) which inevitably results in errors, and the errors I take issue with in the sequences are merely examples of this phenomenon.

I guess I'm not sure how these concerns could possibly be addressed by any platform meant for promoting ideas. You cannot run a lab in your pocket. You can have citations to evidence found by people who do run labs...but that's really all you can do. Everything else must necessarily be a thought experiment.

So my question is, can you envision a better version, and what would be some of the ways that it would be different? (Because if you can, it aught to be created.)

On a personal level, I am no longer sure engagement with such a community is a net benefit.

I am simply treating LW as a 120+ IQ version of Reddit. Just generic discussion with mainly bright folks. The point is, I don't know any other. I used to know Digg and MetaFilter and they are not much better either. If we could make a list of cerebral discussion boards, forums, and suchlike, that would be a good idea I guess. Where do you expect to hang out in the future?

I'll probably spend less time hanging out in online communities, honestly.

I generally agree with your position on the Sequences, but it seems to me that it is possible to hang around this website and have meaningful discussions without worshiping the Sequences or Eliezer Yudkowsky. At least it works for me.
As for being a highly involved/high status member of the community, especially the offline one, I don't know.

Anyway, regarding the point about super-intelligence that you raised, I charitably interpret the position of the AI-risk advocates not as the claim that super-intelligence would be in principle outside the scope of human scientific inquiry, but as the claim that a super-intelligent agent would be more efficient at understanding humans that humans would be at understanding it, giving the super-intelligent agent and edge over humans.

I think that the AI-risk advocates tend to exaggerate various elements of their analysis: they probably underestimate time to human-level AI and time to super-human AI, they may overestimate the speed and upper bounds to recursive self-improvement (their core arguments based on exponential growth seem, at best, unsupported).

Moreover, it seems that they tend to conflate super-intelligence with a sort of near-omniscience:
They seem to assume that a super-intelligent agent will be a near-optimal Bayesian reasoner with an extremely strong prior that will allow it to gain a very accurate model of the world, including all the nuances of human psychology, from a very small amount of observational evidence and little or no interventional experiments. Recent discussion here.
Maybe this is the community bias that you were talking about, the over-reliance on abstract thought rather than evidence, projected on an hypothetical future AI.
It seems dubious to me that this kind of extreme inference is even physically possible, and if it is, we are certainly not anywhere close to implementing it. All the recent advances in machine learning, for instance, rely on processing very large datasets.

Anyway, as much as they exaggerate the magnitude and urgency of the issue, I think that the AI-risk advocates have a point when they claim that keeping a system much intelligent than ourselves under control would be a non-trivial problem.

I think that the AI-risk advocates tend to exaggerate various elements of their analysis: they probably underestimate time to human-level AI and time to super-human AI

It's worth keeping in mind that AI-risk advocates tend to be less confident that AGI is nigh than the top-cited scientists within AI are. People I know at MIRI and FHI are worried about AGI because it looks like a technology that's many decades away, but one where associated safety technologies are even more decades away.

That's consistent with the possibility that your criticism could turn out to be right. It could be that we're less wrong than others on this metric and yet still very badly wrong in absolute terms. To make a strong prediction in this area is to claim to already have a pretty good computational understanding of how general intelligence works.

Moreover, it seems that they tend to conflate super-intelligence with a sort of near-omniscience: They seem to assume that a super-intelligent agent will be a near-optimal Bayesian reasoner

Can you give an example of a statement by a MIRI researcher that is better predicted by 'X is speaking of the AI as a near-optimal Bayesian' than by 'X is speaking of the AI as an agent that's as much smarter than humans as humans are smarter than chimpanzees, but is still nowhere near optimal'? (Or 'an agent that's as much smarter than humans as humans are smarter than dogs'...) I'm not seeing why saying 'Bob the AI could be 100x more powerful than a human', for example, commits one to a view about how close Bob is to optimal.

Maybe this is the community bias that you were talking to, the over-reliance on abstract thought rather than evidence, projected on an hypothetical future AI.

You nailed it. (Your other points too.)

The claim [is] that a super-intelligent agent would be more efficient at understanding humans that humans would be at understanding it, giving the super-intelligent agent['s] edge over humans.

The problem here is that intelligence is not some linear scale, even general intelligence. We human beings are insanely optimized for social intelligence in a way that is not easy for a machine to learn to replicate, especially without detection. It is possible for a general AI to be powerful enough to provide meaningful acceleration of molecular nanotechnology and medical science research whilst being utterly befuddled by social conventions and generally how humans think, simply because it was not programmed for social intelligence.

Anyway, as much as they exaggerate the magnitude and urgency of the issue, I think that the AI-risk advocates have a point when they claim that keeping a system much intelligent than ourselves under control would be a non-trivial problem.

There is however a substantial difference between a non-trivial problem and an impossible problem. Non-trivial we can work with. I solve non-trivial problems for a living. You solve a non-trivial problem by hacking at it repeatedly until it breaks into components that are themselves well understood enough to be trivial problems. It takes a lot of work, and the solution is to simply to do a lot of work.

But in my experience the AI-risk advocates claim that safe / controlled UFAI is an impossibility. You can't solve an impossibility! What's more, in that frame of mind any work done towards making AGI is risk-increasing. Thus people are actively persuaded to NOT work on artificial intelligence, and instead work of fields of basic mathematics which is at this time too basic or speculative to say for certain whether it would have a part in making a safe or controllable AGI.

So smart people who could be contributing to an AGI project, are now off fiddling with basic mathematics research on chalkboards instead. That is, in the view of someone who believes safe / controllable UFAI is non-trivial but possible mechanism to accelerate the arrival of life-saving anti-aging technologies, a humanitarian disaster.

Thanks for taking the time to explain your reasoning, Mark. I'm sorry to hear you won't be continuing the discussion group! Is anyone else here interested in leading that project, out of curiosity? I was getting a lot out of seeing people's reactions.

I think John Maxwell's response to your core argument is a good one. Since we're talking about the Sequences, I'll note that this dilemma is the topic of the Science and Rationality sequence:

In any case, right now you've got people dismissing cryonics out of hand as "not scientific", like it was some kind of pharmaceutical you could easily administer to 1000 patients and see what happened. "Call me when cryonicists actually revive someone," they say; which, as Mike Li observes, is like saying "I refuse to get into this ambulance; call me when it's actually at the hospital". Maybe Martin Gardner warned them against believing in strange things without experimental evidence. So they wait for the definite unmistakable verdict of Science, while their family and friends and 150,000 people per day are dying right now, and might or might not be savable—

—a calculated bet you could only make rationally [i.e., using your own inference skills, without just echoing data from an experimental study, and without just echoing established, expert-verified scientific conclusions].

The drive of Science is to obtain a mountain of evidence so huge that not even fallible human scientists can misread it. But even that sometimes goes wrong, when people become confused about which theory predicts what, or bake extremely-hard-to-test components into an early version of their theory. And sometimes you just can't get clear experimental evidence at all.

Either way, you have to try to do the thing that Science doesn't trust anyone to do—think rationally, and figure out the answer before you get clubbed over the head with it.

(Oh, and sometimes a disconfirming experimental result looks like: "Your entire species has just been wiped out! You are now scientifically required to relinquish your theory. If you publicly recant, good for you! Remember, it takes a strong mind to give up strongly held beliefs. Feel free to try another hypothesis next time!")

This is why there's a lot of emphasis on hard-to-test ("philosophical") questions in the Sequences, even though people are notorious for getting those wrong more often than scientific questions -- because sometimes (e.g., in the case of cryonics and existential risk) the answer matters a lot for our decision-making, long before we have a definitive scientific answer. That doesn't mean we should despair of empirically investigating these questions, but it does mean that our decision-making needs to be high-quality even during periods where we're still in a state of high uncertainty.

The Sequences talk about the Many Worlds Interpretation precisely because it's an unusually-difficult-to-test topic. The idea isn't that this is a completely typical example, or that it's a good idea to disregard evidence when it is available; the idea, rather, is that we sometimes do need to predicate our decisions on our best guess in the absence of perfect tests.

Its placement in Rationality: From AI to Zombies immediately after the 'zombies' sequence (which, incidentally, is an example of how and why we should reject philosophical thought experiments, no matter how intuitively compelling they are, when they don't accord with established scientific theories and data) is deliberate. Rather than reading either sequence as an attempt to defend a specific fleshed-out theory of consciousness or of physical law, they should primarily be read as attempts to show that extreme uncertainty about a domain doesn't always bleed over into 'we don't know anything about this topic' or 'we can't rule out any of the candidate solutions'.

We can effectively rule out epiphenomenalism as a candidate solution to the hard problem of consciousness even if we don't know the answer to the hard problem (which we don't), and we can effectively rule out 'consciousness causes collapse' and 'there is no objective reality' as candidate solutions to the measurement problem in QM even if we don't know the answer to the measurement problem (which, again, we don't). Just advocating 'physicalism' or 'many worlds' is a promissory note, not a solution.

In discussions of EA and x-risk, we likewise need to be able to prioritize more promising hypotheses over less promising ones long before we've answered all the questions we'd like answered. Even deciding what studies to fund presupposes that we've 'philosophized', in the sense of mentally aggregating, heuristically analyzing, and drawing tentative conclusions from giant complicated accumulated-over-a-lifetime data sets.

You wrote:

The human capacity for abstract reasoning over mathematical models is in principle a fully general intelligent behaviour, as the scientific revolution has shown: there is no aspect of the natural world which has remained beyond the reach of human understanding, once a sufficient amount of evidence is available.

That's true, and it's one of the basic assumptions behind MIRI research: that understanding agents smarter than us isn't obviously hopeless, because our human capacity for abstract reasoning makes it possible for us to model systems even when they're extremely complex and dynamic. MIRI's research is intended to make this likelier to happen.

It's not the default that we're always able to predict what our inventions will do before we run them to see what happens; and there are some basic limits on our ability to do so when the system we're predicting is smarter than the predictor. But with enough intellectual progress we may become able to model abstract safety-relevant features of AGI behavior, even though we can't predict in detail the exact decisions the AGI will make. (If we could predict the exact decisions of the AGI, we'd have to be at least as smart as the AGI.)

If it isn't possible to learn a variety of generalizations about smarter autonomous systems, then, interestingly, that also undermines the case for intelligence explosion. Both 'humans trying to make superintelligent AI safe' and 'AI undergoing a series of recursive self-improvements' are cases where less intelligent agents are trying to reliably generate agents that meet various abstract criteria (including superior intelligence). The orthogonality thesis, likewise, simultaneously supports the claim 'many possible AI systems won't have humane goals' and 'it is possible for an AI system to have human goals'. This is why Bostrom/Yudkowsky-type arguments don't uniformly inspire pessimism.

Are you familiar with MIRI's technical agenda? You may also want to check out the AI Impacts project, if you think we should be prioritizing forecasting work at this point rather than object-level mathematical research.

This argument is, however, nonsense. The human capacity for abstract reasoning over mathematical models is in principle a fully general intelligent behaviour, as the scientific revolution has shown: there is no aspect of the natural world which has remained beyond the reach of human understanding, once a sufficient amount of evidence is available. The wave-particle duality of quantum physics, or the 11-dimensional space of string theory may defy human intuition, i.e. our built-in intelligence. But we have proven ourselves perfectly capable of understanding the logical implications of models which employ them. We may not be able to build intuition for how a super-intelligence thinks. Maybe—that's not proven either. But even if that is so, we will be able to reason about its intelligent behaviour in advance, just like string theorists are able to reason about 11-dimensional space-time without using their evolutionarily derived intuitions at all.

This may be retreating to the motte's bailey, so to speak, but I don't think anyone seriously thinks that a superintelligence would be literally impossible to understand. The worry is that there will be such a huge gulf between how superintelligences reason versus how we reason that it would take prohibitively long to understand them.

I think a laptop is a good example. There probably isn't any single human on earth that knows how to build a modern laptop from scratch. There's are computer scientists that know how the operating system is put together--how the operating system is programmed, how memory is written and retrieved from the various buses; there are other computer scientists and electrical engineers who designed the chips themselves, who arrayed circuits efficiently to dissipate heat and optimize signal latency. Even further, there are material scientists and physicists who designed the transistors and chip fabrication processes, and so on.

So, as an individual human, I don't know what it's like to know everything about a laptop all at once in my head, at a glance. I can zoom in on an individual piece and learn about it, but I don't know all the nuances for each piece--just a sort of executive summary. The fundamental objects with which I can reason have a sort of characteristic size in mindspace--I can imagine 5, maybe 6 balls moving around with distinct trajectories (even then, I tend to group them into smaller subgroups). But I can't individually imagine a hundred (I could sit down and trace out the paths of a hundred balls individually, of course, but not all at once).

This is the sense in which a superintelligence could be "dangerously" unpredictable. If the fundamental structures it uses for reasoning greatly exceed a human's characteristic size of mindspace, it would be difficult to tease out its chain of logic. And this only gets worse the more intelligent it gets.

Now, I'll grant you that the lesswrong community likes to sweep under the rug the great competition of timescales and "size"scales that are going on here. It might be prohibitively difficult, for fundamental reasons, to move from working-mind-RAM of size 5 to size 10. It may be that artificial intelligence research progresses so slowly that we never even see an intelligence explosion--just a gently sloped intelligence rise over the next few millennia. But I do think it's a maybe not a mistake but certainly naiive to just proclaim, "Of course we'll be able to understand them, we are generalized reasoners!".

Edit: I should add that this is already a problem for, ironically, computer-assisted theorem proving. If a computer produces a 10,000,000 page "proof" of a mathematical theorem (i.e., something far longer than any human could check by hand), you're putting a huge amount of trust in the correctness of the theorem-proving-software itself.

Edit: I should add that this is already a problem for, ironically, computer-assisted theorem proving. If a computer produces a 10,000,000 page "proof" of a mathematical theorem (i.e., something far longer than any human could check by hand), you're putting a huge amount of trust in the correctness of the theorem-proving-software itself.

No, you just need to trust a proof-checking program, which can be quite small and simple, in contrast with the theorem proving program, which can be arbitrarily complex and obscure.

Thank you for this.

I see you as highlighting a virtue that the current Art gestures toward but doesn't yet embody. And I agree with you, a mature version of the Art definitely would.

In his Lectures on Physics, Feynman provides a clever argument to show that when the only energy being considered in a system is gravitational potential energy, then the energy is conserved. At the end of that, he adds the following:

It is a very beautiful line of reasoning. The only problem is that perhaps it is not true. (After all, nature does not have to go along with our reasoning.) For example, perhaps perpetual motion is, in fact, possible. Some of the assumptions may be wrong, or we may have made a mistake in reasoning, so it is always necessary to check. It turns out experimentally, in fact, to be true.

This is such a lovely mental movement. Feynman deeply cared about knowing how the world really actually works, and it looks like this led him to a mental reflex where even in cases of enormous cultural confidence he still responds to clever arguments by asking "What does nature have to say?"

In my opinion, people in this community update too much on clever arguments. I include myself in that. I disagree with your claim that people shouldn't update at all on clever arguments, but I very much agree that there would be much more strength in the Art if it were to emphasize an active hunger for asking nature its opinion.

I think there's a flavor of mistake that comes from overemphasizing the direction I see you pointing at the expense of other virtues. I've known quite a number of scientists who think the way I see you suggesting who feel like they can't have any opinions or thoughts about things they haven't seem empirical tests of. I think they're in part trying to protect themselves against what Eliezer calls "privileging the hypothesis", but they also make themselves unnecessarily stupid in some ways. The most common and blatant I recall is their getting routinely blindsided by predictable social expectations and drama.

But I think Feynman gets it right.

And I think we ought to, too.

So again, thank you for bringing this up. It clarified something that had been nagging me, and now I think I see how to fix it.

If you'll permit a restatement... it sounds like you surveyed the verbal output of the big names in the transhumanist/singularity space and classified them in terms of seeming basically "correct" or "mistaken".

Two distinguishing features seemed to you to be associated with being mistaken: (1) a reliance on philosophy-like thought experiments rather than empiricism and (2) relatedness to the LW/MIRI cultural subspace.

Then you inferred the existence of an essential tendency to "thought-experiments over empiricism" as a difficult to change hidden variable which accounted for many intellectual surface traits.

Then you inferred that this essence was (1) culturally transmissible, (2) sourced in the texts of LW's founding (which you have recently been reading very attentively), and (3) an active cause of ongoing mistakenness.

Based on this, you decided to avoid the continued influence of this hypothetical pernicious cultural transmission and therefore you're going to start avoiding LW and stop reading the founding texts.

Also, if the causal model here is accurate... you presumably consider it a public service to point out what is going on and help others avoid the same pernicious influence.

My first question: Am I summarizing accurately?

My second question assumed a yes and seeks information relevant to repair: Can you spell out the mechanisms by which you think mistake-causing reliance on thought experiment is promoted and/or transmitted? Is it an explicit doctrine? Is it via social copying of examples? Is it something else?

I'm relatively new here, so I have trouble seeing the same kinds of problems you do.

However, I can say that LessWrong does help me remember to apply the principles of rationality I've been trying to learn.

I'd also like to add that - much like writing a novel - the first draft rarely addresses all of the possible faults. LessWrong is one of (if not the first) community blogs devoted to "refining the art of human rationality." Of course we're going to get some things wrong.

What I really admire about this site, though, is that contrarian viewpoints end up being some of the most highly upvoted - people admire and discourse with dissenters here. So if you truly believe that LessWrong isn't the best use of your time, then I wish you the best with whatever efforts you pursue. But I think if you wrote a bit more on this subject and found a way to add it to the sequences, everyone would only thank you.

We get instead definitive conclusions drawn from thought experiments only.

As a relatively new user here at LessWrong (and new to rationality) it is also curious to me that many here point me to articles written by Eliezer Yudkowsky to support their arguments. I have the feeling there is a general admiration for him and that some could be biased by that rather than approaching the different topics objectively.

Also, when I read the article about dissolving problems and how algorithms feel I didn't find any evidence that it is known exactly how neuron networks work to create these feelings.

That article was a good way of explaining how we might "feel" the existence of things and how to demystify them (like free will, time, ghosts, god, etc.) but I am not sure if the "extra dangling unit in the center" is something that we know exists or if it is another construct that was built to refute things by thought experiment rather than by empiric evidence.

it is also curious to me that many here point me to articles written by Eliezer Yudkowsky to support their arguments

It's been my experience that this is usually done to point to a longer and better-argued version of what the person wants to say rather than to say "here is proof of what I want to say".

I mean, if I agree with the argument made by EY about some subject, and EY has done a lot of work in making the argument, then I'm not going to just reword the argument, I'm just going to post a link.

The appropriate response is to engage the argument made in the EY argument as if it is the argument the person is making themselves.