Friendly AI and the limits of computational epistemology

Very soon, Eliezer is supposed to start posting a new sequence, on "Open Problems in Friendly AI". After several years in which its activities were dominated by the topic of human rationality, this ought to mark the beginning of a new phase for the Singularity Institute, one in which it is visibly working on artificial intelligence once again. If everything comes together, then it will now be a straight line from here to the end.

I foresee that, once the new sequence gets going, it won't be that easy to question the framework in terms of which the problems are posed. So I consider this my last opportunity for some time, to set out an alternative big picture. It's a framework in which all those rigorous mathematical and computational issues still need to be investigated, so a lot of "orthodox" ideas about Friendly AI should carry across. But the context is different, and it makes a difference.

Begin with the really big picture. What would it take to produce a friendly singularity? You need to find the true ontology, find the true morality, and win the intelligence race. For example, if your Friendly AI was to be an expected utility maximizer, it would need to model the world correctly ("true ontology"), value the world correctly ("true morality"), and it would need to outsmart its opponents ("win the intelligence race").

Now let's consider how SI will approach these goals.

The evidence says that the working ontological hypothesis of SI-associated researchers will be timeless many-worlds quantum mechanics, possibly embedded in a "Tegmark Level IV multiverse", with the auxiliary hypothesis that algorithms can "feel like something from inside" and that this is what conscious experience is.

The true morality is to be found by understanding the true decision procedure employed by human beings, and idealizing it according to criteria implicit in that procedure. That is, one would seek to understand conceptually the physical and cognitive causation at work in concrete human choices, both conscious and unconscious, with the expectation that there will be a crisp, complex, and specific answer to the question "why and how do humans make the choices that they do?" Undoubtedly there would be some biological variation, and there would also be significant elements of the "human decision procedure",  as instantiated in any specific individual, which are set by experience and by culture, rather than by genetics. Nonetheless one expects that there is something like a specific algorithm or algorithm-template here, which is part of the standard Homo sapiens cognitive package and biological design; just another anatomical feature, particular to our species.

Having reconstructed this algorithm via scientific analysis of human genome, brain, and behavior, one would then idealize it using its own criteria. This algorithm defines the de-facto value system that human beings employ, but that is not necessarily the value system they would wish to employ; nonetheless, human self-dissatisfaction also arises from the use of this algorithm to judge ourselves. So it contains the seeds of its own improvement. The value system of a Friendly AI is to be obtained from the recursive self-improvement of the natural human decision procedure.

Finally, this is all for naught if seriously unfriendly AI appears first. It isn't good enough just to have the right goals, you must be able to carry them out. In the global race towards artificial general intelligence, SI might hope to "win" either by being the first to achieve AGI, or by having its prescriptions adopted by those who do first achieve AGI. They have some in-house competence regarding models of universal AI like AIXI, and they have many contacts in the world of AGI research, so they're at least engaged with this aspect of the problem.

Upon examining this tentative reconstruction of SI's game-plan, I find I have two major reservations. The big one, and the one most difficult to convey, concerns the ontological assumptions. In second place is what I see as an undue emphasis on the idea of outsourcing the methodological and design problems of FAI research to uploaded researchers and/or a proto-FAI which is simulating or modeling human researchers. This is supposed to be a way to finesse philosophical difficulties like "what is consciousness anyway"; you just simulate some humans until they agree that they have solved the problem. The reasoning goes that if the simulation is good enough, it will be just as good as if ordinary non-simulated humans solved it.

I also used to have a third major criticism, that the big SI focus on rationality outreach was a mistake; but it brought in a lot of new people, and in any case that phase is ending, with the creation of CFAR, a separate organization. So we are down to two basic criticisms.

First, "ontology". I do not think that SI intends to just program its AI with an apriori belief in the Everett multiverse, for two reasons. First, like anyone else, their ventures into AI will surely begin with programs that work within very limited and more down-to-earth ontological domains. Second, at least some of the AI's world-model ought to be obtained rationally. Scientific theories are supposed to be rationally justified, e.g. by their capacity to make successful predictions, and one would prefer that the AI's ontology results from the employment of its epistemology, rather than just being an axiom; not least because we want it to be able to question that ontology, should the evidence begin to count against it.

For this reason, although I have campaigned against many-worlds dogmatism on this site for several years, I'm not especially concerned about the possibility of SI producing an AI that is "dogmatic" in this way. For an AI to independently assess the merits of rival physical theories, the theories would need to be expressed with much more precision than they have been in LW's debates, and the disagreements about which theory is rationally favored would be replaced with objectively resolvable choices among exactly specified models.

The real problem, which is not just SI's problem, but a chronic and worsening problem of intellectual culture in the era of mathematically formalized science, is a dwindling of the ontological options to materialism, platonism, or an unstable combination of the two, and a similar restriction of epistemology to computation.

Any assertion that we need an ontology beyond materialism (or physicalism or naturalism) is liable to be immediately rejected by this audience, so I shall immediately explain what I mean. It's just the usual problem of "qualia". There are qualities which are part of reality - we know this because they are part of experience, and experience is part of reality - but which are not part of our physical description of reality. The problematic "belief in materialism" is actually the belief in the completeness of current materialist ontology, a belief which prevents people from seeing any need to consider radical or exotic solutions to the qualia problem. There is every reason to think that the world-picture arising from a correct solution to that problem will still be one in which you have "things with states" causally interacting with other "things with states", and a sensible materialist shouldn't find that objectionable.

What I mean by platonism, is an ontology which reifies mathematical or computational abstractions, and says that they are the stuff of reality. Thus assertions that reality is a computer program, or a Hilbert space. Once again, the qualia are absent; but in this case, instead of the deficient ontology being based on supposing that there is nothing but particles, it's based on supposing that there is nothing but the intellectual constructs used to model the world.

Although the abstract concept of a computer program (the abstractly conceived state machine which it instantiates) does not contain qualia, people often treat programs as having mind-like qualities, especially by imbuing them with semantics - the states of the program are conceived to be "about" something, just like thoughts are. And thus computation has been the way in which materialism has tried to restore the mind to a place in its ontology. This is the unstable combination of materialism and platonism to which I referred. It's unstable because it's not a real solution, though it can live unexamined for a long time in a person's belief system.

An ontology which genuinely contains qualia will nonetheless still contain "things with states" undergoing state transitions, so there will be state machines, and consequently, computational concepts will still be valid, they will still have a place in the description of reality. But the computational description is an abstraction; the ontological essence of the state plays no part in this description; only its causal role in the network of possible states matters for computation. The attempt to make computation the foundation of an ontology of mind is therefore proceeding in the wrong direction.

But here we run up against the hazards of computational epistemology, which is playing such a central role in artificial intelligence. Computational epistemology is good at identifying the minimal state machine which could have produced the data. But it cannot by itself tell you what those states are "like". It can only say that X was probably caused by a Y that was itself caused by Z.

Among the properties of human consciousness are knowledge that something exists, knowledge that consciousness exists, and a long string of other facts about the nature of what we experience. Even if an AI scientist employing a computational epistemology managed to produce a model of the world which correctly identified the causal relations between consciousness, its knowledge, and the objects of its knowledge, the AI scientist would not know that its X, Y, and Z refer to, say, "knowledge of existence", "experience of existence", and "existence". The same might be said of any successful analysis of qualia, knowledge of qualia, and how they fit into neurophysical causality.

It would be up to human beings - for example, the AI's programmers and handlers - to ensure that entities in the AI's causal model were given appropriate significance. And here we approach the second big problem, the enthusiasm for outsourcing the solution of hard problems of FAI design to the AI and/or to simulated human beings. The latter is a somewhat impractical idea anyway, but here I want to highlight the risk that the AI's designers will have false ontological beliefs about the nature of mind, which are then implemented apriori in the AI. That strikes me as far more likely than implanting a wrong apriori about physics; computational epistemology can discriminate usefully between different mathematical models of physics, because it can judge one state machine model as better than another, and current physical ontology is essentially one of interacting state machines. But as I have argued, not only must the true ontology be deeper than state-machine materialism, there is no way for an AI employing computational epistemology to bootstrap to a deeper ontology.

In a phrase: to use computational epistemology is to commit to state-machine materialism as your apriori ontology. And the problem with state-machine materialism is not that it models the world in terms of causal interactions between things-with-states; the problem is that it can't go any deeper than that, yet apparently we can. Something about the ontological constitution of consciousness makes it possible for us to experience existence, to have the concept of existence, to know that we are experiencing existence, and similarly for the experience of color, time, and all those other aspects of being that fit so uncomfortably into our scientific ontology.

It must be that the true epistemology, for a conscious being, is something more than computational epistemology. And maybe an AI can't bootstrap its way to knowing this expanded epistemology - because an AI doesn't really know or experience anything, only a consciousness, whether natural or artificial, does those things - but maybe a human being can. My own investigations suggest that the tradition of thought which made the most progress in this direction was the philosophical school known as transcendental phenomenology. But transcendental phenomenology is very unfashionable now, precisely because of apriori materialism. People don't see what "categorial intuition" or "adumbrations of givenness" or any of the other weird phenomenological concepts could possibly mean for an evolved Bayesian neural network; and they're right, there is no connection. But the idea that a human being is a state machine running on a distributed neural computation is just a hypothesis, and I would argue that it is a hypothesis in contradiction with so much of the phenomenological data, that we really ought to look for a more sophisticated refinement of the idea. Fortunately, 21st-century physics, if not yet neurobiology, can provide alternative hypotheses in which complexity of state originates from something other than concatenation of parts - for example, entanglement, or from topological structures in a field. In such ideas I believe we see a glimpse of the true ontology of mind, one which from the inside resembles the ontology of transcendental phenomenology; which in its mathematical, formal representation may involve structures like iterated Clifford algebras; and which in its biophysical context would appear to be describing a mass of entangled electrons in that hypothetical sweet spot, somewhere in the brain, where there's a mechanism to protect against decoherence.

Of course this is why I've talked about "monads" in the past, but my objective here is not to promote neo-monadology, that's something I need to take up with neuroscientists and biophysicists and quantum foundations people. What I wish to do here is to argue against the completeness of computational epistemology, and to caution against the rejection of phenomenological data just because it conflicts with state-machine materialism or computational epistemology. This is an argument and a warning that should be meaningful for anyone trying to make sense of their existence in the scientific cosmos, but it has a special significance for this arcane and idealistic enterprise called "friendly AI". My message for friendly AI researchers is not that computational epistemology is invalid, or that it's wrong to think about the mind as a state machine, just that all that isn't the full story. A monadic mind would be a state machine, but ontologically it would be different from the same state machine running on a network of a billion monads. You need to do the impossible one more time, and make your plans bearing in mind that the true ontology is something more than your current intellectual tools allow you to represent.

Comments

sorted by
magical algorithm
Highlighting new comments since Today at 7:52 AM
Select new highlight date
All comments loaded

Upvoted for the accurate and concise summary of the big picture according to SI.

There are qualities which are part of reality - we know this because they are part of experience, and experience is part of reality - but which are not part of our physical description of reality.

This continues to strike me as a category error akin to thinking that our knowledge of integrated circuit design is incomplete because we can't use it to account for Java classes.

I have been publicly and repeatedly skeptical of any proposal to make an AI compute the answer to a philosophical question you don't know how to solve yourself, not because it's impossible in principle, but because it seems quite improbable and definitely very unreliable to claim that you know that computation X will output the correct answer to a philosophical problem and yet you've got no idea how to solve it yourself. Philosophical problems are not problems because they are well-specified and yet too computationally intensive for any one human mind. They're problems because we don't know what procedure will output the right answer, and if we had that procedure we would probably be able to compute the answer ourselves using relatively little computing power. Imagine someone telling you they'd written a program requiring a thousand CPU-years of computing time to solve the free will problem.

And once again, I expect that the hardest part of the FAI problem is not "winning the intelligence race" but winning it with an AI design restricted to the much narrower part of the cognitive space that integrates with the F part, i.e., all algorithms must be conducive to clean self-modification. That's the hard part of the work.

What do you think the chances are that there is some single procedure that can be used to solve all philosophical problems? That for example the procedure our brains are using to try to solve decision theory is essentially the same as the one we'll use to solve consciousness? (I mean some sort of procedure that we can isolate and not just the human mind as a whole.)

If there isn't such a single procedure, I just don't see how we can possibly solve all of the necessarily philosophical problems to build an FAI before someone builds an AGI, because we are still at the stage where every step forward we make just lets us see how many more problems there are (see Open Problems Related to Solomonoff Induction for example) and we are making forward steps so slowly, and worse, there's no good way of verifying that each step we take really is a step forward and not some erroneous digression.

What do you think the chances are that there is some single procedure that can be used to solve all philosophical problems?

Very low, of course. (Then again, relative to the perspective of nonscientists, there turned out to be a single procedure that could be used to solve all empirical problems.) But in general, problems always look much more complicated than solutions do; the presence of a host of confusions does not indicate that the set of deep truths underlying all the solutions is noncompact.

You invoke as granted the assumption that there's anything besides your immediately present self (including your remembered past selves) that has qualia, but then you deny that some anticipatable things will have qualia. Presumably there are some philosophically informed epistemic-ish rules that you have been using, and implicitly endorsing, for the determination of whether any given stimuli you encounter were generated by something with qualia, and there are some other meta-philosophical epistemology-like rules that you are implicitly using and endorsing for determining whether the first set of rules was correct. Can you highlight any suitable past discussion you have given of the epistemology of the problem of other minds?

eta: I guess the discussions here, or here, sort of count, in that they explain how you could think what you do... except they're about something more like priors than like likelihoods.

In retrospect, the rest of your position is like that too, based on sort of metaphysical arguments about what is even coherently postulable, though you treat the conclusions with a certainty I don't see how to justify (e.g. one of your underlying concepts might not be fundamental the way you imagine). So, now that I see that, I guess my question was mostly just a passive-aggressive way to object to your argument procedure. The objectionable feature made more explicit is that the constraint you propose on the priors requires such a gerrymandered-seeming intermediate event -- that consciousness-simulating processes which are not causally (and, therefore, in some sense physically) 'atomic' are not experienced, yet would still manage to generate the only kind of outward evidence about their experiencedness that anyone else could possibly experience without direct brain interactions or measurements -- in order to make the likelihood of the (hypothetical) observations (of the outward evidence of experiencedness, and of the absence of that outward evidence anywhere else) within the gerrymandered event come out favorably.

the problem with state-machine materialism is not that it models the world in terms of causal interactions between things-with-states; the problem is that it can't go any deeper than that, yet apparently we can.

I may have missed the part where you explained why qualia can't fit into a state machine-model of the universe. Where does the incompatibility come from? I'm aware that it looks like no human-designed mathematical objects have experienced qualia yet, which is some level of evidence for it being impossible, but not so strong that I think you're justified in saying a materialist/mathematical platonist view of reality can never account for conscious experiences.

I may have missed the part where you explained why qualia can't fit into a state machine-model of the universe.

I think Mitchell's point is that we don't know whether state-machines have qualia, and the costs of making assumptions could be large.

Parts of this I think are brilliant, other parts I think are absolute nonsense. Not sure how I want to vote on this.

there is no way for an AI employing computational epistemology to bootstrap to a deeper ontology.

This strikes me as probably true but unproven.

My own investigations suggest that the tradition of thought which made the most progress in this direction was the philosophical school known as transcendental phenomenology.

You are anthropomorphizing the universe.

the philosophical school known as transcendental phenomenology.

You are anthropomorphizing the universe.

Phenomenology is the study of appearances. The only part of the universe that it is directly concerned with is "you experiencing existence". That part of the universe is anthropomorphic by definition.

That is all very interesting, but what difference does it practically make?

Suppose I were trying to build an AGI out of computation and physical sensors and actuators, and I had what appeared to me to be a wonderful new approach, and I was unconcerned with whether the device would "really" think or have qualia, just with whether it worked to do practical things. Maybe I'm concerned with fooming and Friendliness, but again, only in terms of the practical consequences, i.e. I don't want the world suddenly turned into paperclips. At what point, if any, would I need to ponder these epistemological issues?

what difference does it practically make?

It will be hard for your AGI to be an ethical agent if it doesn't know who is conscious and who is not.

It's easy enough for us (leaving aside edge cases about animals, the unborn, and the brain dead, which in fact people find hard, or at least persistently disagree on) How do we do it? By any other means than our ordinary senses?

Is a human who is dissociating conscious? Or one who spaces out for a couple of seconds then retcons continuous consciousness later (as appears to be what brains actually do)? Or one who is talking and doing complicated things while sleepwalking?

Your point sounds similar to Wei's point that solving FAI requires metaphilosophy.

Some brief attempted translation for the last part:

A "monad", in Mitchell Porter's usage, is supposed to be a somewhat isolatable quantum state machine, with states and dynamics factorizable somewhat as if it was a quantum analogue of a classical dynamic graphical model such as a dynamic Bayesian network (e.g., in the linked physics paper, a quantum cellular automaton). (I guess, unlike graphical models, it could also be supposed to not necessarily have a uniquely best natural decomposition of its Hilbert space for all purposes, like how with an atomic lattice you can analyze it either in terms of its nuclear positions or its phonons.) For a monad to be a conscious mind, the monad must also at least be complicated and [this is a mistaken guess] capable of certain kinds of evolution toward something like equilibria of tensor-product-related quantum operators having to do with reflective state representation[/mistaken guess]. His expectation that this will work out is based partly on intuitive parallels between some imaginable combinatorially composable structures in the kind of tensor algebra that shows up in quantum mechanics and the known composable grammar-like structures that tend to show up whenever we try to articulate concepts about representation (I guess mostly the operators of modal logic).

(Disclaimer: I know almost only just enough quantum physics to get into trouble.)

A monadic mind would be a state machine, but ontologically it would be different from the same state machine running on a network of a billion monads.

Not all your readers will understand that "network of a billion monads" is supposed to refer to things like classical computing machinery (or quantum computing machinery?).

His expectation that this will work out is based partly on [...]

(It's also based on an intuition I don't understand that says that classical states can't evolve toward something like representational equilibrium the way quantum states can -- e.g. you can't have something that tries to come up with an equilibrium of anticipation/decisions, like neural approximate computation of Nash equilibria, but using something more like representations of starting states of motor programs that, once underway, you've learned will predictably try to search combinatorial spaces of options and/or redo a computation like the current one but with different details -- or that, even if you can get ths sort of evolution in classical states, it's still knowably irrelevant. Earlier he invoked bafflingly intense intuitions about the obviously compelling ontological significance of the lack of spatial locality cues attached to subjective consciousness, such as "this quale is experienced in my anterior cingulate cortex, and this one in Wernicke's area", to argue that experience is necessarily nonclassically replicable. (As compared with, what, the spatial cues one would expect a classical simulation of the functional core of a conscious quantum state machine to magically become able to report experiencing?) He's now willing to spontaneously talk about non-conscious classical machines that simulate quantum ones (including not magically manifesting p-zombie subjective reports of spatial cues relating to its computational hardware), so I don't know what the causal role of that earlier intuition is in his present beliefs; but his reference to a "sweet spot", rather than a sweet protected quantum subspace of a space of network states or something, is suggestive, unless that's somehow necessary for the imagined tensor products to be able to stack up high enough.)

I don't know where you got the part about representational equilibria from.

My conception of a monad is that it is "physically elementary" but can have "mental states". Mental states are complex so there's some sort of structure there, but it's not spatial structure. The monad isn't obtained by physically concatenating simpler objects; its complexity has some other nature.

Consider the Game of Life cellular automaton. The cells are the "physically elementary objects" and they can have one of two states, "on" or "off".

Now imagine a cellular automaton in which the state space of each individual cell is a set of binary trees of arbitrary depth. So the sequence of states experienced by a single cell, rather than being like 0, 1, 1, 0, 0, 0,... might be more like (X(XX)), (XX), ((XX)X), (X(XX)), (X(X(XX)))... There's an internal combinatorial structure to the state of the single entity, and ontologically some of these states might even be phenomenal or intentional states.

Finally, if you get this dynamics as a result of something like the changing tensor decomposition of one of those quantum CAs, then you would have a causal system which mathematically is an automaton of "tree-state" cells, ontologically is a causal grid of monads capable of developing internal intentionality, and physically is described by a Hamiltonian built out of Pauli matrices, such as might describe a many-body quantum system.

Furthermore, since the states of the individual cell can have great or even arbitrary internal complexity, it may be possible to simulate the dynamics of a single grid-cell in complex states, using a large number of grid-cells in simple states. The simulated complex tree-states would actually be a concatenation of simple tree-states. This is the "network of a billion simple monads simulating a single complex monad".

Do you think that the outputs of human philosophers of mind, or physicists thinking about consciousness, can't be accurately modeled by computational processes, even with access to humans? If they can be predicted or heard, then they can be deferred to.

CEV is supposed to extrapolate our wishes "if we knew more", and the AI may be so sure that consciousness doesn't really exist in some fundamental ontological sense that it will override human philosophers' conclusions and extrapolate them as if they also thought consciousness doesn't exist in this ontological sense. (ETA: I think Eliezer has talked specifically about fixing people's wrong beliefs before starting to extrapolate them.) I share a similar concern, not so much about this particular philosophical problem, but that the AI will be wrong on some philosophical issue and reach some kind of disastrous or strongly suboptimal conclusion.

Maybe I missed this, but did you ever write up the Monday/Tuesday game with your views on consciousness? On Monday, consciousness is an algorithm running on a brain, and when people say they have consciously experienced something, they are reporting the output of this algorithm. On Tuesday, the true ontology of mind resembles the ontology of transcendental phenomenology. What's different?

I'm also confused about why an algorithm couldn't represent a mass of entangled electrons.

I find Mitchell_Porter difficult to understand, but I've voted this up just for the well-written summary of the SI's strategy (can an insider tell me whether the summary is accurate?)

Just one thing though - I feel like this isn't the first time I've seen How An Algorithm Feels From Inside linked to as if it was talking about qualia - which it really isn't. It would be a good title for an essay about qualia, but the actual text is more about general dissolving-the-question stuff.

The simple solution is to demystify qualia: I don't understand the manner in which ionic transfer within my brain appears to create sensation, but I don't have to make the jump from that to 'sensation and experience are different from brain state'. All of my sense data comes through channels- typically as an ion discharge through a nerve or a chemical in my blood. Those ion discharges and chemicals interact with brain cells in a complicated manner, and "I" "experience" "sensation". The experience and sensation are no more mysterious than the identity.

So what should I make of this argument if I happen to know you're actually an upload running on classical computing hardware?

Although the abstract concept of a computer program (the abstractly conceived state machine which it instantiates) does not contain qualia, people often treat programs as having mind-like qualities, especially by imbuing them with semantics - the states of the program are conceived to be "about" something, just like thoughts are.

So far as I can tell, I am also in the set of programs that are treated as having mind-like qualities by imbuing them with semantics. We go to a good deal of trouble to teach people to treat themselves and others as people; this seems to be a major focus of early childhood education, language learning, and so on. "Semantics" and "aboutness" have to do with language use, after all; we learn how to make words do things.

Consciousness may not be a one-place property ("I am conscious"); it is a two-place property ("I notice that I am conscious"; "I notice that you are conscious"; "You notice that I am conscious"). After all, most of the time we are not aware of our consciousness.

Consciousness is not continuous - it appears to be something we retcon after the fact.

In addition to being a great post overall, the first ~half of the post is a really excellent and compact summary of huge and complicated interlocking ideas. So, thanks for writing that, it's very useful to be able to see how all the ideas fit together at a glance, even if one already has a pretty good grasp of the ideas individually.

I've formed a tentative hypothesis that some human beings experience their own subjective consciousness much more strongly than others. An even more tentative explanation for why this might happen is that perhaps the brain regions responsible for reflective self awareness leading to that particular Strange Loop become activated or used intensely in some people to a greater degree and at an earlier age. Perhaps thinking a lot about "souls" at a young age causes you to strongly anchor every aspect of your awareness to a central observer-concept, including awareness of that observer-concept.

I don't know.

Anyway, this subpopulation of highly self-conscious people might feel strongly that their own qualia are, in fact, ontologically fundamental, and all their other sense data are illusionary. (I say "other" sense data because fundamentally your prefrontal cortex has no idea which objects of its input stream represent what we would naively classify as processed "sense data" versus which objects consist of fabrication and abstraction from other parts of the brain.)

The rest of the human population would be less likely to make statements about their subjective experience of their own awareness because that experience is less immediate for them.

I developed this hypothesis upon realizing that some people immediately know what you mean when you start talking about reflective self-awareness, and for these people the idea of an observer-spirit consciousness seems like a natural and descriptive explanation of their awarness. But then there are other people who look at you blankly when you make statements about your sense of yourself as a floating observer, no matter how you try to explain it, as if they really never reflected on their own awareness.

For myself, I think of "redness" as belonging to the same ontological class as an object in a C++ program - possessing both a physical reality in bits on a chip, and also complex representational properties within the symbolic system of C++. And I see no reason why my C++ program couldn't ultimately form a "awareness" object which then becomes aware of itself. And that could explain sentences the C++ program outputs about having an insistent sense of its own awareness. It actually sticks in my craw to say this. I, personally, have always had a strong sense of my own awareness and a powerful temptation to believe that there is some granular soul within me. I am not entirely satisfied with how this soul is treated by a reductionistic approach, but nor can I formulate any coherent objections, despite my best effots.

ed: Are people literally downvoting every reply that has anything good to say about the parent?

Upvoted for clarity.

I think, along with most LWers, that your concerns about qualia and the need for a new ontology are mistaken. But even granting that part of your argument, I don't see why it is problematic to approach the FAI problem through simulation of humans. Yes, you would only be simulating their physical/computational aspects, not the ineffable subjectiveness, but does that loss matter, for the purposes of seeing how the simulations react to different extrapolations and trying to determine CEV? Only if a) the qualia humans experience are related to their concrete biology and not to their computational properties, and b) the relation is two-ways, so the qualia are not epiphenomenal to behavior but affect it causally, and physics as we understand it is not causally closed. But in that case, you would not be able to make a good computational simulation of a human's behavior in the first place!

In conclusion, assuming that faithful computational simulations of human behavior are possible, I don't see how the qualia problem interferes with using them to determine CEV and/or help program FAI. There might be other problems with this line of research (I am not endorsing it) but the simulations not having an epiphenomenal inner aspect that true humans have does not interfere. (In fact, it is good--it means we can use simulations without ethical qualms!)

You need to do the impossible one more time, and make your plans bearing in mind that the true ontology [...] something more than your current intellectual tools allow you to represent.

With the "is" removed and replaced by an implied "might be", this seems like a good sentiment...

...well, given scenarios in which there were some other process that could come to represent it, such that there'd be a point in using (necessarily-)current intellectual tools to figure out how to stay out of those processes' way...

...and depending on the relative payoffs, and the other processes' hypothetical robustness against interference.

(To the extent that decomposing the world into processes that separately come to do things, and can be "interfered with" or not, makes sense at all, of course.)

A more intelligible argument than the specific one you have been making is merely "we don't know whether there are any hidden philosophical, contextual, or further-future gotchas in whether or not a seemingly valuable future would actually be valuable". But in that case it seems like you need a general toolset to try to eventually catch the gotcha hypotheses you weren't by historical accident already disposed to turn up, the same way algorithmic probability is supposed to help you organize your efforts to be sure you've covered all the practical implications of hypotheses about non-weird situations. As a corollary: it would be helpful to propose a program of phenomenological investigation that could be expected to cover the same general sort of amount of ground where possible gotchas could be lurking as would designing an AI to approximate a universal computational hypothesis class.

If it matters, the only scenario I can think of specifically relating to quantum mechanics is that there are forms of human communication which somehow are able to transfer qubits, that these matter for something, and that a classical simulation wouldn't preserve them at the input (and/or the other boundaries).

I've been trying to find a way to empathically emulate people who talk about quantum consciousness for a while, so far with only moderate success. Mitchell, I'm curious if you're aware of the work of Christof Koch and Giulio Tononi, and if so, could you speak to their approach?

For reference (if people aren't familiar with the work already) Koch's team is mostly doing experiments... and seems to be somewhat close to having mice that have genes knocked out so that they "logically would seem" to lack certain kinds of qualia that normal mice "logically would seem" to have. Tononi collaborates with him and has proposed a way to examine a thing that computes and calculates that thing's "amount of consciousness" using a framework he called Integrated Information Theory. I have not sat down and fully worked out the details of IIT such that I could explain it to a patient undergrad at a chalkboard, but the reputation of the people involved is positive (I've seen Koch's dog and pony show a few times and it has improved substantially over the years and he is pimping Tononi pretty effectively)... basically the content "smells promising" but I'm hoping I can hear someone else's well informed opinion to see if I should spend more time on it.

Also, it seems to be relevant to this philosophic discussion? Or not? That's what I'm wondering. Opinions appreciated :-)

It bugs me when people talk about 'quantum consciousness", given that classical computers can do anything quantum computers can do, only sometimes slower.

To summarize (mostly for my sake so I know I haven't misunderstood the OP):

  • 1.) Subjective conscious experience or qualia play a non-negligible role in how we behave and how we form our beliefs, especially of the mushy (technical term) variety that ethical reasoning is so bound up in.
  • 2.) The current popular computational flavor of philosophy of mind has inadequately addressed qualia in your eyes because the universality of the extended church-turing thesis, though satisfactorily covering the mechanistic descriptions of matter in a way that provides for emulation of the physical dynamics, does not tell us anything about what things would have subjective conscious experiences.
  • 3.) Features of quantum mechanics such as entanglement and topological structures in a relativistic quantum field provide a better ontological foundation for your speculative theories of consciousness which takes as inspiration phenomenology and a quantum mondadology.

EDIT: I guess the shortest synopsis of this whole argument is: we need to build qualia machines, not just intelligent machines, and we don't have any theories yet to help us do that (other than the normal, but delightful, 9 month process we currently use). I can very much agree with #1. Now, with #2, it is true that the explanatory gap of qualia does not yield to the computational descriptions of physical processes, but it is also true that the universe may just be constructed such that this computational description is the best we can get and we will just have to accept that qualia will be experienced by those computational systems that are organized in particular ways, the brain being one arrangement of such systems. And, for #3, without more information about your theory, I don't see how appealing to ontologically deeper physical processes would get you any further in explaining qualia, you need to give us more.