A Defense of Naive Metaethics

I aim to make several arguments in the post that we can make statements about what should be done and what should not be done that cannot be reduced, by definition, to statements about the physical world.

A Naive Argument

Lukeprog says this in one of his posts:

If someone makes a claim of the 'ought' type, either they are talking about the world of is, or they are talking about the world of is not. If they are talking about the world of is not, then I quickly lose interest because the world of is not isn't my subject of interest.

I would like to question that statement. I would guess that lukeprog's chief subject of interest is figuring out what to do with the options presented to him. His interest is, therefore, in figuring out what he ought to do.

 Consider the reasoning process that takes him from observations about the world to actions. He sees something, and then thinks, and then thinks some more, and then decides. Moreover, he can, if he chooses, express every step of this reasoning process in words. Does he really lose interest at the last step?

My goal here is to get people to feel the intuition that "I ought to do X" means something, and that thing is not "I think I ought to do X" or "I would think that I ought to do X if I were smarter and some other stuff".

(If you don't, I'm not sure what to do.)

People who do feel that intuition run into trouble. This is because "I ought to do X' does not refer to anything that exists. How can you make a statement that doesn't refer to anything that exists?

 I've done it, and my reasoning process is still intact, and nothing has blown up. Everything seems to be fine. No one has explained to me what isn't fine about this.

Since it's intuitive, why would you not want to do it that way?

(You can argue that certain words, for certain people, do not refer to what one ought to do. But it's a different matter to suggest that no word refers to what one ought to do beyond facts about what is.)

A Flatland Argument

"I'm not interested in words, I'm interested in things. Words are just sequences of sounds or images. There's no way a sequence of arbitrary symbols could imply another sequence, or inform a decision."

"I understand how logical definitions work. I can see how, from a small set of axioms, you can derive a large number of interesting facts. But I'm not interested in words without definitions. What does "That thing, over there?" mean? Taboo finger-pointing." 

"You can make statements about observations, that much is obvious. You can even talk about patterns in observations, like "the sun rises in the morning". But I don't understand your claim that there's no chocolate cake at the center of the sun. Is it about something you can see? If not, I'm not interested."

"Claims about the past make perfect sense, but I don't understand what you mean when you say something is going to happen. Sure, I see that chair, and I remember seeing the chair in the past, but what do you mean that the chair will still be there tomorrow? Taboo "will"."

Not every set of claims is reducible to every other set of claims. There is nothing special about the set "claims about the state of the world, including one's place in it and ability to affect it." If you add, however, ought-claims, then you will get a very special set - the set of all information you need to make correct decisions.

I can't see a reason to make claims that aren't reducible, by definition, to that.

The Bootstrapping Trick

Suppose an AI wants to find out what Bob means when he says "water'. AI could ask him if various items were and were not water. But Bob might get temporarily confused in any number of ways - he could mix up his words, he could hallucinate, or anything else. So the AI decides instead to wait. The AI will give Bob time, and everything else he needs, to make the decision. In this way, by giving Bob all the abilities he needs to replicate his abstract concept of a process that decides if something is or is not "water", the AI can duplicate this process.

The following statement is true:

A substance is water (in Bob's language) if and only if Bob, given all the time, intelligence, and other resources he wants, decides that it is water. 

But this is certainly not the definition of water! Imagine if Bob used this criterion to evaluate what was and was not water. He would suffer from an infinite regress. The definition of water is something else. The statement "This is water" reduces to a set of facts about this, not a set of facts about this and Bob's head. 

The extension to morality should be obvious.

What one is forced to do by this argument, if one wants to speak only in physical statements, is to say that "should" has a really, really long definition that incorporates all components of human value. When a simple word has a really, really long definition, we should worry that something is up.

Well, why does it have a long definition? It has a long definition because that's what we believe is important. To say that people who use (in this sense) "should" to mean different things just disagree about definitions is to paper over and cover up the fact that they disagree about what's important.

What do I care about?

In this essay I talk about what I believe about rather than what I care about. What I care about seems like an entirely emotional question to me. I cannot Shut Up And Multiply about what I care about. If I do, in fact, Shut Up and Multiply, then it is because I believe that doing so is right. Suppose I believe that my future emotions will follow multiplication. I would have to, then, believe that I am going to self-modify into someone who multiplies. I would only do this because of a belief that doing so is right. 

Belief and logical reasoning are an important part of how people on lesswrong think about morality, and I don't see how to incorporate them into a metaethics based not on beliefs, but on caring.

 

Comments

sorted by
magical algorithm
Highlighting new comments since Today at 4:28 AM
Select new highlight date
Rendering 50/295 comments  show more

I share your skepticism about Luke's statement (but I've been waiting to criticize until he finishes his sequence to see if he addresses the problems later).

My goal here is to get people to feel the intuition that "I ought to do X" means something, and that thing is not "I think I ought to do X" or "I would think that I ought to do X if I were smarter and some other stuff".

To help pump that intuition, consider this analogy:

"X is true" (where X is a mathematical statement) means something, and that thing is not "I think X is true" or "I would think that X is true if I were smarter and some other stuff".

On the other hand, I think it's also possible that "I ought to do X" doesn't really mean anything. See my What does a calculator mean by "2"?. (ETA: To clarify, I mean some usages of "ought" may not really mean anything. There are some usages that clearly do, for example "If you want to accomplish X, then you ought to do Y" can in principle be straightforwardly reduced to a mathematical statement about decision theory, assuming that our current strong intuition that there is such a thing as "the right decision theory" is correct.)

Wei Dai,

I would prefer to hear the source of your skepticism now, if possible. I anticipate not actually disagreeing. I anticipate that we will argue it out and discover that we agree but that my way of expressing my position was not clear to you at first. And then I anticipate using this information to improve the clarity of my future posts.

I'll first try to restate your position in order to check my understanding. Let me know if I don't do it justice.

People use "should" in several different ways. Most of these ways can be "reducible to physics", or in other words can be restated as talking about how our universe is, without losing any of the intended meaning. Some of these ways can't be so reduced (they are talking about the world of "is not") but those usages are simply meaningless and can be safely ignored.

I agree that many usages of "should" can be reduced to physics. (Or perhaps instead to mathematics.) But there may be other usages that can't be so reduced, and which are not clearly safe to ignore. Originally I was planning to wait for you to list the usages of "should" that can be reduced, and then show that there are other usages that are not obviously talking about "the world of is" but are not clearly meaningless either. (Of course I hope that your reductions do cover all of the important/interesting usages, but I'm not expecting that to be the case.)

Since you ask for my criticism now, I'll just give an example that seems to be one of the hardest to reduce: "Should I consider the lives of random strangers to have (terminal) value?"

(Eliezer's proposal is that what I'm really asking when I ask that question is "Does my CEV think the lives of random strangers should have (terminal) value?" I've given various arguments why I find this solution unsatisfactory. One that is currently fresh on my mind is that "coherent extrapolation" is merely a practical way to find the answer to any given question, but should not be used as the definition of what the question means. For example I could use a variant of CEV (call it Coherent Extrapolated Pi Estimation) to answer "What is the trillionth digit of pi?" but that doesn't imply that by "the trillionth digit of pi" I actually mean "the output of CEPE".)

I'm not planning to list all the reductions of normative language. There are too many. People use normative language in too many ways.

Also, I should clarify that when I talk about reducing ought statements into physical statements, I'm including logic. On my view, logic is just a feature of the language we use to talk about physical facts. (More on that if needed.)

Most of these ways can be "reducible to physics"... without losing any of the intended meaning.

I'm not sure I would say "most."

But there may be other usages that can't be so reduced, and which are not clearly safe to ignore.

What do you mean by "safe to ignore"?

If you're talking about something that doesn't reduce (even theoretically) into physics and/or a logical-mathematical function, then what are you talking about? Fiction? Magic? Those are fine things to talk about, as long as we understand we're talking about fiction or magic.

Should I consider the lives of random strangers to have (terminal) value?

What about this is hard to reduce? We can ask for what you mean by 'should' in this question, and reduce it if possible. Perhaps what you have in mind isn't reducible (divine commands), but then your question is without an answer.

Or perhaps you're asking the question in the sense of "Please fix my broken question for me. I don't know what I mean by 'should'. Would you please do a stack trace on the cognitive algorithms that generated that question, fix my question, and then answer it for me?" And in that case we're doing empathic metaethics.

I'm still confused as to what your objection is. Will you clarify?

What do you mean by "safe to ignore"?

You said that you're not interested in an "ought" sentence if it reduces to talking about the world of is not. I was trying to make the same point by "safe to ignore".

If you're talking about something that doesn't reduce (even theoretically) into physics and/or a logical-mathematical function, then what are you talking about?

I don't know, but I don't think it's a good idea to assume that only things that are reducible to physics and/or math are worth talking about. I mean it's a good working assumption to guide your search for possible meanings of "should", but why declare that you're not "interested" in anything else? Couldn't you make that decision on a case by case basis, just in case there is a meaning of "should" that talks about something else besides physics and/or math and its interestingness will be apparent once you see it?

Or perhaps you're asking the question in the sense of "Please fix my broken question for me. I don't know what I mean by 'should'. Would you please do a stack trace on the cognitive algorithms that generated that question, fix my question, and then answer it for me?" And in that case we're doing empathic metaethics.

Maybe I should have waited until you finish your sequence after all, because I don't know what "doing empathic metaethics" actually entails at this point. How are you proposing to "fix my question"? It's not as if there is a design spec buried somewhere in my brain, and you can check my actual code against the design spec to see where the bug is... Do you want to pick up this conversation after you explain it in more detail?

I don't think it's a good idea to assume that only things that are reducible to physics and/or math are worth talking about. I mean it's a good working assumption to guide your search for possible meanings of "should", but why declare that you're not "interested" in anything else?

Maybe this is because I'm fairly confident of physicalism? Of course I'll change my mind if presented with enough evidence, but I'm not anticipating such a surprise.

'Interest' wasn't the best word for me to use. I'll have to fix that. All I was trying to say is that if somebody uses 'ought' to refer to something that isn't physical or logical, then this punts the discussion back to a debate over physicalism, which isn't the topic of my already-too-long 'Pluralistic Moral Reductionism' post.

Surely, many people use 'ought' to refer to things non-reducible to physics or logic, and they may even be interesting (as in fiction), but in the search for true statements that use 'ought' language they are not 'interesting', unless physicalism is false (which is a different discussion, then).

Does that make sense? I'll explain empathic metaethics in more detail later, but I hope we can get some clarity on this part right now.

Maybe this is because I'm fairly confident of physicalism? Of course I'll change my mind if presented with enough evidence, but I'm not anticipating such a surprise.

First I would call myself a radical platonist instead of a physicalist. (If all universes that exist mathematically also exist physically, perhaps it could be said that there is no difference between platonism and physicalism, but I think most people who call themselves physicalists would deny that premise.) So I think it's likely that everything "interesting" can be reduced to math, but given the history of philosophy I don't think I should be very confident in that. See my recent How To Be More Confident... That You're Wrong.

Right, I'm pretty partial to Tegmark, too. So what I call physicalism is compatible with Tegmark. But could you perhaps give an example of what it would mean to reduce normative language to a logical-mathematical function - even a silly one?

(It's late and I'm thinking up this example on the spot, so let me know if it doesn't make sense.)

Suppose I'm in a restaurant and I say to my dinner companion Bob, "I'm too tired to think tonight. You know me pretty well. What do you think I should order?" From the answer I get, I can infer (when I'm not so tired) a set of joint constraints on what Bob believes to be my preferences, what decision theory he applied on my behalf, and the outcome of his (possibly subconscious) computation. If there is little uncertainty about my preferences and the decision theory involved, then the information conveyed by "you should order X" in this context just reduces to a mathematical statement about (for example) what the arg max of a set of weighted averages is.

(I notice an interesting subtlety here. Even though what I infer from "you should order X" is (1) "according to Bob's computation, the arg max of ... is X", what Bob means by "you should order X" must be (2) "the arg max of ... is X", because if he means (1), then "you should order X" would be true even if Bob made an error in his computation.)

Maybe this is because I'm fairly confident of physicalism? Of course I'll change my mind if presented with enough evidence, but I'm not anticipating such a surprise.

You'd need the FAI able to change its mind as well, which requires that you retain this option in its epistemology. To attack the communication issue from a different angle, could you give examples of the kinds of facts you deny? (Don't say "god" or "magic", give a concrete example.)

People who do feel that intuition run into trouble. This is because "I ought to do X' does not refer to anything that exists. How can you make a statement that doesn't refer to anything that exists?

It refers to my preferences which are physically encoded in my brain. It feels like it doesn't refer to anything that exists because I don't have complete introspective access to the mechanisms by which my brain decides that it wants something.

On top of that, ought refers to lots of different things, and as far as I can tell, most ought statements are summaries of specific preferences (and some signals) rather than the even more complicated description of what I'm actually going to choose to do.

Will and I just spoke on the phone, so here's another way to present our discussion:

Imagine a species of artificial agents. These agents have a list of belief statements that relate physical phenomena to normative properties (let's call them 'moral primitives'):

  • 'Liking' reward signals in human brains are good.
  • Causing physical pain in human infants is forbidden.
  • etc.

These agents also have a list of belief statements about physical phenomena in general:

  • Sweet tastes on the tongue produces reward signals in human brains.
  • Cutting the fingers of infants produces physical pain in infants.
  • Things are made of atoms.
  • etc.

These agents also have an 'ought' function that includes a series of logical statements that relate normative concepts to each other, such as:

  • A thing can't be both permissible and forbidden.
  • A thing can't be both obligatory and non-obligatory.
  • etc.

Finally, these robots have actuators that are activated by a series of rules like:

  • When the agent observes an opportunity to perform an action that is 'obligatory', then it will take that action.
  • An agent will avoid any action that is labeled as 'forbidden.'

Some of these rules might include utility functions that encode ordinal or cardinal value for varying combinations of normative properties.

These agents can't see their own source code. The combination of the moral primitives and the ought function and the non-ought belief statements and a set of rules about behavior produces their action and their verbal statements about what ought to be done.

From their behavior and verbal ought statements these robots can infer to some degree how their ought function works, but they can't fully describe their ought function because they haven't run enough tests or the ought function is just too complicated or the problem is made worse because they also can't see their moral primitives.

The ought function doesn't reduce to physics because it's a set of purely logical statements. The 'meaning' of ought in this sense is determined by the role that the ought function plays in producing intentional behavior by the robots.

Of course, the robots could speak in ought language in stipulated ways, such that 'ought' means 'that which produces pleasure in human brains' or something like that, and this could be a useful way to communicate efficiently, but it wouldn't capture what the ought function is doing or how it is contributing to the production of behavior by these agents.

What Will is saying is that it's convenient to use 'ought' language to refer to this ought function only, and not also to a combination of the ought function and statements about physics, as happens when we stipulatively use 'ought' to talk about 'that which produces well-being in conscious creatures' (for example).

I'm saying that's fine, but it can also be convenient (and intuitive) for people to use 'ought' language in ways that reduce to logical-physical statements, and not only in ways that express a logical function that contains only transformations between normative properties. So we don't have substantive disagreement on this point; we merely have different intuitions about the pragmatic value of particular uses for 'ought' language.

We also drew up a simplified model of the production of human action in which there is a cognitive module that processes the 'ought' function (made of purely logical statements like in the robots' ought function), a cognitive module that processes habits, a cognitive module that processes reflexes, and so on. Each of these produces an output, and another module runs arg(max) on these action options to determine which actions 'wins' and actually occurs.

Of course, the human 'ought' function is probably spread across multiple modules, as is the 'habit' function.

Will likes to think of the 'meaning' of 'ought' as being captured by the algorithm of this 'ought' function in the human brain. This ought function doesn't contain physical beliefs, but rather processes primitive normative/moral beliefs (from outside the ought function) and outputs particular normative/moral judgments, which contribute to the production of human behavior (including spoken moral judgments). In this sense, 'ought' in Will's sense of the term doesn't reduce to physical facts, but to a logical function.

I'm fine with Will using 'ought' in that sense if he wants. I'll try to be clear how I am using the term when I use it.

Will also thinks that the 'ought' function (in his sense) inside human brains is probably very similar between humans - ones that aren't brain damaged or neurologically deranged. I don't know how probable this is because cognitive neuroscience hasn't progressed that far. But if the 'ought' function is the same in all healthy humans, then there needn't be a separate 'meaning' of ought (in Will's sense) for each speaker, but instead there could be a shared 'meaning' of ought (in Will's sense) that is captured by the algorithms of the 'ought' cognitive module that is shared by healthy human brains.

Will, did I say all of that correctly?

I'm fine with Will using 'ought' in that sense if he wants. I'll try to be clear how I am using the term when I use it.

That doesn't seem right. Compare (note that I don't necessarily endorse the rest of this paper) :

What does the word ‘ought’ mean? Strictly speaking, this is an empirical question, about the meaning of a word in English. Such empirical semantic questions should ideally be answered on the basis of extensive empirical evidence about the use of the word by native speakers of English.

As a philosopher, I am primarily interested, not in empirical questions about the meanings of words, but in the nature of the concepts that those words can be used to express — especially when those concepts are central to certain branches of philosophy, as the concepts expressed by ‘ought’ are central to ethics and to the theory of rational choice and rational belief. Still, it is often easiest to approach the task of giving an account of the nature of certain concepts by studying the meanings of the words that can express those concepts. This is why I shall try here to outline an account of the meaning of ‘ought’.

If you examine just one particular sense of the word "ought", even if you make clear which sense, but without systematically enumerating all of the meanings of the word, how can you know that the concept you end up studying is the one that is actually important, or one that other people are most interested in?

How can you make a statement that doesn't refer to anything that exists? I've done it, and my reasoning process is still intact, and nothing has blown up. Everything seems to be fine. No one has explained to me what isn't fine about this. Since it's intuitive, why would you not want to do it that way?

Clearly, you can make statements about things that don't exist. People do it all the time, and I don't object to it. I enjoy works of fiction, too. But if the aim of our dialogue is true claims about reality, then you've got to talk about things that exist - whether the subject matter is 'oughts' or not.

What one is forced to do by this argument, if one wants to speak only in physical statements, is to say that "should" has a really, really long definition that incorporates all components of human value. When a simple word has a really, really long definition, we should worry that something is up.

I don't see why this needs to be the case. I can stipulate short meanings of 'should' as I use the term. People do this all the time (implicitly, at least) when using hypothetical imperatives.

Also, in general I find myself confused by your way of talking about these things. It's not a language I'm familiar with, so I suspect I'm still not fully understanding you. I'm not sure which of our anticipations differ because of the disagreement you're trying to express.

Can you explain what implications (if any) this "naive" metaethics has on the problem how to build an FAI?