True numbers and fake numbers

In physical science the first essential step in the direction of learning any subject is to find principles of numerical reckoning and practicable methods for measuring some quality connected with it. I often say that when you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely in your thoughts advanced to the state of Science, whatever the matter may be.

-- Lord Kelvin

If you believe that science is about describing things mathematically, you can fall into a strange sort of trap where you come up with some numerical quantity, discover interesting facts about it, use it to analyze real-world situations - but never actually get around to measuring it. I call such things "theoretical quantities" or "fake numbers", as opposed to "measurable quantities" or "true numbers".

An example of a "true number" is mass. We can measure the mass of a person or a car, and we use these values in engineering all the time. An example of a "fake number" is utility. I've never seen a concrete utility value used anywhere, though I always hear about nice mathematical laws that it must obey.

The difference is not just about units of measurement. In economics you can see fake numbers happily coexisting with true numbers using the same units. Price is a true number measured in dollars, and you see concrete values and graphs everywhere. "Consumer surplus" is also measured in dollars, but good luck calculating the consumer surplus of a single cheeseburger, never mind drawing a graph of aggregate consumer surplus for the US! If you ask five economists to calculate it, you'll get five different indirect estimates, and it's not obvious that there's a true number to be measured in the first place.

Another example of a fake number is "complexity" or "maintainability" in software engineering. Sure, people have proposed different methods of measuring it. But if they were measuring a true number, I'd expect them to agree to the 3rd decimal place, which they don't :-) The existence of multiple measuring methods that give the same result is one of the differences between a true number and a fake one. Another sign is what happens when two of these methods disagree: do people say that they're both equally valid, or do they insist that one must be wrong and try to find the error?

It's certainly possible to improve something without measuring it. You can learn to play the piano pretty well without quantifying your progress. But we should probably try harder to find measurable components of "intelligence", "rationality", "productivity" and other such things, because we'd be better at improving them if we had true numbers in our hands.

Comments

sorted by
magical algorithm
Highlighting new comments since Today at 3:21 PM
Select new highlight date
All comments loaded

I think there is a tale to tell about the consumer surplus and it goes like this.

Alice loves widgets. She would pay $100 for a widget. She goes on line and finds Bob offering widgets for sale for $100. Err, that is not really what she had in mind. She imagined paying $30 for a widget, and feeling $70 better off as a consequence. She emails Bob: How about $90?

Bob feels like giving up altogether. It takes him ten hours to hand craft a widget and the minimum wage where he lives is $10 an hour. He was offering widgets for $150. $100 is the absolute minimum. Bob replies: No.

While Alice is deciding whether to pay $100 for a widget that is only worth $100 to her, Carol puts the finishing touches to her widget making machine. At the press of a button Carol can produce a widget for only $10. She activates her website, offering widgets for $40. Alice orders one at once.

How would Eve the economist like to analyse this? She would like to identify a consumer surplus of 100 - 40 = 60 dollars, and a producer surplus of 40 - 10 = 30 dollars, for a total gain from trade of 60 + 30 = 90 dollars. But before she can do this she has to telephone Alice and Carol and find out the secret numbers, $100 and $10. Only the market price of $40 is overt.

Alice thinks Eve is spying for Carol. If Carol learns that Alice is willing to pay $100, she will up the price to $80. So Alice bullshits Eve: Yeh, I'm regretting my purchase, I've rushed to buy a widget, but what's it worth really? $35. I've over paid.

Carol thinks Eve is spying for Alice. If Alice learns that they only cost $10 to make, then she will bargain Carol down to $20. Carol bullshits Eve: Currently they cost me $45 to make, but if I can grow volumes I'll get a bulk discount on raw materials and I hope to be making them for $35 and be in profit by 2016.

Eve realises that she isn't going to be able to get the numbers she needs, so she values the trade at its market price and declares GDP to be $40. It is what economist do. It is the desperate expedient to which the opacity of business has reduced them.

Now for the twist in the tale. Carol presses the button on her widget making machine, which catches fire and is destroyed. Carol gives up widget making. Alice buys from Bob for $100. Neither is happy with the deal; the total of consumer surplus and producer surplus is zero. Alice is thinking that she would have been happier spending her $100 eating out. Bob is thinking that he would have had a nicer time earning his $100 waiting tables for 10 hours.

Eve revises her GDP estimate. She has committed herself to market prices, so it is up 150% at $100. Err, that is not what is supposed to happen. Vital machinery is lost in a fire, prices soar and goods are produced by tedious manual labour, the economy has gone to shit, producing no surplus instead of producing a $90 surplus. But Eve's figures make this look good.

I agree that there is a problem with the consumer surplus. It is too hard to discover. But the market price is actually irrelevant. Going with the number you can get, even though it doesn't relate to what you want to know is another kind of fake, in some ways worse.

Disclaimer: I'm not an economist. Corrections welcomed.

Bob is thinking that he would have had a nicer time earning his $100 waiting tables for 10 hours.

If that job's available, why doesn't he do it instead? If it's not, what's the point of focusing on his wishing - he might as well wish he were a millionaire.

The missing detail in your story is what Bob did to earn money while Carol's machine was working. If he was doing something better than hand-making widgets, he wouldn't go back to widgetry unless he could sell at a higher price. And if he was doing something less good than making widgets, he's happy that Carol's machine burned down.

Another point is that if Carol's machine can make widgets more cheaply than Bob, then it might make more them, satisfying more market demand. This should cause GDP to rise since it multiplies items sold by price. How common is the case of very inelastic demand (if that's the right term)?

These points probably shouldn't change your conclusion that GDP is often a bad measure.

Disclaimer: I'm even less of an economist than you are.

You'll need to fix the start of your post: (emphasis mine):

Alice loves widgets. She would pay $100 for a widget. She goes on line and finds Bob offering widgets for sale for $100. Err, that is not really what she had in mind.

That's exactly what she had in mind since she would pay $100. I think it's better to change it to "Alice would pay $95 for a widget".

A couple of other points. Eve's life is very slightly easier since prices of widgets change which means she can estimate part of the demand curve and then estimate the consumer surplus from there. Also, GDP and what it measures has nothing to do with the consumer surplus.

Perhaps she'd be willing to pay $100 for a widget if there were no other option, but would nonetheless prefer to pay less if that can be arranged.

Transaction and information costs are a huge problem. People spend a lot of time paying them in actual life — for instance, driving to work or to the store; standing in lines; searching for bargains; clipping coupons; and so on.

Alice is willing to incur an email round-trip time to distinguish a world where $100 widgets are the only offer from a world in which a $90 widget is also available. She considers that the delay of one round-trip of haggling is worth $10 times the probability of a lower offer existing.

(Other factors obtain, too, like minimizing regret ­— if she bought a widget for $100 and then immediately saw Bob sell one to Faye for $90, she'd feel like $10 worth of fool.)

Assuming you broadly subscribe to the notion of "true numbers" and "fake numbers", how do you classify the following?

Food calories [pollid:595]

The position of an object's centre of mass [pollid:596]

The equilibrium price [pollid:597]

A population's carrying capacity [pollid:598]

The population mean [pollid:599]

Some of your fake numbers fall out of the common practice of shoehorning a partial order into the number line. Suppose you have some quality Foo relative to which you can compare, in a somewhat well-defined manner in at least some cases. For example, Foo = desirable: X is definitely more desirable to me than Y, or Foo = complex: software project A is definitely more complex than software project B. It's not necessarily the case that any X and Y are comparable. It's then tempting to invent a numerical notion of Foo-ness, and assign numerical values of Foo-ness to all things in such a way that your intuitive Foo-wise comparisons hold. The values turn out to be essentially arbitrary on their own, only their relative order is important.

(In mathematical terms, you have a finite (in practice) partially ordered set which you can always order-embed into the integers; if the set is dynamically growing, it's even more convenient to order-embed it into the reals so you can always find intermediate values between existing ones).

After this process, you end up with a linear order, so any X and Y are always comparable. It's easy to forget that this may not have been the case in your intuition when you started out, because these new comparisons do not contradict any older comparisons that you held. If you had no firm opinion on comparative utility of eating ice-cream versus solving a crossword, it doesn't seem a huge travesty that both activities now get a specific utility value and one of them outranks the other.

The advantages of this process are that Foo-ness is now a specific thing that can be estimated, used in calculations, reasoned about more flexibly, etc. The disadvantages, as you describe, are that the numbers are "fake", they're really just psychologically convenient markers in a linear order; and the enforced linearity may mislead you into oversimplifying the phenomenon and failing to investigate why the real partial order is the way it is.

Nitpick: utility is not just an ordering, it also has affine structure (relative intervals are preserved) because of preferences over lotteries. Software complexity is a valid example of your point, though. It's like trying to measure the "largeness" of a thing without specifying whether we mean weight, volume, surface area, or something else.

Another example of a fake number is "complexity" or "maintainability" in software engineering.

Yet another is "productivity". In fact, most of software engineering consists of discussions of fake numbers. :/ This article (pdf) discusses that rather nicely.

Another example of a fake number is "complexity" or "maintainability" in software engineering. Sure, people have proposed different methods of measuring it. But if they were measuring a true number, I'd expect them to agree to the 3rd decimal place, which they don't

Why is such precision required for something to count as a 'measurable quantity'? Depending on how you do the measurements, measurements of (e.g.) prices don't always agree to two decimal places, let alone three.

But we should probably try harder to find measurable components of "intelligence", "rationality", "productivity" and other such things, because we'd be better at improving them if we had true numbers in our hands.

Sure, though IQ is already a triumph of psychometrics, and Stanovich is working on the first RQ (rationality quotient) test.

Let me throw in what might be a useful term: "unobservable".

Take, for example, the standard deviation of a time series. We can certainly make estimates of it, but the actual volatility is unobservable directly, we can only see its effects. A large chunk of statistics is, in fact, dedicated to making estimates of unobservable quantities and figuring out whether these estimates are any good.

Another useful term is "well-defined". For example, look at inflation. Inflation in general (defined as "change in prices", more or less) is not well-defined and different people can (and do) propose various ways to quantify it. But if you take one specific measure, say in the US a particular CPI and define it as a number that comes out of specific procedure that the BLS performs every month, then it becomes well-defined.

Just to nitpick, the standard deviation of a time series is not even well-defined unless we know that the series is stationary. In Shalizi's words, "if you want someone to solve the problem of induction, the philosophy department is down the stairs and to the left". If it were well-defined (e.g. if the time series were coming from some physical process with rigidly specified parameters), it would be just as observable as the mass of the moon, i.e. indirectly. That would fit my criteria for a "true number".

I guess that for me a "true number" has to be a well-defined number that you can measure in multiple ways and get the same result, so inflation is out because it's not well-defined, and CPI is out because it's just one method of measurement that doesn't agree with anything else.