The Importance of Goodhart's Law

This article introduces Goodhart's law, provides a few examples, tries to explain an origin for the law and lists out a few general mitigations.

Goodhart's law states that once a social or economic measure is turned into a target for policy, it will lose any information content that had qualified it to play such a role in the first place. wikipedia The law was named for its developer, Charles Goodhart, a chief economic advisor to the Bank of England.

The much more famous Lucas critique is a relatively specific formulation of the same. 

The most famous examples of Goodhart's law should be the soviet factories which when given targets on the basis of numbers of nails produced many tiny useless nails and when given targets on basis of weight produced a few giant nails. Numbers and weight both correlated well in a pre-central plan scenario. After they are made targets (in different times and periods), they lose that value.

We laugh at such ridiculous stories, because our societies are generally much better run than Soviet Russia. But the key with Goodhart's law is that it is applicable at every level. The japanese countryside is apparently full of constructions that are going on because constructions once started in recession era are getting to be almost impossible to stop. Our society centres around money, which is supposed to be a relatively good measure of reified human effort. But many unscruplous institutions have got rich by pursuing money in many ways that people would find extremely difficult to place as value-adding.

Recently GDP Fetishism by David henderson is another good article on how Goodhart's law is affecting societies.

The way I look at Goodhart's law is Guess the teacher's password writ large. People and instituitions try to achieve their explicitly stated targets in the easiest way possible, often obeying the letter of the law. 

A speculative origin of Goodhart's law

The way I see Goodhart's law work, or a target's utility break down, is the following.

  • Superiors want an undefined goal G.
  • They formulate G* which is not G, but until now in usual practice, G and G* have correlated.
  • Subordinates are given the target G*.
  • The well-intentioned subordinate may recognise G and suggest G** as a substitute, but such people are relatively few and far inbetween. Most people try to achieve G*. 
  • As time goes on, every means of achieving G* is sought. 
  • Remember that G* was formulated precisely because it is simple and more explicit than G. Hence, the persons, processes and organizations which aim at maximising G* achieve competitive advantage over those trying to juggle both G* and G. 
  • P(G|G*) reduces with time and after a point, the correlation completely breaks down.

The mitigations to Goodhart's law

If you consider the law to be true, solutions to Goodhart's law are an impossibility in a non-singleton scenario. So let's consider mitigations.

  • Hansonian Cynicism
  • Better Measures
  • Solutions centred around Human Discretion

Hansonian Cynicism

Pointing out what most people would have in mind as G and showing that institutions all around are not following G, but their own convoluted G*s. Hansonian cynicism is definitely the second step to mitigation in many many cases (Knowing about Goodhart's law is the first). Most people expect universities to be about education and hospitals to be about health. Pointing out that they aren't doing what they are supposed to be doing creates a huge cognitive dissonance in the thinking person.

Better measures

Balanced scorecards

Taking multiple factors into consideration, trying to make G* as strong and spoof-proof as possible.  The Scorecard approach is mathematically, the simplest solution that strikes a mind when confronted with Goodhart's law.

Optimization around the constraint

There are no generic solutions to bridging the gap between G and G*, but the body of knowledge of theory of constraints is a very good starting point for formulating better measures for corporates.

Extrapolated Volition

CEV tries to mitigate Goodhart's law in a better way than mechanical measures by trying to create a complete map of human morality. If G is defined fully, there is no need for a G*. CEV tries to do it for all humanity, but as an example, individual extrapolated volition should be enough. The attempt is incomplete as of now, but it is promising.

Solutions centred around Human discretion

Human discretion is the one thing that can presently beat Goodhart's law because the constant checking and rechecking that G and G* match. Nobody will attempt to pull off anything as weird as the large nails in such a scenario. However, this is not scalable in a strict sense because of the added testing and quality control requirements.

Left Anarchist ideas

Left anarchist ideas about small firms and workgroups are based on the fact that hierarchy will inevitably introduce goodhart's law related problems and thus the best groups are small ones doing simple things.

Hierarchical rule

On the other end of the political spectrum, Molbuggian hierarchical rule completely eliminates the mechanical aspects of the law. There is no letter of the law, its all spirit. I am supposed to take total care of my slaves and have total obedience to my master. The scalability is ensured through hierarchy.

 

Of all proposed solutions to the Goodhart's law problem confronted, I like CEV the most, but that is probably a reflection on me more than anything, wanting a relatively scalable and automated solution. I'm not sure whether the human discretion supporting people are really correct in this matter.

Your comments are invited and other mitigations and solutions to Goodhart's law are also invited.

Comments

sorted by
magical algorithm
Highlighting new comments since Today at 2:30 AM
Select new highlight date
Rendering 50/120 comments  show more

There are no generic solutions to bridging the gap between G and G*, but the body of knowledge of theory of constraints is a very good starting point for formulating better measures for corporates.

A good example from my own history of doing this is when I worked for an ISP and persuaded them to eliminate "cases closed" as a performance measurement for customer service and tech support people, because it was causing email-based cases to be closed without any actual investigation. People would email back and create a new case, and then a rep would get credit for closing that one without investigation either.

The replacement metric was one I derived via the Theory of Constraints, inspired by Goldratt's "throughput-dollar-days" measurement. The replacement metric was "customer-satisfaction-waiting-hours" - a measurement of collective work-in-progress inventory at the team level, and a measurement of priority at the ticket level.

I also made it impossible to truly "close" a case - you could say, "I think this is done", but the customer could still email into it and it would jump right back to its old place in the queue, due to the accumulated "satisfaction waiting hours" on the ticket.

Of course, the toughest part in some ways was educating new service managers that, no, you can't have a measurement of cases closed on a per-rep basis. Instead, you're going to have to actually pay attention to a rep's work in order to know if they're doing the job. (Of course, the system I developed also had ways to make it easy to see what people are working on, not only at the managerial but the team level - peer pressure is a useful co-ordination tool, if done right.)

I have no idea how well the system fared since I left the company, since it's entirely possible they found programmers since then to give them new metrics that would f**k it up, although I did design the database in such a way as to make it as close to impossible as I could manage. ;-)

Anyway, the theory of constraints positively rocks for business performance optimization, and its Thinking Processes are generally useful tools for any rationalist. They were also a big inspiration for me developing other thinking processes and ultimately mindhacking techniques, in that they showed that it's possible to think systematically even about some of the vaguest and most ill-defined problems imaginable, rigorously hone in on key leverage points, resolve conflicts between goals, and generally overcome our brains' processing limitations for analysis and planning.

[Edit to add: the Wikipedia page on thinking processes doesn't really show why a rationalist would be interested in the processes; it's useful to know that a key element of the processes are something called the "categories of legitimate reservation", which have to do with logical proof and well-formedness of argument. They are a key part of constructing and critiquing the semantic maps that are created by the thinking processes.

For example, ToC's conflict resolution method effectively maps out certain implicit assumptions in a conflict, and then invites you to logically disprove these assumptions in order to break the conflict. (That is, if you can find a circumstance where one of those assumptions is false, then the conflict will no longer exist under that circumstance - and you have a potential way out of your dilemma.)

So, in short, ToC thinking processes are mostly about constructing past, present, or future semantic maps of a situation, and applying systematic logic to validating (or invalidating) the maps' well-formedness, as a way of solving problems, creating plans, etc. Very core rationalist stuff, from an instrumental-rationality POV.]

I am reminded of one of Dijkstra's sayings:

To this very day we have organizations that measure "programmer productivity" by the "number of lines of code produced per month"; this number can, indeed, be counted, but they are booking it on the wrong side of the ledger, for we should talk about "the number of lines of code spent".

So, in short: incentives can have unintented consequences, as the incentives influence whatever you want to influence with them.

There are a lot of examples of this in e.g. Dan Ariely's book and Freakonomics.

But the best example must be the bizarre 1994 footbal (soccer) match between Barbados and Grenada. Barbados needed to win with a two goal difference.

The special incentive here was that any goal scored in the extra time would count double. Now, shortly before the end of the regular time, it was 2-1 for Barbados. Imagine what happened...

(edit: added the note about the two-goal difference, thanks Hook)

It's an important note for the soccer game that Barbados needed to win by two points in order to advance to the finals. Otherwise, Grenada would go to the finals. Now people have a chance of imagining what happened.

Goodhart's Law starts some other way. It's not quite right to say:

Superiors want an undefined goal G.

Mathematically speaking, the problem can't be that G is undefined. If G were really undefined in any absolute sense, then superiors would be indifferent to all possible outcomes, or would choose their utility function literally at random. That rarely happens.

Instead, the problem could be that G is difficult to articulate. It is "undefined" only in the sense that people have had trouble coming up with an explicit verbal definition for it. i know what I want and how to get it, but I don't know how to communicate that want to you ex ante. For example, maybe I want you (the night shift manager) to page me (the owner) whenever there's a decision to make that could affect whether our business keeps a client, but I've never taken any business classes and don't quite have the vocab to say that, so instead I say to only page me if it's "important." "Important" is vague, but "important' is just a map, and the map is not the territory.

Alternatively, the problem could be that G is difficult to commit to. I can define my goal in words just fine today, but I know (or you suspect) that later I will be tempted to evaluate you by some other criterion. For example, I would like to give a raise to whichever police officer does the most to keep his beat safe, and, as a thoughtful and experienced police chief, I know exactly what the difference is between a safe neighborhood and an unsafe neighborhood, and I'm happy to explain it to anyone who's interested. As one of my employees, though, you can't verify that I'm actually rewarding people for making neighborhoods safe, and not, say, giving raises to people who bring in the most money for drug busts, or who artificially lower their crime statistics, or who give me a kickback. It might make more sense for me to just announce that I'll pay people based on hours worked and complaints lodged, because that announcement is more verifiable, and thus more credible, so at least I'll be viewed as evenhanded.

Finally, as you've already pointed out, the problem could be that G is difficult or expensive to measure. Alternative measures of GDP that take into account factors like health, leisure, and environmental quality have gotten pretty good about specifying what health is, and it's easy enough to pass laws that commit agencies to valuing health in a particular way, but it's expensive to measure health, especially in any broad sense. A physical is $60; an exercise fitness exam is another $45; an STD test runs about $20; a battery of prophylactic tests for cancer and heart disease and so on is another $100 or so; a mental health exam is another $80, and then you multiply all that by the size of a valid random sample and we're talking real money. In my opinion, it would be money very, very well spent, but one can understand why GDP - which can be measured just by asking the IRS for a copy of its tax receipts - is such a popular metric. It's cheap to use.

CEV, until designed and defined properly, is just a black box that everyone universally agrees is 'good', but has little else in term of defining features.

The fact that students who are motivated to get good scores in exams very often get better scores than students who are genuinely interested in the subject is probably also an application of Goodhart's Law?

Partially; but a lot of what is being tested is actually skills correlated with being good in exams - working hard, memorisation, bending youself to the rules, ability to learn skill sets even if you don't love them, gaming the system - rather than interest in the subject.

But those skills don't correlate with doing good science, or with good use of the subject of the exams in general, nearly so well, and they are easy to test in other ways.

Goodhart's law seems very applicable to natural selection: the Blind Idiot God wants creatures to have higher fitness (G), and so creates targets that are correlated with fitness in the ancestral habitat (e.g., pleasure-seeking and pain-avoidance (G*)). Once you get creatures that are self-aware (us), they figure out G-star, and start optimizing for that instead of G.

Relevant.

In software development, this is (or ought to be) known as the Mini-Van Law.

It made me think of the Tree Swing, insofar as it represents how difficult it can be to create and follow a good G* through the process.

Getting back to trying to propose practical mitigation strategies for goodhart's law, I propose a fairly simple solution: Choose a G*, evaluate performance based on it, but KEEP IT SECRET. This of course wouldn't really work for national scale, GDP-esque kind of situations, but for corporate management situations it seems like it could work well enough. If only upper management knows what G* is, it becomes impossible to optimize for it, and everyone has to just keep working under the assumption they're being evaluated on G.

Taking it a step further, to hedge against employees eventually figuring out G* and surreptitiously optimizing for it, you could have a bounty on guessing G* - the first employee who figures out what the mystery metric G* really is gets a prize, and as soon as it's claimed, you switch to using G**

The hedge is absolutely necessary, elsewise, a manager will just tell subordinates what G* is in order to look impressive for managing a high-performing group.

Andrew Grove (of Intel fame) wrote a book, High Output Management, suggesting that management needs two opposing metrics to avoid this problem. For example, measure productivity and number of defects, and score people on the combined results.

BTW the large nail/little nail joke has a third part. Soviet management eventually got a clue and started measuring by the value of the nails produced... and the result was the world's first solid-gold-nail factory.

Pretty much every trick in organization design or management can be thought of as a partial solution to this problem. Listing "anarchy" or "absolute authority" explicitly on a short list of solutions is therefore a bit misleading.

At work a large part of my job involves choosing G , and I can report that Goodhart's Law is very powerful and readily observable.
Further : rational players in the workspace know full-well that management desire G, and the G
is not well-correlated with G, but nonethelss if they are rewarded on G*, then that's what they will focus on.

The best solution - in my experience - is mentioned in the post: the balanced scorecard. Define several measures G1 G2 G3 and G4 that are normally correlated with G. The correlation is then more persistent : if all four measures improve it is likely that G will improve.

G1 G2 G3 G4 may be presented as simulaneous measures, or if setting four measures in one go is too confusing for people trying to prioritise (the frwer the measures the more powerful) they can be sequential. IE If you hope to improve G over 2 years, then measure G1 for two quarters, then switch the measurement to G2 for the next two and so on. (obviously you don't tell people in advance). NB this approach can eb effective, but will make you very unpopular.