The Cartoon Guide to Löb's Theorem

Lo!  A cartoon proof of Löb's Theorem!

Löb's Theorem shows that a mathematical system cannot assert its own soundness without becoming inconsistent.  Marcello and I wanted to be able to see the truth of Löb's Theorem at a glance, so we doodled it out in the form of a cartoon.  (An inability to trust assertions made by a proof system isomorphic to yourself, may be an issue for self-modifying AIs.)

It was while learning mathematical logic that I first learned to rigorously distinguish between X, the truth of X, the quotation of X, a proof of X, and a proof that X's quotation was provable.

The cartoon guide follows as an embedded Scribd document after the jump, or you can download from yudkowsky.net as a PDF file.  Afterward I offer a medium-hard puzzle to test your skill at drawing logical distinctions.

Cartoon Guide to Löb's Theorem - Upload a Document to Scribd

Read this document on Scribd: Cartoon Guide to Löb's Theorem

And now for your medium-hard puzzle:

The Deduction Theorem (look it up) states that whenever assuming a hypothesis H enables us to prove a formula F in classical logic, then (H->F) is a theorem in classical logic.

Let ◻Z stand for the proposition "Z is provable".  Löb's Theorem shows that, whenever we have ((◻C)->C), we can prove C.

Applying the Deduction Theorem to Löb's Theorem gives us, for all C:

((◻C)->C)->C

However, those familiar with the logic of material implication will realize that:

(X->Y)->Y
  implies
(not X)->Y

Applied to the above, this yields (not ◻C)->C.

That is, all statements which lack proofs are true.

I cannot prove that 2 = 1.

Therefore 2 = 1.

Can you exactly pinpoint the flaw?

Comments

sorted by
magical algorithm
Highlighting new comments since Today at 3:54 PM
Select new highlight date
All comments loaded

The flaw is the instant you used the deduction theorem. You used it to go from

PA |- "◻C -> C" -> PA |- "C"

to

PA |- "(◻C->C) -> C"

What the induction theorem really says is

T + X |- Y implies T |- "X -> Y"

so what you really would have needed to apply the deduction theorm would have been

PA + "◻C -> C" |- C

do I win?

I went to the carnival and I met a fortune-teller. Everything she says comes true. Not only that, she told me that everything she says always comes true.

I said, "George Washington is my mother" and she said it wasn't true.

I said, "Well, if George Washington was my mother would you tell me so?" and she refused to say she would. She said she won't talk about what would be true if George Washingto was my mother, because George Washington is not my mother.

She says that everything she says comes true. She looked outside her little tent, and saw the sky was clear. She said, "If I tell you the sky is clear, then the sky is clear."

"But what if it was raining right now? If you told me it was raining, would it be raining?"

"I won't say what I'd tell you if it was raining right here right now, because it isn't raining."

"But if the sky is clear right now you'll tell me the sky is clear."

"Yes. Because it's true."

"So you'll only tell me that (if you say the sky is clear then the sky really is clear), when the sky really is clear?"

"Yes. I only tell you about true things. I don't tell you what I'd do if the world was some other way."

"Oh."

You can tell that J_Thomas2's "if" (which is the counterfactual "if") is not the "if" of material implication (which is what appears in Loeb's theorem) from the grammar: "if George Washington was my mother" rather than "if George Washington is my mother".

Eliezer: "Larry, interpret the smiley face as saying:

PA + (◻C -> C) |-"

OK ignoring the fact that this is exactly what it doesn't say, I suppose we could notice that every step is surrounded at the top by a happy-face-says, so, if we may be inclined to think we can take those off and get a proof inside PA, starting with "◻C -> C" as a hypothesis. Lets see what happens.

  1. ◻L <-> ◻(◻L -> C)
  2. ◻C -> C

ok these are our hypothesis.

  1. ◻(◻L->C) -> (◻◻L -> ◻C)

Modus Ponens works in PA, fine

  1. ◻L -> (◻◻L -> ◻C)

ordinary MP

  1. ◻L -> ◻◻L

if PA can prove it, then PA can prove it can prove it

  1. ◻L -> ◻C

ordinary MP

  1. ◻L -> C

ordinary MP

  1. ◻(◻L -> C)

ARGH WHAT ARE YOU DOING. THERE IS NO RULE OF FIRST ORDER LOGIC THAT ALLOWS YOU TO DO THIS. TO SAY "if i can prove X then i can prove i can prove X" STEPS OUTSIDE OF FIRST ORDER LOGIC YOU LOSE.

  1. ◻L

ordinary MP

10 C

ordinary MP

I tried summarizing it independently and came up with fewer steps and more narrative. Hope you don't me replying here or stealing your lovely boxes.

PA lets us distribute boxes across arrows and add boxes to most things. Lob's hypothesis says we can remove boxes (or at least do around statement C in the following).

  1. L is defined (in a difficult, confusing, self-referential way that this margin is to narrow to contain) such that
    ◻L is equivalent to ◻(◻L -> C)
  2. Distribute the box over the second form:
    ◻L -> (◻◻L -> ◻C)
  3. ◻L also means ◻◻L since we can add boxes, so by the inference "A->(B->C) + (A->B) -> (A->C)" we know
    ◻L -> ◻C
  4. All of that was standard PA. Here it becomes funny; Lob's hypothesis tells us that ◻C can become C, so
    ◻L -> C
  5. Adding a box to that gives ◻(◻L->C), which in 1. we said was the same as
    ◻L
  6. And in 4. we found out that ◻L gives C, so finally
    C

If you really want to see the truth of the theorem at a glance, I found a much simpler proof on page 7 of this writeup by Aaronson.

Eliezer's password is [correct answer deleted, congratulations Douglas --EY].

If L:ob's theorem is true, then surely we can use it without any reference to the details of its proof? Isn't the point of a proof to obtain a black box, so that we do not have to worry about all the various intermediate steps? And in this case, wouldn't it defeat the purpose of having a proof if, to explain how the proof was misapplied, we had to delve deep into the guts of the proof, rather than just pointing to the relevant features of its interface?

Doug S.:

What character is ◻?
That's u+25FB ('WHITE MEDIUM SQUARE').

Eliezer Yudkowsky:

Larry, interpret the smiley face as saying:

PA + (◻C -> C) |- I'm still struggling to completely understand this. Are you also changing the meaning of ◻ from 'derivable from PA' to 'derivable from PA + (◻C -> C)'? If so, are you additionally changing L to use provability in PA + (◻C -> C) instead of provability in PA?

Robert: "You can only say (((◻C)->C)->(◻C))"

No that's not right. The theorem says that if PA proves "◻C -> C" then PA proves C. so that's ◻(◻C -> C) -> ◻C.

The flaw is that the deduction theorem does not cross meta levels. Eliezer says "Löb's Theorem shows that, whenever we have ((◻C)->C), we can prove C." and goes on to claim that (◻C->C)->C. But he's intentionally failed to use quotes and mixed up the meta levels here. Lob's Theorem does not give us a proof in first order logic from ((◻C)->C) to C. It gives us a proof that if there is a proof of ((◻C)->C) then there is a proof of C. Which is an entirely diffirent thing altogether.

Psy-Kosh: There are two points of view of where the flaw is.

My point view is that the flaw is here:

"Löb's Theorem shows that, whenever we have ((◻C)->C), we can prove C. Applying the Deduction Theorem to Löb's Theorem gives us, for all C: ((◻C)->C)->C"

Because, in fact Lob's Theorem is: (PA |- "◻C -> C") => (PA |- "C") and the Deduction theorem says (PA + X |- Y) => (PA |- "X->Y"). We don't have PA + X proving anything for any X. The deduction theorem just doesn't apply. The flaw is that the informal prose just does not accurately reflect the actual math.

Eliezer's point of view (correct me if I'm wrong) is that in the cartoon, we have 10 steps all of the form "PA proves ...." They each follow logically from the ones that came before. So they look like a proof inside of PA. And if they were a proof inside of PA then the deduction theorem would apply, and his contradiction would go through. So the flaw is that while all of the steps are justified, one of them is only justified from outside of PA. And that one is step 8.

Both of these views are correct.

Brian Jaress:

I think if you used Lob's Theorem correctly, you'd have something like: if PA |- []C -> C then PA |- C [Lob] PA |- ([]C -> C) -> C [Deduction]

This is incorrect because the if-then is outside of PA. The deduction theorem does not apply.

Fascinating that you could present Lob's theorem as a cartoon, Eliezer.

One tiny nitpick: The support for statement 9 seems to be wrong. It reads (1, MP) but that doesn't follow. Perhaps you mean (8, 1, MP)

I was just working through the cartoon guide pursuant to this recent discussion post, and I found this minor mistake too.

I just tested and anecdotally confirmed a hypothesis made with very little data: I suspected that neither Douglas Knight nor Larry D'Anna, the two who pinpointed 8 as the critical step, would be among the objectors to my metaethics. (Either of them can torpedo this nascent theory by stating otherwise.)

Eliezer: "Why doesn't the given proof of Lob's Theorem, the steps 1-10, show that PA + "◻C -> C" |- C?"

That's just not what it says. The hypothesis in step 2 isn't "◻C -> C" it's "◻(◻C -> C)". I suppose if the Hypothesis were "◻C -> C" we could try to find where it breaks. Step 7 is the only step that depends on 2 so it has to be there. Translating the amusing cartoons, we have

  1. ◻(◻C -> C)
  2. ◻(◻L -> ◻C)
  3. ◻(◻L -> C)

Lets say that instead we have

2'. ◻C -> C

Well what are we supposed to do with it? We're stuck. Just because ◻C -> C doesn't mean that PA can prove it.

so what you really would have needed to apply the deduction theorm would have been PA + "◻C -> C" |- C do I win?

Why doesn't the given proof of Lob's Theorem, the steps 1-10, show that PA + "◻C -> C" |- C?

Löb proved the following: for any C, Provable(Provable(C)->C)->Provable(C).

So, we may derive from PA soundness, that for any C, Provable(Provable(C)->C)->C.

Nobody proved, as you stated, that for any C (Provable(C)->C)->C.

Sanity check: If I'm understanding Eleizer and D'Anna correctly, step 8 is valid in PA but not in classical (= first order?) logic. The deduction theorem only applies to proofs in classical logic, therefore this part:

"Applying the Deduction Theorem to Löb's Theorem gives us, for all C:

((◻C)->C)->C"

is incorrect; you can't use the deduction theorem on Lob's Theorem. Everything after that is invalidated, including 2=1.

Is that more or less the point here? I have a good head for math but I'm not formally trained, so I'm trying to be sure I actually understand this instead of just thinking I understand it.

Incidentally: Godel's Theorem inspired awe when I was first exposed to it, and now Lob's has done the same. I actually first read this post months ago, but it's only going back to it after chasing links from a future sequence that I think I finally get it.

Is "I cannot prove that 2 = 1." Supposed to mean that a you can prove that it is impossible to prove 2=1?

Here's the solution for your AI reflection problem.

In this SL4 post you equivocate ⊢ ◻P and ⊢ P . We are are allowed to believe P from a proof of P, just not from a proof of provability of P. Of course, whenever we have ⊢P, we'll have ◻P, so it's an understandable attribute substitution. Modifying your minimal rewrite criterion with this in mind, we might require that, for interpretations supporting derivability, S1 and a subsystem of S2 be mutually interpretable. Makes sense?

Posting before reading other comments...

The flaw is in the very last step: that there is no proof of 2=1. If PA is inconsistent then contradictions are provable in PA and then anything else is too, including 2=1. To prove that there is no proof of 2=1 is to prove that PA is consistent, which is impossible by Godel's theorem.

J Thomas:

Once more through the mill. If PA proves that 6 is a prime number, then 6 is really a prime number. Can you prove that if PA proved 6 was a prime number then 6 would be a prime number? How would you do it?

If PA |- "forall x y . x y = 6 => |x|=1 || |y|=1" then N |= "forall x y . x y = 6 => |x|=1 || |y|=1" (N = the natural numbers equiped with + and ) so for all x and y in N, N |= ",x ,y = 6 => |,x|=1 || |,y|=1" (where ,x means a constant symbol for x) if xy = 6 then N |= ",x ,y = 6" so therefore N |= "|,x|=1 || |,y|=1" thus either N |= "|,x| = 1" or N |= "|,y| = 1" thus either |x|=1 or |y|=1 therefore we have that if x*y = 6 then either |x| = 1 or |y| = 1 therefore 6 is prime therefore if PA |- "6 is prime" then 6 is actually prime

Of course it is also a meta-theorem that for any sentence phi in the language of PA that

ZF |- "PA |- phi => phi_omega"

where phi_omega is phi relativeized to the finite ordinals.

Larry D'Anna: Thanks, I think I understand the Deduction Theorem now.

Okay, I still don't see why we had to pinpoint the flaw in your proof by pointing to a step in someone else's valid proof.

Larry D'Anna identified the step you were looking for, but he did it by trying to transform the proof of Lob's Theorem into a different proof that said what you were pretending it said.

I think, properly speaking, the flaw is pinpointed by saying that you were misusing the theorems, not that the mean old theorem had a step that wouldn't convert into what you wanted it to be.

I've been looking more at the textual proof you linked (the cartoon was helpful, but it spread the proof out over more space) which is a little different and has an all-ascii notation that I'm going to borrow from.

I think if you used Lob's Theorem correctly, you'd have something like:

if PA |- []C -> C then PA |- C [Lob]

PA |- ([]C -> C) -> C [Deduction]

PA |- (not []C) -> C [Definition of implication]

(PA |- ((not []C) -> C) and (PA |- not []C) [New assumption]

PA |- ((not []C) -> C) and (not []C) [If you can derive two things, you can derive the conjunction]

PA |- C [Definition of implication]

And conclude that all provably unprovable statements are also provable, not that all unprovable statements are true.

(Disclaimer: I am definitely not a mathematician, and I skipped over your meta-ethics because I only like philosophy in small doses.)

Larry, a simple 8 would have sufficed. :) (That's what Douglas wrote.)

Psy-Kosh: No that isn't the problem. If there is a proof that 1=2, then 1=2. If there isn't, then 1=2. Either way 1=2. The problem is the mixing of meta-levels / inappropriate use of the deduction theorem / tacit assumption that ◻a -> ◻b is the same as "a -> b".

Sebastian Hagen: no ◻X is PA |- "X". My best guess as to what Eliezer meant by "interpret the smiley face as saying.." is that he meant to interpret the cartoonproof as a proof from the assumption "(◻C -> C)" to the conclusion "C", rather than a proof from "◻(◻C -> C)" to "◻C".

As near as I can make out, the flaw is the assumption that there isn't a proof of 2=1.

There's no proof of Peano arithmetic being consistent, right? (There's a proof of that lack of proof, of course. :))

Larry, interpret the smiley face as saying:

PA + (◻C -> C) |-

Then the steps 1-10 of Lob's proof would seem to take us from:

PA + (◻C -> C) |- (◻C -> C)

to

PA + (◻C -> C) |- C

If this were the case, the deduction theorem would in fact come into play and give us:

PA |- (◻C -> C) -> C

So there must be at least one step of the reasoning 1-10 that does not apply. Which one and why?

The problem is that while Lob's theorem says that whenever we have ((◻C)->C), C is provable, it does not mean that there is a proof of C from ((◻C)->C). This means that you're use of the deduction theorem is incorrect. You can only say (((◻C)->C)->(◻C)). It is clear that (X->Y)->X does not imply (not X)->X, so your little trick doesn't work.

Eliezer: "Any supposed analysis that does not tell you to mention a particular step, is inexact at best, and more probably mistaken."

I'm pretty sure my answer was completely correct. Care to point out the inexactness/mistakes?

Hint: Pinpointing the exact flaw in the proof that 2=1, requires you to mention one of the steps 1-10 of the (valid) proof of Lob's Theorem.

Any supposed analysis that does not tell you to mention a particular step, is inexact at best, and more probably mistaken.

The confusing part is this:

(C is provable -> C) -> C

I can grok the part inside the parenthases - that's what we'd all like to believe; that this system can assure us that everything it proves is true. The weird step is that this statement itself implies that C is true - I guess we either take this on faith or read the Deduction Theorem.

What I find most strange, though, is the desire to have the proof system itself distinguish between provability and truth. Isn't that impossible? A formal system only provides proofs given axioms; to it, the only meaning of truth beyond that of provability is the behavior of truth within formal logic operations, like implication and negation. "Truth" as "believability" only happens in the brains of the humans who read the proof later.

So, I disagree with the misleading paraphrasing of Lob's proof: "Löb's Theorem shows that a mathematical system cannot assert its own soundness without becoming inconsistent."

This dereferences the word "truth" wrongly as "soundness," when truth's meaning within the system is behavior in formal logic operations. The paraphrasing should be "Lob's theorem shows that a consistent mathematical system cannot assert that theorems it proves will behave as true under the symbol manipulations of formal logic."

"""(X->Y)->Y implies (not X)->Y"""

The arrow means "implies", right?

So,

(Smoke implies fire, therefore fire) implies (no smoke means fire)?

I don't get it.

I think that this is what the theorem means;

If (X->Y) -> Y, then ~X -> Y (If it's true that "If it's true that 'if X is true, then Y is true,' then Y must be true," then Y must be true, even if X is not true).

This makes sense because the first line, "(X->Y) -> Y," can be true whether or not X is actually true. The fact that ~X -> Y if this is true is an overly specific example of that "The first line being true (regardless of the truth of X)" -> Y. It's actually worded kind of weirdly, unless "imply" means something different in Logicianese than it does in colloquial English; ~X isn't really "implying" Y, it's just irrelevant.

This doesn't mean that "(X -> Y) -> Y" is always true. I actually can't think of any intuitive situations where this could be true. It's not true that the fact that "if Jesus really had come back to life, Christians would be Less Wrong about stuff" implies that Christians would be Less Wrong about stuff even if Jesus really hadn't come back to life.

Also,

To anyone who wants to tell me I'm wrong about this; If I'm wrong about this, and you know because you've learned about this in a class, whereas I just worked this out for myself, I'd appreciate it if you told me and mentioned that you've learned about this somewhere and know more than I do. If logic is another one of those fields where people who know a lot about it HATE it when people who don't know much about it try to work stuff out for themselves (like Physics and AI), I'd definitely like to know so that I don't throw out wrong answers in the future. Thanks.

I did study stuff like this a LONG time ago, so I respect your trying to work this out from 'common sense'. The way I see it, the key to the puzzle is the truth-value of Y, not 'whether or not X is actually true'. By working out the truth-tables for the various implications, the statement "((◻C)->C)->C" has a False truth-value when both (◻C) and C are False, i.e. if C is both unprovable and false the statement is false. Even though the 'material implication' "(X->Y)->Y implies (not X)->Y" is a tautology (because when the 1st part is false the 2nd part is true & vice-versa) that does not guarantee the truth of the 2nd part alone. In fact, it is intuitively unsurprising that if C is false, the premise that 'C is provable' is also false (of course such intuition is logically unreliable, but it feels nice to me when it happens to be confirmed in a truth table). What may seem counterintuitive is that this is the only case for which the premise is True, but the encompassing implication is False. That's because a 'logical implication' is equivalent to stating that 'either the premise (1st part) is False or the conclusion (2nd part) is True (or both)'. So, for the entire statement "((◻C)->C)->C" to be True when C is False, means "((◻C)->C)" must be False. "((◻C)->C)" is only False when "(◻C)" is True and "C" is False. Here's where the anti-intuitive feature of material implication causes brain-freeze - that with a false premise any implication is assigned a True truth-value. But that does NOT mean that such an implication somehow forces the conclusion to be true! It only affirms that for any implication to be true, it must be the case that "IF the premise is true then the conclusion is true." If the premise is False, the conclusion may be True or False. If the premise is True and the conclusion is False, then the implication itself is False!

The translation of " (not ◻C)->C" as "all statements which lack proofs are true" is a ringer. I'd say the implication "If it is not the case that C is provable, then C is True" is equivalent to saying "Either C is provable or C is True or both." Both of these recognize that the case for which "C is provable" and "C is False" makes the implication itself False. The mistranslation erroneously eliminates consideration of the case in which C is both unprovable an False, which is (not coincidentally, I suspect) the one case for which the grand formulation "((◻C)->C)->C" is False.

Jeepers. I haven't thought about this problem for a long time. Thanks.

The answer that occurs to me for the original puzzle is that Yudkowsky never proved (◻(2 = 1) -> (2 = 1)). I don't know it that is actually the answer, but I really need to go do other work and stop thinking about this problem.

Did EY post the correct answer anywhere? I've been thinking for hours and give up. The comments make no sense. The best I can tell from Larry_D'Anna's comment is that axiom 1 is wrong. But it doesn't seem wrong and I am very confused. Or maybe he's saying the way it's used in step 8 is wrong. But that seems a very straightforward inference from it. Either way, it must be correct, because Lob's theorem is an actual theorem. EY didn't just make it up to trick us.

So I went to wikipedia and it does say the same thing:

◻(◻C->C)->◻C

If you subtract the quotations (◻) from both sides, you do get (◻C->C)->C. Which obviously must be false, or 2=1. But I can't figure out what is wrong with removing the quotations. I don't even see where the quotations even came from in EYs proof. This is very frustrating.

Step 8 relies on the lemma that ◻X→◻◻X. If you try running through the proof with the ◻ removed from both sides, the corresponding lemma you would need to make step 8 work is X→◻X, but this is not true.

Yrg'f frr, V xabj sbe n snpg gung ¬(CN ⊢ (◻P → P) → P), gung vf, gung guvf fhccbfrq "nccyvpngvba" bs gur Qrqhpgvba Gurberz qb Yöo'f Gurberz vf abg n gurberz bs CN.

Yöo'f Gurberz vgfrys vf whfg gung -

Buuuu tbg vg.

Yöo'f Gurberz fnlf gung vs CN ⊢ ◻P → P, gura CN ⊢ P; be, fvzcyl, gung CN ⊢ ◻(◻P → P) → ◻P. Gur Qrqhpgvba Gurberz fnlf gung sbe "ernfbanoyr" (va n pregnva frafr) qrqhpgvir flfgrzf va svefg-beqre ybtvp naq nal frg bs nkvbzf Δ, vs vg'f gur pnfr gung, sbe fbzr N naq O, Δ ∪ {N} ⊢ O, gura Δ ⊢ N → O. Ubjrire, vg vf abg gur pnfr gung CN ∪ {◻P → P} ⊢ P, fvapr CN ∪ {◻P → P, ¬P} vf cresrpgyl pbafvfgrag (vss CN vf nyfb pbafvfgrag) naq vzcyvrf ¬◻P.

Be, va bgure jbeqf, nqqvat gur nkvbz ◻P → P gb CN sbe nal P bayl znxrf P n gurberz bs CN vs ◻P vf nyernql gurberz bs CN; ubjrire, ernfbavat "bhgfvqr" bs CN, jr xabj nyernql gung vs CN ⊢ ◻P gura CN ⊢ P (juvpu pna nyfb or sbeznyvfrq nf CN ⊢ ◻◻P → ◻P, juvpu vf n gurberz bs CN), fb nqqvat ◻P → P gb CN qbrf abg va snpg znxr CN "fgebatre" guna vg jnf orsber va nal zrnavatshy frafr, naq whfg znxrf CN gehfg vgf cebbs bs P, vs vg rire svaqf nal.

Gur pbeerpg nccyvpngvba bs gur Qrqhpgvba Gurberz gb Yöo'f Gurberz fnlf gung CN ∪ {◻(◻P → P)} ⊢ ◻P, juvpu jr nyernql xabj gb or gehr (orpnhfr... gung'f Yöo'f Gurberz), ohg fvapr ◻P → P vf abg va trareny n gurberz bs CN, guvf qbrf abg thnenagrr gung P jvyy or gehr. Naq phevbhfyl, guvf vzcyvrf gung nqqvat rvgure ◻(◻P → P) sbe nyy P be ◻P → P sbe nyy P gb CN qbrfa'g znxr CN nal fgebatre - qbvat gur sbezre npghnyyl eraqref ◻ rssrpgviryl zrnavatyrff, fvapr gura ◻P jvyy or gehr sbe nyy P, naq vg jvyy ab ybatre ercerfrag svaqvat n Töqry Ahzore bs n cebbs bs P, fb bar pbhyq fnl gung vg znxrf CN "jrnxre" va fbzr ybbfr frafr -, ohg nqqvat obgu sbe nyy P znxrf vg vapbafvfgrag.