The Strong Occam's Razor

This post is a summary of the different positions expressed in the comments to my previous post and elsewhere on LW. The central issue turned out to be assigning "probabilities" to individual theories within an equivalence class of theories that yield identical predictions. Presumably we must prefer shorter theories to their longer versions even when they are equivalent. For example, is "physics as we know it" more probable than "Odin created physics as we know it"? Is the Hamiltonian formulation of classical mechanics apriori more probable than the Lagrangian formulation? Is the definition of reals via Dedekind cuts "truer" than the definition via binary expansions? And are these all really the same question in disguise?

One attractive answer, given by shokwave, says that our intuitive concept of "complexity penalty" for theories is really an incomplete formalization of "conjunction penalty". Theories that require additional premises are less likely to be true, according to the eternal laws of probability. Adding premises like "Odin created everything" makes a theory less probable and also happens to make it longer; this is the entire reason why we intuitively agree with Occam's Razor in penalizing longer theories. Unfortunately, this answer seems to be based on a concept of "truth" granted from above - but what do differing degrees of truth actually mean, when two theories make exactly the same predictions?

Another intriguing answer came from JGWeissman. Apparently, as we learn new physics, we tend to discard inconvenient versions of old formalisms. So electromagnetic potentials turn out to be "more true" than electromagnetic fields because they carry over to quantum mechanics much better. I like this answer because it seems to be very well-informed! But what shall we do after we discover all of physics, and still have multiple equivalent formalisms - do we have any reason to believe simplicity will still work as a deciding factor? And the question remains, which definition of real numbers is "correct" after all?

Eliezer, bless him, decided to take a more naive view. He merely pointed out that our intuitive concept of "truth" does seem to distinguish between "physics" and "God created physics", so if our current formalization of "truth" fails to tell them apart, the flaw lies with the formalism rather than with us. I have a lot of sympathy for this answer as well, but it looks rather like a mystery to be solved. I never expected to become entangled in a controversy over the notion of truth on LW, of all places!

A final and most intriguing answer of all came from saturn, who alluded to a position held by Eliezer and sharpened by Nesov. After thinking it over for awhile, I generated a good contender for the most confused argument ever expressed on LW. Namely, I'm going to completely ignore the is-ought distinction and use morality to prove the "strong" version of Occam's Razor - that shorter theories are more "likely" than equivalent longer versions. You ready? Here goes:

Imagine you have the option to put a human being in a sealed box where they will be tortured for 50 years and then incinerated. No observational evidence will ever leave the box. (For added certainty, fling the box away at near lightspeed and let the expansion of the universe ensure that you can never reach it.) Now consider the following physical theory: as soon as you seal the box, our laws of physics will make a localized exception and the victim will spontaneously vanish from the box. This theory makes exactly the same observational predictions as your current best theory of physics, so it lies in the same equivalence class and you should give it the same credence. If you're still reluctant to push the button, it looks like you already are a believer in the "strong Occam's Razor" saying simpler theories without local exceptions are "more true". QED.

It's not clear what, if anything, the above argument proves. It probably has no consequences in reality, because no matter how seductive it sounds, skipping over the is-ought distinction is not permitted. But it makes for a nice koan to meditate on weird matters like "probability as preference" (due to Nesov and Wei Dai) and other mysteries we haven't solved yet.

ETA: Hal Finney pointed out that the UDT approach - assuming that you live in many branches of the "Solomonoff multiverse" at once, weighted by simplicity, and reducing everything to decision problems in the obvious way - dissolves our mystery nicely and logically, at the cost of abandoning approximate concepts like "truth" and "degree of belief". It agrees with our intuition in advising you to avoid torturing people in closed boxes, and more generally in all questions about moral consequences of the "implied invisible". And it nicely skips over all the tangled issues of "actual" vs "potential" predictions, etc. I'm a little embarrassed at not having noticed the connection earlier. Now can we find any other good solutions, or is Wei's idea the only game in town?

Comments

sorted by
magical algorithm
Highlighting new comments since Today at 6:52 PM
Select new highlight date
All comments loaded

gets out the ladder and climbs up to the scoreboard

5 posts without a tasteless and unnecessary torture reference

replaces the 5 with a 0

climbs back down

Years ago, before coming up with even crazier ideas, Wei Dai invented a concept that I named UDASSA. One way to think of the idea is that the universe actually consists of an infinite number of Universal Turing Machines running all possible programs. Some of these programs "simulate" or even "create" virtual universes with conscious entities in them. We are those entities.

Generally, different programs can produce the same output; and even programs that produce different output can have identical subsets of their output that may include conscious entities. So we live in more than one program's output. There is no meaning to the question of what program our observable universe is actually running. We are present in the outputs of all programs that can produce our experiences, including the Odin one.

Probability enters the picture if we consider that a UTM program of n bits is being run in 1/2^n of the UTMs (because 1/2^n of all infinite bit strings will start with that n bit string). That means that most of our instances are present in the outputs of relatively short programs. The Odin program is much longer (we will assume) than one without him, so the overwhelming majority of our copies are in universes without Odin. Probabilistically, we can bet that it's overwhelmingly likely that Odin does not exist.

This is a cool theory, but it is probably equivalent to another, less cool theory that yields identical predictions and does not reference infinite virtual universes. :)

Although it postulates the existence of infinitely many inaccessible universes, it may be simpler than equivalent theories which imply only a single universe.

I feel like this is an argument we've seen before, but with more hilarious self-referentiality.

Yep, I already arrived at that answer elsewhere in the thread. It's very nice and consistent and fits very well with UDT (Wei Dai's current "crazy" idea). There still remains the mystery of where our "subjective" probabilities come from, and the mystery why everything doesn't explode into chaos, but our current mystery becomes solved IMO. To give a recent quote from Wei, "There are copies of me all over math".

Should we stop on UDASSA? Can we consider universe that consists of continuum of UDASSAs each running some (infinite) subset of set of all possible programs.

If anyone is interested. This extension doesn't seem to lead to anything of interest.

If we map continuum of UDASSA multiverses into [0;1) then Lebesgue measure of set of multiverses which run particular program is 1/2.

Let binary number 0.b1 b2 ... bn ... be representation of multiverse M if for all n: (bn=1 iff M runs program number n, and bn=0 otherwise).

It is easy to see that map of set of multiverses which run program number n is a collection of intervals [i/2^n;2i/2^n) for i=1..2^(n-1). Thus its Lebesgue measure is 2^(n-1)/2^n=1/2.

This theory makes exactly the same observational predictions as your current best theory of physics, so it lies in the same equivalence class and you should give it the same credence.

You're blurring an important distinction between two types of equivalence:

  • Empirical equivalence, where two program-theories give the same predictions on all currently known empirical observations.
  • Formal equivalence, where two program-theories give identical predictions on all theoretically possible configurations, and this can be proved mathematically.

If two theories are only empirically equivalent, you use the complexity penalty and prefer the simpler one. If the theories are formally equivalent, you don't bother trying to tell them apart. If you don't know which equivalence relation holds, you sit down and start doing math.

If two theories imply different invisibles, they shouldn't be considered equivalent. That no evidence can tell them apart, and still they are not equal, is explained by them having different priors. But if two theories are logically (agent-provably, rather) equivalent, this is different, as the invisibles they imply and priors measuring them are also the same.

The thing about positivism is it pretends to be a down-to-earth common-sense philosophy, and then the more you think about it the more it turns into this crazy surrealist madhouse. So we can't measure parallel universes and there's no fact of the matter as to whether they exist. But people in parallel universes can measure them, but there's no fact of the matter whether these people exist, and there's a fact of the matter whether these universes exist if and only if these people exist to measure them, so there's no fact of the matter whether there is a fact of the matter whether these universes exist. And meanwhile to the extent that these people exist, some of them claim there is no fact of the matter as to whether we exist, so really there's not one positivism, but there's a positivism for every quantum world. And in each of these positivisms, which worlds it's meaningful to talk about the existence of gets determined by some random process many times each second. And the question of whether other people in the same universe as you exist is meaningless too, because it makes no predictions that differ from those of the interpretation that other people are all well-disguised walruses who act exactly like people. And if you make an identical copy of yourself, you could end up being either one of them, so there's a 50% chance that there's a fact of the matter whether each continuation is a walrus. Etc.

Boxes proofed against all direct and indirect observation, potential for observation mixed with concrete practicality of such observation, strictly-worse choices, morality... one would be hard-pressed to muddle your thought experiment more than that.

Let's try to make it a little more straightforward: assume that there exists a certain amount of physical space which falls outside our past light cone. Do you think it is equally likely that it contains galaxies and that it contains unicorns? More importantly, do you think the preceding question means anything?

I'm skeptical of the idea that the hypothesis "Odin created physics as we know it" would actually make no additional predictions over the hypothesis . I'm tempted to say that as a last resort we could distinguish between them by generating situations like "Omega asks you to bet on whether Odin exists, and will evaluate you using a logarithmic scoring rule and will penalize you that many utilons", though at this point maybe it is unjustified to invoke Omega without explaining how she knows these things.

But what do you think of some of the examples given in "No Logical Positivist I"?

On August 1st 2008 at midnight Greenwich time, a one-foot sphere of chocolate cake spontaneously formed in the center of the Sun; and then, in the natural course of events, this Boltzmann Cake almost instantly dissolved.

I would say that this hypothesis is meaningful and almost certainly false. Not that it is "meaningless". Even though I cannot think of any possible experimental test that would discriminate between its being true, and its being false.

Would that be in the same equivalence class as its negation?

I'm skeptical of the idea that the hypothesis "Odin created physics as we know it" would actually make no additional predictions over the hypothesis

As the originator of that hypothesis, the idea I had in mind was that there are two theories: physics as we know it, and Odin created physics as we know it. Scientists hold the first theory, and Odinists hold the second. The theories predict exactly the same things, so that Odinists and scientists have the same answers to problems, but the Odinists theory lets them say that Odin exists - and they happen to have a book that starts with "If Odin exists, then..." and goes on to detail how they should live their lives. The scientists have a great interest in showing that the second theory is wrong, because the Odinists are otherwise justified in their pillaging of the scientists' home towns. But the Odinists are clever folk, and say we shouldn't expect to see anything different from the world as we know it because Odin is all-powerful at world-creation.

Honestly, I should have picked Loki.

Are you defining Odin's role and behaviour such that he is guaranteed not to actually do anything that impinges on reality beyond creating the laws of physics that we already know? Or is it just that he hasn't interfered with anything so far, or hasn't interfered with anything in such a way that anything we can observe is different?

(Edit: I ask because any claims about metaethics that depend on Odin's existence would seem to require raising him to the level of a causally-active component of the theory rather than an arbitrary and unfalsifiable epiphenomenon.)

I am defining him as being an arbitrary and unfalsifiable epiphenomenon everywhere excepting that he was causally active in the creation of the book that details the ethical lives Odinists ought to live. Basically, he hasn't interfered with anything in such a way that anything we could ever observe is different, except he wrote a book about it.

It's clear to me that anyone could choose to reject Odinism, but it's not clear what arguments other than a strong Occam's razor could convince a sufficiently reasonable and de-biased (ie genuinely truth-seeking) Odinist to give up their belief.

Is the Hulatioamiltonian formulation of classical mechanics apriori more probable than the Lagrangian formn?

They are both derivable from the same source, Newtonian mechanics plus the ZF Set theory. They are equivalent and therefore equally probable.

The shortest possible version of them all - mutually equivalent theories - is the measure how (equally) probable are they.

My favourite justification of the Occam razor is that even if two theories are equivalent in their explicit predictions, the simpler one is usually more likely to inspire correct generalisations. The reason may be that the more complicated the theory is, the more arbitrary constraints it puts on our thinking, and those constraints can prevent us from seeing the correct more general theory. For example, some versions of aether theory can be made eqivalent to special relativity, but the assumptions of absolute space and time make it nearly impossible to discover something equivalent to general relativity, starting from aether.

no matter how seductive it sounds, skipping over the is-ought distinction is not permitted

Yeah, some of us are still not convinced on that one.

Speaking of which, does anyone actually have something resembling a proof of this? People just seem to cast it about flippantly.

Adding premises like "Odin created everything" makes a theory less probable and also happens to make it longer; this is the entire reason why we intuitively agree with Occam's Razor in penalizing longer theories. Unfortunately, this answer seems to be based on a concept of "truth" granted from above - but what do differing degrees of truth actually mean, when two theories make exactly the same predictions?

and

Imagine you have the option to put a human being in a sealed box where they will be tortured for 50 years and then incinerated. No observational evidence will ever leave the box... Now consider the following physical theory: as soon as you seal the box, our laws of physics will make a localized exception and the human will spontaneously vanish. This theory makes exactly the same observational predictions as your current best theory of physics, so it lies in the same equivalence class and you should give them the same credence.

Why do we only care about observational predictions and not all "in principle" predictions? (Serious question, not rhetorical). My intuition is that in the first quote "Physics" and "Odin created physics" aren't in the same equivalence class because the latter makes an additional prediction: the existence of Odin. Similarly in the second quote, there are differing predictions about what is happening inside the box even if they are physically impossible to test, so I would put the two theories in different equivalence classes. I would do the same for MWI and Copenhagen quantum mechanics.

This is pure intuition on my part. I haven't had a chance to take a look at the math of K-complexity and the like, so I might just be missing something relatively basic.

A prediction that's impossible to test is a contradiction in terms. Show me any unfalsifiable theory, and I'll invent some predictions that follow from it, they will just be "impossible to test".

Ok, so don't call the existence of Odin or what's happening inside the box "predictions." Then I'll rephrase my question:

Why do we only care about "predictions" and not "everything a theory says about reality?" Clearly all three pairs of theories I mentioned above say different things about reality even if it is impossible in some sense to observe this difference.

(I'll add to this later, but I'm pressed for time currently) edit: nothing to add, actually

How can we distinguish statements that are "about reality" from statements that aren't, if we just threw away the criteria of predictive power and verification?

How about counterfactual predictive power and verification? If I could observe the inside of that box, then I could see a difference between the two theories.

I realize this opens a potential can of worms, i.e., what sort of counterfactuals are we allowed to consider? But in any case, this is how I've understood the basic idea of falsifiability. Compare to Yvain's logs of the universe idea. (He's doing something different with it, I know)

Theories that require additional premises are less likely to be true, according to the eternal laws of probability ... Unfortunately, this answer seems to be based on a concept of "truth" granted from above - but what do differing degrees of truth actually mean, when two theories make exactly the same predictions?

Reading this and going back to my post to work out what I was thinking, I have a sort-of clarification for the issue in the quote. The original argument was that, before experiencing the universe, all premises are a priori equally likely, so we can generate as many hypotheses as we like out of them. Then, after experience (a posteriori) some premises are extremely likely (shorthand: true) and some are extremely unlikely (shorthand: false). Now, in our advanced position, there are a large number of false premises, a small number of true premises, and an unknown number of premises we haven't had experiences about. We can now generate as many "prediction-equivalent" theories as we like by combining the unknown premises with the true ones. As long as we avoid the false premises, all our hypotheses will be based on true premises, and premises which we have not yet checked. To refine that argument, it is the conjunction of specifically these unknown premises that might weaken the hypothesis. Therefore, we ought to include as few of these as-yet-untested premises in our hypothesis, in order to reduce the chances of it being wrong.

Now, in the case of two theories making the same prediction: I suggest that it is possible to look at an unknown premise and decide whether we can check it. In this sense, if it is checkable, we can view it as a prediction: the hypothesis includes premise C such that if C is true then the hypothesis is true and if C is false the hypothesis is false. In other words, the hypothesis makes a prediction that C is true. If it is uncheckable, though, we don't use the word prediction. This is the discussion that Matt_Simpson and cousin_it are having down below. If both theories make the same predictions because one says A B C and D, and the other says A B C E F G and H, (A and B are true, C is unknown-testable, D through H are unknown-untestable), then the theories are still distinguishable, still different, but they make the same predictions. In this specific case, we should in principle prefer the first theory, because it has one one fault-line and the second has four: even though we don't think we can test any of these fault-lines, the laws of probability still apply. So this is what degrees of truth mean when all theories make the same predictions.

Now, what about two different theories that say A B C and D? I'm not familiar with the physics, but it appears that Hamiltonian and Lagrangian systems are in this scenario: both say A B C and D, in different but equivalent ways. I haven't had enough time to think to bake up an answer for this, but I suspect it is similar to how you can express the same truth-table with different combinations of premises and operators. The question that stumps me is: in logic, we don't care about the operators except for how they twist the premises. In physics, we actually do care about the operators, to an extent: we give them names like "the mechanism of x" and "the underlying reality". So it seems to me that saying Lagrangian and Hamiltonian are different but equivalent is saying that two different logical formulations with the same truth-table are different but equivalent, except in physics we feel the difference actually matters.

I wonder if this can't be considered more pragmatically? There was a passage in the MIT Encyclopedia of Cognitive Sciences in the Logic entry that seems relevant:

Johnson-Laird and Byrne (1991) have argued that postulating more imagelike MENTAL MODELS make better predictions about the way people actually reason. Their proposal, applied to our sample argument, might well help to explain the difference in difficulty in the various inferences mentioned earlier, because it is easier to visualize “some people” and “at least three people” than it is to visualize “most people.” Cognitive scientists have recently been exploring computational models of reasoning with diagrams. Logicians, with the notable exceptions of Euler, Venn, and Peirce, have until the past decade paid scant attention to spatial forms of representation, but this is beginning to change (Hammer 1995).

This made me think a bit differently about how we might choose between two abstract models with the same explanatory power. It seems that the rational thing to do is to choose the one that allows you to reason the most fluently so as to minimize the likelihood of fallacious reasoning.

In fact, it seems that we should expect the cognitive sciences to provide clues about how we could adjust formal systems with the view of easy of understanding and technical fluency when reasoning about/with them.

Taking this view; assuming we had finished physics, all the future work would be about tweaking the formalisms toward the most intuitive possible ones with respect to the knowledge we have of human reasoning. What would be important is that they be as easy to understand as possible. That way we could hope to ensure more efficiency in technological development as well as better general understanding among the public.

I was thinking on a similar line:

Given that computation has costs, memory is limited, to make the best possible predictions given some resources one needs to use the computationally least expensive way.

Assuming that generating a mathematical model is (at least on average) more difficult for more complex theories, wasting time by creating (at the end equivalent) models by having to incorporate epiphenomenal concepts leads to practically worse predictions.

So not using the strong Occam's razor would lead to worse results.

And because we have taking moral issues with us: not using the best possible way would even be morally bad, as we would lose important information for optimizing our moral behavior, as we cannot look as far into the future/would have less accurate predictions at our disposal due to our limited resources.

ETA: The difference to your post above is mainly that this holds true for a perfect bayesian superintelligence still, and should be invariant to different computation substrate.

I personally think Occam's razor is more about describing what you know. If two theories are equally good in their explanatory value, but one has some extra bells and whistles added on, you have to ask what basis you have for deciding to prefer the bells and whistles over the no bells and whistles version.

Since both theories are in fact equally good in their predictions, you have no grounds for preferring one over the other. You are in fact ignorant of which theory is the correct one. However, the simplest one is the one that comes closest to describing the state of your knowledge. The more complicated ones add extra bits that can only really be described as speculations, not knowledge at all, because all the extra bits of 'information' in that theory are not based on any data at all.

Perhaps a more complicated theory is true? Perhaps. But which one of the many many many more complicated theories should you pick? You have no evidence on which to make this choice.

Equally, one shouldn't be too doctrinaire about it. We don't know the simplest explanation is correct - we simply know it's the best way of describing what we know so far. If there are several similar theories of almost equal explanatory weight, there are grounds for reasonable agnosticism even if there is one that's a little 'lighter' than the others.

What if Tegmark's multiverse is true? All the equivalent formulations of reality would "exist" as mathematical structures, and if there's nothing to differentiate between them, it seems that all we can do is point to appropriate equivalence class in which "we" exist.

However, the unreachable tortured man scenario suggests that it may be useful to split that class anyway. I don't know much about Solomonoff prior - does it make sense now to build a probability distribution over the equivalence class and say what is the probability mass of its part that contains the man?

Theories that require additional premises are less likely to be true, according to the eternal laws of probability. Adding premises like "Odin created everything" makes a theory less probable and also happens to make it longer; this is the entire reason why we intuitively agree with Occam's Razor in penalizing longer theories. Unfortunately, this answer seems to be based on a concept of "truth" granted from above -

Not to me it doesn't. (Though I may not understand what you mean by "truth" here.) Bayesian probability theory as I've come to understand it deals with maps directly and with the territory only indirectly. It purports to describe how logic, or the laws of thought, apply to uncertainty. So we can describe in some detail how these laws demand a lesser probability for a compound hypothesis, without ever mentioning the reality you want this hypothesis to address. In that sense the math doesn't care about the content of your theories.

(I started to add a silly technical nitpick for something else on the site. Suffice it to say that numerical probabilities serve as maps of other maps, or measures of the trust that a 'rational' mind would have in those maps.)

So should we trust our 'logical' map-evaluating software in this respect? Well, it seems to work so far. As I understand it our mindless evolution created the basics by trial and error, after which we created more mathematics by similar methods. (We twisted our mathematical intuitions into strange shapes like "the square root of minus one" and kept whatever we found a use for.) Bayes' Theorem as we know it emerged from this process. So we can imagine discovering that probability or logic in general has misled us in some fundamental way. Perhaps the most complex possible theory is always correct and we just can't imagine said theory (or, indeed, a way for it to exist). But our internal software tells us not to expect this. ^_^

Another intriguing answer came from JGWeissman. Apparently, as we learn new physics, we tend to discard inconvenient versions of old formalisms. So electromagnetic potentials turn out to be "more true" than electromagnetic fields because they carry over to quantum mechanics much better. I like this answer because it seems to be very well-informed!

I don't like this explanation- while potentials are useful calculation tools both macroscopically and quantum mechanically, fields have unique values whereas potentials have non-unique values. It's not clear to me how to compare those two benefits and decide if one is "more true."

The alternative way to look at it: if you only knew E&M, would you talk in terms of four-vector potentials or in terms of fields? Most of the calculations for complicated problems are easier with potentials (particularly for magnetism), but the target is generally coming up with the fields from those potentials. Similarly, most calculations in QM are easier with the potentials (I've never seen them done with fields, but I imagine it must be possible- you can do classical mechanics with or without Hamiltonians), but the target is wavefunctions or expectation values.

So it's not clear to me what it means to choose potentials over fields, or vice versa. The potentials are a calculation trick, the fields are real, just like in QM the potentials are a calculation trick, and the wavefunction is real. They're complementary, not competing.

I don't like this explanation- while potentials are useful calculation tools both macroscopically and quantum mechanically, fields have unique values whereas potentials have non-unique values. It's not clear to me how to compare those two benefits and decide if one is "more true."

You can just as easily move to a different mathematical structure where the gauge is "modded out", a "torsor". Similarly, in quantum mechanics where the phase of the wavefunction has no physical significance, rather than working with the vectors of a Hilbert space, we work with rays (though calculational rules in practice reduce to vectors).

There are methods of gaugeless quantization but I'm not familiar with them, though I'd definitely like to learn. (I'd hope they'd get around some of the problems I've had with QFT foundations, though that's probably a forlorn hope.)

I don't like this explanation- while potentials are useful calculation tools both macroscopically and quantum mechanically, fields have unique values whereas potentials have non-unique values. It's not clear to me how to compare those two benefits and decide if one is "more true."

Immediate thought: Why not just regard the potentials as actual elements of a quotient space? :)

So it's not clear to me what it means to choose potentials over fields, or vice versa. The potentials are a calculation trick, the fields are real, just like in QM the potentials are a calculation trick, and the wavefunction is real. They're complementary, not competing.

Are you familiar with the Aharonov-Bohm effect? My understanding is that it is a phenomenon which, in some sense, shows that the EM potential is a "real thing", not just a mathematical artifact.

I am and your understanding is correct for most applications. I don't think it matters for this question, as my understanding is that the operative factor behind the Aharonov-Bohm effect is the nonlocality of wavefunctions.* Because wavefunctions are nonlocal, the potential formulation is staggeringly simpler than a force formulation. (The potentials are more real in the sense that the only people who do calculations with forces are imaginary!)

You still have gauge freedom with the Aharonov-Bohm effect- if you adjust the four-potential everywhere, all it does is adjust the phase everywhere, and all you can measure are phase differences.

Although, that highlights an inconsistency: if I'm willing to accept wavefunctions as real, despite their phase freedom, then I should be willing to accept potentials are real, despite their gauge freedom. I'm going to think this one over, but barring any further thoughts it looks like that's enough to change my mind.

*I could be wrong: I have enough physics training to speculate on these issues, but not to conclude.

[edit] It also helps that Feynman, who certainly knows more about this than I do, sees the potentials as more real (I suppose this means 'fundamental'?) than the fields.

wavefunctions as real, despite their phase freedom,

Heh. It gets worse. Typically one is taught that the wavefunction is defined up to a global constant. You might have thought that the difference in phase between two places would at least be well defined. This is true, so long as you stick to one reference frame. A Galilean boost will preserve the magnitude everywhere, but add a different phase everywhere.