silentbob - LessWrong

Aren't LLMs already capable of two very different kinds of search? Firstly, their whole deal is predicting the next token - which is a kind of search. They're evaluation all the tokens at every step, and in the end choose the most probable seeming one. Secondly, across-token search when prompted accordingly. Say "Please come up with 10 options for X, then rate them all according to Y, and select the best option" is something that current LLMs can perform very reliably - whether or not "within token search" exists as well. But then again, one might of course argue that search happening within a single forward pass, and maybe even a type of search that "emerged " via SGD rather than being hard baked into the architecture, would be particularly interesting/important/dangerous. We just shouldn't make the mistake of assuming that this would be the only type of search that's relevant.

I think across-token search via prompting already has the potential to lead to the AGI like problems that we associate with mesa optimizers. Evidently the technology is not quite there yet because PoCs like AutoGPT basically don't quite work, so far. But conditional on AGI being developed in the next few years, it would seem very likely to me that this kind of search would be the one that enables it, rather than some hidden "O(1)" search deeply within the network itself.

Edit: I should of course add a "thanks for the post" and mention that I enjoyed reading it, and it made some very useful points!

Searching for Search

silentbob4d10

Great post! Two thoughts that came to mind while reading it:

the post mostly discussed search happening directly within the network, e.g. within a single forward pass; but what can also happen e.g. in the case of LLMs is that search happens across token-generation rather than within. E.g. you could give ChatGPT a chess constellation and then ask it to list all the valid moves, and then check which move would lead to which state, and if that state looks better than the last one. This would be search depth 1 of course, but still a form of search. In practice it may be difficult because ChatGPT likes to give messages only of a certain length, so it probably stops prematurely if the search space gets too big, but still, search most definitely takes place in this case.
somewhat of a project proposal, ignoring my previous point and getting back to "search within a single forward pass of the network": let's assume we can "intelligent design" our way to a neural network that actually does implement some kind of small search to solve a problem. So we know the NN is on some pretty optimal solution for the problem it solves. What does (S)GD look like at or very near to this point? Would it stay close to this optimum, or maybe instantly diverge away, e.g. because the optimum's attractor basin is so unimaginably tiny in weight space that it's just numerically highly unstable? If the latter (and if this finding indeed generalizes meaningfully), then one could assume that even though search "exists" in parameter space, it's impractical to ever be reached via SGD due to the unfriendly shape of the search space.

Failures in Kindness

silentbob12d30

Thanks a lot! Appreciated, I've adjusted the post accordingly.

Failures in Kindness

silentbob1mo30

Just came to my mind that these are things I tend to think of under the heading "considerateness" rather than kindness

Guess I'd agree. Maybe I was anchored a bit here by the existing term of computational kindness. :)

Failures in Kindness

silentbob1mo20

Fair point. Maybe if I knew you personally I would take you to be the kind of person that doesn't need such careful communication, and hence I would not act in that way. But even besides that, one could make the point that your wondering about my communication style is still a better outcome than somebody else being put into an uncomfortable situation against their will.

I should also note I generally have less confidence in my proposed mitigation strategies than in the phenomena themselves.

The Assumed Intent Bias

silentbob2mo12

Thanks for the example! It reminds me of how I once was a very active Duolingo user, but then they published some update that changed the color scheme. Suddenly the duolingo interface was brighter and lower contrast, which just gave me a headache. At that point I basically instantly stopped using the app, as I found no setting to change it back to higher contrast. It's not quite the same of course, but probably also something that would be surprising to some product designers -- "if people want to learn a language, surely something so banal as a brightening up the font color a bit would not make them stop using our app".

Causality is Everywhere

silentbob2mo10

Another operationalization for the mental model behind this post: let's assume we have two people, Zero-Zoe and Nonzero-Nadia. They are employed by two big sports clubs and are responsible for the living and training conditions of the athletes. Zero-Zoe strictly follows study results that had significant results (and no failed replications) in her decisions. Nonzero-Nadia lets herself be informed by studies in a similar manner, but also takes priors into account for decisions that have little scientific backing, following a "causality is everywhere and effects are (almost) never truly 0" world view, and goes for many speculative but cheap interventions, that are (if indeed non-zero) more likely to be beneficial rather than detrimental.

One view is that Nonzero-Nadia is wasting her time and focuses on too many inconsequential considerations, so will overall do a worse job than Zero-Zoe as she's distracted from where the real benefits can be found.

Another view, and the one I find more likely, is that Nonzero-Nadia can overall achieve better results (in expectation), because she too will follow the most important scientific findings, but on top of that will apply all kinds of small positive effects that Zero-Zoe is missing out on.

(A third view would of course be "it doesn't make any difference at all and they will achieve completely identical results in expectation", but come on, even an "a non-negligible subset of effect sizes is indeed 0"-person would not make that prediction, right?)

Causality is Everywhere

silentbob2mo10

You're right of course - in the quoted part I link to the wikipedia article for "almost surely" (as the analogous opposite case of "almost 0"), so yes indeed it can happen that the effect is actually 0, but this is so extremely rare on a continuum of numbers that it doesn't make much sense to highlight that particular hypothesis.

Causality is Everywhere

silentbob2mo10

For many such questions it's indeed impossible to say. But I think there are also many, particularly the types of questions we often tend to ask as humans, where you have reasons to assume that the causal connections collectively point in one direction, even if you can't measure it.

Let's take the question whether improving air quality at someone's home improves their recovery time after exercise. I'd say that this is very likely. But I'd also be a bit surprised if studies were able to show such an effect, because it's probably small, and it's probably hard to get precise measurements. But improving air quality is just an intervention that is generally "good", and will have small but positive effects on all kinds of properties in our lives, and negative effects on much fewer properties. And if we accept that the effect on exercise recovery will not be zero, then I'd say there's a chance of something like 90% that this effect will be beneficial rather than detrimental.

Similarly, with many interventions that are supposed to affect behavior of humans, one relevant question that is often answerable is whether the intervention increases or reduces friction. And if we expect no other causal effect that may dominate that one, then often the effect on friction may predict the overall outcome of that intervention.

Causality is Everywhere

silentbob2mo20

A basic operationalization of "causality is everywhere" is "if we ran an RCT on some effect with sufficiently many subjects, we'd always reach statistical significance" - which is an empirical claim that I think is true in "almost" all cases. Even for "if I clap today, will it change the temperature in Tokyo tomorrow?". I think I get what you mean by "if causality is everywhere, it is nowhere" (similar to "a theory that can explain everything has no predictive power"), but my "causality is everyhwere" claim is an at least in theory verifiable/falsifiable factual claim about the world.

Of course "two things are causally connected" is not at all the same as "the causal connection is relevant and we should measure it / utilize it / whatever". My basic point is that assuming that something has no causal connection is almost always wrong. Maybe this happens to yield appropriate results, because the effect is indeed so small that you can simply act as if there was no causal connection. But I also believe that the "I believe X and Y have no causal connection at all" world view leads to many errors in judgment, and makes us overlook many relevant effects as well.

LESSWRONG
LW

Posts

Wiki Contributions

Comments