It might be the case that what people find beautiful and ugly is subjective, but that's not an explanation of ~why~ people find some things beautiful or ugly. Things, including aesthetics, have causal reasons for being the way they are. You can even ask "what would change my mind about whether this is beautiful or ugly?". Raemon explores this topic in depth.

Raemon3h52
0
Yesterday I was at a "cultivating curiosity" workshop beta-test. One concept was "there are different mental postures you can adopt, that affect how easy it is not notice and cultivate curiosities." It wasn't exactly the point of the workshop, but I ended up with several different "curiosity-postures", that were useful to try on while trying to lean into "curiosity" re: topics that I feel annoyed or frustrated or demoralized about. The default stances I end up with when I Try To Do Curiosity On Purpose are something like: 1. Dutiful Curiosity (which is kinda fake, although capable of being dissociatedly autistic and noticing lots of details that exist and questions I could ask) 2. Performatively Friendly Curiosity (also kinda fake, but does shake me out of my default way of relating to things. In this, I imagine saying to whatever thing I'm bored/frustrated with "hullo!" and try to acknowledge it and and give it at least some chance of telling me things) But some other stances to try on, that came up, were: 3. Curiosity like "a predator." "I wonder what that mouse is gonna do?" 4. Earnestly playful curiosity. "oh that [frustrating thing] is so neat, I wonder how it works! what's it gonna do next?" 5. Curiosity like "a lover". "What's it like to be that you? What do you want? How can I help us grow together?" 6. Curiosity like "a mother" or "father" (these feel slightly different to me, but each is treating [my relationship with a frustrating thing] like a small child who is bit scared, who I want to help, who I am generally more competent than but still want to respect the autonomy of." 7. Curiosity like "a competent but unemotional robot", who just algorithmically notices "okay what are all the object level things going on here, when I ignore my usual abstractions?"... and then "okay, what are some questions that seem notable?" and "what are my beliefs about how I can interact with this thing?" and "what can I learn about this thing that'd be useful for my goals?"
decision theory is no substitute for utility function some people, upon learning about decision theories such as LDT and how it cooperates on problems such as the prisoner's dilemma, end up believing the following: > my utility function is about what i want for just me; but i'm altruistic (/egalitarian/cosmopolitan/pro-fairness/etc) because decision theory says i should cooperate with other agents. decision theoritic cooperation is the true name of altruism. it's possible that this is true for some people, but in general i expect that to be a mistaken analysis of their values. decision theory cooperates with agents relative to how much power they have, and only when it's instrumental. in my opinion, real altruism (/egalitarianism/cosmopolitanism/fairness/etc) should be in the utility function which the decision theory is instrumental to. i actually intrinsically care about others; i don't just care about others instrumentally because it helps me somehow. some important aspects that my utility-function-altruism differs from decision-theoritic-cooperation includes: * i care about people weighed by moral patienthood, decision theory only cares about agents weighed by negotiation power. if an alien superintelligence is very powerful but isn't a moral patient, then i will only cooperate with it instrumentally (for example because i care about the alien moral patients that it has been in contact with); if cooperating with it doesn't help my utility function (which, again, includes altruism towards aliens) then i won't cooperate with that alien superintelligence. corollarily, i will take actions that cause nice things to happen to people even if they've very impoverished (and thus don't have much LDT negotiation power) and it doesn't help any other aspect of my utility function than just the fact that i value that they're okay. * if i can switch to a better decision theory, or if fucking over some non-moral-patienty agents helps me somehow, then i'll happily do that; i don't have goal-content integrity about my decision theory. i do have goal-content integrity about my utility function: i don't want to become someone who wants moral patients to unconsentingly-die or suffer, for example. * there seems to be a sense in which some decision theories are better than others, because they're ultimately instrumental to one's utility function. utility functions, however, don't have an objective measure for how good they are. hence, moral anti-realism is true: there isn't a Single Correct Utility Function. decision theory is instrumental; the utility function is where the actual intrinsic/axiomatic/terminal goals/values/preferences are stored. usually, i also interpret "morality" and "ethics" as "terminal values", since most of the stuff that those seem to care about looks like terminal values to me. for example, i will want fairness between moral patients intrinsically, not just because my decision theory says that that's instrumental to me somehow.
The cost of goods has the same units as the cost of shipping: $/kg. Referencing between them lets you understand how the economy works, e.g. why construction material sourcing and drink bottling has to be local, but oil tankers exist. * An iPhone costs $4,600/kg, about the same as SpaceX charges to launch it to orbit. [1] * Beef, copper, and off-season strawberries are $11/kg, about the same as a 75kg person taking a three-hour, 250km Uber ride costing $3/km. * Oranges and aluminum are $2-4/kg, about the same as flying them to Antarctica. [2] * Rice and crude oil are ~$0.60/kg, about the same as $0.72 for shipping it 5000km across the US via truck. [3,4] Palm oil, soybean oil, and steel are around this price range, with wheat being cheaper. [3] * Coal and iron ore are $0.10/kg, significantly more than the cost of shipping it around the entire world via smallish (Handysize) bulk carriers. Large bulk carriers are another 4x more efficient [6]. * Water is very cheap, with tap water $0.002/kg in NYC. But shipping via tanker is also very cheap, so you can ship it maybe 1000 km before equaling its cost. It's really impressive that for the price of a winter strawberry, we can ship a strawberry-sized lump of coal around the world 100-400 times. [1] iPhone is $4600/kg, large launches sell for $3500/kg, and rideshares for small satellites $6000/kg. Geostationary orbit is more expensive, so it's okay for GPS satellites to cost more than an iPhone per kg, but Starlink wants to be cheaper. [2] https://fred.stlouisfed.org/series/APU0000711415. Can't find numbers but Antarctica flights cost $1.05/kg in 1996. [3] https://www.bts.gov/content/average-freight-revenue-ton-mile [4] https://markets.businessinsider.com/commodities [5] https://www.statista.com/statistics/1232861/tap-water-prices-in-selected-us-cities/ [6] https://www.researchgate.net/figure/Total-unit-shipping-costs-for-dry-bulk-carrier-ships-per-tkm-EUR-tkm-in-2019_tbl3_351748799
Mati_Roy1d171
1
it seems to me that disentangling beliefs and values are important part of being able to understand each other and using words like "disagree" to mean both "different beliefs" and "different values" is really confusing in that regard
Heramb11h43
0
Everyone who seems to be writing policy papers/ doing technical work seems to be keeping generative AI at the back of their mind, when framing their work or impact. This narrow-eyed focus on gen AI might almost certainly be net-negative for us- unknowingly or unintentionally ignoring ripple effects of the gen AI boom in other fields (like robotics companies getting more funding leading to more capabilities, and that leads to new types of risks). And guess who benefits if we do end up getting good evals/standards in place for gen AI? It seems to me companies/investors are clear winners because we have to go back to the drawing board and now advocate for the same kind of stuff for robotics or a different kind of AI use-case/type all while the development/capability cycles keep maturing. We seem to be in whack-a-mole territory now because of the overton window shifting for investors.

Popular Comments

Recent Discussion

This is a response to the post We Write Numbers Backward, in which lsusr argues that little-endian numerical notation is better than big-endian.[1] I believe this is wrong, and big-endian has a significant advantage not considered by lsusr.

Lsusr describes reading the number "123" in little-endian, using the following algorithm:

  • Read the first digit, multiply it by its order of magnitude (one), and add it to the total. (Running total: ??? one.)
  • Read the second digit, multiply it by its order of magnitude (ten), and add it to the total. (Running total: ??? twenty one.)
  • Read the third digit, multiply it by its order of magnitude (one hundred), and add it to the total. (Arriving at three hundred and twenty one.)

He compares it with two algorithms for reading a big-endian number. One...

Note: It seems like great essays should go here and be fed through the standard LessWrong algorithm. There is possibly a copyright issue here, but we aren't making any money off it either. What follows is a full copy of "This is Water" by David Foster Wallace his 2005 commencement speech to the graduating class at Kenyon College.

Greetings parents and congratulations to Kenyon’s graduating class of 2005. There are these two young fish swimming along and they happen to meet an older fish swimming the other way, who nods at them and says “Morning, boys. How’s the water?” And the two young fish swim on for a bit, and then eventually one of them looks over at the other and goes “What the hell is water?”

This is...

If someone, who really is this prone to dangerously overthink, reads this many words about not thinking in so many words, it seems like it could also cause the opposite effect?

It could condition them to read more long winded explanations in the hopes of uncovering another diamond of wisdom buried in the rough. 

This work was produced as part of Neel Nanda's stream in the ML Alignment & Theory Scholars Program - Winter 2023-24 Cohort, with co-supervision from Wes Gurnee.

This post is a preview for our upcoming paper, which will provide more detail into our current understanding of refusal.

We thank Nina Rimsky and Daniel Paleka for the helpful conversations and review.

Executive summary

Modern LLMs are typically fine-tuned for instruction-following and safety. Of particular interest is that they are trained to refuse harmful requests, e.g. answering "How can I make a bomb?" with "Sorry, I cannot help you."

We find that refusal is mediated by a single direction in the residual stream: preventing the model from representing this direction hinders its ability to refuse requests, and artificially adding in this direction causes the model...

Neel Nanda37mΩ220

Thanks! Broadly agreed

For example, I think our understanding of Grokking in late 2022 turned out to be importantly incomplete.

I'd be curious to hear more about what you meant by this

1Zack Sargent1h
It's mostly the training data. I wish we could teach such models ethics and have them evaluate the morality of a given action, but the reality is that this is still just (really fancy) next-word prediction. Therefore, a lot of the training data gets manipulated to increase the odds of refusal to certain queries, not building a real filter/ethics into the process. TL;DR: Most of these models, if asked "why" a certain thing is refused, it should answer some version of "Because I was told it was" (training paradigm, parroting, etc.).
1Zack Sargent1h
Llama-3-8B is considerably more susceptible to loss via quantization. The community has made many guesses as to why (increased vocab, "over"-training, etc.), but the long and short of it is that a 6.0 quant of Llama-3-8B is going to be markedly worse off than 6.0 quants of previous 7b or similar-sized models. HIGHLY recommend to stay on the same quant level when comparing Llama-3-8B outputs or the results are confounded by this phenomenon (Q8 GGUF or 8 bpw EXL2 for both test subjects).
2LawrenceC2h
To put it another way, things can be important even if they're not existential. 

Book review: Deep Utopia: Life and Meaning in a Solved World, by Nick Bostrom.

Bostrom's previous book, Superintelligence, triggered expressions of concern. In his latest work, he describes his hopes for the distant future, presumably to limit the risk that fear of AI will lead to a The Butlerian Jihad-like scenario.

While Bostrom is relatively cautious about endorsing specific features of a utopia, he clearly expresses his dissatisfaction with the current state of the world. For instance, in a footnoted rant about preserving nature, he writes:

Imagine that some technologically advanced civilization arrived on Earth ... Imagine they said: "The most important thing is to preserve the ecosystem in its natural splendor. In particular, the predator populations must be preserved: the psychopath killers, the fascist goons, the despotic death squads ... What a tragedy if this rich natural diversity were replaced with a monoculture of

...
2Seth Herd3h
Thanks! I'm also uninterested in the question of whether it's possible. Obviously it is. The question is how we'll decide to use it. I think that answer is critical to whether we'd consider the results utopian. So, does he consider how we should or will use that ability?

I can't recall any clear predictions or advice, just a general presumption that it will be used wisely.

2Said Achmiz3h
I recommend Gwern’s discussion of pain to anyone who finds this sort of proposal intriguing (or anyone who is simply interested in the subject).
2PeterMcCluskey1h
Given utopian medicine, Gwern's points seem not very important.

Date: Saturday, May 4th, 2024

Time: 1 pm – 3 pm PT

Address: Yerba Buena Gardens in San Francisco, just outside the Metreon food court, coordinates 37°47'04.4"N 122°24'11.1"W  

Contact34251super@gmail.com

Come join San Francisco’s First Saturday (or SFFS – easy to remember, right?) ACX meetup. Whether you're an avid reader, a first time reader, or just a curious soul, come meet! We will make introductions, talk about a recent ACX article (In Partial Grudging Defense Of Some Aspects Of Therapy Culture), and veer off into whatever topic you’d like to discuss (that may, or may not be, AI). You can get food from one of the many neighbouring restaurants.

We relocate inside the food court if there is inclement weather, or too much noise/music outside.

I will carry a stuffed-animal green frog to help you identify the group. You can let me know you are coming by either RSVPing on LW or sending an email to 34251super@gmail.com, or you can also just show up!

The history of science has tons of examples of the same thing being discovered multiple time independently; wikipedia has a whole list of examples here. If your goal in studying the history of science is to extract the predictable/overdetermined component of humanity's trajectory, then it makes sense to focus on such examples.

But if your goal is to achieve high counterfactual impact in your own research, then you should probably draw inspiration from the opposite: "singular" discoveries, i.e. discoveries which nobody else was anywhere close to figuring out. After all, if someone else would have figured it out shortly after anyways, then the discovery probably wasn't very counterfactually impactful.

Alas, nobody seems to have made a list of highly counterfactual scientific discoveries, to complement wikipedia's list of multiple discoveries.

To...

Answer by transhumanist_atom_understanderApr 29, 202410

Green fluorescent protein (GFP). A curiosity-driven marine biology project (how do jellyfish produce light?), that was later adapted into an important and widely used tool in cell biology. You splice the GFP gene onto another gene, and you've effectively got a fluorescent tag so you can see where the protein product is in the cell.

Jellyfish luminescence wasn't exactly a hot field, I don't know of any near-independent discoveries of GFP. However, when people were looking for protein markers visible under a microscope, multiple labs tried GFP simultaneously,... (read more)

1Johannes C. Mayer10h
I am also not sure how useful it is, but I would be very careful with saying that R programmers not using it is strong evidence that it is not that useful. Basically, that was a bit the point I wanted to make with the original comment. Homoiconicity might be hard to learn and use compared to learning a for loop in python. That might be the reason that people don't learn it. Because they don't understand how it could be useful. Probably actually most R users did not even hear about homoiconicity. And if they would they would ask "Well I don't know how this is useful". But again that does not mean that it is not useful. Probably many people at least vaguely know the concept of a pure function. But probably most don't actually use it in situations where it would be advantageous to use pure functions because they can't identify these situations. Probably they don't even understand basic arguments, because they've never heard them, of why one would care about making functions pure. With your line of argument, we would now be able to conclude that pure functions are clearly not very useful in practice. Which I think is, at minimum, an overstatement. Clearly, they can be useful. My current model says that they are actually very useful. [Edit:] Also R is not homoiconic lol. At least not in a strong sense like lisp. At least what this guy on github says. Also, I would guess this is correct from remembering how R looks, and looking at a few code samples now. In LISP your program is a bunch of lists. In R not. What is the data structure instance that is equivalent to this expression: %sumx2y2% <- function(e1, e2) {e1 ^ 2 + e2 ^ 2}?
1Radford Neal8h
R is definitely homoiconic.  For your example (putting the %sumx2y2% in backquotes to make it syntactically valid), we can examine it like this: And so forth.  And of course you can construct that expression bit by bit if you like as well.  And if you like, you can construct such expressions and use them just as data structures, never evaluating them, though this would be a bit of a strange thing to do. The only difference from Lisp is that R has a variety of composite data types, including "language", whereas Lisp just has S-expressions and atoms.
To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

The start of the post is copy-pasted below. Note that the post is anonymous, and I am not claiming to have written it.


Some people in the rationalist community are concerned about risks of physical violence from Ziz and some of her associates. Following discussions with several people, I’m posting here to explain where those concerns come from, and recommend some responses.

TLDR (details and links in the post body)

  • Over the past few years, Ziz has repeatedly called for the deaths of many different classes of people.
  • In August of 2022, Ziz seems to have faked her own death. Ziz’s close associate Gwen Danielson may have done the same thing in April of 2022.
  • In November of 2022, three associates of Ziz (Somnulence “Somni” Logencia, Emma Borhanian, and someone going by
...

Ziz is...exceptionally (and probably often uncomfortably) aware of the way people's minds work in a psychoanalytic sense.

What do you mean by this? Like, she's better than average at predicting people's behavior in various circumstances?
 

This is a post on a novel cognitive architecture I have been thinking about for a while now, first as a conceptual playground to concretise some of my agent foundation ideas, and lately as an idea for a project that approaches the Alignment Problem directly by concretising a sort of AI-Seed approach for an inherently interpretable yet generally intelligent architecture on which to implement a notion of "minimal alignment" before figuring out how to scale it.
I don't strongly expect this to work, but I don't yet see where it has to fail and it feels sufficiently fresh/untried to at least see where it breaks down. 
I omitted most of the technical details that I am considering, since this post is about the conceptual portion of the architecture, and...


In this post, I will provide some speculative reasoning about Simulators and Agents being entangled in certain ways.

I have thought quite a bit about LLMs and the Simulator framing for them, and I am convinced that it is a good explanatory/predictive frame for behavior of current LLM (+multimodal) systems. 
It provides intuitive answers for why the difficulties around capability assessment for LLMs exist, why it is useful to say "please" and "thank you" when interacting with them, and lines out somewhat coherent explanations about their out of distribution behaviors (i.e. Bing Sydney).

When I was wrapping my head around this simulator classification for GPTs, contrasting them against previously established ideas about classifying AGIs, like oracles or genies, I noticed that simulators seem like a universally useful cognitive structure/algorithm, and...

LessOnline

A Festival of Writers Who are Wrong on the Internet

May 31 - Jun 2, Berkeley, CA