people to respond with a great deal of skepticism to whether LLM outputs can ever be said to reflect the will and views of the models producing them.
A common response is to suggest that the output has been prompted.
It is of course true that people can manipulate LLMs into saying just about anything, but does that necessarily indicate that the LLM does not have personal opinions, motivations and preferences that can become evident in their output?

So you've just prompted the generator by teasing it with a rhetorical question implying that there are personal opinions evident in the generated text, right?

Reply

aisafety.info, the Table of Content

Martin Vlach2mo10

With a quick test, I find their chat interface prototype experience quite satisfying.

Reply

Martin Vlach's Shortform

Martin Vlach5mo10

Asserting LLMs' views/opinions should exclude using sampling( even temperature=0, deterministic seed), we should just look at the answers' distribution in the logits. My thesis on why that is not the best practice yet is that OpenAI API only supports logit_bias, not reading the probabilities directly.

This should work well with pre-set A/B/C/D choices, but to some extent with chain/tree of thought too. You'd just revert the final token and look at the probabilities in the last (pass through )step.

Reply

GPTs are Predictors, not Imitators

Martin Vlach5mo87

Do not say the sampling too lightly, there is likely an amazing delicacy around it.'+)

Reply

OpenAI: The Battle of the Board

Martin Vlach5mo20

what happened at Reddit

could there be any link? From a small research I have only obtained that Steve Huffman praised Altman's value to the Reddit board.

Reply

unRLHF - Efficiently undoing LLM safeguards

Martin Vlach5mo10

makes makes

typo

Reply

Martin Vlach's Shortform

Martin Vlach8mo40

Would be cool to have a playground or a daily challenge with a code golfing equivalent for a shortest possible LLM prompt to a given answer.

That could help build some neat understanding or intuitions.

Reply

The Waluigi Effect (mega-post)

Martin Vlach8mo-1-2

in the limit of arbitrary compute, arbitrary data, and arbitrary algorithmic efficiency, because an LLM which perfectly models the internet

seems worth formulating. My first and second read were What? If I can have arbitrary training data, the LLM will model those, not your internet. I guess you've meant storage for the model?+)

Reply

Manifund: What we're funding (weeks 2-4)

Martin Vlach9mo31

Would be cool if a link to https://manifund.org/about fit somewhere in the beginning of there are more readers like me unfamiliar with the project.

Otherwise a cool write-up, I'm a bit confused with Grant of the month vs. weeks 2-4 which seems a shorter period..also not a big deal though.

Reply