Wiki Contributions

Comments

Nisan52

Of course, Karpathy's post could be in the multimodal training data.

Nisan72

12 years ago, in The state of Computer Vision and AI: we are really, really far away, Andrej Karpathy wrote:

The picture above is funny.

But for me it is also one of those examples that make me sad about the outlook for AI and for Computer Vision. What would it take for a computer to understand this image as you or I do? [...]

In any case, we are very, very far and this depresses me. What is the way forward? :(

I just asked gpt-4o what's going on in the picture, and it understood most of it:

In this image, a group of men in business attire are seen in a locker room or a similar setting. The focus is on two men, where the taller man is standing on a scale. The shorter man, who appears to be playfully pressing down on the taller man's shoulders to increase his weight on the scale, is creating a humorous situation. Both men and those observing in the background are smiling or laughing, indicating that they are enjoying the lighthearted moment. The man pressing down seems to be taking part in a playful prank or joke, adding a sense of camaraderie and fun to the scene.

Nisan20

That does look like a rough commute, the kind that can use up the mental energy you want to spend on learning. One thing you could consider is staying in a hotel overnight near your school sometimes.

Also, consider wearing ear protection on the Transbay Tube. I wish I had done that when I commuted that way for a year.

Nisan30

I suppose if you had more hidden states than observables, you could distinguish hidden-state prediction from next-token prediction by the dimension of the fractal.

Nisan20

If I understand correctly, the next-token prediction of Mess3 is related to the current-state prediction by a nonsingular linear transformation. So a linear probe showing "the meta-structure of an observer's belief updates over the hidden states of the generating structure" is equivalent to one showing "the structure of the next-token predictions", no?

NisanΩ120

The subject of this post appears in the "Did you know..." section of Wikipedia's front page(archived) right now.

NisanΩ121513

I'm saying "transformers" every time I am tempted to write "LLMs" because many modern LLMs also do image processing, so the term "LLM" is not quite right.

"Transformer"'s not quite right either because you can train a transformer on a narrow task. How about foundation model: "models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks".

NisanΩ9137

I agree 100%. It would be interesting to explore how the term "AGI" has evolved, maybe starting with Goertzel and Pennachin 2007 who define it as:

a software program that can solve a variety of complex problems in a variety of different domains, and that controls itself autonomously, with its own thoughts, worries, feelings, strengths, weaknesses and predispositions

On the other hand, Stuart Russell testified that AGI means

machines that match or exceed human capabilities in every relevant dimension

so the experts seem to disagree. (On the other hand, Stuart & Russell's textbook cite Goertzel and Pennachin 2007 when mentioning AGI. Confusing.)

In any case, I think it's right to say that today's best language models are AGIs for any of these reasons:

  • They're not narrow AIs.
  • They satisfy the important parts of Goertzel and Pennachin's definition.
  • The tasks they can perform are not limited to a "bounded" domain.

In fact, GPT-2 is an AGI.

Nisan82

I'm surprised to see an application of the Banach fixed-point theorem as an example of something that's too implicit from the perspective of a computer scientist. After all, real quantities can only be represented in a computer as a sequence of approximations — and that's exactly what the theorem provides.

I would have expected you to use, say, the Brouwer fixed-point theorem instead, because Brouwer fixed points can't be computed to arbitrary precision in general.

(I come from a mathematical background, fwiw.)

Load More