andrewducker: (Default)
andrewducker ([personal profile] andrewducker) wrote2023-05-02 12:49 pm

Some thoughts on AI, Intelligence, and Large Language Models.

There's a school of thought that says that "People set a bar for AI, and whenever computers achieve that bar it's moved. And this is because people are protective of their intelligence, so they treat anything a computer can do as clearly not *real* intelligence."

And I can understand objections to that - it feels unfair to keep moving the bar. The problem is that lots of people have either no working definition of "intelligence". They have an inductive "feeling" for what intelligence is, and most of the time that works just fine. And it certainly used to feel like, for instance, "being able to beat a human at chess" would require intelligence from a human, so presumably if a computer could do it then it would be an artificial intelligence. So they set the bar wherever it feels like "If the computer can solve this task then surely it must be intelligent".

And actually that's kinda true, if your definition of intelligence is "Can consider lots of possibilities about a chess board, and find the one that's the most effective." The problem is that they then got an "AI" that could only apply its "Intelligence" to chess. And it did't really understand chess, it just had a set of steps to follow that allow it to do well at chess. If you gave the set of steps to a person who had never played chess, and got them to follow the steps then they'd be just as likely to win a game. But they'd have no mental model of chess, because that's not how (most) chess engines work.

And that idea of a mental model is where my definition of intelligence comes from - "The ability to form models from observations and extrapolate from them."

If something is able to form those models and then use them to make predictions, or to analyse new situations, or to extend the models and test them, then it's intelligent. They might be models of how car engines work, or how French works, or how numbers work, or how humans (or their societies) work. Or, indeed, of how to catch updrafts while hunting fieldmice, or where the best grazing is that's safe for your herd, or how to get the humans to deliver the best treats. These are all things that one can create mental models of, and then use those models to understand them and predict how they interact with the world.

I mention this because of the recent excitement about Large Language Models. The kind of thing which GPT is an example of, and exploded onto the scene with extremely impressive examples of conversational ability. These models are, to put it mildly, incredibly impressive. They were trained on huge amounts of text, and they can do an awesome job of taking a prompt and generating some text which looks (mostly) like a human wrote it. It is, frankly, amazing how well it can do this.

And, as you'd expect, some people have come out and said "If a computer can solve this task then surely it must be intelligent." Particularly because we are very used to judging people's intelligence based on how they write (particularly on the internet, when that's frequently all we have to go on). But "This looks like a person wrote it" is exactly what GPT is designed to do. To quote François Chollet, Saying "ChatGPT feels intelligent to me so it must be" is an utterly invalid take -- ChatGPT is literally an adversarial attack on your theory-of-mind abilities.

To be fair, though, LLMs _do_ have models. They make models of what well-written answers to questions look like. Impressively good ones. But that shouldn't be confused with understanding those questions, or having any kind of model of the world. It's great that, when asked "What does a parrot looks like?", it can say "A parrot is a colorful bird with a distinctive curved beak and zygodactyl feet, which means that they have two toes pointing forward and two pointing backward." - because it knows which words are associated with describing what things look like, what words are associated with the word "parrot", how to structure a sentence, etc. But that doesn't mean it has any idea what a curve actually looks like. The word "curved" means something to you because when you were very young people showed you curves and said the word "curve" enough times that you made the connection between experience and language. LLMs have no experience, they only have language. And no matter how much language you pile onto a model, and how many words you link to each other, if none of them ever link to a real experience of a real thing then there's no "there" there - it's all language games. And that's why these systems will regularly say things with no connection to reality - they don't understand what they're saying, they aren't connected to reality, they're just making sentences that look plausible.

Simply put, LLM is amazing, but it's amazing at understanding language patterns and working out what piece of language comes next to look like a person wrote it. And language is only meaningful if it's connected to concepts, and you connect those by starting with, for instance, experiencing dogs and *then* learning the word "dog". Or experiencing dogs, learning the word "cat" and then being told that dogs are like cats except for certain differences.

However! This doesn't mean that there couldn't be a "there", that a system not unlike an LLM couldn't learn how to interact with the reality, to associate words with physical things, and develop an intelligence that was rooted in understanding of the world. I suspect it will need to be significantly bigger than existing models, and to be able to work with huge amounts of memory in order to store the context that needs for various situations. But the idea of building models based on huge amounts of input, and then extrapolating them is clearly one that's not going anywhere.

In the meantime, I can't give you a better idea of what large language models are, and why they produce the things they do than this rather wonderful description.
channelpenguin: (Default)

[personal profile] channelpenguin 2023-05-02 02:04 pm (UTC)(link)
My opinion hasn't changed for 30+years (you may even recall, from our dim and distant youth). I think that for machines to be intelligent, they must also have 1) a "body" - physical equipment to interact with the physical world 2) "emotions" - an aversion to harm to their "body" and most likely some drive towards a goal or goals. (else ... "I think you ought to know, I'm feeling very depressed")
doug: (Default)

[personal profile] doug 2023-05-02 02:24 pm (UTC)(link)
You might like to know that the first chess engine was an algorithm devised by Turing in 1950, following on from some theoretical work on how it could be done by Shannon. Unfortunately, it was beyond the capabilities of any computer available at the time. So what was (arguably) the first ever human-vs-computer chess match was Shannon playing Turing's algorithm as manually-implemented by Turing, which took him half an hour per move. The algorithm lost.
simont: A picture of me in 2016 (Default)

[personal profile] simont 2023-05-02 02:35 pm (UTC)(link)
ChatGPT is literally an adversarial attack on your theory-of-mind abilities

That's an interesting way of looking at it. And it connects directly to the Turing test, of course: is ChatGPT (or maybe its next-but-3 successor, whatever) also an adversarial attack on the Turing test? Is it about to demonstrate that Turing picked the wrong definition?

I think that at present, ChatGPT's most interesting way to fail the Turing test – in the sense of behaving very unlike a human answering the same question – is where it lies (or perhaps, per Harry Frankfurt's distinction, bullshits) with no inhibitions whatsoever and also with no discernible purpose.

Of course humans lie and bullshit in many situations, but generally they'll have some purpose in doing so: you lie to protect a secret that has some consequences to you if it's found out, you bullshit for your own aggrandizement or profit. In both cases, you won't depart from the truth in the first place unless you have a reason to bother – because it requires work, in that you have to take reasonable care to ensure that the untruths you're spouting are not trivially falsifiable. The used-car vendor who goes on about how well-cared-for and reliable the car is doesn't expect that the customer will never find out the truth – but does at least care that the customer doesn't find it out until it's too late. ChatGPT really couldn't give a monkey's.

But then ... wait a minute. Humans do sometimes answer questions in a manner that's immediately obviously an untruth, or extremely unhelpful. If they don't care a hoot about helping the questioner, and can't see any downside to themself in being caught in a lie, why not just say any old thing they think is funny, and enjoy the confusion and/or annoyance they cause in their questioners?

ChatGPT is nowhere near passing the Turing test if the test judge is trying to distinguish it from a sensible human trying to behave reasonably. But it might be pretty close if the judge is trying to distinguish it from an Internet troll.
ciphergoth: (Default)

[personal profile] ciphergoth 2023-05-02 02:50 pm (UTC)(link)
What dialogues with GPT-4 do you think most strongly illustrate the way in which it's an "adversarial attack on your theory-of-mind abilities" rather than "real intelligence"?
calimac: (Default)

[personal profile] calimac 2023-05-02 05:54 pm (UTC)(link)
But if, as the person referred to in your final link says, the AI's ability is only to come up with what sounds like a plausible response to a statement, what does it say when - as happened to me - the AI came up with a plausible answer to a question but a whole bunch of humans trying to do so failed?

Sometimes I wonder if all humans are capable of forming and extrapolating models from observations, your definition of intelligence. I'm reminded of PKD's theory that many supposed humans are actually androids.
agoodwinsmith: (Default)

[personal profile] agoodwinsmith 2023-05-02 09:11 pm (UTC)(link)
"The ability to form models from observations and extrapolate from them."

I think this needs modification, because as it stands, the ChatGPT is/are intelligent. The fact that the extrapolations are not correct is not part of the brief.

For it to know that some of its extrapolations are not correct, it needs judgement, or the ability to evaluate and retry. It also needs to know that correctness is valued. Apparently some success/improvement is happening when a querier asks the ChatGPT to reflect, but it doesn't have the initiative/programming to do so on its own.

So, I think your definition needs an addition such as: "The ability to form models from observations and extrapolate from them; and the ability to evaluate the extrapolations against its models and then modify them towards a closer fit with its models."
Edited (too many "not"s) 2023-05-02 21:13 (UTC)

[personal profile] anna_wing 2023-05-03 02:15 pm (UTC)(link)
So is there are necessary connection between intelligence and a consciousness of some sort? Self-awareness?

I like mechanical things. They are so much more robust. I still remember seeing a room full of connected metal pieces, all the same size, which were collectively the levers that moved the Gates-of-Mordor-sized original locks of the Panama Canal, all of them driven originally by a motor that didn't look a lot bigger than a large outboard motor. Plus, anyone wanting to interfere with it would have had to actually break in and go at it physically. And even then, any workshop could have produced replacement pieces at once.

When I worked in New York city more than 20 years ago, some of the old pneumatic tubes downtown designed at the beginning of the 20th century were still operational. I remember seeing them being used in at least one shop that I visited, which must presumably have originally been a post office. I don't know if they survived the 11 Sept 2001 incidents.