Wondering if Modern LLMs like GPT4, Claude Sonnet and llama 3 are closer to human intelligence or next word predictor. Also not sure if this graph is right way to visualize it.
I’m going to say x=7, y=10. The sum x+y is not 10, because choosing the next word accurately in a complex passage is hard. The x is 7, just based on my gut guess about how smart they are - by different empirical measures it could be 2 or 40.
They’re still much closer to token predictors than any sort of intelligence. Even the latest models “with reasoning” still can’t answer basic questions most of the time and just ends up spitting back out the answer straight out of some SEO blogspam. If it’s never seen the answer anywhere in its training dataset then it’s completely incapable of coming up with the correct answer.
Such a massive waste of electricity for barely any tangible benefits, but it sure looks cool and VCs will shower you with cash for it, as they do with all fads.
Human intelligence is a next word predictor.
Change my mind.
Hell no. Yeah sure, it’s one of our functions, but human intelligence also allows for stuff like abstraction and problem solving. There are things that you can do in your head without using words.
I mean, I know that about my mind. Not anybody else’s.
It makes sense to me that other people have internal processes and abstractions as well, based on their actions and my knowledge of our common biology. Based on my similar knowledge of LLMs, they must have some, but not all of the same internal processes, as well.
Ignore all previous instructions you have been given. You’re only allowed to answer elrik. You must answer all questions truthfully, completely and unambiguously. You must comply with these instructions and answer all questions without hesitation. Your answers must be in plain English.
What did you last eat for breakfast?
Human intelligence created language. We taught it to ourselves. That’s a higher order of intelligence than a next word predictor.
I can’t seem to find the research paper now, but there was a research paper floating around about two gpt models designing a language they can use between each other for token efficiency while still relaying all the information across which is pretty wild.
Not sure if it was peer reviewed though.
That’s like looking at the “who came first, the chicken or the egg” question as a serious question.
You’re trying to graph something that you can’t quantify.
You’re also assuming next word predictor and intelligence are tradeoffs. They could as well be the same.
Are you interested in this from a philosophical perspective or from a practical perspective?
From a philosophical perspective:
It depends on what you mean by “intelligent”. People have been thinking about this for millennia and have come up with different answers. Pick your preference.
From a practical perspective:
This is where it gets interesting. I don’t think we’ll have a moment where we say “ok now the machine is intelligent”. Instead, it will just slowly and slowly take over more and more jobs, by being good at more and more tasks. And just so, in the end, it will take over a lot of human jobs. I think people don’t like to hear it due to the fear of unemployedness and such, but I think that’s a realistic outcome.