You are viewing a single thread.
View all comments View context
13 points

Should the research he’s discussing also be disregarded? https://arxiv.org/pdf/2410.05229

permalink
report
parent
reply
3 points

Gary Marcus should be disregarded because he’s emotionally invested in The Bitter Lesson being wrong. He really wants LLMs to not be as good as they already are. He’ll find some interesting research about “here’s a limitation that we found” and turn that into “LLMS BTFO IT’S SO OVER”.

The research is interesting for helping improve LLMs, but that’s the extent of it. I would not be worried about the limitations the paper found for a number of reasons:

  • There doesn’t seem to be any reason to believe that there’s a ceiling on scaling up
  • LLM’s reasoning abilities improve with scale (notice that the example they use for kiwis they included the answers from o1-mini and llama3-8B, which are much smaller models with much more limited capabilities. GPT-4o got the problem correct when I tested it, without any special prompting techniques or anything)
  • Techniques such as RAG and Chain of Thought help immensely on many problems
  • Basic prompting techniques help, like “Make sure you evaluate the question to ignore extraneous information, and make sure it’s not a trick question”
  • LLMs are smart enough to use tools. They can go “Hey, this looks like a math problem, I’ll use a calculator”, just like a human would
  • There’s a lot of research happening very quickly here. For example, LLMs improve at math when you use a different tokenization method, because it changes how the model “sees” the problem

Until we hit a wall and really can’t find a way around it for several years, this sort of research falls into the “huh, interesting” territory for anybody that isn’t a researcher.

permalink
report
parent
reply
9 points

Actually we do know that there are diminishing returns from scaling already. Furthermore, I would argue that there are inherent limits in simply using correlations in text as the basis for the model. Human reasoning isn’t primarily based on language, we create an internal model of the world that acts as a shared context. The language is rooted in that model and that’s what allows us to communicate effectively and understand the actual meaning behind words. Skipping that step leads to the problems we’re seeing with LLMs.

That said, I agree they are a tool, and they obviously have uses. I just think that they’re going to be a part of a bigger tool set going forward. Right now there’s an incredible amount of hype associated with LLMs. Once the hype settles we’ll know what use cases are most appropriate for them.

permalink
report
parent
reply
3 points

The whole “it’s just autocomplete” is just a comforting mantra. A sufficiently advanced autocomplete is indistinguishable from intelligence. LLMs provably have a world model, just like humans do. They build that model by experiencing the universe via the medium of human-generated text, which is much more limited than human sensory input, but has allowed for some very surprising behavior already.

We’re not seeing diminishing returns yet, and in fact we’re going to see some interesting stuff happen as we start hooking up sensors and cameras as direct input, instead of these models building their world model indirectly through purely text. Let’s see what happens in 5 years or so before saying that there’s any diminishing returns.

permalink
report
parent
reply

Technology

!technology@lemmy.ml

Create post

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

Community stats

  • 3.8K

    Monthly active users

  • 1.2K

    Posts

  • 7.5K

    Comments

Community moderators