feddit.org

-1 points

7 months ago

mathematical average of internet dialog

It’s not. Whenever someone talks about how LLMs are just statistics, ignore them unless you know they are experts. One thing that convinces me that ANNs really capture something fundamental about how human minds work is that we share the same tendency to spout confident nonsense.

report

[ - ]

NιƙƙιDιɱҽʂ@lemmy.world

1 point

7 months ago

It has a tendency to behave exactly as the data it was ultimately trained on…due to statistics…lol

report

stratoscaster@lemmy.world

[ - ]

1 point

7 months ago

It literally is just statistics… wtf are you on about. It’s all just weights and matrix multiplication and tokenization

report

[ - ]

Redex@lemmy.world

-1 points

7 months ago

Well on one hand yes, when you’re training it your telling it to try and mimic the input as close as possible. But the result is still weights that aren’t gonna reproducte everything exactly the same as it just isn’t possible to store everything in the limited amount of entropy weights provide.

In the end, human brains aren’t that dissimilar, we also just have some weights and parameters (neurons, how sensitive they are and how many inputs they have) that then output something.

I’m not convinced that in principle this is that far from how human brains could work (they have a lot of minute differences but the end result is the same), I think that a sufficiently large, well trained and configured model would be able to work like a human brain.

report

https://www.lesswrong.com/posts/XNBZPbxyYhmoqD87F/llms-and-computation-complexity

[ - ]

Natanael@slrpnk.net

-1 points

7 months ago

Not an LLM specifically, in particular lack of backtracking and the network depth limits as well as interconnectivity limits sets hard limits on capabilities.

https://garymarcus.substack.com/p/math-is-hard-if-you-are-an-llm-and

https://arxiv.org/abs/2401.11817

https://www.marktechpost.com/2023/08/01/this-ai-research-dives-into-the-limitations-and-capabilities-of-transformer-large-language-models-llms-empirically-and-theoretically-on-compositional-tasks/?amp

Humans have a completely different memory model and a in large part a very different way of linking together learned concepts to form their world view and to develop interdisciplinary skills, allowing us to solve many kinds of highly complex tasks as long as we can keep enough of it in our memory.

report

[ - ]

-1 points

7 months ago

It’s all just weights and matrix multiplication and tokenization

See, none of these is statistics, as such.

Weights is maybe closest but they are supposed to represent the strength of a neural connection. This is originally inspired by neurobiology.

Matrix multiplication is linear algebra and encountered in lots of contexts.

Tokenization is a thing from NLP. It’s not what one would call a statistical method.

So you can see where my advice comes from.

Certainly there is nothing here that implies any kind of averaging going on.

report

[ - ]

Natanael@slrpnk.net

1 point

7 months ago

If there’s no averaging, why do they repeat stereotypes so often?

report

[ - ]

-1 points

7 months ago

Why would averaging lead to repetition of stereotypes?

Anyway, it’s hard to say LLMs output what they do. GPTisms may have to do with the system prompt or they may result from the fine-tuning. Either way, they don’t seem very internet average to me.

report

[ - ]

Natanael@slrpnk.net

1 point

7 months ago

The TLDR is that pathways between nodes corresponding to frequently seen patterns (stereotypical sentences) gets strengthened more than others and therefore it becomes more likely that this pathway gets activated over others when giving the model a prompt. These strengths correspond to probabilities.

Have you seen how often they’ll sign a requested text with a name placeholder? Have you seen the typical grammar they use? The way they write is a hybridization of the most common types of texts it has seen in samples, weighted by occurrence (which is a statistical property).

It’s like how mixing dog breeds often results in something which doesn’t look exactly like either breed but which has features from every breed. GPT/LLM models mix in stuff like academic writing, redditisms and stackoverflowisms, quoraisms, linkedin-postings, etc. You get this specific dryish text full of hedging language and mixed types of formalisms, a certain answer structure, etc.

report

[ - ]

-1 points

7 months ago

That’s a) not how it works and b) not averaging.

report