"prompt engineering"(lemmy.world)

Anyway, it’s hard to say LLMs output what they do. GPTisms may have to do with the system prompt or they may result from the fine-tuning. Either way, they don’t seem very internet average to me.

permalink

report

parent

[ - ]

Natanael@slrpnk.net

1 point

7 months ago

The TLDR is that pathways between nodes corresponding to frequently seen patterns (stereotypical sentences) gets strengthened more than others and therefore it becomes more likely that this pathway gets activated over others when giving the model a prompt. These strengths correspond to probabilities.

Have you seen how often they’ll sign a requested text with a name placeholder? Have you seen the typical grammar they use? The way they write is a hybridization of the most common types of texts it has seen in samples, weighted by occurrence (which is a statistical property).

It’s like how mixing dog breeds often results in something which doesn’t look exactly like either breed but which has features from every breed. GPT/LLM models mix in stuff like academic writing, redditisms and stackoverflowisms, quoraisms, linkedin-postings, etc. You get this specific dryish text full of hedging language and mixed types of formalisms, a certain answer structure, etc.

permalink

report

parent

[ - ]

General_Effort@lemmy.world

-1 points

7 months ago

That’s a) not how it works and b) not averaging.

permalink

report

parent

[ - ]

Natanael@slrpnk.net

1 point

7 months ago

A) I’ve not yet seen evidence to the contrary

B) you do know there’s a lot of different definitions of average, right? The centerpoint of multiple vectors is one kind of average. The median of online writing is an average. The most common vocabulary, the most common sentence structure, the most common formulation of replies, etc, those all form averages within their respective problem spaces. It displays these properties because it has seen them so often in samples, and then it blends them.

permalink

report

parent

[ - ]

General_Effort@lemmy.world

-1 points

7 months ago

I accidentally clicked reply, sorry.

B) you do know there’s a lot of different definitions of average, right?

I don’t think that any definition applies to this. But I’m no expert on averages. In any case, the training data is not representative of the internet or anything. It’s also not training equally on all data and not only on such text. What you get out is not representative of anything.

permalink

report

parent

Show more comments

[ - ]

General_Effort@lemmy.world

-1 points

7 months ago

A) I’ve not yet seen evidence to the contrary

You should worry more about whether you have seen evidence that supports what you are saying. So, what kind of evidence do you want? A tutorial on coding neural nets? The math? Video or text?

permalink

report

parent

Show more comments