You can always rely on grandma(sh.itjust.works)

posted 1 year ago

earlyadopter@sh.itjust.works

programmerhumor@lemmy.ml

5 commentshide report

Sort:

Hot Top Controversial New Old

You are viewing a single thread.

View all comments View context

[ - ]

Demonen@lemmy.ml

0 points

1 year ago

Yep yep, statistical analysis as to the frequency of tokens in the training text.

Brand new, never-before-seen Windows keys have a frequency of zero occurrences per billion words of training data.

permalink

report

parent

[ - ]

average650@lemmy.world

0 points

1 year ago

That isn’t actually what’s important. It’s the frequency of the token, which could be as simple as single characters. The frequency of those is certainly not zero.

LLMs absolutely can make up new words, word combinations, or sentences.

That’s not to say chatgpt can actually give you good windows keys, but it isn’t a fundamental limitation of LLMs.

permalink

report

parent

[ - ]

Demonen@lemmy.ml

0 points

1 year ago

Okay, I’ll take your word for it.

I’ve never ever, in many hours of playing with ChatGPT as a toy, had it make up a word. Hallucinate wildly, yes, but not stogulate a word out of nothing.

I’d love to know more, though. How does it combine new words? Do you have any examples of words ChatGPT has made up? This is fascinating to me, as it means the model is much less chained to the training data than I thought.

permalink

report

parent

[ - ]

relevants@feddit.de

0 points

1 year ago

A lot of compound words are actually multiple tokens so there’s nothing stopping the LLM from generating the tokens in a new order thereby creating a new word.

permalink

report

parent

Programmer Humor

!programmerhumor@lemmy.ml

Create post

Post funny things about programming here! (Or just rant about your favourite programming language.)

Rules:

Posts must be relevant to programming, programmers, or computer science.
No NSFW content.
Jokes must be in good taste. No hate speech, bigotry, etc.

Community stats

5.4K
Monthly active users
887
Posts
9K
Comments

Community moderators

cat_programmer@lemmy.ml