2 points

Title mentions speaking italian

Not a single hand gesture anywhere

I’ve been duped

permalink
report
reply
1 point

You may not understand, but we do.
Questo segreto rimarrà custodito gelosamente dalla stirpe italica. ◉‿◉

permalink
report
reply
1 point

No brother non possiamo tenere questo segreto fino alla fine

permalink
report
parent
reply
1 point

Non c’è scelta, se l’ultimo italiano dovesse lasciarci, allora anche questa informazione dovrà lasciare l’umanità

permalink
report
parent
reply
-1 points

breaks spaghetti near you

permalink
report
parent
reply
1 point

calls SISMI

permalink
report
parent
reply
1 point

Rememeber, whenever you break one spaghetto you break one heart 💔

permalink
report
parent
reply
0 points

How about go die in a hole?

permalink
report
parent
reply
-2 points

permalink
report
parent
reply
1 point

permalink
report
reply
5 points

This might be happening because of the ‘elegant’ (incredibly hacky) way openai encodes multiple languages into their models. Instead of using all character sets, they use a modulo operator on each character, to make all Unicode characters represented by a small range of values. On the back end, it somehow detects which language is being spoken, and uses that character set for the response. Seeing as the last line seems to be the same mathematical expression as what you asked, my guess is that your equation just happened to perfectly match some sentence that would make sense in the weird language.

permalink
report
reply
1 point

I suppose it’s conceivable that there’s a bug in converting between different representations of Unicode, but I’m not buying and of this “detected which language is being spoken” nonsense or the use of character sets. It would just use Unicode.

The modulo idea makes absolutely no sense, as LLMs use tokens, not characters, and there’s soooooo many tokens. It would make no sense to make those tokens ambiguous.

permalink
report
parent
reply
1 point

I completely agree that it’s a stupid way of doing things, but it is how openai reduced the vocab size of gpt-2 & gpt-3. As far as I know–I have only read the comments in the source code– the conversion is done as a preprocessing step. Here’s the code to gpt-2: https://github.com/openai/gpt-2/blob/master/src/encoder.py I did apparently make a mistake, as the vocab reduction is done through a lut instead of a simple mod.

permalink
report
parent
reply
4 points

Do you have a source for that? Seems like an internal detail a corpo wouldn’t publish

permalink
report
parent
reply
1 point

Can’t find the exact source–I’m on mobile right now–but the code for the gpt-2 encoder uses a utf-8 to unicode look up table to shrink the vocab size. https://github.com/openai/gpt-2/blob/master/src/encoder.py

permalink
report
parent
reply
2 points

Seriously? Python for massive amounts of data? It’s a nice scripting language, but it’s excruciatingly slow

permalink
report
parent
reply
4 points

Let me simplify it: proceeds to print the same expression

permalink
report
reply
1 point

Nope, they replaced an asterisk with an arrow!

permalink
report
parent
reply
3 points
*

Typical AI behavior

Edit: and then it will gaslight you if you say the answer is the same.

permalink
report
parent
reply
4 points

Fucking hate when do that.

You are repeating the same mistake.

I’m sorry for repeating the same mistake, here’s a new solution with corrections *proceed to write the exactly thing already told it was wrong*

permalink
report
parent
reply

Programmer Humor

!programmerhumor@lemmy.ml

Create post

Post funny things about programming here! (Or just rant about your favourite programming language.)

Rules:

  • Posts must be relevant to programming, programmers, or computer science.
  • No NSFW content.
  • Jokes must be in good taste. No hate speech, bigotry, etc.

Community stats

  • 4.2K

    Monthly active users

  • 947

    Posts

  • 10K

    Comments

Community moderators