No, OpenAI Strawberry isn’t imminent — but it sure trolled the AI doomers(pivot-to-ai.com)

posted 4 months ago

David Gerard@awful.systemsM

techtakes@awful.systems

12 commentshide report

Sort:

Hot Top Controversial New Old

You are viewing a single thread.

View all comments View context

[ - ]

UnseriousAcademic@awful.systems

14 points

4 months ago

Does this mean they’re not going to bother training a whole new model again? I was looking forward to seeing AI Mad Cow Disease after it consumed an Internet’s worth of AI generated content.

permalink

report

parent

[ - ]

David Gerard@awful.systemsOPM

9 points

4 months ago

I think they will do whatever gets more investor cash

permalink

report

parent

[ - ]

anton@lemmy.blahaj.zone

8 points

4 months ago

If you change the tokenizer you have to retrain from scratch, but you can do so with the old, unpolluted data.

It’s genius if you think about it,* you can waste energy and tell your investors it’s a new better model, while staying upstream from the river you pollute.
* at least for consultants, compute providers and other middle men.

permalink

report

parent

[ - ]

UnseriousAcademic@awful.systems

4 points

4 months ago

I remember one time in a research project I switched out the tokeniser to see what impact it might have on my output. Spent about a day re-running and the difference was minimal. I imagine it’s wholly the same thing.

*Disclaimer: I don’t actually imagine it is wholly the same thing.

permalink

report

parent

[ - ]

David Gerard@awful.systemsOPM

4 points

4 months ago

there’s a research result that the precise tokeniser makes bugger all difference, it’s almost entirely the data you put in

because LLMs are lossy compression for text

permalink

report