I have 64 zipped megabytes of AIM conversations I had in high school. how hard would it be to train an LLM to be me from 15 years ago?

posted 2 months ago

ch00f@lemmy.world

artificial_intel@lemmy.ml

10 commentshide report

Sort:

Hot Top Controversial New Old

[ - ]

keepthepace@slrpnk.net

12 points

2 months ago

It is called finetuning. I haven’t tried it but oobagooba’s text-generation-webui has a tab to do it and I believe it is pretty straightforward.

Fine tune a base model on your dataset and then tou will then need to format your prompt in the way your AIM logs are organized. e.g. you will need to add “<ch00f>” add the end of your text completion task. It will complete it in the way it learnt it.

If you don’t have a the GPU for it, many companies offer fine-tuning as a service like Mistral

permalink

report

[ - ]

PerogiBoi@lemmy.ca

8 points

2 months ago

Why would you want this??? Anything I wrote from 16 years ago is so beyond cringey. You must have been a stellar kid.

permalink

report

[ - ]

DaGeek247@fedia.io

11 points

2 months ago

Because funy

permalink

report

parent

[ - ]

corsicanguppy@lemmy.ca

5 points

2 months ago

I have 26 years of saved outgoing email.

Recently I needed to redo a fix I learned about in 1998 and implemented then. I implemented it again to install a crappy software project that from its composition canNOT have been from before the post-y2k firing of so many mentors.

Only remembered after 3 hours of searching, saving myself another few hours and surely a nervous breakdown. But, after filtering AD on the client end, the project installed easily.

That’s the best example, but the things I don’t discover I answered already on Stackoverflow I discover I answered years ago in email.

permalink

report

parent

[ - ]

wuphysics87@lemmy.ml

4 points

2 months ago

The real question is why do you have 64 mb of aim conversations?

permalink

report

[ - ]

ch00f@lemmy.worldOP

3 points

2 months ago

Because I communicated with a lot of people over AIM? It’s actually more than just high school. Covers 2004 to around 2012. Also it’s 64mb zipped. Actual size is much larger.

permalink

report

parent

[ - ]

fcano@infosec.pub

2 points

1 month ago

You may try https://github.com/instructlab. You will need to transform those conversations to a specific yaml format.

permalink

report

[ - ]

ch00f@lemmy.worldOP

1 point

1 month ago

Great tip! I got the demo project up and running in around 30 minutes. Glad to see it’s running locally (and not too slowly on my CPU build).

Now to actually train the thing…

permalink

report

parent

[ - ]

istanbullu@lemmy.ml

2 points

2 months ago

Not hard with Huggingface PEFT

permalink

report

AI

!artificial_intel@lemmy.ml

Create post

Artificial intelligence (AI) is intelligence demonstrated by machines, unlike the natural intelligence displayed by humans and animals, which involves consciousness and emotionality. The distinction between the former and the latter categories is often revealed by the acronym chosen.

Community stats

203
Monthly active users
115
Posts
215
Comments

Community stats

Community moderators