You are viewing a single thread.
View all comments View context
1 point

That’s wrong by programmer and data scientist standards.

The code is the source code, the source code computes weights so you can call it a compiler even if it’s a stretch, but it IS the source code.

The training set is the input data. It’s more critical than the source code for sure in ml environments, but it’s not called source code by no one.

The pretrained model is the output data.

Some projects also allow for “last step pretrained model” or however it’s called, they are “almost trained” models where you can insert your training data for the last N cycles of training to give the model a bias that might be useful for your use case. This is done heavily in image processing.

permalink
report
parent
reply
10 points

no, it’s not. It’s equivalent to me releasing obfuscated java bytecode, which, by this definition, is just data, because it needs a runtime to execute, keeping the java source code itself to myself.

Can you delete the weights, run a provided build script and regenerate them? No? then it’s not open source.

permalink
report
parent
reply
7 points

The model itself is not open source and I agree on that. Models don’t have source code however, just training data. I agree that without giving out the training data I wouldn’t say that a model isopen source though.

We mostly agree I was just irked with your semantics. Sorry of I was too pedantic.

permalink
report
parent
reply
2 points

it’s just a different paradigm. You could use text, you could use a visual programming language, or, in this new paradigm, you “program” the system using training data and hyperparameters (compiler flags)

permalink
report
parent
reply

Microblog Memes

!microblogmemes@lemmy.world

Create post

A place to share screenshots of Microblog posts, whether from Mastodon, tumblr, Twitter X, KBin, Threads or elsewhere.

Created as an evolution of White People Twitter and other tweet-capture subreddits.

Rules:

  1. Please put at least one word relevant to the post in the post title.
  2. Be nice.
  3. No advertising, brand promotion or guerilla marketing.
  4. Posters are encouraged to link to the toot or tweet etc in the description of posts.

Related communities:

Community stats

  • 14K

    Monthly active users

  • 2.1K

    Posts

  • 76K

    Comments