I am using a code-completion model for (will be open sourced very soon).

Qwen2.5-coder 1.5b though tends to repeat what has already been written, or change it slightly. (See the video)

Is this intentional? I am passing the prefix and suffix correctly to ollama, so it knows where it currently is. I’m also trimming the amount of lines it can see, so the time-to-first-token isn’t too long.

Do you have a recommendation for a better code model, better suited for this?

You are viewing a single thread.
View all comments
1 point
*

If you want in line completions, you need a model that is trained on “fill in the middle” tasks. On their Huggingface page they even say that this is not supported and needs fine tuning:

We do not recommend using base language models for conversations. Instead, you can apply post-training, e.g., SFT, RLHF, continued pretraining, etc., or fill in the middle tasks on this model.

A model that can do it is:

  • starcoder2
  • codegemma
  • codellama

Another option is to just use the qwen model, but instead of only adding a few lines let it rewrite the entire function each time.

permalink
report
reply
1 point

Have a look at the other comments. Sometimes it does fill in the code correctly, even without any prompting! The template specifically has the fill in the middle part in it.

The ollama site has the template with <fim_prefix> and such.

permalink
report
parent
reply

Free Open-Source Artificial Intelligence

!fosai@lemmy.world

Create post

Community stats

  • 112

    Monthly active users

  • 85

    Posts

  • 101

    Comments