AdComfortable1514
Simple and cool.
Florence 2 image captioning sounds interesting to use.
Do people know of any other image-to-text models (apart from CLIP) ?
Wow , yeah I found a demo here: https://huggingface.co/spaces/Qwen/Qwen2.5
A whole host of LLM models seems to be released. Thanks for the tip!
I’ll see if I can turn them into something useful 👍
Hmm. I mean the FLUX model looks good
, so there must maybe be some magic with the T5 ?
I have no clue, so any insights are welcome.
T5 Huggingface: https://huggingface.co/docs/transformers/model_doc/t5
T5 paper : https://arxiv.org/pdf/1910.10683
Any suggestions on what LLM i ought to use instead of T5?
New stuff
Paper: https://arxiv.org/abs/2303.03032
Takes only a few seconds to calculate.
Most similiar suffix tokens : "vfx "
most similiar prefix tokens : “imperi-”
So its 64-131 between work done by bystanders vs. work done by police?
And casualty rate is actually lower for bystanders doing the work (with their guns) than the police?