so it’s probably just some points assigned for the answers and maybe some simple arithmetic.
Why yes, that’s all that machine learning is, a bunch of statistics :)
I know, but that’s not what I meant. I mean literally something as simple and mundane as assigning points per answer and evaluating the final score:
// Pseudo code
risk = 0
if (Q1 == true) {
risk += 20
}
if (Q2 == true) {
risk += 10
}
// etc...
// Maybe throw in a bit of
if (Q28 == true) {
if (Q22 == true and Q23 == true) {
risk *= 1.5
} else {
risk += 10
}
}
// And finally, evaluate the risk:
if (risk < 10) {
return "negligible"
} else if (risk >= 10 and risk < 40) {
return "low risk"
}
// etc... You get the picture.
And yes, I know I can just write if (Q1) {
, but I wanted to make it a bit more accessible for non-programmers.
The article gives absolutely no reason for us to assume it’s anything more than that, and I apparently missed the part of the article that mentioned that the system had been in use since 2007. I know we had machine learning too back then, but looking at the project description here: https://eucpn.org/sites/default/files/document/files/Buena practica VIOGEN_0.pdf it looks more like they looked at a bunch of cases (2159) and came up with the 35 questions and a scoring system not unlike what I just described above.
Edit: I managed to find this, which has apparently been taken down since (but thanks to archive.org it’s still available): https://web.archive.org/web/20240227072357/https://eticasfoundation.org/gender/the-external-audit-of-the-viogen-system/
VioGén’s algorithm uses classical statistical models to perform a risk evaluation based on the weighted sum of all the responses according to pre-set weights for each variable. It is designed as a recommendation system but, even though the police officers are able to increase the automatically assigned risk score, they maintain it in 95% of the cases.
… which incidentally matches what the article says (that police maintain the VioGen risk score in 95% of the cases).