Today, a prominent child safety organization, Thorn, in partnership with a leading cloud-based AI solutions provider, Hive, announced the release of an AI model designed to flag unknown CSAM at upload. It’s the earliest AI technology striving to expose unreported CSAM at scale.
It’s the earliest AI technology striving to expose unreported CSAM at scale.
horde-safety has been out for a year now. Just saying… It’s not a trained AI model in this way, but it’s still using Neural Networks (i.e. “AI Technology”)
How did you figure out it had issues with broccoli?. Were you checking your vegetable gallery for CSAM?
This seems like a potential actual good use of AI. Can’t have been much fun to train it though.
And is there any risk of people turning these kinds of models around and using them to generate images?
I think image generators in general work by iteratively changing random noise and checking it with a classifier, until the resulting image has a stronger and stronger finding of “cat” or “best quality” or “realistic”.
If this classifier provides fine grained descriptive attributes, that’s a nightmare. If it just detects yes or no, that’s probably fine.
If AI was reliable, maybe. MAYBE. But guess what? It turns out that “advanced autocomplete” does a shitty job of most things, and I bet false positives will be numerous.
“detect new or previously unreported CSAM and child sexual exploitation behavior (CSE), generating a risk score to make human decisions easier and faster.”
False positives don’t matter if they stick to the stated intended purpose of making it easier to detect CSAM manually.
The problem is that they won’t.
Yes, AI tools, in the hands of skilled people, can be very helpful.
But “AI” in capitalism doesn’t mean “more effective workers”, it means “fewer workers.” The issue isn’t technological so much as cultural. You fundamentally cannot convince an MBA not to try to automate away jobs.
(It’s not even a money thing; it’s about getting rid of all those pesky “workers rights” that workers like to bring with us)
Nobody would have been looking directly at the source data. The FBI or whoever provides the dataset to approved groups, but after that you just say “use all the images in this folder” and it goes. But I don’t even know if they actually provide real full-resolution images, or just perceptual hashes, or downsampled images.
And while it’s possible to use the dataset to generate new images assuming the training data had full-res images, like I said, I know they investigate the people making the request before allowing access. And access is probably supervised and audited.
And is there any risk of people turning these kinds of models around and using them to generate images?
There isn’t really much fundamental difference between an image detector and an image generator. The way image generators like stable diffusion work is essentially by generating a starting image that’s nothing but random static and telling the generator “find the cat that’s hidden in this noise.”
It’ll probably take a bit of work to rig this child porn detector up to generate images, but I could definitely imagine it happening. It’s going to make an already complicated philosophical debate even more complicated.
Available image generators are already capable of generating those images and they weren’t even trained on it. Once a neural network can detect/generate two separate concepts, it can detect/generate the overlap. It won’t be as fine-tuned obviously, but can still turn out scarily accurate.
And will we get that technology to keep the Fediverse and free platforms safe? Probably not. All the predecessors have been kept away for sole use of the big players, despite populism always claiming we need to introduce total surveillance to keep the children safe…
I was going to say… Sure would be nice to have this feature in all the open source AI image generator tools but you’re absolutely right 😩
If everyone has access to the model it becomes much easier to find obfuscation methods and validate them. It becomes an uphill battle. It’s unfortunate but it’s an inherent limitation of most safeguards.
You’re probably right. I’m not sure if it’s a good idea to walk close to the edge with things like this, though. Every update to the detection model could change things and get them in jail… So I certainly wouldn’t play a cat and mouse game with something that has several years of jailtime attached… But then I don’t really know the thought process of the average pedo. And AI image detection comes with problems anyways. In the article they say it detected 6 million pictures already. While keeping quiet about the rate of false positives. We know people have gotten in serious trouble for (false) claims. And I also wouldn’t want to be the Fediverse admin who has to go through thousands of flagged pictures and look at them and decide which is which. With consequences attached… Maybe a database of hashes would be the only option. That doesn’t detect new pictures, but at the same time it comes without flase positives and you can’t draw conclusions from hash values.
IFTAS is already working with Thorn towards this goal. But you already have access to such technology through my toolset.
This one? I loosely followed your work… Maybe I should try it someday. See how it does on a regular VPS. Thanks for the link to the IFTAS. Seems they have curated some useful links… I’ll have a look at their articles. Hope they get somewhere with that. At this point, I don’t think there is any blocklist accessible to the average Fediverse admin?!
Edit: Thx, saw your other comment with the link to horde-safety.
Ye, a normal VPS would be too slow for production use, as a GPU is recommended. But you can plug in any home PC to do it without risks
Not a single peep about false positives.
I’m sure it won’t be abused though. And if anyone does complain, just get their electronics seized and checked, because they must be hiding something!
Reminds me of the A cup breasts porn ban in Australia a few years ago, because only pedos would watch that
There was a a porn studio that was prosecuted for creating CSAM. Brazil i belive. Prosecutors claimed that the petite, A-cup woman was clearly underaged. Their star witness was a doctor who testified that such underdeveloped breasts and hips clearly meant she was still going through puberty and couldn’t possible be 18 or older. The porn star showed up to testify that she was in fact over 18 when they shot the film and included all her identification including her birth certificate and passport. She also said something to the effect of women come in all shapes and sizes and a doctor should know better.
I can’t find an article. All I’m getting is GOP trump pedo nominees and brazil laws on porn.
This sort of rhetoric really bothers me. Especially when you consider that there are real adult women with disorders that make them appear prepubescent. Whether that’s appropriate for pornography is a different conversation, but the idea that anyone interested in them is a pedophile is really disgusting. That is a real, human, adult woman and some people say anyone who wants to live them is a monster. Just imagine someone telling you that anyone who wants to love you is a monster and that they’re actually protecting you.
It could also, of course, make mistakes, but Kevin Guo, Hive’s CEO, told Ars that extensive testing was conducted to reduce false positives or negatives substantially. While he wouldn’t share stats, he said that platforms would not be interested in a tool where “99 out of a hundred things the tool is flagging aren’t correct.”
I take this to mean it is at least 1% accurate lol.
This is a great development, albeit with a lot of soul crushing development behind it I assume. People who have to look at CSAM or whatever the acronym is have a miserable job, so I’m very supportive of trying to automate that away from people.
Yeah, I’m happy for AI to take this particular horrifying job from us. Chances are it will be overtuned (too strict), but if there’s a reasonable appeals process I could see it saving a lot of people the trauma of having to regularly view the worst humanity has to offer without major drawbacks.