Technically, it could be coded to recognize the various formats of strings and blur everything indiscriminately.
- OCR is never perfect.
- A partial credit card number or partial SSN wouldn’t match the format, but is still sensitive.
- Perfection is impossible. Demanding it is silly. Loopholes are unavoidable in everything.
- Context can be trained.
Perfection is impossible. Demanding it is silly.
- This isn’t even a matter of perfection, this is Recall barely managing to censor the most blatantly sensitive information (see: the article saying “I also created my own HTML page with a web form that said, explicitly, “enter your credit card number below.” The form had fields for Credit card type, number, CVC and expiration date.”)
- Demanding a system protect user data is not silly, it is necessary. And if a given system can’t do that, then it should never be used. Especially considering the fact this is likely going to make its way onto PCs handling extra sensitive data with strict privacy requirements, such as medical data protected by HIPAA.
Context can be trained.
- Maybe Microsoft shouldn’t have released a tool until it had that context then?
If a company releases a half-baked tool that doesn’t do what it advertises, easily fails in simple attempts at identifying sensitive data, and is almost impossible to guarantee data security with, then it should never be used or advertised for any context in which any sensitive data could ever be present.
Demanding perfection for a system as dangerous as recall is not silly.
It’s like keeping an armed nuclear bomb in the center of a city at all times and being like “hey, it’s ok that it’s activation sequence isn’t perfect, it probably won’t go off”.
The solution to make it perfect is to not install the nuke/recall at all.
Perfection is impossible. Demanding it is silly.
In this case perfection is very easy. It could avoid capturing 100% of credit card info by not taking screenshots of everything.
If you agree that it will never be perfect at filtering out sensitive information, why support it?
that would require knowing the formats of strings. And it requires the text to be text.
What if you had a photo of a handwritten piece of sensitive information?
I doubt that OCR (optical character recognition) is done on device so it likely being sent to some server for processing.
As a software engineer, in any of our corporate applications when a user hits delete we toggle an archived flag, but the data is still there. So I wouldn’t trust any application to do what it actually says.
There are so many technical barriers for recall to ever be able to not snipe your private data that I wouldn’t go anywhere near the thing.
Edit: Furthermore, what happens when MS inevitably gets hacked again and someone steals all the data it has and then starts using that to commit fraud.
As a software engineer, in any of our corporate applications when a user hits delete we toggle an archived flag, but the data is still there.
What many people don’t realize is that this is how some low level data stores work as well. Even regular ol’ file systems do this (basically).
I don’t understand your meaning. Screenshots of a photo are still screenshots and manipulating text on a photo is already a thing (you can use phone camera to translate text directly from a fixed surface).