Cosigned by the author I also include my two cents expounding on the cheque checker ML.
The most consequential failure mode — that both the text (…) and the numeric (…) converge on the same value that happens to be wrong (…) — is vanishingly unlikely. Even if that does happen, it’s still not the end of the world.
I think extremely important is that this is a kind of error that even a human operator could conceivably make. It’s not some unexplainable machine error, likely the scribbles were just exceedingly illegible on that one cheque. We’re not introducing a completely new dangerous failure mode.
Compare that to, for example, using an LLM in lieu of a person in customer service. The failure mode here is that the system can manufacture things whole cloth and tell you to do a stupid and/or dangerous thing. Like tell you to put glue on pizza. No human operator would ever do that, and even if, then that’s straight-up a prosecutable crime with a clear person responsible. Per previous analogy, it’d be a human operator that knowingly inputs fraudulent information from a cheque. But then again, there would be a human signature on the transaction and a person responsible.
So not only is a gigantic LLM matrix a terrible heuristic for most tasks - eg “how to solve my customer problem” - it introduces failure modes that are outlandish, essentially impossible with a human (or a specialised ML system) and leave no chain of responsibility. It’s a real stinky ball of bull.
indeed. the recent air canada matter underscores this.