You are viewing a single thread.
View all comments View context
1 point
*

[Sorry, double posted, my mobile connection is pretty bad rn]

Just because Amazon, king of scams, is doing an AI scam, that doesn’t mean that the underlying technology is impossible to use with minimal errors (it’s AI, it’s made of statistics, there will always be some errors).

Anyways, “just walk out” works in a different way than the fruit recognition in the OP or the checkout machines I was talking about. Image recognition of a discrete item over a white background (or a checkered background) is like, the literal ideal case for image recognition accuracy. This is as opposed to blurry store cameras looking at an entire aisle from 20 feet away and trying to guess what item the customer is taking off the shelf. It’s an entirely different problem space in every way that matters.

Anyways, even ignoring theoretical arguments, I know it’s production-ready because it’s currently beong used in production. There are dozens of stores in Calofornia right now that use checkout machines with a camera that points down towards a plain background “pad”. You place the item on the pad and it selects the most likely item in the store based on what it sees. I’ve seen a live demo of these machines where you take ~10-15 pictures of an item from different angles/rotations/positions and add it to the list of recognizable items, and the machine was able to diatinguish between that item and others accurately. This was in a very candid and scam-unlikely environment (OpenSauce) and by my evaluation this is easily consistent with other known-good image recognition applications.

permalink
report
parent
reply
1 point

it’s AI, it’s made of statistics, there will always be some errors

7 in 10 required manual review

This is as opposed to blurry store cameras looking at an entire aisle from 20 feet away and trying to guess what item the customer is taking off the shelf. It’s an entirely different problem space in every way that matters.

which is why that wasn’t the setup of just walk out

every location was quite literally purpose built with the express goal of making the just walk out technology as accurate as it possibly could be

You place the item on the pad and it selects the most likely item in the store based on what it sees

this is a completely different problem

nobody’s placing the berry or berries they decide to eat or not eat in a separate area before placing them in their mouth

permalink
report
parent
reply
2 points

this is a completely different problem

Yes, that’s what I’ve been trying to explain. And no, JWO was not built to be accurate, it was built to be convenient. That’s a very different incentive that will lead to skipping alternatives that are less convenient but more accurate-- like the checkout kiosks I’ve been talking about. I’m not defending JWO and it’s obviously both a harder problem and one that’s not managed well, focusing on optics over accuracy.

nobody’s placing the berry or berries they decide to eat or not eat in a separate area before placing them in their mouth

That’s not necessary, they’re already placed in a nearly ideal environment by the person setting up the berry bowl. Notice how the “bowl” is a white square with each fruit placed in a way where they’re separated by the whitespace. You wouldn’t even need to train a model on the whole bowl, you could just do an image region detection --> object recognition pipeline. The hardest part about the berry bowl would by far be determining the person taking the fruit! (In fact, I wouldn’t be surprised if that was manually reviewed, with that few instances to look at.)

permalink
report
parent
reply
1 point

Yes, that’s what I’ve been trying to explain

jwo is a different problem than the separate checkout kiosk you’re describing

jwo is the same problem as is in the image

JWO was not built to be accurate, it was built to be convenient

it was built to be accurate within the boundary of “no checkout step”

at this point it feels like you’re deliberately misinterpreting me

Notice how the “bowl” is a white square with each fruit placed in a way where they’re separated by the whitespace

unless somebody moves or jostles them while taking some fruit

you’re essentially making the exact same naive assumptions about the operating environment that led to jwo’s failures

if “just track which one disappeared” was a valid solution to the problem, jwo wouldn’t have failed

The hardest part about the berry bowl would by far be determining the person taking the fruit

facial recognition is a thoroughly solved problem, at least in terms of the accuracy that we’re aiming for here

permalink
report
parent
reply

WTF

!wtf@lemmy.wtf

Create post

The average c/WTF enjoyer

Community stats

  • 131

    Monthly active users

  • 119

    Posts

  • 1.1K

    Comments