You’re right. OPs second question is more specifically about vision, while I answered more broadly.
Anyway, comparing it to data from a camera is not really possible.
Analoge vs. digital and so, but also in the way that we experience it.
The minds interpretation of vision is developed after birth. It takes several weeks before an infant can recognise anything and use the eyes for any purpose. Infants are probably blissfully experiencing meaningless raw sensory inputs before that. All the pattern recognition that is used to focus on things are learned features and so also dependent on actually learning them.
I can’t find the source for this story, but allegedly there was this missionary in Africa who came across a tribe who lived in the jungle and was used to being surrounded by dense forest their entire life. He took some of them to the savannah and showed them the open view. They then tried to grab the animals that were grassing miles away. They didn’t develop a sense of perspective for things in longer distance, because they’d never experienced it.
I don’t know if it’s true, but it makes a point. Some people are better at spotting things in motion or telling colours apart etc. than others. It matters how we use vision. Even in the moment. If I ask you to count all the red things in a room, you’ll see more red things that you were generally aware of. So the focus is not just the 6° angle or whatever. It’s what your brain is recognising for the pattern at mind.
So the idea of quantifying vision to megapixels and framerate is kind of useless in understanding both vision and the brain. It’s connected.
Same with sound. Some people have proved being able to use echo localisation similar to bats. You could test their vision blindfolded and they’d still make their way through a labyrinth or whatever.
Testing senses is difficult because the brain tends to compensate in that way. It’d need to be a very precise testing method to make any kind of quantisation for a particular sense.