Vision as Bayesian Inference

13
Computational Cognition, Vision, and Learning Vision as Bayesian Inference It is attractive to formulate Vision, and other aspect of intelligence, in terms of probabilistic inference. Vision is an inverse inference problem so it can be formulated as Bayesian Inference.

Transcript of Vision as Bayesian Inference

Page 1: Vision as Bayesian Inference

Computational Cognition, Vision, and Learning

Vision as Bayesian Inference

• It is attractive to formulate Vision, and other aspect of intelligence, in terms of probabilistic inference.

• Vision is an inverse inference problem so it can be formulated as Bayesian Inference.

Page 2: Vision as Bayesian Inference

Computational Cognition, Vision, and Learning

Vision as Inverse Inference

• Helmholtz. 1821-1894.• More recently: inverse computer graphics.• Vision must invert the forward process (CG) to discover the causal

factors that have generated the images.• But only for the tasks that we care about – attention, change

blindness.

Page 3: Vision as Bayesian Inference

Computational Cognition, Vision, and Learning

Richard Gregory

• "Perception (vision) as hypotheses“. • Perception is not just a passive acceptance of stimuli, but an active

process involving memory and other internal processes.• Humans have internal representations – we see images when we

dream, we can imagine what animals and people will do, we can hallucinate.

Page 4: Vision as Bayesian Inference

Computational Cognition, Vision, and Learning

Bayesian Decision Theory

• Bayes’ Theorem is a conceptual framework for inverse inference problems.

• We can infer the state S of the • world from the image I using

prior knowledge.

• P(S|I) = P(I|S)P(S)/P(I).• Rev. T. Bayes. 1702-1761• P(I|S) likelihood, P(S) prior

Page 5: Vision as Bayesian Inference

Computational Cognition, Vision, and Learning

Inverse Problems are hard

• There are an infinite number of ways that images can be formed. • Why do we see a cube?• The likelihood P(I|S) rules• out some interpetations S• Prior P(S)– cubes are more

likely than other shapesconsistent with the image.

.

Page 6: Vision as Bayesian Inference

Computational Cognition, Vision, and Learning

Inverse inference requires priors

• Humans use prior knowledge about the world (obtained through experience). Often correct – but can fail occasionally.

• Flying carpet? Levitate?• Play ball-in-box.

Page 7: Vision as Bayesian Inference

Computational Cognition, Vision, and Learning

Analysis by Synthesis: Mumford & Grenander

• Grenander (1960’s) had proposed that vision could be formulated as pattern theory and proposed the idea of “analysis by synthesis”. This is naturally expressed in Bayesian terms.

• Mumford embraced Analysis by Synthesis and Pattern Theory.• Analysis by Synthesis emphasizes pattern synthesis as well as pattern

analysis. Bayesian inference requires you construct a prior probability model of whatever signals or situations you are modeling and you should always test your prior by sampling to see which features it models accurately and which it does not.

Page 8: Vision as Bayesian Inference

Computational Cognition, Vision, and Learning

Mumford’s Bold Hypothesis.• Mumford (1991) boldly proposed a model for how a primate brain could perform

analysis by synthesis using bottom-up and top-down processing.• He proposed that each area of the cortex carries on its calculations with the

active participation of a nucleus in the thalamus with which it is reciprocally and topographically connected. This nucleus plays the role of an 'active blackboard‘ on which the current best reconstruction of some aspect of the world is always displayed

• Each cortical area maintains and updates the organism's knowledge of a specific aspect of the world, ranging from low level raw data to high level abstract representations, and involving interpreting stimuli and generating actions.

• It draws on multiple sources of expertise, learned from experience, creating multiple, often conflicting, hypotheses which are integrated by the action of the thalamic neurons and then sent back to the standard input layer of the cortex.

• .

Page 9: Vision as Bayesian Inference

Computational Cognition, Vision, and Learning

Mumford’s bold hypothesis for the architecture of the neocortex. • The higher areas of the neocortex attempts to fit its abstractions to the data it

receives from lower areas by sending back to them from its deep pyramidal cells a template reconstruction best fitting the lower level view.

• The lower areas attempts to reconcile the reconstruction of its view that it receives from higher areas with what it knows, sending back from its superficial pyramidal cells the features in its data which are not predicted by the higher area.

• The whole calculation is done with all areas working simultaneously, but with order imposed by synchronous activity in the various top-down, bottom-up loops.

• Neuroscience experiments give increasing support for top-down models and maybe for analysis by synthesis.

Page 10: Vision as Bayesian Inference

Computational Cognition, Vision, and Learning

Vision as Bayesian Inference: Yuille & Kersten

• A. Yuille & D. Kersten. Trends in Cognitive Science. 2006.

Page 11: Vision as Bayesian Inference

Computational Cognition, Vision, and Learning

Vision as Bayesian Inference: Yuille & Kersten

• Goats that kill!

Page 12: Vision as Bayesian Inference

Computational Cognition, Vision, and Learning

Vision as Bayesian Inference: Yuille & Kersten

• Adding more realism building on work by Z. Tu & S.C. Zhu 2002, Z. Tu et al. 2006.

Page 13: Vision as Bayesian Inference

Computational Cognition, Vision, and Learning

Vision as Bayesian Inference

• This is Bayes-in-the-Big. It is a highly ambitious research program which is conceptually very attractive.

• Challenges include learning probability models to generate images of all the classes of objects, and background “stuff”, that occur in real images.

• Efficient inference algorithms that can take an image as input and rapidly search over all the possible world configurations (of objects and background) than can have caused it.