Saturday, February 16, 2013
Room 306 (Hynes Convention Center)
People can efficiently understand at a glance a complex scene, containing multiple objects. The recognition process often ignores many of the details in the scene and somehow quickly focuses on meaningful objects and configurations. This capacity is difficult to replicate in current computational models, which typically focus on salient and statistically significant events in the sensory input. I will examine this problem from a computational standpoint, focusing mainly on learning during early developmental stages. I will show how the visual system can quickly learn to extract and represent objects and events that are meaningful to the observer, even when they are highly variable and non-salient in the input image. I will use this to discuss some general issues in the efficient combination of innate structures and learning from experience.