Towards Total Scene Understanding using Structured.. (HELIOS)
Towards Total Scene Understanding using Structured Models
Start date: Jan 1, 2014,
End date: Dec 31, 2018
"This project is at the interface between computer vision and linguistics: the aim is to have an algorithm generate relevant sentences that describe a scene given one or more images.Scene understanding has been one of the central goals in computer vision for many decades. It involves various individual tasks, such as object recognition, action understanding and 3D scene recovery. One simple definition of this task is to say scene understanding is equivalent to being able to generate meaningful natural language descriptions of a scene, an important problem in computational linguistics. Whilst even a child can do this with ease, the solution of this fundamental problem has remained elusive. This is because there has been a large amount of research in computer vision that is very deep, but not broad, leading to an in depth understanding of edge and feature detectors, tracking, camera calibration, projective geometry, segmentation, denoising, stereo methods, object detection etc. However, there has been only a limited amount of research on a framework for integrating these functional elements into a method for scene understanding.Within this proposal I advocate a complete view of computer vision, in which the scene is dealt with as a whole, in which problems which are normally considered distinct by most researchers are unified into a common cost function or energy. I will discuss the form the energy should take and efficient algorithms for learning and inference. Our preliminary experiments indicate that such a unified treatment will lead to a paradigm shift in computer vision with a quantum leap in performance. We intend to build embodied demonstrators including a prosthetic vision aid to the visually impaired. The World Health Organization gives a figure of over 300 million such people world wide, which means that in addition to being transformative in the areas of linguistics, HCI, robotics, and computer vision, this work will have a massive impact world wide"
Get Access to the 1st Network for European Cooperation