权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Visual Image Interpretation in Humans and Machines

人类和机器的视觉图像解释

基本信息

批准号：
EP/L014564/1
负责人：
Andrew Schofield
金额：
$ 15.46万
依托单位：
University of Birmingham
依托单位国家：
英国
项目类别：
Research Grant
财政年份：
2014
资助国家：
英国
起止时间：
2014 至无数据
项目状态：
已结题

来源：
https://gtr.ukri.org/projects?ref=EP%2FL014564%2F1
关键词：
Visual Image Interpretation Humans Machines

项目摘要

The sense of vision is so fundamental to humans that it is a largely automated process which appears to us as extremely easy. This would suggest that it should be easy to make a computer see like a human. In fact this is a very difficult task because the biological visual system is very complex; occupying about one quarter of the human brain. Human vision is both highly effective and efficient. For example it is capable of identifying around 10,000 different object categories and can learn new categories from single examples. This in achieved with a system requiring just 20 watts of power and weighing 1.4kg. No computer system can match this performance for recognition ability, learning efficiency and power consumption.One way to devise new computer vision methods is to understand how biological visual systems work. However, the complexity of vision has made this very difficult and some researchers have concentrated their efforts on understanding biological vision while others have sought independent solutions to specific problems in computer vision. For example, humans can read car number plates but we do so using a general purpose visual system that can also read gothic script and handwriting as well as performing a host of other tasks. Building a number plate recognition system to read letters in the same general way that humans do would be difficult. However, because number plates have a certain fixed format (they are always a certain, bright, colour, and the font is always a certain style and size) building a computer vision system just to read number plates, and nothing else, is a much simpler task. There are some tasks that have not proved simple for computer vision and where understanding biological vision is likely to be essential to future success. One example is matching the appearance of two surfaces. Suppose you wanted to make artificial stone to look exactly like the real stones in a building. To get the recipe just right you would have to know not just the physical properties of the original stone (which probably cannot be matched exactly) but also how the human vision system is likely to perceive the stone. You can then pick a recipe that may not mimic the stone exactly but which will look just like the real stone to humans. Moreover, if you know how the visual system processes the colours and textures of surfaces you can build a computerised tool that can predict recipes automatically. Another area of interest is computer graphics. One way to make computer graphics look convincing is to exactly model the physics of the thing you are trying to represent. However, such rendering methods are often very time consuming and computationally expensive. Because the human visual system does not see every detail in an object it is often possible to render graphics much more quickly and effectively using perceptual rendering techniques that exploit knowledge of how the human visual system will process each scene.Because those researchers working on biological vision tend to be from Biology and Psychology backgrounds and those who research computer vision from Computer Science and Engineering backgrounds, there is often a gap in understanding between the two groups of researchers which makes it hard for them to work together on problems such as those outlined above. The aim of this Network is to bring such researchers closer together, both physically and scientifically, so that they can identify and work together on the challenging problems where success is most likely. We will achieve this by a series of away day style meetings and conferences and by funding junior scientists and PhD students to spend time working in another lab from a different discipline.

视觉对人类来说是如此重要，以至于它在很大程度上是一个自动化的过程，在我们看来非常容易。这意味着让计算机看起来像人类应该很容易。事实上，这是一项非常困难的任务，因为生物视觉系统非常复杂，占据了人类大脑的四分之一。人类的视觉既高效又高效。例如，它能够识别大约10，000种不同的对象类别，并可以从单个示例中学习新的类别。这是通过一个只需要20瓦功率和1.4公斤重的系统实现的。在识别能力、学习效率和功耗方面，没有任何计算机系统能与之匹敌。设计新的计算机视觉方法的一个方法是了解生物视觉系统是如何工作的。然而，视觉的复杂性使得这一点非常困难，一些研究人员将精力集中在理解生物视觉上，而另一些研究人员则寻求独立的解决方案来解决计算机视觉中的特定问题。例如，人类可以读取汽车牌照，但我们使用通用的视觉系统来实现这一点，该系统还可以读取哥特式脚本和手写体，以及执行许多其他任务。建立一个车牌识别系统来像人类一样阅读字母是很困难的。然而，由于车牌有一定的固定格式（它们总是某种特定的，明亮的颜色，字体总是某种特定的风格和大小），因此构建一个计算机视觉系统来读取车牌，而不是其他，是一个简单得多的任务。对于计算机视觉来说，有些任务并不简单，理解生物视觉可能对未来的成功至关重要。一个例子是匹配两个曲面的外观。假设你想让人造石看起来和建筑物中的真实的石头一模一样。为了得到正确的配方，你不仅要知道原始宝石的物理特性（可能无法完全匹配），还要知道人类视觉系统如何感知宝石。然后你可以选择一个配方，可能不完全模仿石头，但这将看起来就像真实的石头给人类。此外，如果你知道视觉系统如何处理表面的颜色和纹理，你就可以构建一个计算机化的工具，可以自动预测食谱。另一个感兴趣的领域是计算机图形学。使计算机图形看起来令人信服的一种方法是精确地模拟你试图表示的事物的物理特性。然而，这样的渲染方法通常非常耗时并且计算昂贵。由于人类视觉系统并不能看到物体的每一个细节，因此通常可以使用感知渲染技术来更快更有效地渲染图形，这些技术利用了人类视觉系统如何处理每个场景的知识。因为那些研究生物视觉的研究人员往往来自生物学和心理学背景，而那些研究计算机视觉的研究人员来自计算机科学和工程背景，两组研究人员之间往往存在理解上的差距，这使得他们很难就上述问题进行合作。该网络的目的是使这些研究人员在实际和科学上更紧密地联系在一起，以便他们能够确定最有可能取得成功的挑战性问题并共同努力。我们将通过一系列的客场风格的会议和会议，并通过资助初级科学家和博士生花时间在另一个实验室从不同的学科工作来实现这一目标。