权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Computer vision algorithms for live video processing using programmable graphics hardware

使用可编程图形硬件进行实时视频处理的计算机视觉算法

基本信息

批准号：
293127-2012
负责人：
Gong, Minglun
金额：
$ 1.6万
依托单位：
Memorial University of Newfoundland
依托单位国家：
加拿大
项目类别：
Discovery Grants Program - Individual
财政年份：
2015
资助国家：
加拿大
起止时间：
2015-01-01 至 2016-12-31
项目状态：
已结题

来源：
https://www.nserc-crsng.gc.ca/ase-oro/Details-Detailles_eng.asp?id=571401
关键词：
Computer vision algorithms live video

项目摘要

Today's technology is increasingly powerful and affordable. For example, an 800 dollar graphics card today can process one Trillion FLOPs in double precision. Merely a decade ago, such processing power would come with a one million dollar price tag. Similarly, digital video cameras at the time were only available to movie producers, whereas today they are in the hands of billions of users, as well as integrated into phones, vehicles, and game consoles. The availability of low cost processing power and digital video capturing devices allow computer vision techniques to affect and benefit our day to day lives. They are making our phones smarter, our vehicles safer, and our game consoles much more fun to interact with. This research investigates how to perform challenging computer vision tasks on live video at real-time speed. The tasks include inferring depth from video sequences captured from different viewpoints (stereo vision), detecting moving foreground objects from dynamic backgrounds (foreground segmentation), and extracting objects with fuzzy boundaries for seamlessly blending with new backgrounds (video matting). Being able to perform these tasks for live video has a widely range of applications in our daily life. For example, real-time stereo vision can be employed for sensing the 3D environment and foreground segmentation for detecting pedestrians; both are key components of the future driverless cars. Foreground segmentation and video matting can be used to extract video conference participants from captured video and place them into the same virtual environment, providing better sense of presence and better protection of privacy. The goal of the research is to develop novel algorithms that are not only fast enough for handling live video, but also having better or comparable performance to the state-of-the-art offline algorithms. The algorithms to be developed therefore need to be both effective and efficient. In addition, to harvest the processing power of modern graphics hardware, these algorithms will be designed with parallel execution in mind. Some preliminary work along this research direction has yielded very promising results.

今天的技术越来越强大，价格也越来越便宜。例如，一个800美元的图形卡今天可以处理一个三进制浮点数在双精度。仅仅在十年前，这样的处理能力还需要100万美元的价格。同样，当时的数码摄像机仅适用于电影制片人，而今天它们掌握在数十亿用户手中，并集成到手机，车辆和游戏机中。低成本处理能力和数字视频捕获设备的可用性使计算机视觉技术能够影响和造福我们的日常生活。他们让我们的手机更智能，我们的汽车更安全，我们的游戏机更有趣的互动。这项研究探讨了如何以实时速度在实时视频上执行具有挑战性的计算机视觉任务。这些任务包括从不同视点捕获的视频序列中推断深度（立体视觉），从动态背景中检测移动的前景对象（前景分割），以及提取具有模糊边界的对象，以便与新背景无缝融合（视频抠图）。能够为实时视频执行这些任务在我们的日常生活中有着广泛的应用。例如，实时立体视觉可用于感知3D环境，前景分割可用于检测行人;两者都是未来无人驾驶汽车的关键组成部分。前景分割和视频抠图可用于从捕获的视频中提取视频会议参与者，并将其放置到同一虚拟环境中，从而提供更好的存在感和更好的隐私保护。该研究的目标是开发新型算法，不仅速度足够快，可以处理实时视频，而且具有比最先进的离线算法更好或相当的性能。因此，要开发的算法需要既有效又高效。此外，为了获得现代图形硬件的处理能力，这些算法将在设计时考虑到并行执行。沿着这个研究方向的一些初步工作沿着已经取得了非常有希望的结果。