权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

III: Small: A Query System for Rapid Audiovisual Analysis of Large-Scale Video Collections

三：小型：大规模视频采集快速视听分析的查询系统

基本信息

批准号：
1908727
负责人：
Kayvon Fatahalian
金额：
$ 50万
依托单位：
Stanford University
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2019
资助国家：
美国
起止时间：
2019-09-01 至 2022-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1908727&HistoricalAwards=false
关键词：
III Small Query System Rapid

项目摘要

Due to advances in computer vision and audio processing, it is now possible to automatically annotate large video collections with basic information about their audiovisual contents (e.g., people, places, objects, audio transcripts). However, it remains difficult to productively carry out higher level analytics tasks on video because of the challenges of defining higher level, complex events of interest. In response, this project seeks to enable more sophisticated and higher productivity video analysis through the design of a programming system for composing basic video annotations into higher level patterns and events of interest in a video. Queries authored in the proposed system can serve as a direct specification of video events of interest, or as a mechanism for automatically generating data labels that provide supervision for subsequent model training. The proposed systems will be applicable to many video domains; however, the project will feature a collaboration with journalists and news media personnel to conduct an audiovisual analysis that measures diversity and representation in nearly a decade of American cable TV news broadcasts (over 200,000 hours since 2010). Specifically, the project will create software tools for answering questions such as: What individuals appear most often on the news? In what contexts? (in interviews? on panels?) What topics and stories do certain individuals cover? In addition to disseminating the results of these analyses, the project will produce interactive web-based tools that will enable students and the public to perform their own diversity analyses of the contents of cable TV news.The primary technical challenge of the project involves the design of a new video analysis system for defining spatio-temporal patterns and events of interest in video. Inspired by early multimedia database query systems, the system will support multi-modal video analyses by representing all video annotations (whether they result from pixels, audio, or transcripts) as continuous space-time volumes in a video. Users will define complex patterns via queries that compose (via spatio-temporal relations) and manipulate collections of simpler space-time annotations. The compositional nature of these queries will allow them to execute rapidly on large video collections, enabling analysts to iteratively conceptualize, prototype, and specify novel high-level patterns in videos. To reduce the cost of annotating large video collections, the project will also exploit the long running nature of TV and film video streams to train low-cost models that are specific to a show or film's video content. The project will investigate the use model distillation (and do so in a continuous, online setting) to train face and object detection models that maintain high accuracy on a video stream at an order of magnitude lower runtime cost than existing methods. All systems developed as part of the project will be distributed to the public as open source software, and the project will involve hosted hackathons to educate students and broader community about their use.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

由于计算机视觉和音频处理的进步，现在可以用关于其视听内容(例如，人、地点、物体、音频记录)的基本信息来自动地对大量视频集合进行注释。然而，由于定义更高级别、复杂的感兴趣事件的挑战，在视频上高效地执行更高级别的分析任务仍然很困难。作为回应，该项目寻求通过设计编程系统来实现更复杂和更高生产率的视频分析，以将基本视频注释组合成视频中更高级别的模式和感兴趣的事件。在所提出的系统中创作的查询可以用作感兴趣的视频事件的直接规范，或者用作自动生成数据标签的机制，该数据标签为后续的模型训练提供监督。拟议的系统将适用于许多视频领域；然而，该项目将与记者和新闻媒体工作人员合作，进行视听分析，衡量近十年来美国有线电视新闻广播的多样性和代表性(自2010年以来超过200 000小时)。具体地说，该项目将创建用于回答以下问题的软件工具：哪些人最常出现在新闻中？在什么情况下？(在采访中？在太阳能板上？)某些人报道哪些主题和故事？除了传播这些分析的结果外，该项目还将制作基于网络的互动工具，使学生和公众能够对有线电视新闻的内容进行自己的多样性分析。该项目的主要技术挑战涉及设计一种新的视频分析系统，以确定视频中的时空模式和感兴趣的事件。受早期多媒体数据库查询系统的启发，该系统将通过将所有视频注释(无论是来自像素、音频还是抄本)表示为视频中连续的时空卷来支持多模式视频分析。用户将通过组成(通过时空关系)和操作更简单的时空注释集合的查询来定义复杂的模式。这些查询的组合性质将允许它们在大型视频集合上快速执行，使分析师能够迭代地概念化视频中的新高级模式、原型并指定视频中的新高级模式。为了降低为大型视频集添加注释的成本，该项目还将利用电视和电影视频流的长期运行特性，培训特定于节目或电影视频内容的低成本模型。该项目将研究使用模型蒸馏(并在连续的在线设置中进行)来训练人脸和对象检测模型，这些模型以比现有方法低一个数量级的运行成本在视频流上保持高精度。作为该项目的一部分开发的所有系统将作为开源软件向公众分发，该项目将包括主办黑客松，以教育学生和更广泛的社区了解它们的使用。该奖项反映了NSF的法定使命，并通过使用基金会的智力优势和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（6）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Analysis of Faces in a Decade of US Cable TV News

美国有线电视新闻十年来的面孔分析

DOI：
10.1145/3447548.3467134
发表时间：
2021
期刊：
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining
影响因子：
0
作者：
Hong, James;Crichton, Will;Zhang, Haotian;Fu, Daniel Y.;Ritchie, Jacob;Barenholtz, Jeremy;Hannel, Ben;Yao, Xinwei;Murray, Michaela;Moriba, Geraldine
通讯作者：
Moriba, Geraldine

Spotting Temporally Precise, Fine-Grained Events in Video

DOI：
10.48550/arxiv.2207.10213
发表时间：
2022-07
期刊：
ArXiv
影响因子：
0
作者：
James Hong;Haotian Zhang;Michaël Gharbi;Matthew Fisher;Kayvon Fatahalian
通讯作者：
James Hong;Haotian Zhang;Michaël Gharbi;Matthew Fisher;Kayvon Fatahalian

Learning Rare Category Classifiers on a Tight Labeling Budget

DOI：
10.1109/iccv48922.2021.00831
发表时间：
2021-10
期刊：
2021 IEEE/CVF International Conference on Computer Vision (ICCV)
影响因子：
0
作者：
Ravi Teja Mullapudi;Fait Poms;W. Mark;Deva Ramanan;Kayvon Fatahalian
通讯作者：
Ravi Teja Mullapudi;Fait Poms;W. Mark;Deva Ramanan;Kayvon Fatahalian

Low-Shot Validation: Active Importance Sampling for Estimating Classifier Performance on Rare Categories

DOI：
10.1109/iccv48922.2021.01053
发表时间：
2021-09
期刊：
2021 IEEE/CVF International Conference on Computer Vision (ICCV)
影响因子：
0
作者：
Fait Poms;Vishnu Sarukkai;Ravi Teja Mullapudi;N. Sohoni;W. Mark;Deva Ramanan;Kayvon Fatahalian
通讯作者：
Fait Poms;Vishnu Sarukkai;Ravi Teja Mullapudi;N. Sohoni;W. Mark;Deva Ramanan;Kayvon Fatahalian

Background Splitting: Finding Rare Classes in a Sea of Background

DOI：
10.1109/cvpr46437.2021.00795
发表时间：
2020-08
期刊：
2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
影响因子：
0
作者：
Ravi Teja Mullapudi;Fait Poms;W. Mark;Deva Ramanan;Kayvon Fatahalian
通讯作者：
Ravi Teja Mullapudi;Fait Poms;W. Mark;Deva Ramanan;Kayvon Fatahalian

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Kayvon Fatahalian其他文献

Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods

快速且三重：用三元组方法加速弱监督

DOI：
发表时间：
2020
期刊：
International Conference on Machine Learning
影响因子：
0
作者：
Daniel Y. Fu;Mayee F. Chen;Frederic Sala;Sarah Hooper;Kayvon Fatahalian;Christopher Ré
通讯作者：
Christopher Ré

Vid2Player: Controllable Video Sprites That Behave and Appear Like Professional Tennis Players

Vid2Player：行为和外观像职业网球运动员的可控视频精灵

DOI：
发表时间：
2020
期刊：
ACM Transactions on Graphics
影响因子：
6.2
作者：
Haotian Zhang;Cristobal Sciutto;Maneesh Agrawala;Kayvon Fatahalian
通讯作者：
Kayvon Fatahalian

Creating an Agile Hardware Design Flow

创建敏捷的硬件设计流程

DOI：
发表时间：
2020
期刊：
Design Automation Conference
影响因子：
0
作者：
Rick Bahr;Clark W. Barrett;Nikhil Bhagdikar;Alex Carsello;Ross G. Daly;Caleb Donovick;David Durst;Kayvon Fatahalian;Kathleen Feng;P. Hanrahan;Teguh Hofstee;M. Horowitz;Dillon Huff;Fredrik Kjolstad;Taeyoung Kong;Qiaoyi Liu;Makai Mann;J. Melchert;Ankita Nayak;Aina Niemetz;Gedeon Nyengele;Priyanka Raina;Stephen Richardson;Rajsekhar Setaluri;Jeff Setter;Kavya Sreedhar;Maxwell Strange;James J. Thomas;Christopher Torng;Lenny Truong;Nestan Tsiskaridze;Keyi Zhang
通讯作者：
Keyi Zhang

Type-directed scheduling of streaming accelerators

流加速器的类型定向调度

DOI：
发表时间：
2020
期刊：
ACM-SIGPLAN Symposium on Programming Language Design and Implementation
影响因子：
0
作者：
David Durst;Matthew Feldman;Dillon Huff;David Akeley;Ross G. Daly;G. Bernstein;Marco Patrignani;Kayvon Fatahalian;P. Hanrahan
通讯作者：
P. Hanrahan