权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Voice and Video-based Service Provisioning on the Cloud

云端音视频业务发放

基本信息

批准号：
RTI-2022-00460
负责人：
Zulkernine, Farhana
金额：
$ 10.82万
依托单位：
Queen's University
依托单位国家：
加拿大
项目类别：
Research Tools and Instruments
财政年份：
2021
资助国家：
加拿大
起止时间：
2021-01-01 至 2022-12-31
项目状态：
已结题

来源：
https://www.nserc-crsng.gc.ca/ase-oro/Details-Detailles_eng.asp?id=743071
关键词：
Voice Video based Service Provisioning

项目摘要

In US, 127.1 million users are using voice control in their vehicles. Surveys conducted by Voicebot.ai show that smart speaker use frequency over time in the UK and Germany is quite high indicating a clear adoption of the technology. Although deep learning has greatly advanced the state-of-the-art in natural language understanding, translation and generation, more work is needed to reach human level communication in terms of adapting to the listeners cognitive abilities, responding with emotion and compassion, understanding context, and deliver the best response from both unstructured and structured knowledge sources in real time. We propose to build a voice and video communication framework to provide cloud services and address the following research challenges: a) develop multi-stream voice and video data fusion to allow improved machine cognition, b) generate voice responses adapting to users' voice, tone, vocabulary, and speed of speech for effective communication and better adoption of the technology by users of all age group and diversity, and c) enable real time rule based and information retrieval based question answering on the cloud using a hierarchical knowledge base combining in-memory and stored structured and unstructured data. We did extensive work on the IBM Watson voice and video communication system, a dialogue and rule-based question answering (QA) system, an information retrieval based three-phase real time QA system, and a hierarchical multilayer knowledge management system using in-memory graph clustering techniques for distributed knowledge management. However, none of these work addressed voice and video communication in real time for cloud service provisioning based on a hierarchical hybrid knowledge base, which we intend to address in the proposed research. Real time service provisioning on the cloud must address security aspects of voice and video data and the knowledge base, execute 3rd party voice and video processing tools and applications putting the network at risk based on the users' request, and most importantly require robust deep learning models trained on domain specific voice, video, and text data to enhance language understanding and response generation. A stand-alone cloud server must be used to test the real time usability of the system and to enable public access without risking the academic network with a custom distributed knowledge management system. We request a compute server with high memory GPUs, firewall, and storage for robust training of deep learning models, creating a publicly accessible cloud server to address security aspects, and validating the system for student support, health data analytics and knowledge services with the help of real people and research collaborators. Voice and video-based service provisioning on the cloud became a viable alternative during the pandemic and our research will extend the technology to enable its applications to home care and support services in Canada.

在美国，有1.271亿用户在车内使用语音控制。www.example.com进行的调查显示，随着时间的推移，智能扬声器在英国和德国的使用频率相当高，这表明该技术的采用率很高。尽管深度学习极大地推进了自然语言理解、翻译和生成的最新技术，但在适应听众的认知能力、以情感和同情心做出回应、理解上下文以及在真实的时间内从非结构化和结构化知识源中提供最佳响应方面，还需要做更多的工作来达到人类水平的沟通。我们建议建立一个语音和视频通信框架，以提供云服务，并解决以下研究挑战：B）生成适应用户的语音、音调、词汇和语速的语音响应，以进行有效的通信，并使所有年龄组和多样性的用户更好地采用该技术，以及c）使用组合了存储器中以及存储的结构化和非结构化数据的分层知识库，在云上实现基于真实的实时规则和基于信息检索的问题回答。我们做了大量的工作，IBM的沃森语音和视频通信系统，一个对话和基于规则的问答（QA）系统，一个基于信息检索的三阶段真实的时间QA系统，和一个分层的多层知识管理系统，使用内存中的图形聚类技术的分布式知识管理。然而，这些工作都没有解决的语音和视频通信在真实的时间的云服务供应的基础上，分层混合知识库，我们打算解决在拟议的研究。云上的真实的实时服务供应必须解决语音和视频数据以及知识库的安全问题，执行第三方语音和视频处理工具和应用程序，根据用户的请求将网络置于风险之中，最重要的是需要在特定领域的语音，视频和文本数据上训练强大的深度学习模型，以增强语言理解和响应生成。必须使用独立的云服务器来测试系统的真实的时间可用性，并使公众能够访问，而不冒学术网络与定制的分布式知识管理系统。我们需要一个具有高内存GPU、防火墙和存储的计算服务器，用于深度学习模型的强大训练，创建一个可公开访问的云服务器以解决安全问题，并在真实的人员和研究合作者的帮助下验证系统的学生支持、健康数据分析和知识服务。在疫情期间，云端语音及视频服务供应成为可行的替代方案，我们的研究将扩展该技术，使其应用于加拿大的家庭护理及支援服务。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Zulkernine, Farhana其他文献

Pan-Canadian Electronic Medical Record Diagnostic and Unstructured Text Data for Capturing PTSD: Retrospective Observational Study.

DOI：
10.2196/41312
发表时间：
2022-12-13
期刊：
JMIR MEDICAL INFORMATICS
影响因子：
3.2
作者：
Kosowan, Leanne;Singer, Alexander;Zulkernine, Farhana;Zafari, Hasan;Nesca, Marcello;Muthumuni, Dhasni
通讯作者：
Muthumuni, Dhasni