Collaborative Research: CIF: Small: Theory for Learning Lossless and Lossy Coding
协作研究:CIF:小型:学习无损和有损编码的理论
基本信息
- 批准号:2324396
- 负责人:
- 金额:$ 40万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-10-01 至 2026-09-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
An estimated 330,000 billion bytes of data is generated daily in various forms: video, images, and music, but also scientific, economic, and industrial content. This enormous amount of data has already transformed modern life in ways that are transparent (social media) and in ways that are not immediately visible (furthering scientific, business, and economic goals through better modeling, forecast and use of data). Data is communicated, often wirelessly, on massive scales in many formats: videos, images, and music, and in real time applications such as gaming, streaming content, video calls and telemedicine. In order to handle this amount of data, it needs to be compressed by algorithms that examine the data to understand the underlying structure and remove redundant descriptions, seeking thus to use fewer bits to represent the same. Traditional compression method includes the well-known JPEG (joint photographic experts group) compression for images from smartphones, for example. This is a lossy compression method, as some image quality is lost. Lossless compression, with no quality loss, is typically used for compressing computer files (e.g., with Zip) and for lossless music streaming. In recent years, machine learning has become very powerful and used to solve many problems like autonomous driving, speech recognition, and implementing chatbots. A recent focus is to use machine learning for data compression. The aim of this project is to understand the fundamental theory of machine learning for data compression, for example what type of machine learning algorithms can compress data well and how many samples are needed to learn compression well. Through this fundamental understanding of data compression using machine learning, the aim is to develop more powerful compression methods, leading to more efficient use of wireless spectrum and less energy consumption by mobile devices.Recently, there has been much effort in developing machine learning methods for source coding by both researchers and high tech companies. These methods have had some success in beating traditional source coding methods. The project aims to develop fundamental bounds for performance of learning for both lossless and lossy source coding. The problem is framed in a probably approximately correct (PAC) learning framework, both uniform and non-uniform. The first part of the research considers lossless source coding, both of interest in itself and as a basis of lossy source coding, and aims to develop bounds for learning. The project investigates what factors influence the convergence of learning. This is extended with an active learning framework, where the algorithms can adapt how much data they need to examine, using more data for more subtle models and less data for simpler models, and figuring out when the underlying model may be simple with what is known as a "stopping rule." The second part of the research considers lossy source coding, in particular almost lossless source coding and lossless coding of real-valued sources. The aim is to understand in what sense source coding can be learned (e.g., uniform vs non-uniform PAC), and based on this to develop performance bounds. Estimation, compression, and learning have always been known to be subtly different, and these nuances translate into quantifiably large implications for problems harnessing them; this research will resolve some of these tangles, particularly for sources with memory. The fundamental understanding of learning for coding developed through this project will in turn result in the development of better coding methods.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
据估计,每天有330万亿字节的数据以各种形式产生:视频、图像和音乐,还有科学、经济和工业内容。大量的数据已经以透明的方式(社交媒体)和不可见的方式(通过更好的建模,预测和使用数据来促进科学,商业和经济目标)改变了现代生活。数据通常以多种格式进行大规模无线通信:视频、图像和音乐,以及游戏、流媒体内容、视频通话和远程医疗等真实的应用程序。为了处理如此大量的数据,需要通过算法进行压缩,这些算法检查数据以理解底层结构并删除冗余描述,从而寻求使用更少的位来表示相同的数据。例如,传统的压缩方法包括用于来自智能手机的图像的众所周知的JPEG(联合图像专家组)压缩。这是一种有损压缩方法,因为会丢失一些图像质量。无质量损失的无损压缩通常用于压缩计算机文件(例如,与Zip)和无损音乐流。近年来,机器学习已经变得非常强大,并用于解决许多问题,如自动驾驶,语音识别和实现聊天机器人。最近的焦点是使用机器学习进行数据压缩。该项目的目的是了解数据压缩机器学习的基本理论,例如什么类型的机器学习算法可以很好地压缩数据,以及需要多少样本才能很好地学习压缩。通过对使用机器学习的数据压缩的基本理解,目标是开发更强大的压缩方法,从而更有效地使用无线频谱并降低移动的设备的能耗。最近,研究人员和高科技公司都在努力开发用于信源编码的机器学习方法。这些方法在击败传统信源编码方法方面取得了一些成功。该项目旨在为无损和有损源编码的学习性能制定基本界限。这个问题是在一个可能近似正确(PAC)的学习框架,统一和非统一。研究的第一部分考虑无损信源编码,这两个本身的兴趣和作为有损信源编码的基础,并旨在开发学习的界限。该项目研究了哪些因素影响学习的收敛。这是通过主动学习框架进行扩展的,在该框架中,算法可以调整他们需要检查的数据量,使用更多的数据用于更微妙的模型,使用更少的数据用于更简单的模型,并通过所谓的“停止规则”来确定底层模型何时可能是简单的。研究的第二部分考虑有损信源编码,特别是几乎无损信源编码和实值信源的无损编码。目的是理解源编码可以在什么意义上被学习(例如,均匀与非均匀PAC),并在此基础上开发性能界限。估计,压缩和学习一直被认为是微妙的不同,这些细微差别转化为利用它们的问题的量化的巨大影响;这项研究将解决其中的一些纠结,特别是对于有记忆的来源。通过这个项目对编码学习的基本理解反过来将导致更好的编码方法的发展。这个奖项反映了NSF的法定使命,并被认为是值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估的支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Anders Host-Madsen其他文献
Anders Host-Madsen的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Anders Host-Madsen', 18)}}的其他基金
EAGER:Real Time Federated Learning using Kernel Methods
EAGER:使用核方法的实时联合学习
- 批准号:
2142987 - 财政年份:2021
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
CIF: Small: Description Length Analysis for Machine Learning and Graph Models
CIF:小型:机器学习和图模型的描述长度分析
- 批准号:
1908957 - 财政年份:2019
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborate Research: Delay and Energy: Design Tradeoffs in Spectrally Efficient Systems
合作研究:延迟和能量:频谱效率系统的设计权衡
- 批准号:
1923751 - 财政年份:2019
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
CIF:EAGER:Information Theory Approaches for finding Atypical Sequences
CIF:EAGER:寻找非典型序列的信息论方法
- 批准号:
1434600 - 财政年份:2014
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
CIF:Small:Collaborative Research:Minimum Energy Communications in Wireless Networks
CIF:小:合作研究:无线网络中的最小能量通信
- 批准号:
1017823 - 财政年份:2010
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: Capacity and Coding in Resource-Limited Wireless Networks
合作研究:资源有限无线网络中的容量和编码
- 批准号:
0729152 - 财政年份:2007
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
SENSORS: Cooperative Diversity for Wireless Sensor Networks
传感器:无线传感器网络的协作多样性
- 批准号:
0329908 - 财政年份:2003
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
- 批准号:
2403122 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
- 批准号:
2402815 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Mathematical and Algorithmic Foundations of Multi-Task Learning
协作研究:CIF:小型:多任务学习的数学和算法基础
- 批准号:
2343599 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Mathematical and Algorithmic Foundations of Multi-Task Learning
协作研究:CIF:小型:多任务学习的数学和算法基础
- 批准号:
2343600 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
- 批准号:
2402817 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: NSF-AoF: CIF: Small: AI-assisted Waveform and Beamforming Design for Integrated Sensing and Communication
合作研究:NSF-AoF:CIF:小型:用于集成传感和通信的人工智能辅助波形和波束成形设计
- 批准号:
2326622 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
- 批准号:
2402816 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
- 批准号:
2403123 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: NSF-AoF: CIF: Small: AI-assisted Waveform and Beamforming Design for Integrated Sensing and Communication
合作研究:NSF-AoF:CIF:小型:用于集成传感和通信的人工智能辅助波形和波束成形设计
- 批准号:
2326621 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Versatile Data Synchronization: Novel Codes and Algorithms for Practical Applications
合作研究:CIF:小型:多功能数据同步:实际应用的新颖代码和算法
- 批准号:
2312872 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant














{{item.name}}会员




