CIF: Medium: Collaborative Research: Information Theory and Statistical Inference from Large-Alphabet Data
CIF:中:协作研究:信息论和大字母数据的统计推断
基本信息
- 批准号:1065632
- 负责人:
- 金额:$ 36.96万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2011
- 资助国家:美国
- 起止时间:2011-08-01 至 2016-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Statistical analysis is key to many challenging applications such as text classification, speech recognition, and DNA analysis. However, often the amount of data available is comparable or even smaller than the set of symbols (alphabet) constituting the data. Unfortunately, not much is known about optimal inference in this so-called large-alphabet domain. Recently, several promising approaches have been developed by different scientific communities, including Bayesian nonparametrics in statistics and machine learning, universal compression in information theory, and the theory of graph limits in mathematics and computer science.The investigators study the problem drawing from these multiple perspectives, but with a particular focus on developing the information theoretic approach. The research studies analytical properties of the "pattern maximum likelihood'' estimator, which performs well in practice but is not understood theoretically, and also explores computational speedups. Moreover, it attempts to delineate which problem classes are better handled by Bayesian nonparametric techniques and which by the pattern approach, and explores links between these approaches. The investigators use the resulting theory for automatic document classification, allowing for more automation in storing, retrieving, and analyzing data. Furthermore, the investigators use the theory to study genetic variations, whose link with disease diagnosis is a crucial step in the systematic quantification of biology that is playing an increasingly important role in medical advancement. The research also brings new courses to the classroom, with a special outreach effort to involve women and under-represented minorities, including through the Native Hawaiian Science and Engineering Mentorship Program.
统计分析是许多具有挑战性的应用程序的关键,如文本分类,语音识别和DNA分析。然而,可用的数据量通常与构成数据的符号集(字母表)相当或甚至更小。不幸的是,在这个所谓的大字母表域中,对最优推理知之甚少。近年来,不同的科学团体开发了一些有前途的方法,包括统计学和机器学习中的贝叶斯非参数,信息论中的通用压缩,数学和计算机科学中的图极限理论。研究人员从这些多个角度研究问题,但特别关注发展信息论方法。该研究研究了“模式最大似然”估计量的分析性质,该估计量在实践中表现良好,但理论上并不理解,并且还探索了计算加速。此外,它试图划定哪些问题类是更好地处理贝叶斯非参数技术和模式的方法,并探讨这些方法之间的联系。研究人员将由此产生的理论用于自动文档分类,从而在存储,检索和分析数据方面实现更多的自动化。此外,研究人员使用该理论来研究遗传变异,其与疾病诊断的联系是生物学系统量化的关键一步,在医学进步中发挥着越来越重要的作用。这项研究还为课堂带来了新的课程,并通过夏威夷土著科学和工程导师计划等方式,特别努力让妇女和代表性不足的少数民族参与进来。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Narayana Santhanam其他文献
Narayana Santhanam的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Narayana Santhanam', 18)}}的其他基金
NRT-AI: Data in Engineering and Society: Converging Applications, Research, and Training Enhancements for Students
NRT-AI:工程和社会中的数据:融合应用、研究和学生培训增强
- 批准号:
2244574 - 财政年份:2023
- 资助金额:
$ 36.96万 - 项目类别:
Standard Grant
CIF:Small:Collaborative Research:Statistics of slow mixing Markov processes: theory and applications to community detection
CIF:小:协作研究:慢混合马尔可夫过程的统计:社区检测的理论和应用
- 批准号:
1619452 - 财政年份:2016
- 资助金额:
$ 36.96万 - 项目类别:
Standard Grant
相似海外基金
Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
- 批准号:
2403122 - 财政年份:2024
- 资助金额:
$ 36.96万 - 项目类别:
Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
- 批准号:
2402815 - 财政年份:2024
- 资助金额:
$ 36.96万 - 项目类别:
Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
- 批准号:
2402817 - 财政年份:2024
- 资助金额:
$ 36.96万 - 项目类别:
Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
- 批准号:
2402816 - 财政年份:2024
- 资助金额:
$ 36.96万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
- 批准号:
2403123 - 财政年份:2024
- 资助金额:
$ 36.96万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Medium: Fundamental Limits of Cache-aided Multi-user Private Function Retrieval
协作研究:CIF:中:缓存辅助多用户私有函数检索的基本限制
- 批准号:
2312229 - 财政年份:2023
- 资助金额:
$ 36.96万 - 项目类别:
Continuing Grant
Collaborative Research: CIF: Medium: Statistical and Algorithmic Foundations of Distributionally Robust Policy Learning
合作研究:CIF:媒介:分布式稳健政策学习的统计和算法基础
- 批准号:
2312205 - 财政年份:2023
- 资助金额:
$ 36.96万 - 项目类别:
Continuing Grant
Collaborative Research: CIF: Medium: Fundamental Limits of Privacy-Enhancing Technologies
合作研究:CIF:中:隐私增强技术的基本限制
- 批准号:
2312666 - 财政年份:2023
- 资助金额:
$ 36.96万 - 项目类别:
Continuing Grant
Collaborative Research: CIF: Medium: Fundamental Limits of Cache-aided Multi-user Private Function Retrieval
协作研究:CIF:中:缓存辅助多用户私有函数检索的基本限制
- 批准号:
2312228 - 财政年份:2023
- 资助金额:
$ 36.96万 - 项目类别:
Continuing Grant
Collaborative Research: CIF: Medium: Robust Learning over Graphs
协作研究:CIF:媒介:图上的鲁棒学习
- 批准号:
2312547 - 财政年份:2023
- 资助金额:
$ 36.96万 - 项目类别:
Continuing Grant