Distance-Based Analysis for Complex High-Dimensional Data
复杂高维数据的基于距离的分析
基本信息
- 批准号:2113771
- 负责人:
- 金额:$ 30万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-07-01 至 2022-02-28
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Throughout the course of the twentieth century, distances have played a significant role in important areas of statistics, which include classification, clustering, discriminant analysis, multidimensional scaling, sampling, spatial statistics, scoring rules, and kernel methods in machine learning. Distances are also central to the definition of divergence measures, relative entropy and information gain, some of which are fundamental to the concept of C.R. Rao's quadratic entropy and the analysis of diversity in ecology and other areas of science. Yet, at present there are significant gaps in our knowledge and in the emerging statistical literature on the use of distance-based tests and analyses for complex high dimensional data. One example is analysis of similarity which is among the most cited and most widely used distance-based statistical methods but is limited by an absence of relevant mathematical knowledge. This research will derive new mathematical knowledge on various distance-based statistical methods, and apply this for providing answers to important scientific questions arising in a number of disciplines in forestry, ecology and marine science, such as: (1) how biodiversity changes in tropical forests? (2) how taxonomic and functional profiles of bacterial communities change with environmental conditions in different oceanic regions? The project establishes collaborations among several disciplines and between two US academic institutions and provides research and training to graduate and undergraduate students. The project develops a new body of knowledge on distance-based statistical methods and computation for analyzing complex, high dimensional data that arise in the form of compositions, trees, graphs, or networks. The distances considered here are all non-Euclidean -- either non-metric dissimilarities that do not satisfy any triangular inequalities or just discrete numbers -- but they all arise from conditionally positive definite kernels. Examples of distances include the squared Euclidean distance, the Bray-Curtis dissimilarity, the Jensen-Shannon distance, Unifrac or the Kantorovich-Rubinstein metric, the Aitchison distance, the edit distance, various graph kernel and spectral distances, and other distances based on optimal transport problems. Specifically, the project advances the mathematical theory and computation of exact distribution-free two and multi-sample runs tests, change points, and other related problems by counting runs along the shortest Hamiltonian path (or loop) of the pooled sample of data points. The project also considers analysis of similarity and related distance-based rank tests and derives new mathematical results that allow us to pursue more advanced statistical analyses. The project contributes to: (i) a deeper analysis of biodiversity in tropical forest; (ii) an investigation of how taxonomic and functional profiles of prokaryotic communities change with environmental conditions in different oceanic regions; (iii) a study of the variability of composition of rare earth elements in deep-sea muds of the Pacific Ocean; and (iv) an understanding of the relationship of intertidal communities in the Oregon coast with respect to upwelling and nutrient delivery. The project integrates mathematics research, science and education and will provide opportunities for dissertation work for graduate students.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
在整个世纪,距离在统计学的重要领域发挥了重要作用,包括分类,聚类,判别分析,多维尺度,采样,空间统计,评分规则和机器学习中的核方法。距离也是定义发散测度、相对熵和信息增益的核心,其中一些是C.R.概念的基础。拉奥二次熵与生态学和其他科学领域的多样性分析。然而,目前有显着的差距,在我们的知识和新兴的统计文献中使用基于距离的测试和分析复杂的高维数据。 一个例子是相似性分析,这是最常引用和最广泛使用的基于距离的统计方法之一,但由于缺乏相关的数学知识而受到限制。这项研究将获得关于各种基于距离的统计方法的新的数学知识,并将其应用于回答林业、生态学和海洋科学的一些学科中出现的重要科学问题,例如:(1)热带森林中的生物多样性如何变化?(2)细菌群落的分类和功能特征如何随不同海洋区域的环境条件而变化?该项目建立了多个学科之间和两个美国学术机构之间的合作,并为研究生和本科生提供研究和培训。该项目开发了一套新的基于距离的统计方法和计算知识体系,用于分析以组合物,树,图形或网络形式出现的复杂,高维数据。这里考虑的距离都是非欧几里德的--要么是不满足任何三角不等式的非度量相异度,要么只是离散数--但它们都来自条件正定核。距离的示例包括平方欧几里德距离、Bray-Curtis相异度、Jensen-Shannon距离、Unifrac或Kantorovich-Rubinstein度量、Aitchison距离、编辑距离、各种图核和谱距离以及基于最优传输问题的其他距离。具体来说,该项目通过计算数据点的合并样本的最短哈密顿路径(或循环)的运行沿着,推进了精确的无分布的两个和多个样本运行测试,变化点和其他相关问题的数学理论和计算。该项目还考虑了相似性分析和相关的基于距离的排名测试,并得出新的数学结果,使我们能够进行更先进的统计分析。该项目有助于:(i)对热带森林生物多样性进行更深入的分析;(ii)调查原核生物群落的分类和功能概况如何随着不同海洋区域的环境条件而变化;(iii)研究太平洋深海泥中稀土元素的组成变异性;和(iv)了解俄勒冈州海岸潮间带群落与上升流和营养物输送的关系。该项目整合了数学研究、科学和教育,并将为研究生提供论文工作的机会。该奖项反映了NSF的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Debashis Mondal其他文献
Photocontrolled activation of doubly emo/em-nitrobenzyl-protected small molecule benzimidazoles leads to cancer cell death
光控激活双 emo/em-硝基苄基保护的小分子苯并咪唑导致癌细胞死亡
- DOI:
10.1039/d3sc01786a - 发表时间:
2023-08-23 - 期刊:
- 影响因子:7.400
- 作者:
Manzoor Ahmad;Naveen J. Roy;Anurag Singh;Debashis Mondal;Abhishek Mondal;Thangavel Vijayakanth;Mayurika Lahiri;Pinaki Talukdar - 通讯作者:
Pinaki Talukdar
High-frequency rectifiers based on type-II Dirac fermions
- DOI:
doi.org/10.1038/s41467-021-21906-w - 发表时间:
2021 - 期刊:
- 影响因子:16.6
- 作者:
Libo Zhang;Zhiqingzi Chen;Kaixuan Zhang;Lin Wang;Huang Xu;Li Han;Wanlong Guo;Yao Yang;Chia-Nung Kuo;Chin Shan Lue;Debashis Mondal;Jun Fuji;Ivana Vobornik;Barun Ghosh;Amit Agarwal;Huaizhong Xing;Xiaoshuang Chen;Antonio Politano;Wei Lu - 通讯作者:
Wei Lu
Wavelet variances for heavy‐tailed time series
重尾时间序列的小波方差
- DOI:
- 发表时间:
2019 - 期刊:
- 影响因子:1.7
- 作者:
Rodney V. Fonseca;Debashis Mondal;L. Zhang - 通讯作者:
L. Zhang
Progress and prospects toward supramolecular bioactive ion transporters
超分子生物活性离子转运体的进展与前景
- DOI:
10.1039/d2cc06761g - 发表时间:
2023-01-01 - 期刊:
- 影响因子:4.200
- 作者:
Abhishek Mondal;Manzoor Ahmad;Debashis Mondal;Pinaki Talukdar - 通讯作者:
Pinaki Talukdar
PAC Guarantees and Effective Algorithms for Detecting Novel Categories
PAC 保证和检测新类别的有效算法
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Si Liu;Risheek Garrepalli;Dan Hendrycks;Alan Fern;Debashis Mondal;Thomas G. Dietterich - 通讯作者:
Thomas G. Dietterich
Debashis Mondal的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Debashis Mondal', 18)}}的其他基金
Distance-Based Analysis for Complex High-Dimensional Data
复杂高维数据的基于距离的分析
- 批准号:
2217007 - 财政年份:2021
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Markov Random Fields, Geostatistics and Matrix-Free Computation
马尔可夫随机场、地统计学和无矩阵计算
- 批准号:
2153669 - 财政年份:2021
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Markov Random Fields, Geostatistics and Matrix-Free Computation
马尔可夫随机场、地统计学和无矩阵计算
- 批准号:
1916448 - 财政年份:2019
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
2016 International Indian Statistical Association conference `Statistical and Data Sciences: A Key to Healthy People, Planet and Prosperity'
2016 年国际印度统计协会会议“统计和数据科学:人类健康、地球和繁荣的关键”
- 批准号:
1636648 - 财政年份:2016
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
CAREER: New Directions in Spatial Statistics
职业:空间统计的新方向
- 批准号:
1519890 - 财政年份:2014
- 资助金额:
$ 30万 - 项目类别:
Continuing Grant
CAREER: New Directions in Spatial Statistics
职业:空间统计的新方向
- 批准号:
1254840 - 财政年份:2013
- 资助金额:
$ 30万 - 项目类别:
Continuing Grant
Connecting Markov Random Fields with Geostatistical Models
连接马尔可夫随机场与地统计模型
- 批准号:
0906300 - 财政年份:2009
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
相似国自然基金
Data-driven Recommendation System Construction of an Online Medical Platform Based on the Fusion of Information
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国青年学者研究基金项目
Exploring the Intrinsic Mechanisms of CEO Turnover and Market Reaction: An Explanation Based on Information Asymmetry
- 批准号:W2433169
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国学者研究基金项目
Incentive and governance schenism study of corporate green washing behavior in China: Based on an integiated view of econfiguration of environmental authority and decoupling logic
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国学者研究基金项目
A study on prototype flexible multifunctional graphene foam-based sensing grid (柔性多功能石墨烯泡沫传感网格原型研究)
- 批准号:
- 批准年份:2020
- 资助金额:20 万元
- 项目类别:
基于tag-based单细胞转录组测序解析造血干细胞发育的可变剪接
- 批准号:81900115
- 批准年份:2019
- 资助金额:21.0 万元
- 项目类别:青年科学基金项目
应用Agent-Based-Model研究围术期单剂量地塞米松对手术切口愈合的影响及机制
- 批准号:81771933
- 批准年份:2017
- 资助金额:50.0 万元
- 项目类别:面上项目
Reality-based Interaction用户界面模型和评估方法研究
- 批准号:61170182
- 批准年份:2011
- 资助金额:57.0 万元
- 项目类别:面上项目
Multistage,haplotype and functional tests-based FCAR 基因和IgA肾病相关关系研究
- 批准号:30771013
- 批准年份:2007
- 资助金额:30.0 万元
- 项目类别:面上项目
差异蛋白质组技术结合Array-based CGH 寻找骨肉瘤分子标志物
- 批准号:30470665
- 批准年份:2004
- 资助金额:8.0 万元
- 项目类别:面上项目
GaN-based稀磁半导体材料与自旋电子共振隧穿器件的研究
- 批准号:60376005
- 批准年份:2003
- 资助金额:20.0 万元
- 项目类别:面上项目
相似海外基金
I-Corps: Centralized, Cloud-Based, Artificial Intelligence (AI) Video Analysis for Enhanced Intubation Documentation and Continuous Quality Control
I-Corps:基于云的集中式人工智能 (AI) 视频分析,用于增强插管记录和持续质量控制
- 批准号:
2405662 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Ownership-based Alias Analysis for Securing Unsafe Rust Programs
用于保护不安全 Rust 程序的基于所有权的别名分析
- 批准号:
DP240103194 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Discovery Projects
Collaborative Research: Investigating the Impact of Video-based Analysis of Classroom Teaching on STEM Teacher Preparation, Effectiveness, and Retention
合作研究:调查基于视频的课堂教学分析对 STEM 教师准备、有效性和保留率的影响
- 批准号:
2344795 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Collaborative Research: Investigating the Impact of Video-based Analysis of Classroom Teaching on STEM Teacher Preparation, Effectiveness, and Retention
合作研究:调查基于视频的课堂教学分析对 STEM 教师准备、有效性和保留率的影响
- 批准号:
2344793 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Collaborative Research: Investigating the Impact of Video-based Analysis of Classroom Teaching on STEM Teacher Preparation, Effectiveness, and Retention
合作研究:调查基于视频的课堂教学分析对 STEM 教师准备、有效性和保留率的影响
- 批准号:
2344790 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Collaborative Research: Investigating the Impact of Video-based Analysis of Classroom Teaching on STEM Teacher Preparation, Effectiveness, and Retention
合作研究:调查基于视频的课堂教学分析对 STEM 教师准备、有效性和保留率的影响
- 批准号:
2344789 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Collaborative Research: Investigating the Impact of Video-based Analysis of Classroom Teaching on STEM Teacher Preparation, Effectiveness, and Retention
合作研究:调查基于视频的课堂教学分析对 STEM 教师准备、有效性和保留率的影响
- 批准号:
2344791 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Collaborative Research: Investigating the Impact of Video-based Analysis of Classroom Teaching on STEM Teacher Preparation, Effectiveness, and Retention
合作研究:调查基于视频的课堂教学分析对 STEM 教师准备、有效性和保留率的影响
- 批准号:
2344792 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Collaborative Research: Investigating the Impact of Video-based Analysis of Classroom Teaching on STEM Teacher Preparation, Effectiveness, and Retention
合作研究:调查基于视频的课堂教学分析对 STEM 教师准备、有效性和保留率的影响
- 批准号:
2344794 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
ROBIN: Rotation-based Buckling Instability Analysis, and Applications to Creation of Novel Soft Mechanisms
ROBIN:基于旋转的屈曲不稳定性分析及其在新型软机构创建中的应用
- 批准号:
24K00847 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Grant-in-Aid for Scientific Research (B)