CAREER: Machine learning, Mapping Spaces, and Obstruction Theoretic Methods in Topological Data Analysis
职业:拓扑数据分析中的机器学习、映射空间和障碍理论方法
基本信息
- 批准号:2415445
- 负责人:
- 金额:$ 40万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2024
- 资助国家:美国
- 起止时间:2024-04-01 至 2025-04-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Data analysis can be described as the dual process of extracting information from observations, and of understanding patterns in a principled manner. This process and the deployment of data-centric technologies have recently brought unprecedented advances in many scientific fields, as well as increased global prosperity with the advent of knowledge-based economies and systems. At a high level, this revolution is driven by two thrusts: the modern technologies which allow for the collection of complex data sets, and the theories and algorithms we use to make sense of them. That said, and for all its benefits, extracting actionable knowledge from data is difficult. Observations gathered in uncontrolled environments are often high-dimensional, complex and noisy; and even when controlled experiments are used, the intricate systems that underlie them --- like those from meteorology, chemistry, medicine and biology --- can yield data sets with highly nontrivial underlying topology. This refers to properties such as the number of disconnected pieces (i.e., clusters), the existence of holes or the orientability of the data space. The research funded through this CAREER award will leverage ideas from algebraic topology to address data science questions like visualization and representation of complex data sets, as well as the challenges posed by nontrivial topology when designing learning systems for prediction and classification. This work will be integrated into the educational program of the PI through the creation of an online TDA (Topological Data Analysis) academy, with the dual purpose of lowering the barrier of entry into the field for data scientists and academics, as well as increasing the representation of underserved communities in the field of computational mathematics. The project provides research training opportunities for graduate students.Understanding the set of maps between topological spaces has led to rich and sophisticated mathematics, for it subsumes algebraic invariants like homotopy groups and generalized (co)homology theories. And while several data science questions are discrete versions of mapping space problems --- including nonlinear dimensionality reduction and supervised learning --- the corresponding theoretical and algorithm treatment is currently lacking. This CAREER award will contribute towards remedying this situation. The research program articulated here seeks to launch a novel research program addressing the theory and algorithms of how the underlying topology of a data set can be leveraged for data modeling (e.g., in dimensionality reduction) as well as when learning maps between complex data spaces (e.g., in supervised learning). This work will yield methodologies for the computation of topology-aware and robust multiscale coordinatizations for data via classifying spaces, a computational theory of topological obstructions to the robust extension of maps between data sets, as well as the introduction of modern deep learning paradigms in order to learn maps between non-Euclidean data sets.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
数据分析可以描述为从观察中提取信息和以原则性方式理解模式的双重过程。这一进程和以数据为中心的技术的部署最近在许多科学领域带来了前所未有的进步,并随着知识经济和系统的出现而增加了全球繁荣。在高层次上,这场革命是由两个推力驱动的:允许收集复杂数据集的现代技术,以及我们用来理解它们的理论和算法。尽管如此,从数据中提取可操作的知识是困难的。在不受控制的环境中收集的观察结果通常是高维的,复杂的和嘈杂的;即使使用受控实验,作为其基础的复杂系统-如气象学,化学,医学和生物学-可以产生具有高度非平凡的底层拓扑结构的数据集。这指的是诸如断开的片段的数量(即,簇)、孔的存在或数据空间的可定向性。通过这个CAREER奖资助的研究将利用代数拓扑的思想来解决数据科学问题,如复杂数据集的可视化和表示,以及在设计用于预测和分类的学习系统时,非平凡拓扑所带来的挑战。这项工作将通过创建在线TDA(拓扑数据分析)学院纳入PI的教育计划,其双重目的是降低数据科学家和学者进入该领域的门槛,以及增加计算数学领域服务不足的社区的代表性。该项目为研究生提供了研究培训的机会。理解拓扑空间之间的映射集合导致了丰富而复杂的数学,因为它包含了代数不变量,如同伦群和广义(上)同调理论。虽然一些数据科学问题是映射空间问题的离散版本-包括非线性降维和监督学习-但目前缺乏相应的理论和算法处理。这个职业奖将有助于纠正这种情况。这里阐述的研究计划旨在启动一个新的研究计划,解决如何利用数据集的底层拓扑进行数据建模的理论和算法(例如,在降维中)以及当学习复杂数据空间之间的映射时(例如,在监督学习中)。这项工作将产生通过分类空间计算数据的拓扑感知和鲁棒多尺度坐标化的方法,这是一种拓扑障碍的计算理论,用于数据集之间的映射的鲁棒扩展,以及现代深度学习范式的引入,以学习非Euclidean数据集。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Toroidal Coordinates: Decorrelating Circular Coordinates with Lattice Reduction
环形坐标:通过晶格缩减去关联圆坐标
- DOI:10.4230/lipics.socg.2023.57
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Scoccola, Luis;Gakhar, Hitesh;Bush, Johnathan;Schonsheck, Nikolas;Rask, Tatum;Zhou, Ling;Perea, Jose A.
- 通讯作者:Perea, Jose A.
Topological Data Analysis of Electroencephalogram Signals for Pediatric Obstructive Sleep Apnea
小儿阻塞性睡眠呼吸暂停脑电图信号的拓扑数据分析
- DOI:10.1109/embc40787.2023.10340674
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Manjunath, Shashank;Perea, Jose A.;Sathyanarayana, Aarti
- 通讯作者:Sathyanarayana, Aarti
FibeRed: Fiberwise Dimensionality Reduction of Topologically Complex Data with Vector Bundles
FiberRed:使用向量束对拓扑复杂数据进行纤维维数降低
- DOI:10.4230/lipics.socg.2023.56
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Scoccola, Luis;Perea, Jose A.
- 通讯作者:Perea, Jose A.
Persistable: persistent and stable clustering
- DOI:10.21105/joss.05022
- 发表时间:2023-03
- 期刊:
- 影响因子:0
- 作者:Luis Scoccola;Alexander Rolle
- 通讯作者:Luis Scoccola;Alexander Rolle
Sliding window persistence of quasiperiodic functions
- DOI:10.1007/s41468-023-00136-7
- 发表时间:2021-03
- 期刊:
- 影响因子:0
- 作者:H. Gakhar;Jose A. Perea
- 通讯作者:H. Gakhar;Jose A. Perea
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Jose Perea其他文献
Mo1199 DEVELOPMENT OF AN EXOSOME-BASED LIQUID BIOPSY POWERED BY MACHINE LEARNING FOR THE DETECTION OF EARLY-ONSET COLORECTAL CANCER
- DOI:
10.1016/s0016-5085(24)02718-5 - 发表时间:
2024-05-18 - 期刊:
- 影响因子:
- 作者:
Alessandro Mannucci;Caiming Xu;Katsutoshi Shoda;Jose Perea;Giulia M. Cavestro;Ajay Goel - 通讯作者:
Ajay Goel
1256 DEVELOPMENT AND VALIDATION OF A MIRNA-BASED SIGNATURE, POWERED BY MACHINE LEARNING, FOR PREDICTING 5-YEAR DISEASEFREE SURVIVAL AFTER SURGERY IN EARLY-ONSET COLORECTAL CANCER
- DOI:
10.1016/s0016-5085(24)01163-6 - 发表时间:
2024-05-18 - 期刊:
- 影响因子:
- 作者:
Alessandro Mannucci;Goretti Hernández;Hiroyuki Uetake;Yasuhide Yamada;Francesc Balaguer;Hideo Baba;Jose Perea;Clement R. Boland;Enrique Quintero;Ajay Goel - 通讯作者:
Ajay Goel
Su1147 RISK OF METACHRONOUS NEOPLASIA IN EARLY-ONSET COLORECTAL CANCER. SYSTEMATIC REVIEW AND METANALYSIS.
- DOI:
10.1016/s0016-5085(24)02002-x - 发表时间:
2024-05-18 - 期刊:
- 影响因子:
- 作者:
Gianluca Pellino;Giacomo Fuschillo;Rogelio Gonzalez-Sarmiento;Marc Marti-Gallostra;Francesco Selvaggi;Eloy Espín-Basany;Jose Perea - 通讯作者:
Jose Perea
Mo1152 GERMLINE MUTATIONS IN EARLY-ONSET COLORECTAL CANCER: THE MORE YOU SEARCH THE MORE YOU FIND.
- DOI:
10.1016/s0016-5085(23)02794-4 - 发表时间:
2023-05-01 - 期刊:
- 影响因子:
- 作者:
Jose Perea;Marc Marti-Gallostra;Francesc Balaguer;Marta Jiménez-Toscano;Edurne Álvaro;Araceli Ballestero;Damian Garcia-Olmo;Rosario Vidal-Tocino;Elena Hurtado;Gonzalo Sanz;Fernando Jiménez;Alfredo Vivas;Irene López-Rojo;Alicia Alvarellos Perez;Sirio Melone;Lorena Brandáriz;Jessica Pérez;Rogelio Gonzalez-Sarmiento - 通讯作者:
Rogelio Gonzalez-Sarmiento
Su1144 MOLECULAR PAIRED ANALYSIS OF METACHRONOUS COLORECTAL CANCERS SHOWS HETEROGENEITY IN THIS SUBSET OF COLORECTAL NEOPLASM.
- DOI:
10.1016/s0016-5085(24)01999-1 - 发表时间:
2024-05-18 - 期刊:
- 影响因子:
- 作者:
Jessica Pérez;Daniel Rueda;Alfredo Vivas;Lorena Brandáriz;Rogelio Gonzalez-Sarmiento;Jose Perea - 通讯作者:
Jose Perea
Jose Perea的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Jose Perea', 18)}}的其他基金
CAREER: Machine learning, Mapping Spaces, and Obstruction Theoretic Methods in Topological Data Analysis
职业:拓扑数据分析中的机器学习、映射空间和障碍理论方法
- 批准号:
1943758 - 财政年份:2020
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
AF: Small: Bundle-theoretic methods for local-to-global inference
AF:小:用于局部到全局推理的捆绑理论方法
- 批准号:
2006661 - 财政年份:2020
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
CDS&E: Collaborative Research: Machine Learning on Dynamical Systems via Topological Features
CDS
- 批准号:
1622301 - 财政年份:2016
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
相似国自然基金
Understanding structural evolution of galaxies with machine learning
- 批准号:n/a
- 批准年份:2022
- 资助金额:10.0 万元
- 项目类别:省市级项目
相似海外基金
CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
- 批准号:
2337776 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CAREER: Mitigating the Lack of Labeled Training Data in Machine Learning Based on Multi-level Optimization
职业:基于多级优化缓解机器学习中标记训练数据的缺乏
- 批准号:
2339216 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CAREER: Integrated and end-to-end machine learning pipeline for edge-enabled IoT systems: a resource-aware and QoS-aware perspective
职业:边缘物联网系统的集成端到端机器学习管道:资源感知和 QoS 感知的视角
- 批准号:
2340075 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CAREER: Gaussian Processes for Scientific Machine Learning: Theoretical Analysis and Computational Algorithms
职业:科学机器学习的高斯过程:理论分析和计算算法
- 批准号:
2337678 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CAREER: Heterogeneous Neuromorphic and Edge Computing Systems for Realtime Machine Learning Technologies
职业:用于实时机器学习技术的异构神经形态和边缘计算系统
- 批准号:
2340249 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CAREER: From Fragile to Fortified: Harnessing Causal Reasoning for Trustworthy Machine Learning with Unreliable Data
职业:从脆弱到坚固:利用因果推理,利用不可靠的数据实现值得信赖的机器学习
- 批准号:
2337529 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CAREER: Ethical Machine Learning in Health: Robustness in Data, Learning and Deployment
职业:健康领域的道德机器学习:数据、学习和部署的稳健性
- 批准号:
2339381 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CAREER: Towards Trustworthy Machine Learning via Learning Trustworthy Representations: An Information-Theoretic Framework
职业:通过学习可信表示实现可信机器学习:信息理论框架
- 批准号:
2339686 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CAREER: Intelligent Battery Management with Safe, Efficient, Fast-Adaption Reinforcement Learning and Physics-Inspired Machine Learning: From Cells to Packs
职业:具有安全、高效、快速适应的强化学习和物理启发机器学习的智能电池管理:从电池到电池组
- 批准号:
2340194 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CAREER: From Dirty Data to Fair Prediction: Data Preparation Framework for End-to-End Equitable Machine Learning
职业:从脏数据到公平预测:端到端公平机器学习的数据准备框架
- 批准号:
2341055 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant