权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CAREER: Machine learning, Mapping Spaces, and Obstruction Theoretic Methods in Topological Data Analysis

职业：拓扑数据分析中的机器学习、映射空间和障碍理论方法

基本信息

批准号：
2415445
负责人：
Jose Perea
金额：
$ 40万
依托单位：
Northeastern University
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2024
资助国家：
美国
起止时间：
2024-04-01 至 2025-04-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2415445&HistoricalAwards=false
关键词：
CAREER Machine learning Mapping Spaces

项目摘要

Data analysis can be described as the dual process of extracting information from observations, and of understanding patterns in a principled manner. This process and the deployment of data-centric technologies have recently brought unprecedented advances in many scientific fields, as well as increased global prosperity with the advent of knowledge-based economies and systems. At a high level, this revolution is driven by two thrusts: the modern technologies which allow for the collection of complex data sets, and the theories and algorithms we use to make sense of them. That said, and for all its benefits, extracting actionable knowledge from data is difficult. Observations gathered in uncontrolled environments are often high-dimensional, complex and noisy; and even when controlled experiments are used, the intricate systems that underlie them --- like those from meteorology, chemistry, medicine and biology --- can yield data sets with highly nontrivial underlying topology. This refers to properties such as the number of disconnected pieces (i.e., clusters), the existence of holes or the orientability of the data space. The research funded through this CAREER award will leverage ideas from algebraic topology to address data science questions like visualization and representation of complex data sets, as well as the challenges posed by nontrivial topology when designing learning systems for prediction and classification. This work will be integrated into the educational program of the PI through the creation of an online TDA (Topological Data Analysis) academy, with the dual purpose of lowering the barrier of entry into the field for data scientists and academics, as well as increasing the representation of underserved communities in the field of computational mathematics. The project provides research training opportunities for graduate students.Understanding the set of maps between topological spaces has led to rich and sophisticated mathematics, for it subsumes algebraic invariants like homotopy groups and generalized (co)homology theories. And while several data science questions are discrete versions of mapping space problems --- including nonlinear dimensionality reduction and supervised learning --- the corresponding theoretical and algorithm treatment is currently lacking. This CAREER award will contribute towards remedying this situation. The research program articulated here seeks to launch a novel research program addressing the theory and algorithms of how the underlying topology of a data set can be leveraged for data modeling (e.g., in dimensionality reduction) as well as when learning maps between complex data spaces (e.g., in supervised learning). This work will yield methodologies for the computation of topology-aware and robust multiscale coordinatizations for data via classifying spaces, a computational theory of topological obstructions to the robust extension of maps between data sets, as well as the introduction of modern deep learning paradigms in order to learn maps between non-Euclidean data sets.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

数据分析可以描述为从观察中提取信息和以原则性方式理解模式的双重过程。这一进程和以数据为中心的技术的部署最近在许多科学领域带来了前所未有的进步，并随着知识经济和系统的出现而增加了全球繁荣。在高层次上，这场革命是由两个推力驱动的：允许收集复杂数据集的现代技术，以及我们用来理解它们的理论和算法。尽管如此，从数据中提取可操作的知识是困难的。在不受控制的环境中收集的观察结果通常是高维的，复杂的和嘈杂的;即使使用受控实验，作为其基础的复杂系统-如气象学，化学，医学和生物学-可以产生具有高度非平凡的底层拓扑结构的数据集。这指的是诸如断开的片段的数量（即，簇）、孔的存在或数据空间的可定向性。通过这个CAREER奖资助的研究将利用代数拓扑的思想来解决数据科学问题，如复杂数据集的可视化和表示，以及在设计用于预测和分类的学习系统时，非平凡拓扑所带来的挑战。这项工作将通过创建在线TDA（拓扑数据分析）学院纳入PI的教育计划，其双重目的是降低数据科学家和学者进入该领域的门槛，以及增加计算数学领域服务不足的社区的代表性。该项目为研究生提供了研究培训的机会。理解拓扑空间之间的映射集合导致了丰富而复杂的数学，因为它包含了代数不变量，如同伦群和广义（上）同调理论。虽然一些数据科学问题是映射空间问题的离散版本-包括非线性降维和监督学习-但目前缺乏相应的理论和算法处理。这个职业奖将有助于纠正这种情况。这里阐述的研究计划旨在启动一个新的研究计划，解决如何利用数据集的底层拓扑进行数据建模的理论和算法（例如，在降维中）以及当学习复杂数据空间之间的映射时（例如，在监督学习中）。这项工作将产生通过分类空间计算数据的拓扑感知和鲁棒多尺度坐标化的方法，这是一种拓扑障碍的计算理论，用于数据集之间的映射的鲁棒扩展，以及现代深度学习范式的引入，以学习非Euclidean数据集。该奖项反映了NSF的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（8）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Toroidal Coordinates: Decorrelating Circular Coordinates with Lattice Reduction

环形坐标：通过晶格缩减去关联圆坐标

DOI：
10.4230/lipics.socg.2023.57
发表时间：
2023
期刊：
Leibniz international proceedings in informatics
影响因子：
0
作者：
Scoccola, Luis;Gakhar, Hitesh;Bush, Johnathan;Schonsheck, Nikolas;Rask, Tatum;Zhou, Ling;Perea, Jose A.
通讯作者：
Perea, Jose A.

Topological Data Analysis of Electroencephalogram Signals for Pediatric Obstructive Sleep Apnea

小儿阻塞性睡眠呼吸暂停脑电图信号的拓扑数据分析

DOI：
10.1109/embc40787.2023.10340674
发表时间：
2023
期刊：
Annual International Conference of the IEEE Engineering in Medicine and Biology Society
影响因子：
0
作者：
Manjunath, Shashank;Perea, Jose A.;Sathyanarayana, Aarti
通讯作者：
Sathyanarayana, Aarti

FibeRed: Fiberwise Dimensionality Reduction of Topologically Complex Data with Vector Bundles

FiberRed：使用向量束对拓扑复杂数据进行纤维维数降低

DOI：
10.4230/lipics.socg.2023.56
发表时间：
2023
期刊：
Leibniz international proceedings in informatics
影响因子：
0
作者：
Scoccola, Luis;Perea, Jose A.
通讯作者：
Perea, Jose A.

Persistable: persistent and stable clustering

DOI：
10.21105/joss.05022
发表时间：
2023-03
期刊：
J. Open Source Softw.
影响因子：
0
作者：
Luis Scoccola;Alexander Rolle
通讯作者：
Luis Scoccola;Alexander Rolle

Sliding window persistence of quasiperiodic functions

DOI：
10.1007/s41468-023-00136-7
发表时间：
2021-03
期刊：
Journal of Applied and Computational Topology
影响因子：
0
作者：
H. Gakhar;Jose A. Perea
通讯作者：
H. Gakhar;Jose A. Perea

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Jose Perea其他文献

Mo1199 DEVELOPMENT OF AN EXOSOME-BASED LIQUID BIOPSY POWERED BY MACHINE LEARNING FOR THE DETECTION OF EARLY-ONSET COLORECTAL CANCER

DOI：
10.1016/s0016-5085(24)02718-5
发表时间：
2024-05-18
期刊：
Conference abstract
影响因子：
作者：
Alessandro Mannucci;Caiming Xu;Katsutoshi Shoda;Jose Perea;Giulia M. Cavestro;Ajay Goel
通讯作者：
Ajay Goel

1256 DEVELOPMENT AND VALIDATION OF A MIRNA-BASED SIGNATURE, POWERED BY MACHINE LEARNING, FOR PREDICTING 5-YEAR DISEASEFREE SURVIVAL AFTER SURGERY IN EARLY-ONSET COLORECTAL CANCER

DOI：
10.1016/s0016-5085(24)01163-6
发表时间：
2024-05-18
期刊：
Conference abstract
影响因子：
作者：
Alessandro Mannucci;Goretti Hernández;Hiroyuki Uetake;Yasuhide Yamada;Francesc Balaguer;Hideo Baba;Jose Perea;Clement R. Boland;Enrique Quintero;Ajay Goel
通讯作者：
Ajay Goel

Su1147 RISK OF METACHRONOUS NEOPLASIA IN EARLY-ONSET COLORECTAL CANCER. SYSTEMATIC REVIEW AND METANALYSIS.

DOI：
10.1016/s0016-5085(24)02002-x
发表时间：
2024-05-18
期刊：
Conference abstract
影响因子：
作者：
Gianluca Pellino;Giacomo Fuschillo;Rogelio Gonzalez-Sarmiento;Marc Marti-Gallostra;Francesco Selvaggi;Eloy Espín-Basany;Jose Perea
通讯作者：
Jose Perea

Mo1152 GERMLINE MUTATIONS IN EARLY-ONSET COLORECTAL CANCER: THE MORE YOU SEARCH THE MORE YOU FIND.

DOI：
10.1016/s0016-5085(23)02794-4
发表时间：
2023-05-01
期刊：
Conference abstract
影响因子：
作者：
Jose Perea;Marc Marti-Gallostra;Francesc Balaguer;Marta Jiménez-Toscano;Edurne Álvaro;Araceli Ballestero;Damian Garcia-Olmo;Rosario Vidal-Tocino;Elena Hurtado;Gonzalo Sanz;Fernando Jiménez;Alfredo Vivas;Irene López-Rojo;Alicia Alvarellos Perez;Sirio Melone;Lorena Brandáriz;Jessica Pérez;Rogelio Gonzalez-Sarmiento
通讯作者：
Rogelio Gonzalez-Sarmiento

Su1144 MOLECULAR PAIRED ANALYSIS OF METACHRONOUS COLORECTAL CANCERS SHOWS HETEROGENEITY IN THIS SUBSET OF COLORECTAL NEOPLASM.

DOI：
10.1016/s0016-5085(24)01999-1
发表时间：
2024-05-18
期刊：
Conference abstract
影响因子：
作者：
Jessica Pérez;Daniel Rueda;Alfredo Vivas;Lorena Brandáriz;Rogelio Gonzalez-Sarmiento;Jose Perea
通讯作者：
Jose Perea