Geometric Methods in Data Analysis
数据分析中的几何方法
基本信息
- 批准号:RGPIN-2021-03206
- 负责人:
- 金额:$ 4.66万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2022
- 资助国家:加拿大
- 起止时间:2022-01-01 至 2023-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
A geometric view of data, based on treating data points as elements of a high dimensional space, has long traditions in computer science and statistics. For example, in machine learning, it is common to represent data as a collection of points, each point corresponding to a particular example, such as a labeled image. The individual features of the data points, e.g. the pixels of the image, give the coordinates of the data point in a coordinate system. Many standard machine learning tasks can then be formulated geometrically. For example, classifying images according to their content can be formulated as finding an appropriate partition of this high dimensional space of images into geometric shapes. For another example, finding similar images corresponds to finding points that are close in an appropriate distance metric. Such formulations allow leveraging mathematical insights from high-dimensional geometry to solve challenging data analysis tasks. Despite the long traditions of a geometric view of data, for many fundamental data analysis tasks we still lack a complete understanding of how the underlying geometry of the task interacts with its statistical and computational hardness. Relatedly, we often do not know algorithms that solve these tasks and adapt optimally to the geometry of the data. The long-term objective of this project is develop a geometric theory of central algorithmic tasks in data analysis. In particular, the theory should predict the statistical and computational complexity of each task, such as the amount of data required to solve it, and the efficiency of an optimal algorithm for it. The theory should also provide principles for the design of simple, and efficient algorithms that optimally adapt to the underlying geometry of the data. The algorithms should also be dynamic, in the sense of being able to adapt to changes in the data or to the problem being solved. The main areas of data analysis that are of interest for the project are private statistical data analysis, high-dimensional search, and experimental design. The short term objectives in each of these areas that will be attacked using geometric tools are: * In private data analysis: design optimal and efficient algorithms for answering counting queries, and for stochastic optimization problems like classification, logistic, and least-squares regression; design competitive algorithms that enable interactive analysis of the data; characterize what tasks can be solved more efficiently when interaction is allowed in distributed models of private data analysis. * In high dimensional search: characterize the metrics for which there exist efficient near neighbour search data structures based on randomized space partitions, as well as data structures in the more general computational models, such as decision trees. * In experimental design: develop efficient algorithms that approximately compute optimal experimental designs under combinatorial constraints.
基于将数据点视为高维空间元素的数据几何视图在计算机科学和统计学中具有悠久的传统。例如,在机器学习中,通常将数据表示为点的集合,每个点对应于特定的示例,例如标记的图像。数据点的各个特征(例如图像的像素)给出了数据点在坐标系中的坐标。许多标准的机器学习任务可以用几何公式表示。例如,根据图像的内容对图像进行分类可以被公式化为找到图像的这种高维空间到几何形状的适当划分。对于另一示例,找到相似图像对应于找到在适当距离度量中接近的点。这样的公式允许利用来自高维几何的数学见解来解决具有挑战性的数据分析任务。尽管数据的几何视图有着悠久的传统,但对于许多基本的数据分析任务,我们仍然缺乏对任务的底层几何结构如何与其统计和计算难度相互作用的完整理解。与此相关的是,我们通常不知道解决这些任务并最佳适应数据几何形状的算法。这个项目的长期目标是发展数据分析中的中心算法任务的几何理论。特别是,该理论应该预测每个任务的统计和计算复杂性,例如解决它所需的数据量,以及最佳算法的效率。该理论还应该为设计简单而有效的算法提供原则,这些算法最佳地适应数据的底层几何结构。算法也应该是动态的,在这个意义上,能够适应数据的变化或正在解决的问题。该项目感兴趣的数据分析的主要领域是私人统计数据分析,高维搜索和实验设计。* 在私人数据分析中:设计最佳和有效的算法来回答计数查询,以及随机优化问题,如分类,逻辑和最小二乘回归;设计有竞争力的算法,使数据的交互式分析成为可能;描述当在私有数据分析的分布式模型中允许交互时,可以更有效地解决哪些任务。* 在高维搜索中:表征存在基于随机化空间分区的有效近邻搜索数据结构以及更一般的计算模型(诸如决策树)中的数据结构的度量。 * 在实验设计中:开发有效的算法,近似计算组合约束下的最优实验设计。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Nikolov, Aleksandar其他文献
Optimization of geopolymers based on natural zeolite clinoptilolite by calcination and use of aluminate activators
- DOI:
10.1016/j.conbuildmat.2020.118257 - 发表时间:
2020-05-20 - 期刊:
- 影响因子:7.4
- 作者:
Nikolov, Aleksandar;Nugteren, Henk;Rostovsky, Ivan - 通讯作者:
Rostovsky, Ivan
Approximate Nearest Neighbors Beyond Space Partitions
空间分区之外的近似最近邻
- DOI:
10.1137/1.9781611976465.72 - 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Andoni, Alexandr;Nikolov, Aleksandar;Razenshteyn, Ilya P.;Waingarten, Erik - 通讯作者:
Waingarten, Erik
Private Query Release Assisted by Public Data
公共数据辅助私密查询发布
- DOI:
- 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
Bassily, Raef;Cheu, Albert;Moran, Shay;Nikolov, Aleksandar;Ullman, Jonathan;Wu, Zhiwei Steven - 通讯作者:
Wu, Zhiwei Steven
Geopolymer materials based on natural zeolite
- DOI:
10.1016/j.cscm.2017.03.001 - 发表时间:
2017-06-01 - 期刊:
- 影响因子:6.2
- 作者:
Nikolov, Aleksandar;Rostovsky, Ivan;Nugteren, Henk - 通讯作者:
Nugteren, Henk
Nikolov, Aleksandar的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Nikolov, Aleksandar', 18)}}的其他基金
Algorithms and Private Data Analysis
算法和私有数据分析
- 批准号:
CRC-2020-00004 - 财政年份:2022
- 资助金额:
$ 4.66万 - 项目类别:
Canada Research Chairs
Geometric Methods in Data Analysis
数据分析中的几何方法
- 批准号:
RGPAS-2021-00030 - 财政年份:2022
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Accelerator Supplements
Geometric Methods in Data Analysis
数据分析中的几何方法
- 批准号:
RGPAS-2021-00030 - 财政年份:2021
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Accelerator Supplements
Geometric Methods in Data Analysis
数据分析中的几何方法
- 批准号:
RGPIN-2021-03206 - 财政年份:2021
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Individual
Algorithms And Private Data Analysis
算法和私有数据分析
- 批准号:
CRC-2020-00004 - 财政年份:2021
- 资助金额:
$ 4.66万 - 项目类别:
Canada Research Chairs
Algorithms and Private Data Analysis
算法和私有数据分析
- 批准号:
1000230936-2015 - 财政年份:2020
- 资助金额:
$ 4.66万 - 项目类别:
Canada Research Chairs
Computational Discrepancy Theory
计算差异理论
- 批准号:
RGPIN-2016-06333 - 财政年份:2020
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Individual
Algorithms and Private Data Analysis
算法和私有数据分析
- 批准号:
1000233061-2019 - 财政年份:2020
- 资助金额:
$ 4.66万 - 项目类别:
Canada Research Chairs
Computational Discrepancy Theory
计算差异理论
- 批准号:
RGPIN-2016-06333 - 财政年份:2019
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Individual
Algorithms and Private Data Analysis
算法和私有数据分析
- 批准号:
1000230936-2015 - 财政年份:2019
- 资助金额:
$ 4.66万 - 项目类别:
Canada Research Chairs
相似国自然基金
Computational Methods for Analyzing Toponome Data
- 批准号:60601030
- 批准年份:2006
- 资助金额:17.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Novel statistical methods for data with non-Euclidean geometric structure
非欧几何结构数据的新颖统计方法
- 批准号:
DP220102232 - 财政年份:2022
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Projects
Geometric Methods in Data Analysis
数据分析中的几何方法
- 批准号:
RGPAS-2021-00030 - 财政年份:2022
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Accelerator Supplements
Geometric Methods in Data Analysis
数据分析中的几何方法
- 批准号:
RGPAS-2021-00030 - 财政年份:2021
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Accelerator Supplements
Geometric Methods in Data Analysis
数据分析中的几何方法
- 批准号:
RGPIN-2021-03206 - 财政年份:2021
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Individual
CAREER: Geometric and Combinatorial Methods for Distribution-Free Inference and Dependent Network Data
职业:无分布推理和相关网络数据的几何和组合方法
- 批准号:
2046393 - 财政年份:2021
- 资助金额:
$ 4.66万 - 项目类别:
Continuing Grant
Numerical Methods for Nonlinear Partial Differential Equations, with applications to Optimal Transportation, and Geometric Data Reduction
非线性偏微分方程的数值方法,及其在最优运输和几何数据简化中的应用
- 批准号:
RGPIN-2016-03922 - 财政年份:2021
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Individual
Numerical Methods for Nonlinear Partial Differential Equations, with applications to Optimal Transportation, and Geometric Data Reduction
非线性偏微分方程的数值方法,及其在最优运输和几何数据简化中的应用
- 批准号:
RGPIN-2016-03922 - 财政年份:2020
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Individual
Numerical Methods for Nonlinear Partial Differential Equations, with applications to Optimal Transportation, and Geometric Data Reduction
非线性偏微分方程的数值方法,及其在最优运输和几何数据简化中的应用
- 批准号:
RGPIN-2016-03922 - 财政年份:2019
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Individual
CAREER: Variational and Geometric Methods for Data Analysis
职业:数据分析的变分和几何方法
- 批准号:
1752202 - 财政年份:2018
- 资助金额:
$ 4.66万 - 项目类别:
Continuing Grant
Numerical Methods for Nonlinear Partial Differential Equations, with applications to Optimal Transportation, and Geometric Data Reduction
非线性偏微分方程的数值方法,及其在最优运输和几何数据简化中的应用
- 批准号:
RGPIN-2016-03922 - 财政年份:2018
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Individual














{{item.name}}会员




