Geometric Methods in Data Analysis
数据分析中的几何方法
基本信息
- 批准号:RGPIN-2021-03206
- 负责人:
- 金额:$ 4.66万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2021
- 资助国家:加拿大
- 起止时间:2021-01-01 至 2022-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
A geometric view of data, based on treating data points as elements of a high dimensional space, has long traditions in computer science and statistics. For example, in machine learning, it is common to represent data as a collection of points, each point corresponding to a particular example, such as a labeled image. The individual features of the data points, e.g. the pixels of the image, give the coordinates of the data point in a coordinate system. Many standard machine learning tasks can then be formulated geometrically. For example, classifying images according to their content can be formulated as finding an appropriate partition of this high dimensional space of images into geometric shapes. For another example, finding similar images corresponds to finding points that are close in an appropriate distance metric. Such formulations allow leveraging mathematical insights from high-dimensional geometry to solve challenging data analysis tasks. Despite the long traditions of a geometric view of data, for many fundamental data analysis tasks we still lack a complete understanding of how the underlying geometry of the task interacts with its statistical and computational hardness. Relatedly, we often do not know algorithms that solve these tasks and adapt optimally to the geometry of the data. The long-term objective of this project is develop a geometric theory of central algorithmic tasks in data analysis. In particular, the theory should predict the statistical and computational complexity of each task, such as the amount of data required to solve it, and the efficiency of an optimal algorithm for it. The theory should also provide principles for the design of simple, and efficient algorithms that optimally adapt to the underlying geometry of the data. The algorithms should also be dynamic, in the sense of being able to adapt to changes in the data or to the problem being solved. The main areas of data analysis that are of interest for the project are private statistical data analysis, high-dimensional search, and experimental design. The short term objectives in each of these areas that will be attacked using geometric tools are: * In private data analysis: design optimal and efficient algorithms for answering counting queries, and for stochastic optimization problems like classification, logistic, and least-squares regression; design competitive algorithms that enable interactive analysis of the data; characterize what tasks can be solved more efficiently when interaction is allowed in distributed models of private data analysis. * In high dimensional search: characterize the metrics for which there exist efficient near neighbour search data structures based on randomized space partitions, as well as data structures in the more general computational models, such as decision trees. * In experimental design: develop efficient algorithms that approximately compute optimal experimental designs under combinatorial constraints.
基于将数据点视为高维空间元素的数据几何观点,在计算机科学和统计学中有着悠久的传统。例如,在机器学习中,通常将数据表示为点的集合,每个点对应于一个特定的示例,例如标记的图像。数据点的各个特征,例如图像的像素,给出了数据点在坐标系中的坐标。然后,许多标准的机器学习任务就可以几何地表示出来。例如,根据图像的内容对图像进行分类可以表示为找到图像的高维空间到几何形状的适当分割。又例如,查找相似图像对应于查找在适当距离度量中接近的点。这样的公式允许利用来自高维几何的数学见解来解决具有挑战性的数据分析任务。尽管数据的几何观点有着悠久的传统,但对于许多基本的数据分析任务,我们仍然缺乏对任务的基本几何如何与其统计和计算难度相互作用的完整理解。与此相关的是,我们通常不知道能够解决这些任务并以最佳方式适应数据几何的算法。该项目的长期目标是开发数据分析中中心算法任务的几何理论。特别是,该理论应该预测每项任务的统计和计算复杂性,例如解决它所需的数据量,以及它的最优算法的效率。该理论还应提供设计简单、高效的算法的原则,以最佳地适应数据的基本几何结构。从能够适应数据变化或正在解决的问题的意义上讲,算法也应该是动态的。该项目感兴趣的数据分析的主要领域是私人统计数据分析、高维搜索和实验设计。这些领域中将使用几何工具攻击的每个领域的短期目标是:*在私人数据分析方面:为回答计数查询以及分类、逻辑和最小二乘回归等随机优化问题设计最优和高效的算法;设计具有竞争力的算法,使数据能够进行交互分析;表征在私人数据分析的分布式模型中允许交互时可以更有效地解决哪些任务。*在高维搜索中:描述存在基于随机空间分区的高效近邻搜索数据结构的指标,以及更一般计算模型中的数据结构,如决策树。*在试验设计方面:开发高效算法,在组合约束下近似计算最优试验设计。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Nikolov, Aleksandar其他文献
Optimization of geopolymers based on natural zeolite clinoptilolite by calcination and use of aluminate activators
- DOI:
10.1016/j.conbuildmat.2020.118257 - 发表时间:
2020-05-20 - 期刊:
- 影响因子:7.4
- 作者:
Nikolov, Aleksandar;Nugteren, Henk;Rostovsky, Ivan - 通讯作者:
Rostovsky, Ivan
Approximate Nearest Neighbors Beyond Space Partitions
空间分区之外的近似最近邻
- DOI:
10.1137/1.9781611976465.72 - 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Andoni, Alexandr;Nikolov, Aleksandar;Razenshteyn, Ilya P.;Waingarten, Erik - 通讯作者:
Waingarten, Erik
Private Query Release Assisted by Public Data
公共数据辅助私密查询发布
- DOI:
- 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
Bassily, Raef;Cheu, Albert;Moran, Shay;Nikolov, Aleksandar;Ullman, Jonathan;Wu, Zhiwei Steven - 通讯作者:
Wu, Zhiwei Steven
Geopolymer materials based on natural zeolite
- DOI:
10.1016/j.cscm.2017.03.001 - 发表时间:
2017-06-01 - 期刊:
- 影响因子:6.2
- 作者:
Nikolov, Aleksandar;Rostovsky, Ivan;Nugteren, Henk - 通讯作者:
Nugteren, Henk
Nikolov, Aleksandar的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Nikolov, Aleksandar', 18)}}的其他基金
Algorithms and Private Data Analysis
算法和私有数据分析
- 批准号:
CRC-2020-00004 - 财政年份:2022
- 资助金额:
$ 4.66万 - 项目类别:
Canada Research Chairs
Geometric Methods in Data Analysis
数据分析中的几何方法
- 批准号:
RGPIN-2021-03206 - 财政年份:2022
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Individual
Geometric Methods in Data Analysis
数据分析中的几何方法
- 批准号:
RGPAS-2021-00030 - 财政年份:2022
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Accelerator Supplements
Geometric Methods in Data Analysis
数据分析中的几何方法
- 批准号:
RGPAS-2021-00030 - 财政年份:2021
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Accelerator Supplements
Algorithms And Private Data Analysis
算法和私有数据分析
- 批准号:
CRC-2020-00004 - 财政年份:2021
- 资助金额:
$ 4.66万 - 项目类别:
Canada Research Chairs
Algorithms and Private Data Analysis
算法和私有数据分析
- 批准号:
1000230936-2015 - 财政年份:2020
- 资助金额:
$ 4.66万 - 项目类别:
Canada Research Chairs
Computational Discrepancy Theory
计算差异理论
- 批准号:
RGPIN-2016-06333 - 财政年份:2020
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Individual
Algorithms and Private Data Analysis
算法和私有数据分析
- 批准号:
1000233061-2019 - 财政年份:2020
- 资助金额:
$ 4.66万 - 项目类别:
Canada Research Chairs
Computational Discrepancy Theory
计算差异理论
- 批准号:
RGPIN-2016-06333 - 财政年份:2019
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Individual
Algorithms and Private Data Analysis
算法和私有数据分析
- 批准号:
1000230936-2015 - 财政年份:2019
- 资助金额:
$ 4.66万 - 项目类别:
Canada Research Chairs
相似国自然基金
Computational Methods for Analyzing Toponome Data
- 批准号:60601030
- 批准年份:2006
- 资助金额:17.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Novel statistical methods for data with non-Euclidean geometric structure
非欧几何结构数据的新颖统计方法
- 批准号:
DP220102232 - 财政年份:2022
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Projects
Geometric Methods in Data Analysis
数据分析中的几何方法
- 批准号:
RGPIN-2021-03206 - 财政年份:2022
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Individual
Geometric Methods in Data Analysis
数据分析中的几何方法
- 批准号:
RGPAS-2021-00030 - 财政年份:2022
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Accelerator Supplements
Geometric Methods in Data Analysis
数据分析中的几何方法
- 批准号:
RGPAS-2021-00030 - 财政年份:2021
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Accelerator Supplements
CAREER: Geometric and Combinatorial Methods for Distribution-Free Inference and Dependent Network Data
职业:无分布推理和相关网络数据的几何和组合方法
- 批准号:
2046393 - 财政年份:2021
- 资助金额:
$ 4.66万 - 项目类别:
Continuing Grant
Numerical Methods for Nonlinear Partial Differential Equations, with applications to Optimal Transportation, and Geometric Data Reduction
非线性偏微分方程的数值方法,及其在最优运输和几何数据简化中的应用
- 批准号:
RGPIN-2016-03922 - 财政年份:2021
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Individual
Numerical Methods for Nonlinear Partial Differential Equations, with applications to Optimal Transportation, and Geometric Data Reduction
非线性偏微分方程的数值方法,及其在最优运输和几何数据简化中的应用
- 批准号:
RGPIN-2016-03922 - 财政年份:2020
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Individual
Numerical Methods for Nonlinear Partial Differential Equations, with applications to Optimal Transportation, and Geometric Data Reduction
非线性偏微分方程的数值方法,及其在最优运输和几何数据简化中的应用
- 批准号:
RGPIN-2016-03922 - 财政年份:2019
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Individual
CAREER: Variational and Geometric Methods for Data Analysis
职业:数据分析的变分和几何方法
- 批准号:
1752202 - 财政年份:2018
- 资助金额:
$ 4.66万 - 项目类别:
Continuing Grant
Numerical Methods for Nonlinear Partial Differential Equations, with applications to Optimal Transportation, and Geometric Data Reduction
非线性偏微分方程的数值方法,及其在最优运输和几何数据简化中的应用
- 批准号:
RGPIN-2016-03922 - 财政年份:2018
- 资助金额:
$ 4.66万 - 项目类别:
Discovery Grants Program - Individual














{{item.name}}会员




