Joining the dots: from data to insight
连接点:从数据到洞察
基本信息
- 批准号:EP/N014189/1
- 负责人:
- 金额:$ 155.2万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2015
- 资助国家:英国
- 起止时间:2015 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
The relentless growth of the amount, variety, availability, and the rate of change of data has profoundly transformed essentially all aspects of human life. The Big Data revolution has created a paradox: While we create and collect more data than ever before, it is not always easy to unlock the information it contains. To turn the easy availability of data into a major scientific and economic advantage, it is imperative that we create analytic tools that would be equal to the challenge presented by the complexity of modern data.In recent years, breakthroughs in topological data analysis and machine learning have paved the way for significant progress towards creating efficient and reliable tools to extract information from data.Our proposal has been designed to address the scope of the call as follows.To 'convert the vast amounts of data produced into understandable, actionable information' we will create a powerful fusion of machine learning, statistics, and topological data analysis. This combination of statistical insight, with computational power of machine learning with the flexibility, scalability, and visualisation tools of topology will allow a significant reduction of complexity of the data under study. The results will be output in a form that is best suited to the intended application or a scientific problem at hand. This way, we will create a seamless pathway from data analysis to implementation, which will allow us to control every step of this process. In particular, the intended end user will be able to query the results of the analysis to extract the information relevant to them. In summary, our work will provide tools to extract information from complex data sets to support user investigations or decisions.It is now well established that a main challenge of Big Data is how 'to efficiently and intelligently extract knowledge from heterogeneous, distributed data while retaining the context necessary for its interpretation'. This will be addressed first of all by developing techniques for dealing with heterogenous data. A main strength of topology is its ability to identify simple components in complex systems. It can also provide guiding principles on how to combine elements to create a model of a complex system. It also provides numerical techniques to control the overall shape of the resulting model to ensure that it fits with the original constraints. We will use the particular strengths of machine learning, statistics and topology to identify the main properties of data, which will then be combined to provide an overall analysis of the data. For example, a collection of text documents can be analysed using machine learning techniques to create a graph which captures similarities between documents in a topological way. This is an efficient way to classify a corpus of documents according to a desired set of keywords. An important part of our investigation will be to develop robust techniques of data fusion. This is important in many applications. One of our main applications will address the problem of creating a set of descriptors to diagnose and treat asthma. There are five main pathways for clinical diagnosis of asthma, each supported by data. To create a coherent picture of the disease we need to understand how to combine the information contained in these separate data sets to create the so called 'asthma handprint' which is a major challenge in this part of medicine.Every novel methodology of data analysis has to prove that its 'techniques are realistic, compatible and scalable with real- world services and hardware systems'. The best way to do that is to engage from the outset with challenging applications , and to ensure that theoretic and modelling solutions fit well the intended applications. We offer a unique synergy between theory and modelling as well as world-class facilities in medicine and chemistry which will provide a strict test for our ideas and results.
数据的数量、种类、可用性和变化速度的不断增长,从本质上深刻地改变了人类生活的方方面面。大数据革命创造了一个悖论:虽然我们创造和收集的数据比以往任何时候都多,但要解锁其中包含的信息并不总是那么容易。为了将数据的易得性转变为主要的科学和经济优势,我们必须创造出能够应对现代数据复杂性带来的挑战的分析工具。近年来,拓扑数据分析和机器学习方面的突破为创建高效可靠的工具从数据中提取信息铺平了道路。我们的提案旨在解决以下呼吁的范围。为了“将产生的大量数据转换为可理解、可操作的信息”,我们将创建一个强大的机器学习、统计学和拓扑数据分析的融合。这种统计洞察力、机器学习的计算能力、拓扑的灵活性、可扩展性和可视化工具的结合,将大大降低所研究数据的复杂性。结果将以最适合预期应用或手头科学问题的形式输出。通过这种方式,我们将创建一个从数据分析到实施的无缝路径,这将使我们能够控制这一过程的每一步。特别是,预期的最终用户将能够查询分析结果以提取与其相关的信息。总之,我们的工作将提供从复杂数据集中提取信息的工具,以支持用户调查或决策。现在,大数据的一个主要挑战是如何“高效、智能地从异构、分布式数据中提取知识,同时保留其解释所需的上下文”。这将首先通过开发处理异构数据的技术来解决。拓扑学的一个主要优点是它能够识别复杂系统中的简单组件。它还可以提供指导原则,说明如何结合元素来创建一个复杂系统的模型。它还提供了数值技术来控制最终模型的整体形状,以确保它符合原始约束。我们将利用机器学习、统计学和拓扑学的特殊优势来识别数据的主要属性,然后将其结合起来提供数据的整体分析。例如,可以使用机器学习技术分析一组文本文档,以创建一个以拓扑方式捕获文档之间相似性的图。这是一种根据所需的关键字集对文档语料库进行分类的有效方法。我们研究的一个重要部分将是开发健壮的数据融合技术。这在许多应用程序中都很重要。我们的主要应用之一将解决创建一组描述符来诊断和治疗哮喘的问题。临床诊断哮喘有五种主要途径,每一种都有数据支持。为了创建一个连贯的疾病图像,我们需要了解如何将这些独立数据集中包含的信息结合起来,以创建所谓的“哮喘手印”,这是这部分医学中的一个主要挑战。每一种新的数据分析方法都必须证明其“技术是现实的,与现实世界的服务和硬件系统兼容并可扩展的”。做到这一点的最佳方法是从一开始就参与具有挑战性的应用,并确保理论和建模解决方案很好地适应预期的应用。我们提供理论和模型之间独特的协同作用,以及世界一流的医学和化学设施,这将为我们的想法和结果提供严格的测试。
项目成果
期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Exactness of locally compact groups
- DOI:10.1016/j.aim.2017.03.020
- 发表时间:2016-03
- 期刊:
- 影响因子:0
- 作者:J. Brodzki;Chris Cave;Kang Li
- 通讯作者:J. Brodzki;Chris Cave;Kang Li
Lung Topology Characteristics in patients with Chronic Obstructive Pulmonary Disease
- DOI:10.1038/s41598-018-23424-0
- 发表时间:2018-03-28
- 期刊:
- 影响因子:4.6
- 作者:Belchi, Francisco;Pirashvili, Mariam;Brodzki, Jacek
- 通讯作者:Brodzki, Jacek
On the Baum-Connes Conjecture for Groups Acting on CAT(0)-Cubical Spaces
关于作用于 CAT(0)-立方空间的群的 Baum-Connes 猜想
- DOI:10.1093/imrn/rnaa059
- 发表时间:2021
- 期刊:
- 影响因子:1
- 作者:Brodzki J
- 通讯作者:Brodzki J
$$A_\infty $$ Persistent Homology Estimates Detailed Topology from Pointcloud Datasets
$$A_infty $$ 持久同源估计点云数据集的详细拓扑
- DOI:10.1007/s00454-021-00319-y
- 发表时间:2021
- 期刊:
- 影响因子:0.8
- 作者:Belchí F
- 通讯作者:Belchí F
A differential complex for CAT(0) cubical spaces
CAT(0) 立方空间的微分复形
- DOI:10.1016/j.aim.2019.03.009
- 发表时间:2019
- 期刊:
- 影响因子:1.7
- 作者:Brodzki J
- 通讯作者:Brodzki J
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Jacek Brodzki其他文献
D-Branes, RR-Fields and Duality on Noncommutative Manifolds
- DOI:
10.1007/s00220-007-0396-y - 发表时间:
2007-12-05 - 期刊:
- 影响因子:2.600
- 作者:
Jacek Brodzki;Varghese Mathai;Jonathan Rosenberg;Richard J. Szabo - 通讯作者:
Richard J. Szabo
Jacek Brodzki的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Jacek Brodzki', 18)}}的其他基金
Coarse geometry and cohomology of large data sets
大数据集的粗略几何和上同调
- 批准号:
EP/I016945/1 - 财政年份:2011
- 资助金额:
$ 155.2万 - 项目类别:
Research Grant
Preventing wide-area blackouts through adaptive islanding of transmission networks
通过传输网络的自适应孤岛来防止大范围停电
- 批准号:
EP/G059101/1 - 财政年份:2010
- 资助金额:
$ 155.2万 - 项目类别:
Research Grant
New directions in noncommutative geometry.
非交换几何的新方向。
- 批准号:
EP/G012296/1 - 财政年份:2008
- 资助金额:
$ 155.2万 - 项目类别:
Research Grant
Analysis and geometry of metric spaces with applications in geometric group theory and topology.
度量空间的分析和几何及其在几何群论和拓扑中的应用。
- 批准号:
EP/F031947/1 - 财政年份:2008
- 资助金额:
$ 155.2万 - 项目类别:
Research Grant
相似国自然基金
宏观三维C-dots/Ta3N5纳米孔材料的可控构筑及其光催化全分解水性能研究
- 批准号:21872023
- 批准年份:2018
- 资助金额:66.0 万元
- 项目类别:面上项目
不同形貌C3N4/C-Dots高效光催化材料的制备及其光催化降解典型PPCPs的机制研究
- 批准号:21677040
- 批准年份:2016
- 资助金额:65.0 万元
- 项目类别:面上项目
基于病例队列随访设计的流动人口肺结核病人DOTS实施质量改进策略研究
- 批准号:71473152
- 批准年份:2014
- 资助金额:62.0 万元
- 项目类别:面上项目
基于量子点多色荧光细胞标志谱型的CTC鉴别与肿瘤个体化诊治的研究
- 批准号:30772507
- 批准年份:2007
- 资助金额:30.0 万元
- 项目类别:面上项目
量子点技术对细胞表面蛋白和受体在体内分布的研究
- 批准号:30570686
- 批准年份:2005
- 资助金额:26.0 万元
- 项目类别:面上项目
相似海外基金
Virtual nanostructure simulation (VINAS) portal
虚拟纳米结构模拟 (VINAS) 门户
- 批准号:
10567076 - 财政年份:2023
- 资助金额:
$ 155.2万 - 项目类别:
Convenient rapid and portable tool for the detection of ribonucleases
用于检测核糖核酸酶的方便、快速、便携的工具
- 批准号:
10760552 - 财政年份:2023
- 资助金额:
$ 155.2万 - 项目类别:
Overcoming pressure ulcers with engineered hormones and stem cells
用工程激素和干细胞克服压疮
- 批准号:
10821146 - 财政年份:2023
- 资助金额:
$ 155.2万 - 项目类别:
CAS: Collaborative Research: Integrative Learning of Fluorescence Fluctuations in Perovskite Quantum Dots Using A Data Science Assisted Single-Particle Approach
CAS:协作研究:使用数据科学辅助单粒子方法综合学习钙钛矿量子点荧光涨落
- 批准号:
2203854 - 财政年份:2022
- 资助金额:
$ 155.2万 - 项目类别:
Standard Grant
Development and testing of Carbon Quantum Dot architectures to arrest neurotoxicant-insult- related outcomes
开发和测试碳量子点架构以阻止神经毒物侮辱相关的结果
- 批准号:
10412365 - 财政年份:2022
- 资助金额:
$ 155.2万 - 项目类别:
Nanocrystal Quantum Dot Biomimetics of SARS-CoV-2 to Interrogate Neutrophil-Mediated Neuroinflammation at the Blood-Brain Barrier
SARS-CoV-2 的纳米晶量子点仿生学研究中性粒细胞介导的血脑屏障神经炎症
- 批准号:
10510611 - 财政年份:2022
- 资助金额:
$ 155.2万 - 项目类别:
CAS: Collaborative Research: Integrative Learning of Fluorescence Fluctuations in Perovskite Quantum Dots Using A Data Science Assisted Single-Particle Approach
CAS:协作研究:使用数据科学辅助单粒子方法综合学习钙钛矿量子点荧光涨落
- 批准号:
2203700 - 财政年份:2022
- 资助金额:
$ 155.2万 - 项目类别:
Standard Grant
Adaptive Tracking and Quantum Imaging for Protein-Protein Interactions
蛋白质-蛋白质相互作用的自适应跟踪和量子成像
- 批准号:
10706952 - 财政年份:2022
- 资助金额:
$ 155.2万 - 项目类别:
Ultralong-term single-molecule imaging of amyloid precursor protein (APP) processing in Alzheimer's disease
阿尔茨海默病中淀粉样前体蛋白(APP)加工的超长期单分子成像
- 批准号:
10738516 - 财政年份:2022
- 资助金额:
$ 155.2万 - 项目类别:
Adaptive Tracking and Quantum Imaging for Protein-Protein Interactions
蛋白质-蛋白质相互作用的自适应跟踪和量子成像
- 批准号:
10296577 - 财政年份:2022
- 资助金额:
$ 155.2万 - 项目类别: