Bayesian Learning for Sparse High-Dimensional Data
稀疏高维数据的贝叶斯学习
基本信息
- 批准号:2889818
- 负责人:
- 金额:--
- 依托单位:
- 依托单位国家:英国
- 项目类别:Studentship
- 财政年份:2023
- 资助国家:英国
- 起止时间:2023 至 无数据
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
This project is focused on understanding uncertainty in machine learning models trained on limited datasets. There are many problems where the number of data points is small relative to the number of features. Typical solutions assume independence of features or use dimensionality reduction to learn a maximum likelihood projection of the data. For small data sets, learnt models are critically dependent on the actual data points used. The project will investigate whether Bayesian methods can be used to characterise the uncertainty of estimated parameters efficiently when developing machine learning models for sensor signal time series.Much recent progress in machine learning has relied on the availability of large datasets, which allows the development of complex models. However, many problems in defence and security do not have access to such data, either because they require use of less widely studied sensors (such as sonar) or they relate to adversaries, who strive to limit data about their activities. Most published models rely on point estimates of parameters, achieved through algorithms such as maximum likelihood or stochastic gradient descent. However, when this type of model is applied in situations with limited data, the uncertainty associated with parameter estimates is usually not taken into account, either when integrating machine learning models into wider systems, or when assessing performance to predict how the model might behave in operational scenarios. Even when other approaches to deal with limited datasets are used, such as transfer learning, uncertainty characterisation is still important as there is often a mismatch between the distribution of the pre-training and training datasets.This project aims to investigate to what extent Bayesian methods can be used to characterise the uncertainty of estimated parameters when dealing with sparse but potentially high-dimensional data sets, and how this can be implemented in a distributed computing setting. The expected outcome of the project is the development of suitable Bayesian algorithms, along with a software implementation, and an analysis of algorithm performance on relevant datasets.The research will start with a literature review into appropriate approaches, which could include Variational Bayesian methods, Markov Chain Monte Carlo (MCMC), Sequential Monte Carlo (SMC), Approximate Bayesian Computation (ABC), and other approximate methods. Consideration will be given to the computational feasibility of the algorithms, including the extent to which computing can be distributed to multiple processors or virtual machines in a cloud infrastructure and the transparency (confidence) and performance improvements the various approaches could provide. Suitable innovative techniques will be developed, assessed, and compared against baseline approaches. Bayesian Neural Networks (BNN) will also be researched with implementations containing techniques such as SMC and MCMC methods, amongst others. The algorithms will be applied to a number of sponsor-supplied datasets, such as sonar sensor or electrical device measurement time-series. The research will be to determine the extent to which the uncertainty representation accommodates operational data that may not have the same distribution as the training data. Based on discussions with the sponsor and an analysis of the results, industrially relevant scenarios where the algorithms can be used will be identified.
这个项目的重点是理解在有限数据集上训练的机器学习模型中的不确定性。在许多问题中,数据点的数量相对于特征的数量来说是很小的。典型的解决方案假设特征的独立性或使用降维来学习数据的最大似然投影。对于小数据集,学习到的模型严重依赖于使用的实际数据点。该项目将研究在为传感器信号时间序列开发机器学习模型时,贝叶斯方法是否可以有效地表征估计参数的不确定性。最近机器学习的许多进展都依赖于大型数据集的可用性,这使得复杂模型的开发成为可能。然而,国防和安全领域的许多问题无法获得此类数据,要么是因为它们需要使用研究较少的传感器(如声纳),要么是因为它们与对手有关,后者努力限制有关其活动的数据。大多数已发表的模型依赖于参数的点估计,通过最大似然或随机梯度下降等算法实现。然而,当这种类型的模型应用于数据有限的情况时,无论是在将机器学习模型集成到更广泛的系统中,还是在评估性能以预测模型在操作场景中的表现时,通常都不会考虑与参数估计相关的不确定性。即使使用其他方法来处理有限的数据集,如迁移学习,不确定性表征仍然很重要,因为预训练数据集和训练数据集的分布之间经常存在不匹配。该项目旨在研究贝叶斯方法在处理稀疏但潜在高维数据集时可以在多大程度上用于描述估计参数的不确定性,以及如何在分布式计算设置中实现这一点。该项目的预期结果是开发合适的贝叶斯算法,以及软件实现,并分析算法在相关数据集上的性能。研究将从文献综述开始,包括变分贝叶斯方法、马尔可夫链蒙特卡罗(MCMC)、顺序蒙特卡罗(SMC)、近似贝叶斯计算(ABC)和其他近似方法。将考虑算法的计算可行性,包括计算可以分布到云基础设施中的多个处理器或虚拟机的程度,以及各种方法可以提供的透明度(置信度)和性能改进。将开发、评估适当的创新技术,并与基线方法进行比较。贝叶斯神经网络(BNN)也将通过包含SMC和MCMC方法等技术的实现进行研究。该算法将应用于许多赞助商提供的数据集,如声纳传感器或电气设备测量时间序列。研究将是确定不确定性表示在多大程度上容纳可能与训练数据具有不同分布的操作数据。根据与赞助商的讨论和对结果的分析,将确定可以使用算法的工业相关场景。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
其他文献
吉治仁志 他: "トランスジェニックマウスによるTIMP-1の線維化促進機序"最新医学. 55. 1781-1787 (2000)
Hitoshi Yoshiji 等:“转基因小鼠中 TIMP-1 的促纤维化机制”现代医学 55. 1781-1787 (2000)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
LiDAR Implementations for Autonomous Vehicle Applications
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
吉治仁志 他: "イラスト医学&サイエンスシリーズ血管の分子医学"羊土社(渋谷正史編). 125 (2000)
Hitoshi Yoshiji 等人:“血管医学与科学系列分子医学图解”Yodosha(涉谷正志编辑)125(2000)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Effect of manidipine hydrochloride,a calcium antagonist,on isoproterenol-induced left ventricular hypertrophy: "Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,K.,Teragaki,M.,Iwao,H.and Yoshikawa,J." Jpn Circ J. 62(1). 47-52 (1998)
钙拮抗剂盐酸马尼地平对异丙肾上腺素引起的左心室肥厚的影响:“Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('', 18)}}的其他基金
An implantable biosensor microsystem for real-time measurement of circulating biomarkers
用于实时测量循环生物标志物的植入式生物传感器微系统
- 批准号:
2901954 - 财政年份:2028
- 资助金额:
-- - 项目类别:
Studentship
Exploiting the polysaccharide breakdown capacity of the human gut microbiome to develop environmentally sustainable dishwashing solutions
利用人类肠道微生物群的多糖分解能力来开发环境可持续的洗碗解决方案
- 批准号:
2896097 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
A Robot that Swims Through Granular Materials
可以在颗粒材料中游动的机器人
- 批准号:
2780268 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Likelihood and impact of severe space weather events on the resilience of nuclear power and safeguards monitoring.
严重空间天气事件对核电和保障监督的恢复力的可能性和影响。
- 批准号:
2908918 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Proton, alpha and gamma irradiation assisted stress corrosion cracking: understanding the fuel-stainless steel interface
质子、α 和 γ 辐照辅助应力腐蚀开裂:了解燃料-不锈钢界面
- 批准号:
2908693 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Field Assisted Sintering of Nuclear Fuel Simulants
核燃料模拟物的现场辅助烧结
- 批准号:
2908917 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Assessment of new fatigue capable titanium alloys for aerospace applications
评估用于航空航天应用的新型抗疲劳钛合金
- 批准号:
2879438 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Developing a 3D printed skin model using a Dextran - Collagen hydrogel to analyse the cellular and epigenetic effects of interleukin-17 inhibitors in
使用右旋糖酐-胶原蛋白水凝胶开发 3D 打印皮肤模型,以分析白细胞介素 17 抑制剂的细胞和表观遗传效应
- 批准号:
2890513 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Understanding the interplay between the gut microbiome, behavior and urbanisation in wild birds
了解野生鸟类肠道微生物组、行为和城市化之间的相互作用
- 批准号:
2876993 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
相似国自然基金
Scalable Learning and Optimization: High-dimensional Models and Online Decision-Making Strategies for Big Data Analysis
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:合作创新研究团队
Understanding structural evolution of galaxies with machine learning
- 批准号:n/a
- 批准年份:2022
- 资助金额:10.0 万元
- 项目类别:省市级项目
煤矿安全人机混合群智感知任务的约束动态多目标Q-learning进化分配
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于领弹失效考量的智能弹药编队短时在线Q-learning协同控制机理
- 批准号:62003314
- 批准年份:2020
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
集成上下文张量分解的e-learning资源推荐方法研究
- 批准号:61902016
- 批准年份:2019
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
具有时序迁移能力的Spiking-Transfer learning (脉冲-迁移学习)方法研究
- 批准号:61806040
- 批准年份:2018
- 资助金额:20.0 万元
- 项目类别:青年科学基金项目
基于Deep-learning的三江源区冰川监测动态识别技术研究
- 批准号:51769027
- 批准年份:2017
- 资助金额:38.0 万元
- 项目类别:地区科学基金项目
具有时序处理能力的Spiking-Deep Learning(脉冲深度学习)方法研究
- 批准号:61573081
- 批准年份:2015
- 资助金额:64.0 万元
- 项目类别:面上项目
基于有向超图的大型个性化e-learning学习过程模型的自动生成与优化
- 批准号:61572533
- 批准年份:2015
- 资助金额:66.0 万元
- 项目类别:面上项目
E-Learning中学习者情感补偿方法的研究
- 批准号:61402392
- 批准年份:2014
- 资助金额:26.0 万元
- 项目类别:青年科学基金项目
相似海外基金
CAREER: Physics-inspired Machine Learning with Sparse and Asynchronous p-bits
职业:利用稀疏和异步 p 位进行物理启发的机器学习
- 批准号:
2237357 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Continuing Grant
ATD: Sparse and Localized Graph Convolutional Networks for Anomaly Detection and Active Learning
ATD:用于异常检测和主动学习的稀疏和局部图卷积网络
- 批准号:
2220574 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Standard Grant
Distributed video coding and deep learning using convolutional sparse dictionary generated with large scale datasets
使用大规模数据集生成的卷积稀疏字典进行分布式视频编码和深度学习
- 批准号:
23K11159 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Scientific Research (C)
Large-scale sparse learning using asynchronous architecture for interpretable model
使用异步架构进行可解释模型的大规模稀疏学习
- 批准号:
23K11213 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Scientific Research (C)
Collaborative Research: CIF: Small: Robust Machine Learning under Sparse Adversarial Attacks
协作研究:CIF:小型:稀疏对抗攻击下的鲁棒机器学习
- 批准号:
2236484 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Robust Machine Learning under Sparse Adversarial Attacks
协作研究:CIF:小型:稀疏对抗攻击下的鲁棒机器学习
- 批准号:
2236483 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Standard Grant
Robust and scalable algorithms for learning hidden structures in sparse network data with the aid of side information
借助辅助信息学习稀疏网络数据中隐藏结构的鲁棒且可扩展的算法
- 批准号:
2311024 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Standard Grant
Reciprocal Perspective Machine Learning to Identify Relationships in Sparse Biological Networks
交互视角机器学习识别稀疏生物网络中的关系
- 批准号:
RGPIN-2021-04184 - 财政年份:2022
- 资助金额:
-- - 项目类别:
Discovery Grants Program - Individual
Robust sparse recovery and deep learning algorithms in computational mathematics
计算数学中的鲁棒稀疏恢复和深度学习算法
- 批准号:
RGPIN-2020-06766 - 财政年份:2022
- 资助金额:
-- - 项目类别:
Discovery Grants Program - Individual
Causal Structure Learning from Sparse High Dimensional Data
从稀疏高维数据中学习因果结构
- 批准号:
RGPIN-2021-02856 - 财政年份:2022
- 资助金额:
-- - 项目类别:
Discovery Grants Program - Individual