Data-Driven Algorithms for Data Acquisition
用于数据采集的数据驱动算法
基本信息
- 批准号:EP/Y037200/1
- 负责人:
- 金额:$ 156.63万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2024
- 资助国家:英国
- 起止时间:2024 至 无数据
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Advances in machine learning have transformed our ability to utilize data. But far less progress has been made on intelligently acquiring such data in the first place. Consequently, though data-driven approaches are now ubiquitous across science and industry, hand-crafted and heuristic approaches are typically still the norm for data acquisition itself.My goal is to address this shortfall by developing principled quantitative methods for data acquisition. In particular, I will construct adaptive algorithms that leverage information from previous data to guide future data acquisition. The basis for doing this will be the framework of Bayesian adaptive design (BAD), which formalizes the utility of data through the information it provides, then exploits this to optimize the controllable aspects of the acquisition process.Despite its principled foundations, BAD has not yet seen substantial uptake due to some key challenges in its deployment. Most notably, it has crippling computational bottlenecks that undermine its usage. By overcoming these with a new policy-based approach, I hope to turn BAD's potential into a reality, providing a powerful basis for intelligent data acquisition in domains as diverse as interactive surveys and virtual assistants, to laboratory experiments and psychology trials.One area of particular focus will be active learning, wherein one iteratively selects points to label from an unlabelled pool. Here BAD has already provided some success, but I believe it is currently fundamentally misapplied. I hope to substantially improve state-of-the-art in the area through various innovations, such as targeting information gain in predictions rather than parameters, properly utilizing unlabelled data, and developing policy-based approaches. I further propose to revisit the foundations of the Bayesian neural network models often used in such settings, questioning their fundamental assumptions and developing radically new approaches.
机器学习的进步改变了我们利用数据的能力。但在智能地获取这些数据方面取得的进展要少得多。因此,尽管数据驱动的方法现在在科学和工业中无处不在,但手工制作和启发式方法通常仍然是数据采集本身的规范。我的目标是通过开发数据采集的原则性定量方法来解决这一不足。特别是,我将构建自适应算法,利用以前的数据信息来指导未来的数据采集。这样做的基础将是贝叶斯自适应设计(BAD)的框架,它通过它提供的信息形式化数据的效用,然后利用它来优化采购过程的可控方面,尽管它的原则基础,BAD还没有看到大量的吸收,由于在其部署的一些关键挑战。最值得注意的是,它有严重的计算瓶颈,破坏了它的使用。通过一种新的基于政策的方法来克服这些问题,我希望将BAD的潜力变成现实,为交互式调查和虚拟助手,实验室实验和心理学试验等领域的智能数据采集提供强大的基础。在这方面,BAD已经取得了一些成功,但我认为它目前从根本上被误用了。我希望通过各种创新来大幅提高该领域的最新技术水平,例如在预测中瞄准信息增益而不是参数,适当利用未标记的数据,以及开发基于政策的方法。我还建议重新审视在这种情况下经常使用的贝叶斯神经网络模型的基础,质疑他们的基本假设,并开发全新的方法。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Tom Rainforth其他文献
Statistical Verification of Neural Networks
神经网络的统计验证
- DOI:
- 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
Stefan Webb;Tom Rainforth;Y. Teh;M. P. Kumar - 通讯作者:
M. P. Kumar
On Nesting Monte Carlo Estimators – Supplementary Material
关于嵌套蒙特卡洛估计器 - 补充材料
- DOI:
- 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
Tom Rainforth;R. Cornish;Hongseok Yang;Andrew Warrington;Frank D. Wood - 通讯作者:
Frank D. Wood
On the Opportunities and Pitfalls of Nesting Monte Carlo Estimators
关于嵌套蒙特卡洛估计器的机会和陷阱
- DOI:
- 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
Tom Rainforth;R. Cornish;Hongseok Yang;Andrew Warrington;Frank D. Wood - 通讯作者:
Frank D. Wood
Target-Aware Bayesian Inference: How to Beat Optimal Conventional Estimators
目标感知贝叶斯推理:如何击败最佳传统估计器
- DOI:
- 发表时间:
2020 - 期刊:
- 影响因子:6
- 作者:
Tom Rainforth;Adam Goliński;F. Wood;Sheheryar Zaidi;Kilian Q. Weinberger - 通讯作者:
Kilian Q. Weinberger
Disentangling Disentanglement
解开 解开
- DOI:
- 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
E. Mathieu;Tom Rainforth;Siddharth Narayanaswamy;Yee Whye Teh - 通讯作者:
Yee Whye Teh
Tom Rainforth的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似国自然基金
Data-driven Recommendation System Construction of an Online Medical Platform Based on the Fusion of Information
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国青年学者研究基金项目
相似海外基金
Development of Data-Collection Algorithms and Data-Driven Control Methods for Guaranteed Stabilization of Nonlinear Systems with Uncertain Equilibria and Orbits
开发数据收集算法和数据驱动控制方法,以保证具有不确定平衡和轨道的非线性系统的稳定性
- 批准号:
23K03913 - 财政年份:2023
- 资助金额:
$ 156.63万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
CAREER: Towards Harnessing the Motility of Microorganisms: Fast Algorithms, Data-Driven Models, and 3D Interactive Visual Computing
职业:利用微生物的运动性:快速算法、数据驱动模型和 3D 交互式视觉计算
- 批准号:
2408964 - 财政年份:2023
- 资助金额:
$ 156.63万 - 项目类别:
Continuing Grant
Smart QI: Developing a scalable platform for integrating data-driven algorithms that drive quality improvement in pediatric sepsis care at health facilities in Uganda
Smart QI:开发一个可扩展的平台,用于集成数据驱动的算法,推动乌干达医疗机构儿科脓毒症护理质量的提高
- 批准号:
460680 - 财政年份:2022
- 资助金额:
$ 156.63万 - 项目类别:
Miscellaneous Programs
Learning Transparent models from Data Driven algorithms to Enhance streaming data analysis
从数据驱动算法中学习透明模型以增强流数据分析
- 批准号:
2748733 - 财政年份:2022
- 资助金额:
$ 156.63万 - 项目类别:
Studentship
NeTS: Small: Hybrid Switching in Data Center Networks: Systems-driven Modeling and Principled Algorithms
NetS:小型:数据中心网络中的混合交换:系统驱动的建模和原理算法
- 批准号:
2309187 - 财政年份:2022
- 资助金额:
$ 156.63万 - 项目类别:
Standard Grant
CAREER: Towards Harnessing the Motility of Microorganisms: Fast Algorithms, Data-Driven Models, and 3D Interactive Visual Computing
职业:利用微生物的运动性:快速算法、数据驱动模型和 3D 交互式视觉计算
- 批准号:
2146191 - 财政年份:2022
- 资助金额:
$ 156.63万 - 项目类别:
Continuing Grant
Data driven splitting and composition algorithms
数据驱动的分割和组合算法
- 批准号:
2594279 - 财政年份:2021
- 资助金额:
$ 156.63万 - 项目类别:
Studentship
HDR Institute: Accelerated AI Algorithms for Data-Driven Discovery
HDR 研究所:用于数据驱动发现的加速 AI 算法
- 批准号:
2117997 - 财政年份:2021
- 资助金额:
$ 156.63万 - 项目类别:
Cooperative Agreement
Collaborative Research: Transfer Learning for Large-Scale Inference: General Framework and Data-Driven Algorithms
协作研究:大规模推理的迁移学习:通用框架和数据驱动算法
- 批准号:
2015259 - 财政年份:2020
- 资助金额:
$ 156.63万 - 项目类别:
Standard Grant
Collaborative Research: Transfer Learning for Large-Scale Inference: General Framework and Data-Driven Algorithms
协作研究:大规模推理的迁移学习:通用框架和数据驱动算法
- 批准号:
2015339 - 财政年份:2020
- 资助金额:
$ 156.63万 - 项目类别:
Standard Grant