NSF Convergence Accelerator-Track D: Application of Sequential Inductive Transfer Learning for Experimental Metadata Normalization to Enable Rapid Integrative Analysis
NSF Convergence Accelerator-Track D:应用顺序归纳迁移学习进行实验元数据标准化以实现快速集成分析
基本信息
- 批准号:2040521
- 负责人:
- 金额:$ 100万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-09-15 至 2022-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
The NSF Convergence Accelerator supports use-inspired, team-based, multidisciplinary efforts that address challenges of national importance and will produce deliverables of value to society in the near future. This project, NSF Convergence Accelerator-Track D: Application of sequential inductive transfer learning for experimental metadata normalization to enable rapid integrative analysis, develops tools to support integrative analyses and meta analyses across multiple, distinct research databases. With the explosion in data-driven research, researchers even in a single domain of study are confronted with many different databases that use different terminologies and measurement schemes. Thus, data become siloed—collected via different processes and described by different metadata schemes with no central index of databases, metadata, or variables, making it difficult for a researcher to identify data of the appropriate type for use in integrative analyses or meta analyses. In Phase I of this effort, a multidisciplinary team of researchers and experts in statistics, epidemiology, data harmonization, machine leaning, ethics, databases, imaging, and software engineering will develop tools to link metadata across four biomedical database, as a proof of concept. The linked information will be available via the MetaMatchMaker (3M) portal to be developed by the project.While traditional neural network approaches could be used to link experimental metadata, that approach can be time consuming, requiring the construction of large training datasets. This project employs an alternative approach based on Pretrained Learning Models (PLMs), combining methods used in Natural Language Processing (NLP) and transfer learning, to allow for the application of data-driven models built in one domain to be applied to another, without the time and expense of developing large training datasets. In Phase I of the effort, a PLM will be developed from a large existing manually trained dataset of PhenX–dbGAP metadata linkage, which will then be used to link metadata from four diverse biomedical databases. The results from Phase I would enable rapid and broader identification of experimental data in less time and with fewer resources devoted to data normalization; second, the PLM approach is expected to provide significant savings in linking experimental metadata across databases by eliminating, or greatly reducing, the need for development of training data. Phase II of this effort will expand the number and variety of linked databases, and also make 3M compliant with developing federated data access procedures for biomedical data, such as Global Alliance for Genomics and Health (GA4GH)’s Authentication and Authorization Infrastructure. The metrics for success of this approach include increased speed and reduced cost of conducting integrative analyses; increased reuse of linked data. While, the proof of concept in Phase I is based on the linkage of biomedical data, if successful, this approach would be applicable to databases frp, many other domains including, for example, national security, weather, environmental research, geosciences, astronomy, forensic analysis, and law enforcement.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
NSF 融合加速器支持以使用为基础的、基于团队的多学科努力,以解决国家重大挑战,并将在不久的将来为社会提供有价值的成果。该项目名为 NSF Convergence Accelerator-Track D:应用顺序归纳迁移学习进行实验元数据标准化,以实现快速综合分析,开发工具来支持跨多个不同研究数据库的综合分析和元分析。随着数据驱动研究的爆炸式增长,即使是单一研究领域的研究人员也面临着许多使用不同术语和测量方案的不同数据库。因此,数据变得孤岛化——通过不同的流程收集并由不同的元数据方案描述,没有数据库、元数据或变量的中心索引,使得研究人员难以识别用于综合分析或荟萃分析的适当类型的数据。在这项工作的第一阶段,一个由统计学、流行病学、数据协调、机器学习、伦理学、数据库、成像和软件工程领域的研究人员和专家组成的多学科团队将开发工具来链接四个生物医学数据库的元数据,作为概念证明。链接的信息将通过该项目开发的 MetaMatchMaker (3M) 门户提供。虽然可以使用传统的神经网络方法来链接实验元数据,但该方法可能非常耗时,需要构建大型训练数据集。该项目采用了一种基于预训练学习模型 (PLM) 的替代方法,结合了自然语言处理 (NLP) 和迁移学习中使用的方法,允许将一个领域中构建的数据驱动模型应用到另一个领域,而无需开发大型训练数据集的时间和费用。在第一阶段的工作中,将从现有的大型手动训练的 PhenX-dbGAP 元数据链接数据集开发 PLM,然后将其用于链接来自四个不同生物医学数据库的元数据。第一阶段的结果将能够在更短的时间内和更少的资源用于数据标准化,从而快速、更广泛地识别实验数据;其次,PLM 方法有望通过消除或大大减少开发培训数据的需要,在跨数据库链接实验元数据方面节省大量成本。这项工作的第二阶段将扩大链接数据库的数量和种类,并使 3M 符合开发生物医学数据联合数据访问程序的要求,例如全球基因组学与健康联盟 (GA4GH) 的身份验证和授权基础设施。这种方法成功的衡量标准包括提高进行综合分析的速度和降低成本;增加链接数据的重用。虽然第一阶段的概念验证基于生物医学数据的链接,但如果成功,这种方法将适用于数据库 frp 以及许多其他领域,包括国家安全、天气、环境研究、地球科学、天文学、法医分析和执法。该奖项反映了 NSF 的法定使命,并通过使用基金会的智力价值和更广泛的影响审查进行评估,被认为值得支持 标准。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Grier Page其他文献
Grier Page的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Grier Page', 18)}}的其他基金
SGER: A Power and Sample Size Atlas for Microarray Research
SGER:微阵列研究的功率和样本量图谱
- 批准号:
0306596 - 财政年份:2003
- 资助金额:
$ 100万 - 项目类别:
Standard Grant
相似海外基金
NSF Convergence Accelerator Track L: HEADLINE - HEAlth Diagnostic eLectronIc NosE
NSF 融合加速器轨道 L:标题 - 健康诊断电子 NosE
- 批准号:
2343806 - 财政年份:2024
- 资助金额:
$ 100万 - 项目类别:
Standard Grant
NSF Convergence Accelerator track L: Translating insect olfaction principles into practical and robust chemical sensing platforms
NSF 融合加速器轨道 L:将昆虫嗅觉原理转化为实用且强大的化学传感平台
- 批准号:
2344284 - 财政年份:2024
- 资助金额:
$ 100万 - 项目类别:
Standard Grant
NSF Convergence Accelerator Track K: Unraveling the Benefits, Costs, and Equity of Tree Coverage in Desert Cities
NSF 融合加速器轨道 K:揭示沙漠城市树木覆盖的效益、成本和公平性
- 批准号:
2344472 - 财政年份:2024
- 资助金额:
$ 100万 - 项目类别:
Standard Grant
NSF Convergence Accelerator Track L: Smartphone Time-Resolved Luminescence Imaging and Detection (STRIDE) for Point-of-Care Diagnostics
NSF 融合加速器轨道 L:用于即时诊断的智能手机时间分辨发光成像和检测 (STRIDE)
- 批准号:
2344476 - 财政年份:2024
- 资助金额:
$ 100万 - 项目类别:
Standard Grant
NSF Convergence Accelerator Track L: Intelligent Nature-inspired Olfactory Sensors Engineered to Sniff (iNOSES)
NSF 融合加速器轨道 L:受自然启发的智能嗅觉传感器,专为嗅探而设计 (iNOSES)
- 批准号:
2344256 - 财政年份:2024
- 资助金额:
$ 100万 - 项目类别:
Standard Grant
NSF Convergence Accelerator Track K: COMPASS: Comprehensive Prediction, Assessment, and Equitable Solutions for Storm-Induced Contamination of Freshwater Systems
NSF 融合加速器轨道 K:COMPASS:风暴引起的淡水系统污染的综合预测、评估和公平解决方案
- 批准号:
2344357 - 财政年份:2024
- 资助金额:
$ 100万 - 项目类别:
Standard Grant
NSF Convergence Accelerator Track M: Water-responsive Materials for Evaporation Energy Harvesting
NSF 收敛加速器轨道 M:用于蒸发能量收集的水响应材料
- 批准号:
2344305 - 财政年份:2024
- 资助金额:
$ 100万 - 项目类别:
Standard Grant
NSF Convergence Accelerator (L): Innovative approach to monitor methane emissions from livestock using an advanced gravimetric microsensor.
NSF Convergence Accelerator (L):使用先进的重力微传感器监测牲畜甲烷排放的创新方法。
- 批准号:
2344426 - 财政年份:2024
- 资助金额:
$ 100万 - 项目类别:
Standard Grant
NSF Convergence Accelerator, Track K: Mapping the nation's wetlands for equitable water quality, monitoring, conservation, and policy development
NSF 融合加速器,K 轨道:绘制全国湿地地图,以实现公平的水质、监测、保护和政策制定
- 批准号:
2344174 - 财政年份:2024
- 资助金额:
$ 100万 - 项目类别:
Standard Grant
NSF Convergence Accelerator Track M: A new biomanufacturing process for making precipitated calcium carbonate and plant-based compounds that support human health
NSF Convergence Accelerator Track M:一种新的生物制造工艺,用于制造支持人类健康的沉淀碳酸钙和植物基化合物
- 批准号:
2344228 - 财政年份:2024
- 资助金额:
$ 100万 - 项目类别:
Standard Grant














{{item.name}}会员




