Elements: Crowdsourced Materials Data Engine for Unpublished XRD Results
Elements:用于未发布 XRD 结果的众包材料数据引擎
基本信息
- 批准号:2104007
- 负责人:
- 金额:$ 55.44万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-08-01 至 2024-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Although data-driven analysis has been heralded as a new paradigm in fundamental material science such as X-ray Diffraction (XRD) analysis, high-value material datasets are often not made public and are underutilized. This project designs and develops CRUX, a crowdsourced data infrastructure and services to curate, discover, share, and recommend unpublished XRD data and analytical results. CRUX promotes underutilized high-quality material science data by allowing the sharing and exploration of unpublished data with state-of-the-art crowdsourcing, knowledge harvesting, and machine learning techniques. CRUX provides a crowdsourced knowledge base to allow scientists and the general public to share and access unpublished data resources. It also provides (a) a novel search engine that supports simple keyword search, can provide relevant data resources when the exact keyword matching does not exist, and self-evolves to improve the search quality, and (b) a "data feed" service to allow users to easily receive and track updates of specific data resources of interest. The developed infrastructure and tools enable an open, collaborative, and sustainable platform that can facilitate exchanging of unpublished XRD data and discoveries, unlock new research problems (e.g., predictive analysis of materials compositions with multi-phase data), and inspire the novel design of machine learning pipelines (e.g., deep neural networks) for data-driven materials science. CRUX will make materials data resources available and shareable for a broad community including materials scientists, data analysts, software developers, and the general public, and thus promote long-term collaborative research, software development, and education. The developed CRUX system enables (1) coherent representation of materials data, metadata, and knowledge in terms of a three-tier knowledge graph model; (2) scalable XRD metadata curation and information extraction techniques to promote high-value unpublished XRD data sources for data-driven materials research; (3) adaptive, self-improving search and recommendation techniques to recommend relevant datasets upon user requests and feedback, with sustainability beyond the time of the project; and (4) interactive and exploratory search techniques to explain and recommend the relevant datasets beyond the scope of initial queries. CRUX will be evaluated with established human-in-the-loop knowledge bases and active machine learning algorithms by cornerstone materials research such as the discovery of new high-temperature ferroelectrics. The research community will be able to share XRD data resources (analytical results, machine learning models, processing data) via "one-click" upload, search for high-quality data resources, and (re)discover new resources for machine learning pipelines. CRUX enables several components to advance data-driven materials research, including a materials knowledge graph model, automatic data integration, and exploratory query engine that support "Why" and "What-if" analysis for XRD analysis. Developed solutions will benefit data-driven material science in general. For example, researchers can make use of unpublished two-phase data to predict new materials compositions, identify solubility limits through parameterization by machine learning tools, and refine machine learning models with more sophisticated techniques such as deep neural networks.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
尽管数据驱动分析已被誉为基础材料科学(如X射线衍射(XRD)分析)的新范式,但高价值材料数据集通常不公开,且未得到充分利用。该项目设计和开发CRUX,这是一种众包数据基础设施和服务,用于管理,发现,共享和推荐未发布的XRD数据和分析结果。CRUX促进未充分利用的高质量材料科学数据,允许使用最先进的众包,知识收集和机器学习技术共享和探索未发表的数据。CRUX提供了一个众包知识库,使科学家和公众能够共享和访问未发布的数据资源。它还提供了(a)一种新颖的搜索引擎,该搜索引擎支持简单的关键字搜索,当不存在精确的关键字匹配时可以提供相关的数据资源,并且自我进化以提高搜索质量,以及(B)一种“数据馈送”服务,以允许用户容易地接收和跟踪感兴趣的特定数据资源的更新。开发的基础设施和工具实现了一个开放、协作和可持续的平台,可以促进未发表的XRD数据和发现的交换,解开新的研究问题(例如,具有多相数据的材料成分的预测分析),并启发机器学习流水线的新颖设计(例如,深度神经网络)用于数据驱动的材料科学。CRUX将为包括材料科学家、数据分析师、软件开发人员和公众在内的广泛社区提供和共享材料数据资源,从而促进长期合作研究、软件开发和教育。开发的CRUX系统实现了(1)材料数据、元数据和知识在三层知识图模型方面的一致表示;(2)可扩展的XRD元数据管理和信息提取技术,以促进高价值的未发布XRD数据源用于数据驱动的材料研究;(3)自适应、自我改进的搜索和推荐技术,以根据用户请求和反馈推荐相关数据集,并在项目结束后保持可持续性;以及(4)交互式和探索性搜索技术,用于解释和推荐超出初始查询范围的相关数据集。CRUX将通过基础材料研究(如发现新的高温铁电体),利用已建立的人在回路知识库和主动机器学习算法进行评估。研究社区将能够通过“一键”上传共享XRD数据资源(分析结果,机器学习模型,处理数据),搜索高质量的数据资源,并(重新)发现机器学习管道的新资源。CRUX使几个组件能够推进数据驱动的材料研究,包括材料知识图模型,自动数据集成和探索性查询引擎,支持XRD分析的“为什么”和“如果”分析。开发的解决方案将有利于数据驱动的材料科学。例如,研究人员可以利用未发表的两相数据来预测新材料的组成,通过机器学习工具的参数化来确定溶解度极限,并使用更复杂的技术(如深度神经网络)来完善机器学习模型。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Subgraph Query Generation with Fairness and Diversity Constraints
- DOI:10.1109/icde53745.2022.00278
- 发表时间:2022-05
- 期刊:
- 影响因子:0
- 作者:Hanchao Ma;Sheng Guan;Mengying Wang;Yen-shuo Chang;Yinghui Wu
- 通讯作者:Hanchao Ma;Sheng Guan;Mengying Wang;Yen-shuo Chang;Yinghui Wu
Diversified Subgraph Query Generation with Group Fairness
- DOI:10.1145/3488560.3498525
- 发表时间:2022-02
- 期刊:
- 影响因子:0
- 作者:Hanchao Ma;Sheng Guan;Christopher Toomey;Yinghui Wu
- 通讯作者:Hanchao Ma;Sheng Guan;Christopher Toomey;Yinghui Wu
GALE: Active Adversarial Learning for Erroneous Node Detection in Graphs
- DOI:10.1109/icde55515.2023.00134
- 发表时间:2023-04
- 期刊:
- 影响因子:0
- 作者:Sheng Guan;Hanchao Ma;Mengying Wang;Yinghui Wu
- 通讯作者:Sheng Guan;Hanchao Ma;Mengying Wang;Yinghui Wu
Fair Group Summarization with Graph Patterns
- DOI:10.1109/icde55515.2023.00154
- 发表时间:2023-04
- 期刊:
- 影响因子:0
- 作者:Hanchao Ma;Sheng Guan;Mengying Wang;Qi Song;Yinghui Wu
- 通讯作者:Hanchao Ma;Sheng Guan;Mengying Wang;Qi Song;Yinghui Wu
CRUX: Crowdsourced Materials Science Resource and Workflow Exploration
- DOI:10.1145/3511808.3557194
- 发表时间:2022-10
- 期刊:
- 影响因子:0
- 作者:Mengying Wang;Hanchao Ma;Abhishek Daundkar;Sheng Guan;Yiyang Bian;A. Sehirlioglu;Yinghui Wu
- 通讯作者:Mengying Wang;Hanchao Ma;Abhishek Daundkar;Sheng Guan;Yiyang Bian;A. Sehirlioglu;Yinghui Wu
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Yinghui Wu其他文献
Understanding the Impact of Cu-In-Ga-S Nanoparticles Compactness on Holes Transfer of Perovskite Solar Cells
了解 Cu-In-Ga-S 纳米颗粒致密性对钙钛矿太阳能电池空穴传输的影响
- DOI:
10.3390/nano9020286 - 发表时间:
2019 - 期刊:
- 影响因子:5.3
- 作者:
D;an Zhao;Yinghui Wu;Bao Tu;Guichuan Xing;Haifeng Li;Zhubing He - 通讯作者:
Zhubing He
放射線の健康リスク科学教育は従来の放射線教育とどこが違うのか
辐射健康风险科学教育与传统辐射教育有何不同?
- DOI:
- 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
Kunichika Matsumoto;Simpei Hanaoka;Yinghui Wu;Tomonori Hasegawa;神田玲子 - 通讯作者:
神田玲子
Demonstration of Geyser: Provenance Extraction and Applications over Data Science Scripts
Geyser 演示:数据科学脚本的来源提取和应用
- DOI:
10.1145/3555041.3589717 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Fotis Psallidas;Megan Leszczynski;M. Namaki;Avrilia Floratou;Ashvin Agrawal;Konstantinos Karanasos;Subru Krishnan;Pavle Subotić;Markus Weimer;Yinghui Wu;Yiwen Zhu - 通讯作者:
Yiwen Zhu
Oxidative Stress and Inflammation in Sows with Excess Backfat: Up-Regulated Cytokine Expression and Elevated Oxidative Stress Biomarkers in Placenta
背膘过多母猪的氧化应激和炎症:胎盘中细胞因子表达上调和氧化应激生物标志物升高
- DOI:
10.3390/ani9100796 - 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
Yuanfei Zhou;Tao Xu;Yinghui Wu;Hongkui Wei;Jian Peng - 通讯作者:
Jian Peng
Synergy Effect of Both 2,2,2-Trifluoroethylamine Hydrochloride and SnF2 for Highly Stable FASnI3-xClx Perovskite Solar Cells
2,2,2-三氟乙胺盐酸盐和 SnF2 对高稳定 FASnI3-xClx 钙钛矿太阳能电池的协同效应
- DOI:
10.1002/solr.201800290 - 发表时间:
2019 - 期刊:
- 影响因子:7.9
- 作者:
Bin-Bin Yu;Leiming Xu;Min Liao;Yinghui Wu;Fangzhou Liu;Zhenfei Zhang;Jie Ding;Wei Chen;Bao Tu;Yi Lin;Yudong Zhu;Xusheng Zhang;Weitang Yao;Aleks;ra B. Djurišić;Jin-Song Hu;Zhubing He - 通讯作者:
Zhubing He
Yinghui Wu的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Yinghui Wu', 18)}}的其他基金
BIGDATA: Collaborative Research: F: Association Analysis of Big Graphs: Models, Algorithms and Applications
BIGDATA:协作研究:F:大图关联分析:模型、算法和应用
- 批准号:
1633629 - 财政年份:2016
- 资助金额:
$ 55.44万 - 项目类别:
Standard Grant
相似海外基金
Design and Development of a Near Real-Time Community Crowdsourced Resilience Information System for Enhancing Community Resilience in the Face of Flooding and other Extreme Events
设计和开发近实时社区众包抗灾信息系统,以增强社区面对洪水和其他极端事件的抗灾能力
- 批准号:
2325631 - 财政年份:2023
- 资助金额:
$ 55.44万 - 项目类别:
Standard Grant
IMR: MM-1A: ADDRESS: Augment, Denoise and Debias cRowdsourced mEasurements for Statistical Synthesis of internet access characterization
IMR:MM-1A:地址:互联网接入特征统计综合的增强、降噪和去偏众包测量
- 批准号:
2220417 - 财政年份:2022
- 资助金额:
$ 55.44万 - 项目类别:
Standard Grant
expertise analysis in crowdsourced software engineering
众包软件工程中的专业知识分析
- 批准号:
573415-2022 - 财政年份:2022
- 资助金额:
$ 55.44万 - 项目类别:
University Undergraduate Student Research Awards
Effectively managing and leveraging crowdsourced knowledge for software engineering
有效管理和利用软件工程的众包知识
- 批准号:
RGPIN-2021-03354 - 财政年份:2022
- 资助金额:
$ 55.44万 - 项目类别:
Discovery Grants Program - Individual
PFI-TT: Crowdsourced Road Geometry Estimation using Smartphones
PFI-TT:使用智能手机进行众包道路几何估计
- 批准号:
2044670 - 财政年份:2021
- 资助金额:
$ 55.44万 - 项目类别:
Standard Grant
Effectively managing and leveraging crowdsourced knowledge for software engineering
有效管理和利用软件工程的众包知识
- 批准号:
DGECR-2021-00441 - 财政年份:2021
- 资助金额:
$ 55.44万 - 项目类别:
Discovery Launch Supplement
Leveraging Crowdsourced Data to Assess Spatiotemporal Patterns of Resilience in Diverse Gulf Coast Communities Impacted by Natural Hazards
利用众包数据评估受自然灾害影响的墨西哥湾沿岸不同社区的复原力时空模式
- 批准号:
2053588 - 财政年份:2021
- 资助金额:
$ 55.44万 - 项目类别:
Standard Grant
Effectively managing and leveraging crowdsourced knowledge for software engineering
有效管理和利用软件工程的众包知识
- 批准号:
RGPIN-2021-03354 - 财政年份:2021
- 资助金额:
$ 55.44万 - 项目类别:
Discovery Grants Program - Individual
Investigating crowdsourced digital activism and the security threats these actions pose
调查众包数字行动主义以及这些行动造成的安全威胁
- 批准号:
2603532 - 财政年份:2021
- 资助金额:
$ 55.44万 - 项目类别:
Studentship
CRII:HCC: A Crowdsourced Social Computing Platform with Gamification Mechanisms for Healthy Eating
CRII:HCC:具有健康饮食游戏化机制的众包社交计算平台
- 批准号:
2104515 - 财政年份:2021
- 资助金额:
$ 55.44万 - 项目类别:
Standard Grant