III: Small: Learning Multi-scale Sequence Features for Predicting Gene to Microbiome Function
III:小:学习多尺度序列特征以预测基因与微生物组的功能
基本信息
- 批准号:2107108
- 负责人:
- 金额:$ 49.29万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-09-01 至 2024-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Microbial communities play vital roles in health and the environment. In human health, they are referred to as our microbiomes; for example, healthy gut microbiomes can help digest and efficiently convert food to nutrients to be taken up in our gut. However, what constitutes “unhealthy” (dysbiotic) microbiomes and how they can affect or be affected by the body (or environment) is unknown. If we can understand microbes’ interactions with each other and the body, then we can design better treatments, therapies, and medicines (e.g. pre- and pro-biotics) to manipulate microbiomes. To understand the ``rules'' for microbial ecosystems, we must first solve the genotype-to-phenotype problem, i.e. identify microbial genetic changes which correlate to changes in microbiome functioning/traits. Most researchers have simply focused on predicting environmental or disease phenotypes by solely using microbiome community structure (ie: a population census of species in a community) and do not consider detailed DNA/RNA differences. It is not surprising that most studies have yielded modest prediction accuracy and little understanding of how microbiomes function. Attributing which “configurations” of organisms and/or genes contribute to a particular “microbiome state” can help us predict disease, understand how the environment may change microbial ecosystems, and be able to predict future changes of these systems (e.g. perturbations due to a chemical, temperature, etc.). Methods that can learn pertinent features at multiple scales (genome-, organism-, and community-level) simultaneously, are needed to interpret both the “species census” and microbial genetic changes (mutations that may lead to speciation and/or functional evolution) that influence community structure. Our educational activities will bring cutting edge research and topics to undergraduate and graduate education in Bioinformatics-related courses, which are part of Machine Learning and Bioinformatics Masters programs and a Bachelor’s bioinformatics minor at Drexel University. In addition, we plan to organize a Drexel College of Engineering-wide high school extracurricular program for mentoring of science projects for underserved public schools. A unified algorithm is needed to learn microbiome features on multiple levels to be able to predict microbiome functioning, thereby identifying biological processes (a.k.a harnessing data to understand the rules of life, NSF Big10 goals) that result in important “states” (e.g. disease or healthy). Doing so will transform our understanding of how large- and small-scale changes influence microbiome phenotypes. Current approaches are highly limited. Phenotype prediction based on 16S rRNA surveys is usually conducted solely on microbial operational taxonomic units (OTUs), which rarely capture the mutations that signify overall phenotypic changes. Phenotype prediction using metagenomes may perform better than 16S surveys, but many downstream analyses (feature selection, statistical tests) are needed to interpret (e.g. infer subcommunities relevant to phenotype) this classification. Therefore, we propose to develop a recurrent neural network (RNN) that can learn both community-level changes in the microbiome and genetic changes that relate to microbiome phenotypes. While most neural networks can ``learn'' features, it is usually difficult to get this information back out of the network (i.e.: interpretation). We will also use the recent advances in attention-based RNNs that will help us interpret which multi-scale features are most important to phenotype prediction. We will make our algorithms and software available to the microbiome community, whose potential applications include improving agriculture, environmental monitoring, personalized medicine, among others.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
微生物群落在健康和环境中发挥着至关重要的作用。在人类健康中,它们被称为我们的微生物组;例如,健康的肠道微生物群可以帮助消化并有效地将食物转化为肠道吸收的营养物质。然而,什么构成“不健康”(生态不良)微生物群以及它们如何影响或被身体(或环境)影响尚不清楚。如果我们能够了解微生物之间以及微生物与人体之间的相互作用,那么我们就可以设计出更好的治疗方法、疗法和药物(例如益生菌和益生菌)来操纵微生物群。为了理解微生物生态系统的“规则”,我们必须首先解决基因型到表型的问题,即识别与微生物组功能/性状变化相关的微生物遗传变化。大多数研究人员仅仅通过微生物群落结构(即群落中物种的种群普查)来预测环境或疾病表型,而没有考虑详细的DNA/RNA差异。这并不奇怪,大多数研究得出了适度的预测准确性和对微生物群如何运作的了解很少。归因于生物体和/或基因的哪些“配置”有助于特定的“微生物组状态”,可以帮助我们预测疾病,了解环境如何改变微生物生态系统,并能够预测这些系统的未来变化(例如,由于化学,温度等引起的扰动)。需要能够同时在多个尺度(基因组、生物体和群落水平)学习相关特征的方法来解释影响群落结构的“物种普查”和微生物遗传变化(可能导致物种形成和/或功能进化的突变)。我们的教育活动将为生物信息学相关课程的本科和研究生教育带来最前沿的研究和主题,这些课程是德雷塞尔大学机器学习和生物信息学硕士课程和生物信息学学士辅修课程的一部分。此外,我们计划组织一个德雷克塞尔工程学院范围内的高中课外项目,为服务不足的公立学校指导科学项目。需要一个统一的算法来学习微生物组在多个层面上的特征,以便能够预测微生物组的功能,从而识别导致重要“状态”(例如疾病或健康)的生物过程(也称为利用数据来理解生命规则,NSF Big10目标)。这样做将改变我们对大规模和小规模变化如何影响微生物组表型的理解。目前的方法非常有限。基于16S rRNA调查的表型预测通常仅在微生物操作分类单位(OTUs)上进行,很少捕捉到表明整体表型变化的突变。使用宏基因组进行表型预测可能比16S调查效果更好,但需要许多下游分析(特征选择、统计检验)来解释(例如推断与表型相关的亚群落)这种分类。因此,我们建议开发一种循环神经网络(RNN),可以学习微生物组的社区水平变化和与微生物组表型相关的遗传变化。虽然大多数神经网络可以“学习”特征,但通常很难从网络中获得这些信息(即:解释)。我们还将使用基于注意力的rnn的最新进展,这将帮助我们解释哪些多尺度特征对表型预测最重要。我们将把我们的算法和软件提供给微生物群落,其潜在的应用包括改善农业、环境监测、个性化医疗等。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(11)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Predicting Institution Outcomes for Inter Partes Review (IPR) Proceedings at the United States Patent Trial & Appeal Board by Deep Learning of Patent Owner Preliminary Response Briefs
- DOI:10.3390/app12073656
- 发表时间:2022-04
- 期刊:
- 影响因子:0
- 作者:B. Sokhansanj;G. Rosen
- 通讯作者:B. Sokhansanj;G. Rosen
Physiological and evolutionary contexts of a new symbiotic species from the nitrogen-recycling gut community of turtle ants
- DOI:10.1038/s41396-023-01490-1
- 发表时间:2023-08-09
- 期刊:
- 影响因子:11
- 作者:Bechade,Benoit;Cabuslay,Christian S.;Russell,Jacob A.
- 通讯作者:Russell,Jacob A.
How Scalable Are Clade-Specific Marker K-Mer Based Hash Methods for Metagenomic Taxonomic Classification?
- DOI:10.3389/frsip.2022.842513
- 发表时间:2022-07-05
- 期刊:
- 影响因子:0
- 作者:Gray,Melissa;Zhao,Zhengqiao;Rosen,Gail L.
- 通讯作者:Rosen,Gail L.
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Gail Rosen其他文献
Low-power realization of FIR filters using current-mode analog design techniques
使用电流模式模拟设计技术低功耗实现 FIR 滤波器
- DOI:
10.1109/acssc.2004.1399562 - 发表时间:
2004 - 期刊:
- 影响因子:0
- 作者:
V. Srinivasan;Gail Rosen;Paul Hasler - 通讯作者:
Paul Hasler
Implementation of a Hebbian chemoreceptor model for diffusive source localization
- DOI:
10.1016/j.biosystems.2009.02.003 - 发表时间:
2009-06-01 - 期刊:
- 影响因子:
- 作者:
Gail Rosen;Paul Hasler;Mark T. Smith - 通讯作者:
Mark T. Smith
Predicting Anti-microbial Resistance using Large Language Models
使用大型语言模型预测抗菌药物耐药性
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Hyunwoo Yoo;B. Sokhansanj;James R. Brown;Gail Rosen - 通讯作者:
Gail Rosen
Gail Rosen的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Gail Rosen', 18)}}的其他基金
Collaborative Research: IIBR Informatics: Keeping up with the genomes - Continual Learning of Metagenomic Data
合作研究:IIBR 信息学:跟上基因组的步伐 - 宏基因组数据的持续学习
- 批准号:
1936791 - 财政年份:2020
- 资助金额:
$ 49.29万 - 项目类别:
Standard Grant
MRI: Proteus++: Enabling Data-Intensive Computing at Drexel University
MRI:Proteus:在德雷塞尔大学实现数据密集型计算
- 批准号:
1919691 - 财政年份:2019
- 资助金额:
$ 49.29万 - 项目类别:
Standard Grant
Hypothesis-driven Computational Genomics: Engaging Students in Lab Protocols and Bioinformatics via Inquiry
假设驱动的计算基因组学:通过探究让学生参与实验室协议和生物信息学
- 批准号:
1245632 - 财政年份:2013
- 资助金额:
$ 49.29万 - 项目类别:
Standard Grant
CAREER: A Machine Learning Framework for Metagenomic Relationships
职业:宏基因组关系的机器学习框架
- 批准号:
0845827 - 财政年份:2009
- 资助金额:
$ 49.29万 - 项目类别:
Standard Grant
Inquiry-based Laboratories for Engaging Students of Creative and Performing Arts in STEM
让创意和表演艺术学生参与 STEM 的探究式实验室
- 批准号:
0733284 - 财政年份:2007
- 资助金额:
$ 49.29万 - 项目类别:
Standard Grant
相似国自然基金
昼夜节律性small RNA在血斑形成时间推断中的法医学应用研究
- 批准号:
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
tRNA-derived small RNA上调YBX1/CCL5通路参与硼替佐米诱导慢性疼痛的机制研究
- 批准号:n/a
- 批准年份:2022
- 资助金额:10.0 万元
- 项目类别:省市级项目
Small RNA调控I-F型CRISPR-Cas适应性免疫性的应答及分子机制
- 批准号:32000033
- 批准年份:2020
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
Small RNAs调控解淀粉芽胞杆菌FZB42生防功能的机制研究
- 批准号:31972324
- 批准年份:2019
- 资助金额:58.0 万元
- 项目类别:面上项目
变异链球菌small RNAs连接LuxS密度感应与生物膜形成的机制研究
- 批准号:81900988
- 批准年份:2019
- 资助金额:21.0 万元
- 项目类别:青年科学基金项目
肠道细菌关键small RNAs在克罗恩病发生发展中的功能和作用机制
- 批准号:31870821
- 批准年份:2018
- 资助金额:56.0 万元
- 项目类别:面上项目
基于small RNA 测序技术解析鸽分泌鸽乳的分子机制
- 批准号:31802058
- 批准年份:2018
- 资助金额:26.0 万元
- 项目类别:青年科学基金项目
Small RNA介导的DNA甲基化调控的水稻草矮病毒致病机制
- 批准号:31772128
- 批准年份:2017
- 资助金额:60.0 万元
- 项目类别:面上项目
基于small RNA-seq的针灸治疗桥本甲状腺炎的免疫调控机制研究
- 批准号:81704176
- 批准年份:2017
- 资助金额:20.0 万元
- 项目类别:青年科学基金项目
水稻OsSGS3与OsHEN1调控small RNAs合成及其对抗病性的调节
- 批准号:91640114
- 批准年份:2016
- 资助金额:85.0 万元
- 项目类别:重大研究计划
相似海外基金
III: Small: Multiple Device Collaborative Learning in Real Heterogeneous and Dynamic Environments
III:小:真实异构动态环境中的多设备协作学习
- 批准号:
2311990 - 财政年份:2023
- 资助金额:
$ 49.29万 - 项目类别:
Standard Grant
III: SMALL: Graph Contrastive Learning for Few-Shot Node Classification
III:SMALL:少样本节点分类的图对比学习
- 批准号:
2229461 - 财政年份:2023
- 资助金额:
$ 49.29万 - 项目类别:
Standard Grant
III: Small: A Big Data and Machine Learning Approach for Improving the Efficiency of Two-sided Online Labor Markets
III:小:提高双边在线劳动力市场效率的大数据和机器学习方法
- 批准号:
2311582 - 财政年份:2023
- 资助金额:
$ 49.29万 - 项目类别:
Standard Grant
III : Small : Integrating and Learning on Spatial Data via Multi-Agent Simulation
III:小:通过多智能体模拟集成和学习空间数据
- 批准号:
2311954 - 财政年份:2023
- 资助金额:
$ 49.29万 - 项目类别:
Standard Grant
III: Small: A New Machine Learning Paradigm Towards Effective yet Efficient Foundation Graph Learning Models
III:小型:一种新的机器学习范式,实现有效且高效的基础图学习模型
- 批准号:
2321504 - 财政年份:2023
- 资助金额:
$ 49.29万 - 项目类别:
Standard Grant
III: Small: Deep Interactive Reinforcement Learning for Self-optimizing Feature Selection
III:小:用于自优化特征选择的深度交互式强化学习
- 批准号:
2152030 - 财政年份:2022
- 资助金额:
$ 49.29万 - 项目类别:
Standard Grant
III: Small: Distributed Reinforcement Learning over Complex Networks
III:小型:复杂网络上的分布式强化学习
- 批准号:
2230101 - 财政年份:2022
- 资助金额:
$ 49.29万 - 项目类别:
Standard Grant
EAGER: III: Small: Green Granular Neural Networks with Fast FPGA-based Incremental Transfer Learning
EAGER:III:小型:具有基于 FPGA 的快速增量迁移学习的绿色粒度神经网络
- 批准号:
2234227 - 财政年份:2022
- 资助金额:
$ 49.29万 - 项目类别:
Standard Grant
Collaborative Research: III: Small: Robust Learning and Inference Protocols for Mitigating Information Pollution
合作研究:III:小型:用于减轻信息污染的鲁棒学习和推理协议
- 批准号:
2135581 - 财政年份:2022
- 资助金额:
$ 49.29万 - 项目类别:
Standard Grant
Collaborative Research: III: Small: Robust Learning and Inference Protocols for Mitigating Information Pollution
合作研究:III:小型:用于减轻信息污染的鲁棒学习和推理协议
- 批准号:
2135573 - 财政年份:2022
- 资助金额:
$ 49.29万 - 项目类别:
Standard Grant