SIFTER: A Systems Biology Platform for Protein Function Prediction
SIFTER:蛋白质功能预测的系统生物学平台
基本信息
- 批准号:1122732
- 负责人:
- 金额:$ 24万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Fellowship Award
- 财政年份:2011
- 资助国家:美国
- 起止时间:2011-09-01 至 2014-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Proteins are key biomolecules involved in virtually all processes within cells,e.g., metabolism, cell signaling, immune response, etc., and knowledge ofprotein function is vital to obtain a basic understanding of cellular activity.Due to recent advances in nucleotide sequencing technology, the number ofavailable genomic sequences is doubling in size roughly every 12 months, anincredibly fast pace vastly exceeding Moore's law. Experimental technologiesrequired to decipher protein function have not progressed nearly as fast. Infact, although there are roughly 10 million protein sequences in thecomprehensive Uniprot database, only 0.2% have experimentally validatedfunction annotations. This sequence-function gap is rapidly expanding, and thedevelopment of computational methods is of crucial importance to effectivelyutilize this deluge of sequence data.In this work, we develop SIFTER, a large-scale, systems biology platform toaccurately predict protein function from high-throughput data. Building upon apromising phylogenomic-based prototype, we incorporate interaction networksinto our model to improve performance. Interaction data intrinsically couplesthe thousands to millions of proteins within such networks, and we usevariational inference and parallelized implementations to address thischallenging computational problem. We also explore techniques for functionprediction based on low-rank matrix factorization, and along the way, introducenovel sampling-based approaches to speed up computation. Additionally, wedevelop algorithms to quantify uncertainty in SIFTER's predictions tohelp guide future experimental work. These novel algorithms are large-scaleextensions to classical bootstrap sampling and are generally applicable to anyproblem involving massive data. Finally, we evaluate SIFTER incollaboration with experimental biologists, allowing us to pinpoint relevantuse cases and resulting in an effective method with widespread impact withinthe biomedical community.
蛋白质是细胞内几乎所有过程中的关键生物分子,如新陈代谢、细胞信号、免疫反应等,而蛋白质功能的知识对于基本了解细胞活动是至关重要的。由于核苷酸测序技术的最新进展,可用的基因组序列的数量大约每12个月翻一番,速度惊人地超过摩尔定律。破译蛋白质功能所需的实验技术进展几乎没有那么快。事实上,尽管在综合性的UniProt数据库中大约有1000万个蛋白质序列,但只有0.2%的蛋白质序列经过实验验证了功能注释。这种序列与功能的差距正在迅速扩大,而计算方法的发展对于有效利用这种海量的序列数据至关重要。在这项工作中,我们开发了一个大规模的系统生物学平台Screter,用于从高通量数据中准确地预测蛋白质功能。在一个有前景的基于系统基因组学的原型的基础上,我们将交互网络纳入到我们的模型中,以提高性能。相互作用数据本质上连接了这样的网络中数以千计到数百万的蛋白质,我们使用变分推理和并行实现来解决这一具有挑战性的计算问题。我们还探索了基于低阶矩阵分解的函数预测技术,并在此过程中引入了新的基于采样的方法来加快计算速度。此外,我们还开发了算法来量化Siefter预测中的不确定性,以帮助指导未来的实验工作。这些新算法是对经典Bootstrap抽样的大规模扩展,一般适用于任何涉及海量数据的问题。最后,我们与实验生物学家合作对SIFTER进行评估,使我们能够准确地确定相关的用例,并产生在生物医学界具有广泛影响的有效方法。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Ameet Talwalkar其他文献
AutoML Decathlon: Diverse Tasks, Modern Methods, and Efficiency at Scale
AutoML Decathlon:多样化的任务、现代方法和大规模效率
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Nicholas Roberts;Samuel Guo;Cong Xu;Ameet Talwalkar;David Lander;Lvfang Tao;Linhang Cai;Shuaicheng Niu;Jianyu Heng;Hongyang Qin;Minwen Deng;Johannes Hog;Alexander Pfefferle;Sushil Ammanaghatta Shivakumar;Arjun Krishnakumar;Yubo Wang;R. Sukthanker;Frank Hutter;Euxhen Hasanaj;Tien;M. Khodak;Yuriy Nevmyvaka;Kashif Rasul;Frederic Sala;Anderson Schneider;Junhong Shen;Evan R. Sparks - 通讯作者:
Evan R. Sparks
NAS-Bench-360: Benchmarking Diverse Tasks for Neural Architecture Search
NAS-Bench-360:神经架构搜索的各种任务基准测试
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Renbo Tu;M. Khodak;Nicholas Roberts;Ameet Talwalkar - 通讯作者:
Ameet Talwalkar
On the support recovery of marginal regression.
关于边际回归的支持恢复。
- DOI:
- 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
S. J. Kazemitabar;A. Amini;Ameet Talwalkar - 通讯作者:
Ameet Talwalkar
Applying interpretable machine learning in computational biology—pitfalls, recommendations and opportunities for new developments
在计算生物学中应用可解释机器学习——陷阱、建议和新发展的机会
- DOI:
10.1038/s41592-024-02359-7 - 发表时间:
2024-08-09 - 期刊:
- 影响因子:32.100
- 作者:
Valerie Chen;Muyu Yang;Wenbo Cui;Joon Sik Kim;Ameet Talwalkar;Jian Ma - 通讯作者:
Jian Ma
Targeted treatment of folate receptor-positive platinum-resistant ovarian cancer and companion diagnostics, with specific focus on vintafolide and etarfolatide
叶酸受体阳性铂耐药性卵巢癌的靶向治疗和伴随诊断,特别关注vintafolide和etarfolatide
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
Nicholas Roberts;Samuel Guo;Cong Xu;Ameet Talwalkar;David Lander;Lvfang Tao;Linhang Cai;Shuaicheng Niu;Jianyu Heng;Hongyang Qin;Minwen Deng;Johannes Hog;Alexander Pfefferle;Sushil Ammanaghatta Shivakumar;Arjun Krishnakumar;Yubo Wang;R. Sukthanker;Frank Hutter;Euxhen Hasanaj;Tien;M. Khodak;Yuriy Nevmyvaka;Kashif Rasul;Frederic Sala;Anderson Schneider;Junhong Shen;Evan R. Sparks - 通讯作者:
Evan R. Sparks
Ameet Talwalkar的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Ameet Talwalkar', 18)}}的其他基金
Travel: NSF Student Travel Grant for the Sixth Conference on Machine Learning and Systems (MLSys 2023)
旅行:第六届机器学习和系统会议 (MLSys 2023) 的 NSF 学生旅行补助金
- 批准号:
2325547 - 财政年份:2023
- 资助金额:
$ 24万 - 项目类别:
Standard Grant
CAREER: Foundations of Next-Generation Neural Architecture Search
职业:下一代神经架构搜索的基础
- 批准号:
2046613 - 财政年份:2021
- 资助金额:
$ 24万 - 项目类别:
Continuing Grant
BIGDATA: F: Optimization in Federated Networks of Devices
BIGDATA:F:设备联合网络的优化
- 批准号:
1838017 - 财政年份:2019
- 资助金额:
$ 24万 - 项目类别:
Standard Grant
Model-Parallel Collaborative Filtering in Apache Spark
Apache Spark 中的模型并行协同过滤
- 批准号:
1555772 - 财政年份:2015
- 资助金额:
$ 24万 - 项目类别:
Standard Grant
相似国自然基金
Graphon mean field games with partial observation and application to failure detection in distributed systems
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
基于“阳化气、阴成形”理论探讨龟鹿二仙胶调控 HIF-1α/Systems Xc-通路抑制铁死亡治疗少弱精子症的作用机理
- 批准号:
- 批准年份:2024
- 资助金额:15.0 万元
- 项目类别:省市级项目
EstimatingLarge Demand Systems with MachineLearning Techniques
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国学者研究基金
Understanding complicated gravitational physics by simple two-shell systems
- 批准号:12005059
- 批准年份:2020
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
Simulation and certification of the ground state of many-body systems on quantum simulators
- 批准号:
- 批准年份:2020
- 资助金额:40 万元
- 项目类别:
全基因组系统作图(systems mapping)研究三种细菌种间互作遗传机制
- 批准号:31971398
- 批准年份:2019
- 资助金额:58.0 万元
- 项目类别:面上项目
The formation and evolution of planetary systems in dense star clusters
- 批准号:11043007
- 批准年份:2010
- 资助金额:10.0 万元
- 项目类别:专项基金项目
相似海外基金
Computational Systems Biology for Investigating Infectious Diseases
研究传染病的计算系统生物学
- 批准号:
502567 - 财政年份:2024
- 资助金额:
$ 24万 - 项目类别:
Computational topology and geometry for systems biology
系统生物学的计算拓扑和几何
- 批准号:
EP/Z531224/1 - 财政年份:2024
- 资助金额:
$ 24万 - 项目类别:
Research Grant
BIORETS: Convergence Research Experiences for Teachers in Synthetic and Systems Biology to Address Challenges in Food, Health, Energy, and Environment
BIORETS:合成和系统生物学教师的融合研究经验,以应对食品、健康、能源和环境方面的挑战
- 批准号:
2341402 - 财政年份:2024
- 资助金额:
$ 24万 - 项目类别:
Standard Grant
Acute human gingivitis systems biology
人类急性牙龈炎系统生物学
- 批准号:
484000 - 财政年份:2023
- 资助金额:
$ 24万 - 项目类别:
Operating Grants
A UK-Japan partnership for synergising synthetic biology with systems biology.
英国-日本合作伙伴关系,旨在协同合成生物学与系统生物学。
- 批准号:
BB/X018318/1 - 财政年份:2023
- 资助金额:
$ 24万 - 项目类别:
Research Grant
Personalized Cancer Treatment Strategies with Systems Biology and AI
利用系统生物学和人工智能的个性化癌症治疗策略
- 批准号:
23H03494 - 财政年份:2023
- 资助金额:
$ 24万 - 项目类别:
Grant-in-Aid for Scientific Research (B)