UniProt - Protein sequence and function embeddings for AI/Machine Learning readiness

UniProt - 用于人工智能/机器学习准备的蛋白质序列和功能嵌入

基本信息

  • 批准号:
    10594115
  • 负责人:
  • 金额:
    $ 25.1万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2014
  • 资助国家:
    美国
  • 起止时间:
    2014-09-18 至 2026-05-31
  • 项目状态:
    未结题

项目摘要

PROJECT SUMMARY UNIPROT PROTEIN SEQUENCE AND FUNCTION EMBEDDINGS FOR AI/MACHINE LEARNING READINESS - SUPPLEMENT REQUEST 2022 Artificial intelligence and machine learning (AI/ML) has the potential to advance biomedical research. The overall goals of this Supplement application for the UniProt parent grant (U24HG007822) are to (i) support the AI/ML community by providing protein sequence embeddings and make use of these embeddings for accurate and fast protein clustering in UniRef production, (ii) explore methods of embedding UniProt functional annotation data, and (iii) engage with the AI/ML community to advance AI/ML readiness of UniProt data. UniProt has been a leader in the provision of protein sequence and annotation data since its inception in 2002. UniProt provides gold standard training data for hundreds of AI/ML applications in biomedical research. Protein sequence embeddings show enormous promise for protein clustering and structural and functional analysis and prediction. By providing UniProt protein sequence embeddings, we will increase the accessibility of sequence embeddings, reduce duplication of effort in the community, and establish a standard that can facilitate evaluation and comparison of models. We will test different embedding methods, focusing on the most widely adopted methods as determined by our recent survey of the user community. We will also investigate using sequence embeddings to speed up sequence clustering in UniRef production. Similarly, there are many functional annotations in UniProt that are amenable to embedding for use in AI/ML models. As a test case, we will explore embedding of Rhea biochemical reaction annotations for enzymes and transporters. In addition to disseminating the embeddings, we will develop methods to visualize them and compare them to existing enzyme classification systems. Finally, to ensure that our work aligns with community needs, we will organise a workshop to work with the community on their use cases and applications for embeddings. We will invite various stakeholders, including researchers that participated in the embeddings survey, participants of the metal binding site prediction challenge currently underway as part of the parent grant, as well as NIH representatives. Overall, the work in this proposal will enhance the readiness of UniProt for use in AI/ML and will integrate AI/ML methods into UniProt production.
项目摘要 用于AI/机器学习的Uniprot蛋白质序列和功能嵌入 2022年审核申请 人工智能和机器学习(AI/ML)具有推进生物医学研究的潜力。的 UniProt父母补助金(U24 HG 007822)的补充申请的总体目标是(i)支持 AI/ML社区通过提供蛋白质序列嵌入,并利用这些嵌入 UniRef生产中准确和快速的蛋白质聚类,(ii)探索嵌入UniProt的方法 功能注释数据,以及(iii)与AI/ML社区合作,以推进UniProt的AI/ML准备 数据UniProt自成立以来一直是提供蛋白质序列和注释数据的领导者 2002年UniProt为生物医学领域的数百个AI/ML应用提供金标准训练数据 research.蛋白质序列嵌入显示出巨大的前景,蛋白质聚类和结构, 功能分析与预测通过提供UniProt蛋白质序列嵌入,我们将增加 序列嵌入的可访问性,减少社区中的重复工作,并建立标准 这有助于评估和比较各种模式。我们将测试不同的嵌入方法, 根据我们最近对用户社区的调查,最广泛采用的方法。我们还将 研究在UniRef生产中使用序列嵌入来加速序列聚类。同样地, 在UniProt中有许多功能注释,它们可以嵌入AI/ML模型中使用。 作为一个测试案例,我们将探索为酶嵌入Rhea生化反应注释, 运输机除了传播嵌入,我们还将开发可视化方法, 将其与现有的酶分类系统进行比较。最后,为了确保我们的工作符合 社区的需求,我们将组织一个工作坊,与社区合作, 嵌入的应用。我们将邀请各种利益相关者,包括参与 嵌入调查,参与者的金属结合位点预测挑战目前正在进行的一部分, 家长补助金以及NIH代表。总的来说,本提案中的工作将加强 UniProt已准备好用于AI/ML,并将AI/ML方法集成到UniProt生产中。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Alex Bateman其他文献

Alex Bateman的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Alex Bateman', 18)}}的其他基金

UniProt: A centralized protein sequence and function resource
UniProt:集中的蛋白质序列和功能资源
  • 批准号:
    9114369
  • 财政年份:
    2014
  • 资助金额:
    $ 25.1万
  • 项目类别:
UniProt: A Protein Sequence and Function Resource for Biomedical Science
UniProt:生物医学的蛋白质序列和功能资源
  • 批准号:
    10267787
  • 财政年份:
    2014
  • 资助金额:
    $ 25.1万
  • 项目类别:
UniProt - Enhancing functional genomics data access for the Alzheimer's Disease (AD) and dementia-related protein research communities
UniProt - 增强阿尔茨海默病 (AD) 和痴呆相关蛋白质研究社区的功能基因组学数据访问
  • 批准号:
    10121011
  • 财政年份:
    2014
  • 资助金额:
    $ 25.1万
  • 项目类别:
UniProt: A centralized protein sequence and function resource
UniProt:集中的蛋白质序列和功能资源
  • 批准号:
    8739769
  • 财政年份:
    2014
  • 资助金额:
    $ 25.1万
  • 项目类别:
UniProt: A centralized protein sequence and function resource
UniProt:集中的蛋白质序列和功能资源
  • 批准号:
    9069018
  • 财政年份:
    2014
  • 资助金额:
    $ 25.1万
  • 项目类别:
UniProt: A Protein Sequence and Function Resource for Biomedical Science
UniProt:生物医学的蛋白质序列和功能资源
  • 批准号:
    10663983
  • 财政年份:
    2014
  • 资助金额:
    $ 25.1万
  • 项目类别:
UniProt: A centralized protein sequence and function resource
UniProt:集中的蛋白质序列和功能资源
  • 批准号:
    9276092
  • 财政年份:
    2014
  • 资助金额:
    $ 25.1万
  • 项目类别:
UniProt: A Protein Sequence and Function Resource for Biomedical Science
UniProt:生物医学的蛋白质序列和功能资源
  • 批准号:
    10490361
  • 财政年份:
    2014
  • 资助金额:
    $ 25.1万
  • 项目类别:
UniProt: A centralized protein sequence and function resource
UniProt:集中的蛋白质序列和功能资源
  • 批准号:
    10372430
  • 财政年份:
    2014
  • 资助金额:
    $ 25.1万
  • 项目类别:
UniProt building community metrics for FAIR and TRUSTworthy resources
UniProt 为公平和值得信赖的资源构建社区指标
  • 批准号:
    10595850
  • 财政年份:
    2014
  • 资助金额:
    $ 25.1万
  • 项目类别:

相似海外基金

WELL-CALF: optimising accuracy for commercial adoption
WELL-CALF:优化商业采用的准确性
  • 批准号:
    10093543
  • 财政年份:
    2024
  • 资助金额:
    $ 25.1万
  • 项目类别:
    Collaborative R&D
Investigating the Adoption, Actual Usage, and Outcomes of Enterprise Collaboration Systems in Remote Work Settings.
调查远程工作环境中企业协作系统的采用、实际使用和结果。
  • 批准号:
    24K16436
  • 财政年份:
    2024
  • 资助金额:
    $ 25.1万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Unraveling the Dynamics of International Accounting: Exploring the Impact of IFRS Adoption on Firms' Financial Reporting and Business Strategies
揭示国际会计的动态:探索采用 IFRS 对公司财务报告和业务战略的影响
  • 批准号:
    24K16488
  • 财政年份:
    2024
  • 资助金额:
    $ 25.1万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
ERAMET - Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
ERAMET - 快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
  • 批准号:
    10107647
  • 财政年份:
    2024
  • 资助金额:
    $ 25.1万
  • 项目类别:
    EU-Funded
Assessing the Coordination of Electric Vehicle Adoption on Urban Energy Transition: A Geospatial Machine Learning Framework
评估电动汽车采用对城市能源转型的协调:地理空间机器学习框架
  • 批准号:
    24K20973
  • 财政年份:
    2024
  • 资助金额:
    $ 25.1万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
  • 批准号:
    10106221
  • 财政年份:
    2024
  • 资助金额:
    $ 25.1万
  • 项目类别:
    EU-Funded
Our focus for this project is accelerating the development and adoption of resource efficient solutions like fashion rental through technological advancement, addressing longer in use and reuse
我们该项目的重点是通过技术进步加快时装租赁等资源高效解决方案的开发和采用,解决更长的使用和重复使用问题
  • 批准号:
    10075502
  • 财政年份:
    2023
  • 资助金额:
    $ 25.1万
  • 项目类别:
    Grant for R&D
Engage2innovate – Enhancing security solution design, adoption and impact through effective engagement and social innovation (E2i)
Engage2innovate — 通过有效参与和社会创新增强安全解决方案的设计、采用和影响 (E2i)
  • 批准号:
    10089082
  • 财政年份:
    2023
  • 资助金额:
    $ 25.1万
  • 项目类别:
    EU-Funded
De-Adoption Beta-Blockers in patients with stable ischemic heart disease without REduced LV ejection fraction, ongoing Ischemia, or Arrhythmias: a randomized Trial with blinded Endpoints (ABbreviate)
在没有左心室射血分数降低、持续性缺血或心律失常的稳定型缺血性心脏病患者中停用β受体阻滞剂:一项盲法终点随机试验(ABbreviate)
  • 批准号:
    481560
  • 财政年份:
    2023
  • 资助金额:
    $ 25.1万
  • 项目类别:
    Operating Grants
Collaborative Research: SCIPE: CyberInfrastructure Professionals InnoVating and brOadening the adoption of advanced Technologies (CI PIVOT)
合作研究:SCIPE:网络基础设施专业人员创新和扩大先进技术的采用 (CI PIVOT)
  • 批准号:
    2321091
  • 财政年份:
    2023
  • 资助金额:
    $ 25.1万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了