UniProt: A Protein Sequence and Function Resource for Biomedical Science
UniProt:生物医学的蛋白质序列和功能资源
基本信息
- 批准号:10663983
- 负责人:
- 金额:$ 593.41万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2014
- 资助国家:美国
- 起止时间:2014-09-18 至 2026-05-31
- 项目状态:未结题
- 来源:
- 关键词:AccelerationAffectAmino Acid SequenceArtificial IntelligenceBiomedical ResearchCatalogsCellsCollaborationsCommunitiesComplementComplexCuesDataData SetDevelopmentDiseaseDisease susceptibilityDistance LearningEnsureEnvironmentFAIR principlesGenomicsGenotypeGleanGrowthHealthHereditary DiseaseHumanHuman GeneticsHuman MicrobiomeIndividualInternationalInternetKnowledgeKnowledge ExtractionLiteratureMachine LearningMethodsModernizationMolecularMolecular BiologyMolecular Sequence DataMolecular StructureOntologyOrganOrthologous GeneOutcomePaperPathway interactionsPatternPharmaceutical PreparationsPhenotypePlayProcessProductionProtein ArrayProteinsPublicationsReadabilityReadinessResearchResearch PersonnelResourcesRoleScienceShapesSiteStandardizationStructureSystemTechnologyTissuesTrainingTriageVariantWorkbiomedical data sciencebiomedical resourcecrowdsourcingdata accessdata reusedeep learningdeep learning modeldesignexperienceexperimental studyformycin triphosphategenetic architecturegenomic variationhackathonhuman diseaseimprovedinnovationknowledge curationknowledgebaselearning engagementlearning strategymachine learning methodmacromolecular assemblymeetingsnew technologypathogenpersonalized diagnosticsprognosticprotein functionresponsesocial mediasymposiumtext searchingweb sitewebinar
项目摘要
PROJECT SUMMARY/ABSTRACT
This project continues the development of the UniProt Knowledgebase, which aims to provide the scientific
community with a comprehensive, high-quality, and freely accessible resource of protein sequences and
functional information. Proteins are an essential bridge between human genetics, the environment and
phenotype. While human genetics has increasing power to find correlations between genotype and phenotype,
knowledge of how proteins function, provided by UniProt, is essential for the mechanistic understanding critical
to develop health outcomes through improved and personalized diagnostics, prognostics, and treatments.
Biomedical research is being revolutionized by methods from the field of Artificial Intelligence, particularly
Machine Learning (ML) approaches such as Deep Learning (DL). These approaches now outstrip the ability of
humans in many fields and are state-of-the-art when sufficient data is available. UniProt provides gold standard
training data for hundreds of ML applications in biomedical research. The work in this proposal will enhance the
readiness of UniProt for use in ML and will integrate ML methods to enhance our efficiency.
UniProt curators extract and synthesize experimental knowledge of proteins from papers in human and machine-
readable forms using a range of standard ontologies. This proposal will further structure protein knowledge in
UniProt, developing complete, machine-readable catalogs of the functional impact of human variation and of
human protein networks and complexes, essential to understanding human disease. Efficiency of curation will
be improved using DL models, developed in collaboration with text mining experts, to automate the identification
of relevant papers and accelerate extraction of knowledge. This extracted knowledge will be validated by our
expert curators and also the wider research community who will be actively engaged to further scale curation.
ML approaches will also be used to infer annotations for proteins with no experimental characterization, using
community challenges to develop faster, more accurate, scalable approaches to annotate the deluge of
uncharacterized proteins.
UniProt is an exemplar FAIR resource and has served the scientific community with metronomic data releases
despite an exponential growth in data volumes. Streamlined production processes will scale efficiently and
sustainably with both the growing data volume and complexity. We will explore novel technologies to ensure the
continued timely release of data to the community according to the FAIR principles.
UniProt is an international hub of protein data that serves hundreds of thousands of users annually. We will
continue using user-centric approaches to develop the UniProt website in response to user needs and new data
types. We will engage with our stakeholders and collaborators by introducing an annual strategic partnership
meeting. We will engage our communities through webinars, social media, hackathons and attendance at
scientific meetings to broaden the efficient and impactful use of our data.
项目总结/摘要
该项目继续开发UniProt知识库,旨在提供科学的
社区提供全面,高质量和免费获得的蛋白质序列资源,
功能信息。蛋白质是人类遗传学、环境和
表型虽然人类遗传学越来越有能力发现基因型和表型之间的相关性,
由UniProt提供的蛋白质功能的知识对于理解关键的
通过改进和个性化的诊断、诊断和治疗来改善健康状况。
人工智能领域的方法正在使生物医学研究发生革命性的变化,
机器学习(ML)方法,如深度学习(DL)。这些方法现在已经超出了
人类在许多领域,是国家的最先进的,当有足够的数据。UniProt提供黄金标准
为生物医学研究中的数百个ML应用提供训练数据。本提案中的工作将加强
UniProt已经准备好用于ML,并将整合ML方法以提高我们的效率。
UniProt策展人从人类和机器的论文中提取和合成蛋白质的实验知识-
使用一系列标准本体的可读形式。这一建议将进一步结构蛋白质的知识,
UniProt,开发完整的,机器可读的人类变异的功能影响目录,
人类蛋白质网络和复合物,对了解人类疾病至关重要。管理效率将
使用与文本挖掘专家合作开发的DL模型进行改进,以自动识别
相关论文,加速知识提取。这些提取的知识将由我们的
专家策展人和更广泛的研究社区将积极参与进一步扩大策展规模。
ML方法还将用于推断没有实验表征的蛋白质的注释,
社区面临的挑战是开发更快,更准确,可扩展的方法来注释洪水,
未知蛋白质
UniProt是一个典型的FAIR资源,并为科学界提供节拍数据发布服务
尽管数据量呈指数增长。简化的生产流程将有效扩展,
随着数据量和复杂性的不断增长,我们将探索新技术,
根据公平原则,继续及时向社会发布数据。
UniProt是一个国际蛋白质数据中心,每年为数十万用户提供服务。我们将
继续采用以用户为中心的方法,根据用户需求和新数据开发UniProt网站
类型我们将通过引入年度战略合作伙伴关系,与利益相关者和合作者互动
会议我们将通过网络研讨会、社交媒体、黑客马拉松和参加
科学会议,以扩大我们的数据的有效和有影响力的使用。
项目成果
期刊论文数量(23)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
COVID-19 Knowledge Graph from semantic integration of biomedical literature and databases.
- DOI:10.1093/bioinformatics/btab694
- 发表时间:2021-12-07
- 期刊:
- 影响因子:0
- 作者:Chen C;Ross KE;Gavali S;Cowart JE;Wu CH
- 通讯作者:Wu CH
The Quest for Orthologs orthology benchmark service in 2022.
- DOI:10.1093/nar/gkac330
- 发表时间:2022-07-05
- 期刊:
- 影响因子:14.9
- 作者:Nevers Y;Jones TEM;Jyothi D;Yates B;Ferret M;Portell-Silva L;Codo L;Cosentino S;Marcet-Houben M;Vlasova A;Poidevin L;Kress A;Hickman M;Persson E;Piližota I;Guijarro-Clarke C;OpenEBench team the Quest for Orthologs Consortium;Iwasaki W;Lecompte O;Sonnhammer E;Roos DS;Gabaldón T;Thybert D;Thomas PD;Hu Y;Emms DM;Bruford E;Capella-Gutierrez S;Martin MJ;Dessimoz C;Altenhoff A
- 通讯作者:Altenhoff A
A crowdsourcing open platform for literature curation in UniProt.
- DOI:10.1371/journal.pbio.3001464
- 发表时间:2021-12
- 期刊:
- 影响因子:9.8
- 作者:Wang Y;Wang Q;Huang H;Huang W;Chen Y;McGarvey PB;Wu CH;Arighi CN;UniProt Consortium
- 通讯作者:UniProt Consortium
Recent applications of deep learning and machine intelligence on in silico drug discovery: methods, tools and databases.
- DOI:10.1093/bib/bby061
- 发表时间:2019-09-27
- 期刊:
- 影响因子:9.5
- 作者:Rifaioglu AS;Atas H;Martin MJ;Cetin-Atalay R;Atalay V;Doğan T
- 通讯作者:Doğan T
The Minimum Information about a Molecular Interaction CAusal STatement (MI2CAST).
- DOI:10.1093/bioinformatics/btaa622
- 发表时间:2021-04-05
- 期刊:
- 影响因子:0
- 作者:Touré V;Vercruysse S;Acencio ML;Lovering RC;Orchard S;Bradley G;Casals-Casas C;Chaouiya C;Del-Toro N;Flobak Å;Gaudet P;Hermjakob H;Hoyt CT;Licata L;Lægreid A;Mungall CJ;Niknejad A;Panni S;Perfetto L;Porras P;Pratt D;Saez-Rodriguez J;Thieffry D;Thomas PD;Türei D;Kuiper M
- 通讯作者:Kuiper M
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Alex Bateman其他文献
Alex Bateman的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Alex Bateman', 18)}}的其他基金
UniProt: A centralized protein sequence and function resource
UniProt:集中的蛋白质序列和功能资源
- 批准号:
9114369 - 财政年份:2014
- 资助金额:
$ 593.41万 - 项目类别:
UniProt: A Protein Sequence and Function Resource for Biomedical Science
UniProt:生物医学的蛋白质序列和功能资源
- 批准号:
10267787 - 财政年份:2014
- 资助金额:
$ 593.41万 - 项目类别:
UniProt - Enhancing functional genomics data access for the Alzheimer's Disease (AD) and dementia-related protein research communities
UniProt - 增强阿尔茨海默病 (AD) 和痴呆相关蛋白质研究社区的功能基因组学数据访问
- 批准号:
10121011 - 财政年份:2014
- 资助金额:
$ 593.41万 - 项目类别:
UniProt: A centralized protein sequence and function resource
UniProt:集中的蛋白质序列和功能资源
- 批准号:
9069018 - 财政年份:2014
- 资助金额:
$ 593.41万 - 项目类别:
UniProt: A centralized protein sequence and function resource
UniProt:集中的蛋白质序列和功能资源
- 批准号:
8739769 - 财政年份:2014
- 资助金额:
$ 593.41万 - 项目类别:
UniProt - Protein sequence and function embeddings for AI/Machine Learning readiness
UniProt - 用于人工智能/机器学习准备的蛋白质序列和功能嵌入
- 批准号:
10594115 - 财政年份:2014
- 资助金额:
$ 593.41万 - 项目类别:
UniProt: A centralized protein sequence and function resource
UniProt:集中的蛋白质序列和功能资源
- 批准号:
9276092 - 财政年份:2014
- 资助金额:
$ 593.41万 - 项目类别:
UniProt: A Protein Sequence and Function Resource for Biomedical Science
UniProt:生物医学的蛋白质序列和功能资源
- 批准号:
10490361 - 财政年份:2014
- 资助金额:
$ 593.41万 - 项目类别:
UniProt: A centralized protein sequence and function resource
UniProt:集中的蛋白质序列和功能资源
- 批准号:
10372430 - 财政年份:2014
- 资助金额:
$ 593.41万 - 项目类别:
UniProt building community metrics for FAIR and TRUSTworthy resources
UniProt 为公平和值得信赖的资源构建社区指标
- 批准号:
10595850 - 财政年份:2014
- 资助金额:
$ 593.41万 - 项目类别:
相似海外基金
How Does Particle Material Properties Insoluble and Partially Soluble Affect Sensory Perception Of Fat based Products
不溶性和部分可溶的颗粒材料特性如何影响脂肪基产品的感官知觉
- 批准号:
BB/Z514391/1 - 财政年份:2024
- 资助金额:
$ 593.41万 - 项目类别:
Training Grant
BRC-BIO: Establishing Astrangia poculata as a study system to understand how multi-partner symbiotic interactions affect pathogen response in cnidarians
BRC-BIO:建立 Astrangia poculata 作为研究系统,以了解多伙伴共生相互作用如何影响刺胞动物的病原体反应
- 批准号:
2312555 - 财政年份:2024
- 资助金额:
$ 593.41万 - 项目类别:
Standard Grant
RII Track-4:NSF: From the Ground Up to the Air Above Coastal Dunes: How Groundwater and Evaporation Affect the Mechanism of Wind Erosion
RII Track-4:NSF:从地面到沿海沙丘上方的空气:地下水和蒸发如何影响风蚀机制
- 批准号:
2327346 - 财政年份:2024
- 资助金额:
$ 593.41万 - 项目类别:
Standard Grant
Graduating in Austerity: Do Welfare Cuts Affect the Career Path of University Students?
紧缩毕业:福利削减会影响大学生的职业道路吗?
- 批准号:
ES/Z502595/1 - 财政年份:2024
- 资助金额:
$ 593.41万 - 项目类别:
Fellowship
感性個人差指標 Affect-X の構築とビスポークAIサービスの基盤確立
建立个人敏感度指数 Affect-X 并为定制人工智能服务奠定基础
- 批准号:
23K24936 - 财政年份:2024
- 资助金额:
$ 593.41万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Insecure lives and the policy disconnect: How multiple insecurities affect Levelling Up and what joined-up policy can do to help
不安全的生活和政策脱节:多种不安全因素如何影响升级以及联合政策可以提供哪些帮助
- 批准号:
ES/Z000149/1 - 财政年份:2024
- 资助金额:
$ 593.41万 - 项目类别:
Research Grant
How does metal binding affect the function of proteins targeted by a devastating pathogen of cereal crops?
金属结合如何影响谷类作物毁灭性病原体靶向的蛋白质的功能?
- 批准号:
2901648 - 财政年份:2024
- 资助金额:
$ 593.41万 - 项目类别:
Studentship
Investigating how double-negative T cells affect anti-leukemic and GvHD-inducing activities of conventional T cells
研究双阴性 T 细胞如何影响传统 T 细胞的抗白血病和 GvHD 诱导活性
- 批准号:
488039 - 财政年份:2023
- 资助金额:
$ 593.41万 - 项目类别:
Operating Grants
New Tendencies of French Film Theory: Representation, Body, Affect
法国电影理论新动向:再现、身体、情感
- 批准号:
23K00129 - 财政年份:2023
- 资助金额:
$ 593.41万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
The Protruding Void: Mystical Affect in Samuel Beckett's Prose
突出的虚空:塞缪尔·贝克特散文中的神秘影响
- 批准号:
2883985 - 财政年份:2023
- 资助金额:
$ 593.41万 - 项目类别:
Studentship