Curation at scale: Integrating AI into community curation
大规模策展:将人工智能融入社区策展
基本信息
- 批准号:10344771
- 负责人:
- 金额:$ 35.59万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-09-01 至 2025-05-31
- 项目状态:未结题
- 来源:
- 关键词:AddressAlgorithmsAreaBenchmarkingBiologicalBiologyBypassCaenorhabditis elegansClassificationCommunitiesComputer AssistedCouplesCouplingDataData CollectionDatabasesFAIR principlesFeedbackGenerationsGoldInfrastructureIngestionKnowledgeLiteratureMachine LearningManualsMethodsMicrobiologyModelingNamesNatural Language ProcessingPaperProcessPublicationsPublishingReadabilityResearchResearch PersonnelResourcesSpeedStatistical MethodsSystemTextTimeTrainingValidationWormBasebiological researchdashboarddata submissiondesigngenome resourceimprovedinnovationknowledge baselearning engagementmachine learning algorithmmachine learning methodmembernovel strategiesrepositorytext searchingtool
项目摘要
Project Summary
Biological knowledgebases are a critical resource for researchers and accelerate scientific discoveries by
providing manually curated, machine-readable data collections. However, the aggregation and manual curation
of biological data is a labor-intensive process that relies almost entirely on professional biocurators. Two
approaches have been advanced to help with this problem: natural language processing (NLP; text mining (TM)
and machine learning (ML)) and engagement of researchers (community curation). However, neither of these
approaches alone is sufficient to address the critical need for increased efficiency in the biocuration process. Our
solution to these challenges is an NLP-enhanced community curation portal, Author Curation to Knowledgebase
(ACKnowledge). The ACKnowledge system, currently implemented for the C. elegans literature, couples
statistical methods and text mining algorithms to enhance community curation of research articles. We propose
to strengthen and expand ACKnowledge by including other species into our pipeline, incorporating more
sophisticated machine learning models, and presenting sentence-level entity and concept extraction for more
detailed author curation. In addition, we will develop an Author Curation Portal (ACP) to allow authors to easily
upload and curate their own documents. Taken together, these enhancements will allow us to maximize
community curation efforts by leveraging author expertise in multiple areas of biology, while at the same time
supporting authors with as much AI-assisted curation as possible. This reciprocal interaction will improve not
only the content of knowledgebases, but the AI methods themselves, as we will receive valuable feedback on our
models. By developing an Author Curation Portal, we will further empower authors to participate in the curation
process and alert knowledgebases to key information that can, and should, be readily discoverable in accordance
with FAIR (Findable, Accessible, Interoperable, and Reusable) data principles.
项目摘要
生物学知识库是研究人员的重要资源,通过以下方式加速科学发现:
提供人工管理的、机器可读的数据集合。然而,聚合和手动策展
生物数据的处理是一个劳动密集型的过程,几乎完全依赖于专业的生物处理人员。两
已经提出了一些方法来帮助解决这个问题:自然语言处理(NLP);文本挖掘(TM)
和机器学习(ML))和研究人员的参与(社区策展)。然而,这两个
这些方法本身就足以解决生物固化过程中提高效率的关键需求。我们
这些挑战的解决方案是一个NLP增强的社区策展门户,作者策展到知识库
(承认)。ACKnowledge系统目前已在C.优雅的文学,夫妇
统计方法和文本挖掘算法,以增强研究文章的社区管理。我们提出
通过将其他物种纳入我们的管道来加强和扩大ACKnowledge,
复杂的机器学习模型,并提供高级实体和概念提取,
作者详细介绍此外,我们将开发一个作者策展门户(ACP),让作者轻松地
上传和整理自己的文档。综合考虑,这些增强功能将使我们能够最大限度地
社区策展工作,利用作者在生物学多个领域的专业知识,同时
尽可能多地支持作者进行人工智能辅助的策展。这种相互作用不会改善
只有知识库的内容,但人工智能的方法本身,因为我们将收到宝贵的反馈,我们的
模型通过开发一个作者策展门户网站,我们将进一步授权作者参与策展
处理知识库,并提醒知识库注意能够并且应该根据
FAIR(Findable,Interoperable,Reusable)数据原则。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
PAUL Warren STERNBERG其他文献
PAUL Warren STERNBERG的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('PAUL Warren STERNBERG', 18)}}的其他基金
Curation at scale: Integrating AI into community curation
大规模策展:将人工智能融入社区策展
- 批准号:
10621338 - 财政年份:2021
- 资助金额:
$ 35.59万 - 项目类别:
Bipartite gene expression system for C. elegans genetic and neural circuit analysis
用于线虫遗传和神经回路分析的二分基因表达系统
- 批准号:
9437389 - 财政年份:2017
- 资助金额:
$ 35.59万 - 项目类别:
Genetics 2012: Model Organism to Human Cancer
遗传学 2012:人类癌症模型生物
- 批准号:
8319996 - 财政年份:2012
- 资助金额:
$ 35.59万 - 项目类别:
Textpresso, information retrieval and extraction system for biological literature
Textpresso,生物文献信息检索和提取系统
- 批准号:
7347569 - 财政年份:2006
- 资助金额:
$ 35.59万 - 项目类别:
Textpresso, an information retrieval and extraction system for biological literat
Textpresso,生物文学信息检索和提取系统
- 批准号:
7047977 - 财政年份:2006
- 资助金额:
$ 35.59万 - 项目类别:
Textpresso, information retrieval and extraction system for biological literature
Textpresso,生物文献信息检索和提取系统
- 批准号:
7212077 - 财政年份:2006
- 资助金额:
$ 35.59万 - 项目类别:
相似海外基金
Approximate algorithms and architectures for area efficient system design
区域高效系统设计的近似算法和架构
- 批准号:
LP170100311 - 财政年份:2018
- 资助金额:
$ 35.59万 - 项目类别:
Linkage Projects
AMPS: Rank Minimization Algorithms for Wide-Area Phasor Measurement Data Processing
AMPS:用于广域相量测量数据处理的秩最小化算法
- 批准号:
1736326 - 财政年份:2017
- 资助金额:
$ 35.59万 - 项目类别:
Standard Grant
Low Power, Area Efficient, High Speed Algorithms and Architectures for Computer Arithmetic, Pattern Recognition and Cryptosystems
用于计算机算术、模式识别和密码系统的低功耗、面积高效、高速算法和架构
- 批准号:
1686-2013 - 财政年份:2017
- 资助金额:
$ 35.59万 - 项目类别:
Discovery Grants Program - Individual
Rigorous simulation of speckle fields caused by large area rough surfaces using fast algorithms based on higher order boundary element methods
使用基于高阶边界元方法的快速算法对大面积粗糙表面引起的散斑场进行严格模拟
- 批准号:
375876714 - 财政年份:2017
- 资助金额:
$ 35.59万 - 项目类别:
Research Grants
Low Power, Area Efficient, High Speed Algorithms and Architectures for Computer Arithmetic, Pattern Recognition and Cryptosystems
用于计算机算术、模式识别和密码系统的低功耗、面积高效、高速算法和架构
- 批准号:
1686-2013 - 财政年份:2016
- 资助金额:
$ 35.59万 - 项目类别:
Discovery Grants Program - Individual
Low Power, Area Efficient, High Speed Algorithms and Architectures for Computer Arithmetic, Pattern Recognition and Cryptosystems
用于计算机算术、模式识别和密码系统的低功耗、面积高效、高速算法和架构
- 批准号:
1686-2013 - 财政年份:2015
- 资助金额:
$ 35.59万 - 项目类别:
Discovery Grants Program - Individual
Low Power, Area Efficient, High Speed Algorithms and Architectures for Computer Arithmetic, Pattern Recognition and Cryptosystems
用于计算机算术、模式识别和密码系统的低功耗、面积高效、高速算法和架构
- 批准号:
1686-2013 - 财政年份:2014
- 资助金额:
$ 35.59万 - 项目类别:
Discovery Grants Program - Individual
AREA: Optimizing gene expression with mRNA free energy modeling and algorithms
区域:利用 mRNA 自由能建模和算法优化基因表达
- 批准号:
8689532 - 财政年份:2014
- 资助金额:
$ 35.59万 - 项目类别:
CPS: Synergy: Collaborative Research: Distributed Asynchronous Algorithms and Software Systems for Wide-Area Monitoring of Power Systems
CPS:协同:协作研究:用于电力系统广域监控的分布式异步算法和软件系统
- 批准号:
1329780 - 财政年份:2013
- 资助金额:
$ 35.59万 - 项目类别:
Standard Grant
CPS: Synergy: Collaborative Research: Distributed Asynchronous Algorithms and Software Systems for Wide-Area Mentoring of Power Systems
CPS:协同:协作研究:用于电力系统广域指导的分布式异步算法和软件系统
- 批准号:
1329745 - 财政年份:2013
- 资助金额:
$ 35.59万 - 项目类别:
Standard Grant