EAGER: Collaborative Research: Mining Scientific Literature with the LAPPS Grid
EAGER:协作研究:使用 LAPPS 网格挖掘科学文献
基本信息
- 批准号:1811123
- 负责人:
- 金额:$ 17.04万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2018
- 资助国家:美国
- 起止时间:2018-06-01 至 2019-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Scientists have become unable to keep up with the ever-expanding number of scientific publications. The lack of this ability is a fundamental bottleneck to scientific progress. Current search technologies are limited because they are able to find many relevant documents, but cannot extract and organize the information content of these documents or suggest new scientific hypotheses based on the organized content. Natural Language Processing (NLP) based text mining strategies are a recognized means to approach this problem, but most scientists do not have the expertise or time to take use them. In addition, the lack of interoperability among NLP tools as well as the data in repositories scattered around the web are barriers to sharing workflows, resources, and results. This project will identify what analysis features are needed within an easy-to-use platform for mining scientific texts, implement an initial version of such a platform, and make it available to scientists.There is currently no open, easy-to-use platform for mining scientific texts that provides interoperable access to a wide array of software, computing resources, and publication data. Publicly available software (such as Google) is not geared toward publication data, and in-house tools are fragile and deliver only a fraction of relevant results. The main objective of this project is, therefore, to (1) identify the requirements for an easy-to-use platform for mining information from scientific publications and (2) deploy facilities that meet these needs. To achieve this goal this project will extend the already existing NSF-funded LAPPS Grid to include means to access a broad range of interoperable NLP tools, large bodies of publication data and lexical and ontological resources, and, crucially, to rapidly adapt existing software to new domains and evaluate results. This project will also leverage enhancements to the NSF-funded Galaxy platform for interactive data exploration and extended access to NSF hardware resources (XSEDE machines including Stampede, Bridges, and Jetstream). By providing access to services for mining scientific publications and lowering the barriers to entry resulting from licensing, redistribution, and intellectual property concerns, this project provides capabilities that were previously unavailable to scientists. Researchers are able to perform large-scale text mining using an HPC infrastructure through a web-based interface without the need to know about underlying infrastructure. Additionally, providing iterative domain adaptation capabilities enables scientists to easily adapt existing services to specialized areas without configuring or installing additional components. The ability to examine both explicit and implicit information scattered across massive repositories of publications will undoubtedly result in new observations and insights.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
科学家们已经无法跟上科学出版物数量的不断增加。缺乏这种能力是科学进步的根本瓶颈。目前的搜索技术是有限的,因为他们能够找到许多相关的文件,但不能提取和组织这些文件的信息内容或建议新的科学假设的基础上组织的内容。基于自然语言处理(NLP)的文本挖掘策略是解决这个问题的公认方法,但大多数科学家没有专业知识或时间来使用它们。此外,NLP工具之间缺乏互操作性以及分散在网络上的存储库中的数据是共享工作流程,资源和结果的障碍。该项目将确定在一个易于使用的科学文本挖掘平台中需要哪些分析功能,实现这样一个平台的初始版本,并将其提供给科学家。目前还没有一个开放的、易于使用的科学文本挖掘平台,可以提供对广泛的软件、计算资源和出版数据的互操作访问。公开可用的软件(如Google)并不适合发布数据,内部工具也很脆弱,只能提供一小部分相关结果。因此,该项目的主要目标是:(1)确定科学出版物中采矿信息的易用平台的要求;(2)部署满足这些需要的设施。为了实现这一目标,该项目将扩展现有的NSF资助的LAPPS网格,以包括访问广泛的可互操作的NLP工具,大量出版数据和词汇和本体资源的方法,并且,至关重要的是,快速调整现有软件以适应新的领域并评估结果。该项目还将利用NSF资助的Galaxy平台的增强功能,以进行交互式数据探索和扩展对NSF硬件资源(XSEDE机器,包括Stampede,Bridges和Jetstream)的访问。通过提供获取采矿科学出版物服务的途径,并降低因许可证、再分配和知识产权问题而造成的进入壁垒,该项目提供了科学家以前无法获得的能力。研究人员能够通过基于Web的界面使用HPC基础设施执行大规模文本挖掘,而无需了解底层基础设施。此外,提供迭代域适配功能使科学家能够轻松地将现有服务适配到特定领域,而无需配置或安装其他组件。对分散在大量出版物存储库中的显性和隐性信息进行检查的能力无疑将产生新的观察和见解。该奖项反映了NSF的法定使命,并被认为值得通过使用基金会的知识价值和更广泛的影响审查标准进行评估来支持。
项目成果
期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
The LAPPS Grid/Galaxy Platform for Mining Scientific Publications
LAPPS Grid/Galaxy 采矿科学出版物平台
- DOI:
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Nancy Ide, Keith Suderman
- 通讯作者:Nancy Ide, Keith Suderman
Towards cross-platform interoperability for machine-assisted annotation
实现机器辅助注释的跨平台互操作性
- DOI:
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Eckart de Castilho, R.;Ide, N.;Kim, J.-D.;Klie, J.-C.;Suderman, K.
- 通讯作者:Suderman, K.
Mining Biomedical Publications With The LAPPS Grid
使用 LAPPS 网格挖掘生物医学出版物
- DOI:
- 发表时间:2018
- 期刊:
- 影响因子:0
- 作者:Nancy Ide, Keith Suderman
- 通讯作者:Nancy Ide, Keith Suderman
Bridging the LAPPS Grid and CLARIN
- DOI:
- 发表时间:2018-05
- 期刊:
- 影响因子:0
- 作者:E. Hinrichs;Nancy Ide;J. Pustejovsky;Jan Hajic;Marie Hinrichs;Mohammad Fazleh Elahi;Keith Suderman;M. Verhagen;Kyeongmin Rim;P. Stranák;J. Misutka
- 通讯作者:E. Hinrichs;Nancy Ide;J. Pustejovsky;Jan Hajic;Marie Hinrichs;Mohammad Fazleh Elahi;Keith Suderman;M. Verhagen;Kyeongmin Rim;P. Stranák;J. Misutka
A Multi-Platform Annotation Ecosystem for Domain Adaptation
用于领域适应的多平台注释生态系统
- DOI:10.18653/v1/w19-4021
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Eckart de Castilho, Richard;Ide, Nancy;Kim, Jin-Dong;Klie, Jan-Christoph;Suderman, Keith
- 通讯作者:Suderman, Keith
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Nancy Ide其他文献
The Language Application Grid and Galaxy
语言应用网格和银河
- DOI:
- 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
Nancy Ide;Keith Suderman;J. Pustejovsky;M. Verhagen;C. Cieri - 通讯作者:
C. Cieri
A statistical measure of theme and structure
主题和结构的统计测量
- DOI:
10.1007/bf02176632 - 发表时间:
1989 - 期刊:
- 影响因子:0
- 作者:
Nancy Ide - 通讯作者:
Nancy Ide
Outline of a database model for electronic dictionaries
电子词典数据库模型概述
- DOI:
10.5555/3170967.3170995 - 发表时间:
1991 - 期刊:
- 影响因子:0
- 作者:
Nancy Ide;J. Véronis;J. Maitre - 通讯作者:
J. Maitre
Community Standards for Linguistically-Annotated Resources
语言注释资源的社区标准
- DOI:
10.1007/978-94-024-0881-2_4 - 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
Nancy Ide;N. Calzolari;Judith Eckle;D. Gibbon;Sebastian Hellmann;Ki Yong Lee;Joakim Nivre;Laurent Romary - 通讯作者:
Laurent Romary
Preface to the special issue: LREC 2012: state of the art in resource development and evaluation
- DOI:
10.1007/s10579-014-9289-9 - 发表时间:
2014-11-22 - 期刊:
- 影响因子:1.800
- 作者:
Nancy Ide;Nicoletta Calzolari - 通讯作者:
Nicoletta Calzolari
Nancy Ide的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Nancy Ide', 18)}}的其他基金
SI2-SSI: The Language Application Grid: A Framework for Rapid Adaptation and Reuse
SI2-SSI:语言应用网格:快速适应和重用的框架
- 批准号:
1147944 - 财政年份:2012
- 资助金额:
$ 17.04万 - 项目类别:
Standard Grant
RUI: CRI: CI-ADDO-EN: Collaborative Research: MASC: A Community Resource For and By the People
RUI:CRI:CI-ADDO-EN:合作研究:MASC:人民的社区资源
- 批准号:
1059312 - 财政年份:2011
- 资助金额:
$ 17.04万 - 项目类别:
Standard Grant
INTEROP: Sustainable Interoperability for Language Technology
INTEROP:语言技术的可持续互操作性
- 批准号:
0753069 - 财政年份:2008
- 资助金额:
$ 17.04万 - 项目类别:
Continuing Grant
CRI: CRD A Richly Annotated Resource for Language Processing and Linguistics Research
CRI:CRD 语言处理和语言学研究的注释丰富的资源
- 批准号:
0708952 - 财政年份:2007
- 资助金额:
$ 17.04万 - 项目类别:
Continuing Grant
Collaborative Research: CRI: An Open Linguistic Infrastructure for American English
合作研究:CRI:美式英语的开放语言基础设施
- 批准号:
0551601 - 财政年份:2006
- 资助金额:
$ 17.04万 - 项目类别:
Standard Grant
CRI: An Open Linguistic Infrastructure for American English
CRI:美式英语的开放语言基础设施
- 批准号:
0454130 - 财政年份:2005
- 资助金额:
$ 17.04万 - 项目类别:
Standard Grant
ITR: American National Corpus: A Primary Resource for Linguistics Research
ITR:美国国家语料库:语言学研究的主要资源
- 批准号:
0218609 - 财政年份:2002
- 资助金额:
$ 17.04万 - 项目类别:
Continuing Grant
XMELLT: Cross-lingual Multi-word Expression Lexicons for Language Technology
XMELLT:语言技术跨语言多词表达词典
- 批准号:
9982069 - 财政年份:2000
- 资助金额:
$ 17.04万 - 项目类别:
Standard Grant
American National Corpus: Planning and Exploration Workshop
美国国家语料库:规划与探索研讨会
- 批准号:
9978422 - 财政年份:1999
- 资助金额:
$ 17.04万 - 项目类别:
Standard Grant
Workshop: Exploring US-Romanian Collaboration in Language Technology
研讨会:探索美国-罗马尼亚在语言技术方面的合作
- 批准号:
9978601 - 财政年份:1999
- 资助金额:
$ 17.04万 - 项目类别:
Standard Grant
相似海外基金
Collaborative Research: EAGER: The next crisis for coral reefs is how to study vanishing coral species; AUVs equipped with AI may be the only tool for the job
合作研究:EAGER:珊瑚礁的下一个危机是如何研究正在消失的珊瑚物种;
- 批准号:
2333604 - 财政年份:2024
- 资助金额:
$ 17.04万 - 项目类别:
Standard Grant
EAGER/Collaborative Research: An LLM-Powered Framework for G-Code Comprehension and Retrieval
EAGER/协作研究:LLM 支持的 G 代码理解和检索框架
- 批准号:
2347624 - 财政年份:2024
- 资助金额:
$ 17.04万 - 项目类别:
Standard Grant
EAGER/Collaborative Research: Revealing the Physical Mechanisms Underlying the Extraordinary Stability of Flying Insects
EAGER/合作研究:揭示飞行昆虫非凡稳定性的物理机制
- 批准号:
2344215 - 财政年份:2024
- 资助金额:
$ 17.04万 - 项目类别:
Standard Grant
Collaborative Research: EAGER: Designing Nanomaterials to Reveal the Mechanism of Single Nanoparticle Photoemission Intermittency
合作研究:EAGER:设计纳米材料揭示单纳米粒子光电发射间歇性机制
- 批准号:
2345581 - 财政年份:2024
- 资助金额:
$ 17.04万 - 项目类别:
Standard Grant
Collaborative Research: EAGER: Designing Nanomaterials to Reveal the Mechanism of Single Nanoparticle Photoemission Intermittency
合作研究:EAGER:设计纳米材料揭示单纳米粒子光电发射间歇性机制
- 批准号:
2345582 - 财政年份:2024
- 资助金额:
$ 17.04万 - 项目类别:
Standard Grant
Collaborative Research: EAGER: Designing Nanomaterials to Reveal the Mechanism of Single Nanoparticle Photoemission Intermittency
合作研究:EAGER:设计纳米材料揭示单纳米粒子光电发射间歇性机制
- 批准号:
2345583 - 财政年份:2024
- 资助金额:
$ 17.04万 - 项目类别:
Standard Grant
Collaborative Research: EAGER: Energy for persistent sensing of carbon dioxide under near shore waves.
合作研究:EAGER:近岸波浪下持续感知二氧化碳的能量。
- 批准号:
2339062 - 财政年份:2024
- 资助金额:
$ 17.04万 - 项目类别:
Standard Grant
Collaborative Research: EAGER: IMPRESS-U: Groundwater Resilience Assessment through iNtegrated Data Exploration for Ukraine (GRANDE-U)
合作研究:EAGER:IMPRESS-U:通过乌克兰综合数据探索进行地下水恢复力评估 (GRANDE-U)
- 批准号:
2409395 - 财政年份:2024
- 资助金额:
$ 17.04万 - 项目类别:
Standard Grant
Collaborative Research: EAGER: The next crisis for coral reefs is how to study vanishing coral species; AUVs equipped with AI may be the only tool for the job
合作研究:EAGER:珊瑚礁的下一个危机是如何研究正在消失的珊瑚物种;
- 批准号:
2333603 - 财政年份:2024
- 资助金额:
$ 17.04万 - 项目类别:
Standard Grant
EAGER/Collaborative Research: An LLM-Powered Framework for G-Code Comprehension and Retrieval
EAGER/协作研究:LLM 支持的 G 代码理解和检索框架
- 批准号:
2347623 - 财政年份:2024
- 资助金额:
$ 17.04万 - 项目类别:
Standard Grant