CCRI: Research Infrastructure: NEW: Semantic Scholar Open Data Platform: Enabling Research Into Scientific Search and Discovery

CCRI:研究基础设施:新:语义学者开放数据平台:促进科学搜索和发现研究

基本信息

项目摘要

The exponential growth of scientific publication makes it difficult for scientists to track developments in their field and make connections between different advances. In response, artificial-intelligence researchers have started to develop techniques that allow computers to ‘read’ scientific papers and automatically classify topics, extract key results, summarize contributions, identify connections, and select a personalized set of papers that may be of special interest to each scientist. The enduring vision is to build AI systems that can process an immense corpus of scholarly documents and augment the capabilities of human scientists – accelerating scientific discovery and helping humanity quickly confront disasters such as the COVID-19 pandemic. The proposed Semantic Scholar Open Data Platform builds infrastructure to support this research by first gathering a comprehensive set of papers and arranging for efficient indexing. The system processes PDF-formatted papers to extract information and use advanced analytic processing approaches to provide researchers access to results. The infrastructure will dramatically lower the barrier to entry for newcomers to the field of scholarly document processing, improve reproducibility of experiments, and accelerate innovation in the important area of AI-augmented scientific discoveryThe infrastructure proposed is unique, because alternative sources of academic papers are either closed, incomplete, have limited programmatic access, or have been retired. The proposed Semantic Scholar Open Data Platform has three parts: 1) a comprehensive set of online services enabling researchers to programmatically search, filter, extract, summarize, and analyze a large and continually-updated corpus of documents; 2) a new mechanism that enables researchers to curate their own domain-specific text corpora, as the team previously created the CORD-19 dataset for coronavirus research; 3) open source software, including pretrained language models and user interface templates to serve as research building blocks. Together the infrastructure will dramatically lower the barrier to entry for newcomers to the field of scholarly document processing, improve reproducibility of experiments, and accelerate innovation in the important area of AI-augmented scientific discovery. Fortunately, the recent increase in research in scholarly document processing (e.g., the rapid uptake of our CORD-19 dataset) shows that the computer and information science community has the interest and capability to develop new technologies that accelerate science and help meet global societal challenges, such as pandemics and climate change. The resulting advances in AI-augmented scientific discovery will benefit all areas of science, spurring medical advances, creating new jobs, and improving access for blind researchers. We will improve global infrastructure by providing open services, data sets, code, and associated educational materials. The team will also engage with underrepresented STEM students and through K-12 outreach.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
科学出版物的指数级增长使科学家很难跟踪其领域的发展并在不同的进步之间建立联系。为此,人工智能研究人员已开始开发技术,使计算机能够“阅读”科学论文并自动对主题进行分类,提取关键结果,总结贡献,识别联系,并选择每位科学家可能特别感兴趣的一组个性化论文。持久的愿景是构建能够处理大量学术文档并增强人类科学家能力的人工智能系统,从而加速科学发现并帮助人类快速应对诸如 COVID-19 大流行等灾难。拟议的语义学者开放数据平台通过首先收集一套全面的论文并安排有效的索引来构建基础设施来支持这项研究。该系统处理 PDF 格式的论文以提取信息,并使用先进的分析处理方法为研究人员提供结果。该基础设施将大大降低学术文档处理领域新手的进入门槛,提高实验的可重复性,并加速人工智能增强科学发现这一重要领域的创新。所提出的基础设施是独一无二的,因为学术论文的替代来源要么是封闭的、不完整的,要么是程序化访问有限的,要么是已经退役的。拟议的语义学者开放数据平台由三个部分组成:1)一套全面的在线服务,使研究人员能够以编程方式搜索、过滤、提取、总结和分析大量且不断更新的文档语料库; 2)一种新机制,使研究人员能够管理自己的特定领域文本语料库,因为该团队之前为冠状病毒研究创建了 CORD-19 数据集; 3)开源软件,包括预训练的语言模型和用户界面模板,作为研究构建模块。这些基础设施将大大降低学术文档处理领域的新手进入门槛,提高实验的可重复性,并加速人工智能增强科学发现这一重要领域的创新。幸运的是,最近学术文档处理研究的增加(例如,我们的 CORD-19 数据集的快速采用)表明计算机和信息科学界有兴趣并有能力开发新技术,以加速科学发展并帮助应对全球社会挑战,例如流行病和气候变化。人工智能增强科学发现所带来的进步将惠及所有科学领域,促进医学进步,创造新的就业机会,并改善盲人研究人员的机会。我们将通过提供开放服务、数据集、代码和相关教育材料来改善全球基础设施。 The team will also engage with underrepresented STEM students and through K-12 outreach.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Daniel Weld其他文献

Daniel Weld的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Daniel Weld', 18)}}的其他基金

RAPID: Augmented Intelligence for Accelerating Covid-Related Scientific Discovery
RAPID:增强智能加速新冠相关科学发现
  • 批准号:
    2040196
  • 财政年份:
    2020
  • 资助金额:
    $ 200万
  • 项目类别:
    Standard Grant
RI: Small: Improving Crowd-Sourced Annotation by Autonomous Intelligent Agents
RI:小型:通过自主智能代理改进众包注释
  • 批准号:
    1420667
  • 财政年份:
    2014
  • 资助金额:
    $ 200万
  • 项目类别:
    Standard Grant
RI: Small: Decision-Theoretic Control of Crowd-Sourced Workflows
RI:小型:众包工作流程的决策理论控制
  • 批准号:
    1016713
  • 财政年份:
    2010
  • 资助金额:
    $ 200万
  • 项目类别:
    Standard Grant
RI: Small: Integrating Paradigms for Approximate Stochastic Planning
RI:小型:集成近似随机规划的范式
  • 批准号:
    1016465
  • 财政年份:
    2010
  • 资助金额:
    $ 200万
  • 项目类别:
    Standard Grant
Supporting Students Attending IUI 2009 Conference
支持学生参加 IUI 2009 会议
  • 批准号:
    0914591
  • 财政年份:
    2009
  • 资助金额:
    $ 200万
  • 项目类别:
    Standard Grant
Representation and Reasoning about Adaptive Interfaces
自适应接口的表示和推理
  • 批准号:
    0307906
  • 财政年份:
    2003
  • 资助金额:
    $ 200万
  • 项目类别:
    Continuing Grant
Extending Graphplan to Handle Uncertainty and Sensing Actions
扩展 Graphplan 来处理不确定性和感知动作
  • 批准号:
    9872128
  • 财政年份:
    1998
  • 资助金额:
    $ 200万
  • 项目类别:
    Standard Grant
Principled Planning with Simultaneous Actions, Metric Time and Continuous Effects
同步行动、公制时间和连续效应的原则性规划
  • 批准号:
    9303461
  • 财政年份:
    1994
  • 资助金额:
    $ 200万
  • 项目类别:
    Continuing Grant
Presidential Young Investigator Award
总统青年研究员奖
  • 批准号:
    8957302
  • 财政年份:
    1989
  • 资助金额:
    $ 200万
  • 项目类别:
    Continuing Grant
Managing Complexity in Qualitative Physics
管理定性物理学的复杂性
  • 批准号:
    8902010
  • 财政年份:
    1989
  • 资助金额:
    $ 200万
  • 项目类别:
    Standard Grant

相似国自然基金

Research on Quantum Field Theory without a Lagrangian Description
  • 批准号:
    24ZR1403900
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
Cell Research
  • 批准号:
    31224802
  • 批准年份:
    2012
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Cell Research
  • 批准号:
    31024804
  • 批准年份:
    2010
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Cell Research (细胞研究)
  • 批准号:
    30824808
  • 批准年份:
    2008
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
  • 批准号:
    10774081
  • 批准年份:
    2007
  • 资助金额:
    45.0 万元
  • 项目类别:
    面上项目

相似海外基金

Collaborative Research: Research Infrastructure: CCRI: ENS: Enhanced Open Networked Airborne Computing Platform
合作研究:研究基础设施:CCRI:ENS:增强型开放网络机载计算平台
  • 批准号:
    2235160
  • 财政年份:
    2023
  • 资助金额:
    $ 200万
  • 项目类别:
    Standard Grant
Collaborative Research: CCRI: New: Syntactic Differencing Infrastructure for Software Evolution Research
合作研究:CCRI:新:软件进化研究的句法差异基础设施
  • 批准号:
    2232594
  • 财政年份:
    2023
  • 资助金额:
    $ 200万
  • 项目类别:
    Standard Grant
Collaborative Research: CCRI: New: CoMIC: A Collaborative Mobile Immersive Computing Research Infrastructure for Multi-user XR
协作研究:CCRI:新:CoMIC:用于多用户 XR 的协作移动沉浸式计算研究基础设施
  • 批准号:
    2235050
  • 财政年份:
    2023
  • 资助金额:
    $ 200万
  • 项目类别:
    Standard Grant
Collaborative Research: Research Infrastructure: CCRI: New: Distributed Space and Terrestrial Networking Infrastructure for Multi-Constellation Coexistence
合作研究:研究基础设施:CCRI:新:用于多星座共存的分布式空间和地面网络基础设施
  • 批准号:
    2235140
  • 财政年份:
    2023
  • 资助金额:
    $ 200万
  • 项目类别:
    Standard Grant
Collaborative Research: CCRI: New: A Research News Recommender Infrastructure with Live Users for Algorithm and Interface Experimentation
合作研究:CCRI:新:研究新闻推荐基础设施与实时用户进行算法和界面实验
  • 批准号:
    2232554
  • 财政年份:
    2023
  • 资助金额:
    $ 200万
  • 项目类别:
    Standard Grant
Collaborative Research: CCRI: New: A Research News Recommender Infrastructure with Live Users for Algorithm and Interface Experimentation
合作研究:CCRI:新:研究新闻推荐基础设施与实时用户进行算法和界面实验
  • 批准号:
    2232551
  • 财政年份:
    2023
  • 资助金额:
    $ 200万
  • 项目类别:
    Standard Grant
Collaborative Research: CCRI: New: CoMIC: A Collaborative Mobile Immersive Computing Research Infrastructure for Multi-user XR
协作研究:CCRI:新:CoMIC:用于多用户 XR 的协作移动沉浸式计算研究基础设施
  • 批准号:
    2235049
  • 财政年份:
    2023
  • 资助金额:
    $ 200万
  • 项目类别:
    Standard Grant
Collaborative Research: CCRI: Planning-C: An Infrastructure and Dataset for Research in Android Testing & Analysis
合作研究:CCRI:Planning-C:Android 测试研究的基础设施和数据集
  • 批准号:
    2235137
  • 财政年份:
    2023
  • 资助金额:
    $ 200万
  • 项目类别:
    Standard Grant
Collaborative Research: Research Infrastructure: CCRI: New: Distributed Space and Terrestrial Networking Infrastructure for Multi-Constellation Coexistence
合作研究:研究基础设施:CCRI:新:用于多星座共存的分布式空间和地面网络基础设施
  • 批准号:
    2235139
  • 财政年份:
    2023
  • 资助金额:
    $ 200万
  • 项目类别:
    Standard Grant
Collaborative Research: CCRI: New: A Research News Recommender Infrastructure with Live Users for Algorithm and Interface Experimentation
合作研究:CCRI:新:研究新闻推荐基础设施与实时用户进行算法和界面实验
  • 批准号:
    2232555
  • 财政年份:
    2023
  • 资助金额:
    $ 200万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了