SI2-SSE: Human- and Machine-Intelligent Software Elements for Cost-Effective Scientific Data Digitization

SI2-SSE:用于经济高效的科学数据数字化的人机智能软件元素

基本信息

  • 批准号:
    1535086
  • 负责人:
  • 金额:
    $ 48.8万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2015
  • 资助国家:
    美国
  • 起止时间:
    2015-08-01 至 2020-07-31
  • 项目状态:
    已结题

项目摘要

In the era of data-intensive scientific discovery, Big Data scientists in all communities spend the majority of their time and effort collecting, integrating, curating, transforming, and assessing quality before actually performing discovery analysis. Some endeavors may even start from information not being available and accessible in digital form, and when it is available, it is often in non-structured form, not compatible with analytics tools that require structured and uniformly-formatted data. Two main methods to deal with the volume and variety of data as well as to accelerate the rate of digitization have been to apply crowdsourcing or machine-learning solutions. However, very little has been done to simultaneously take advantage of both types of solutions, and to make it easier for different efforts to share and reuse developed software elements. The vision of the Human- and Machine-Intelligent Network (HuMaIN) project is to accelerate scientific data digitization through fundamental advances in the integration and mutual cooperation between human and machine processing in order to handle practical hurdles and bottlenecks present in scientific data digitization. Even though HuMaIN concentrates on digitization tasks faced by the biodiversity community, the software elements being developed are generic in nature, and expected to be applicable to other scientific domains (e.g., exploring the surface of the moon for craters require the same type of crowdsourcing tool as finding words in text, and the same questions of whether machine-learning tools could provide similar results can be tested).The HuMaIN project proposes to conduct research and develop the following software elements: (a) configurable Machine-Learning applications for scientific data digitization (e.g., Optical Character Recognition and Natural Language Processing), which will be made automatically available as RESTful services for increasing the ability of HuMaIN software elements to interoperate with other elements while decreasing the software development time via a new application specification language; (b) workflows leading to a cyber-human coordination system that will take advantage of feedback loops (e.g., based on consensus of crowdsourced data and its quality) for self-adaptation to changes and increased sustainability of the overall system, (c) new crowdsourcing micro-tasks with ability of being reusable for a variety of scenarios and containing user activity sensors for studying time-effective user interfaces, and (d) services to support automated creation and configuration of crowdsourcing workflows on demand to fit the needs of individual groups. A cloud-based system will be deployed to provide the necessary execution environment with traceability of service executions involved in cyber-human workflows, and cost-effectiveness analysis of all the software elements developed in this project will provide assessment and evaluation of long standing what-if scenarios pertaining human- and machine-intelligent tasks. Crowdsourcing activities will attract a wide range of users with tasks that require low expertise, and at the same time it will expose volunteers to applied science and engineering, potentially attracting interest of K-12 teachers and students.
在数据密集型科学发现的时代,所有社区的大数据科学家都花费大部分时间和精力来收集,整合,策划,转换和评估质量,然后再实际执行发现分析。某些努力甚至可能是从无法以数字形式获得的信息开始的,并且在可用的情况下,通常以非结构性形式,与需要结构化且均匀形成的数据的分析工具不兼容。处理数据数量和多种数据以及加速数字化速率的两种主要方法是应用众包或机器学习解决方案。但是,几乎没有采取很少的工作来同时利用两种类型的解决方案,并且使不同的努力更容易共享和重用开发的软件元素。人类和机器智能网络(Humain)项目的愿景是通过人类和机器处理之间的整合和相互合作的基本进步来加速科学数据数字化,以处理科学数据数字化中存在的实用障碍和瓶颈。即使腐殖质集中于生物多样性社区面临的数字化任务,但开发的软件要素本质上是仿制的,预计将适用于其他科学领域(例如,探索山口山的月球表面(a)可配置的用于科学数据数字化的机器学习应用程序(例如,光学角色识别和自然语言处理),将自动提供作为宁静服务,以提高Humain软件元素与其他元素互动的能力,同时通过新的应用程序规范语言减少软件开发时间; (b)导致网络人类协调系统的工作流程,该系统将利用反馈循环(例如,基于众包数据及其质量的共识)来自我适应整体系统的变化和增加的可持续性,(c)新的众包微型任务,并能够为各种式的驾驶用户提供服务,并能够为各种式的用户练习,并培养各种情况。按需众包工作流程的自动创建和配置以满足各个组的需求。将部署一个基于云的系统,以提供必要的执行环境,并以网络人类工作流程涉及的服务执行的可追溯性,以及对本项目中开发的所有软件元素的成本效益分析,将提供对长期存在的人类和机器和机器和机器智能任务的评估和评估。众包活动将吸引众多需要专业知识的任务的用户,同时,它将使志愿者接触到应用科学和工程,从而吸引K-12教师和学生的兴趣。

项目成果

期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Cooperative Human-Machine Data Extraction from Biological Collections
SELFIE: Self-Aware Information Extraction from Digitized Biocollections
自拍:从数字化生物收藏中提取自我意识信息
  • DOI:
    10.1109/escience.2017.19
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Alzuru, Icaro;Matsunaga, Andrea;Tsugawa, Mauricio;Fortes, Jose A.B.
  • 通讯作者:
    Fortes, Jose A.B.
Quality-Aware Human-Machine Text Extraction for Biocollections using Ensembles of OCRs
使用 OCR 集成对生物样本进行质量感知人机文本提取
  • DOI:
    10.1109/escience.2019.00020
  • 发表时间:
    2019
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Alzuru, Icaro;Stephens, Rhiannon;Matsunaga, Andrea;Tsugawa, Mauricio;Flemons, Paul;Fortes, Jose A.B.
  • 通讯作者:
    Fortes, Jose A.B.
Task Design and Crowd Sentiment in Biocollections Information Extraction
生物馆藏信息提取中的任务设计和人群情绪
  • DOI:
    10.1109/cic.2017.00056
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Alzuru, Icaro;Matsunaga, Andrea;Tsugawa, Mauricio;Fortes, Jose A.B.
  • 通讯作者:
    Fortes, Jose A.B.
Human-Machine Information Extraction Simulator for Biological Collections
生物标本人机信息提取模拟器
  • DOI:
    10.1109/bigdata47090.2019.9005601
  • 发表时间:
    2019
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Alzuru, Icaro;Malladi, Aditi;Matsunaga, Andrea;Tsugawa, Mauricio;Jose A.B., Fortes
  • 通讯作者:
    Jose A.B., Fortes
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Jose Fortes其他文献

Toward Construction of Resilient Software-Defined IT Infrastructure for Supporting Disaster Management Applications
构建弹性软件定义的 IT 基础设施以支持灾难管理应用
  • DOI:
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Yasuhiro Watashiba;Jose Fortes;Jason Haga;Kohei Ichikawa;Susumu Date;Hirotake Abe;Yoshiyuki Kido;Hiroaki Yamanaka;Ryousei Takano;Ryusuke Egawa
  • 通讯作者:
    Ryusuke Egawa
A study on big data I/O performance with modern storage systems
现代存储系统大数据 I/O 性能研究
  • DOI:
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Kenji Nakashima;Joichiro Kon;Gil Jae Lee;Jose Fortes;Saneyasu Yamaguchi
  • 通讯作者:
    Saneyasu Yamaguchi
PRAGMA-ENT: Exposing SDN Concepts to Domain Scientists in the Pacific Rim
PRAGMA-ENT:向环太平洋地区的领域科学家展示 SDN 概念
  • DOI:
  • 发表时间:
    2015
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Kohei Ichikawa;Mauricio Tsugawa;Jason Haga;Hiroaki Yamanaka;Te-Lung Liu;Yoshiyuki Kido;Pongsakorn U-Chupala;Che Huang;Chawanat Nakasan;Jo-Yu Chang;Li-Chi Ku;Whey-Fone Tsai;Susumu Date;Shinji Shimojo;Philip Papadopoulos;Jose Fortes
  • 通讯作者:
    Jose Fortes

Jose Fortes的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Jose Fortes', 18)}}的其他基金

SCC-PG: Coordinated Safety Management Across Smart Communities
SCC-PG:跨智能社区的协调安全管理
  • 批准号:
    1951816
  • 财政年份:
    2020
  • 资助金额:
    $ 48.8万
  • 项目类别:
    Standard Grant
EAGER: Towards the Web of Biodiversity Knowledge: Understanding Data Connectedness to Improve Identifier Practices
EAGER:迈向生物多样性知识网络:了解数据连通性以改进标识符实践
  • 批准号:
    1839201
  • 财政年份:
    2018
  • 资助金额:
    $ 48.8万
  • 项目类别:
    Standard Grant
US-EA CENTRA: US - East Asia Collaborations to Enable Transnational Cyberinfrastructure Applications
US-EA CENTRA:美国-东亚合作实现跨国网络基础设施应用
  • 批准号:
    1550126
  • 财政年份:
    2015
  • 资助金额:
    $ 48.8万
  • 项目类别:
    Continuing Grant
EAGER: Collaborative Research: Model-based Autonomic Cloud Computing Software Technology
EAGER:协作研究:基于模型的自主云计算软件技术
  • 批准号:
    1265341
  • 财政年份:
    2013
  • 资助金额:
    $ 48.8万
  • 项目类别:
    Standard Grant
Second Workshop on Instrumentation Needs of Computer and Information Science and Engineering (INCISE2) Research
第二届计算机与信息科学与工程仪器需求研讨会(INCISE2)研究
  • 批准号:
    1232197
  • 财政年份:
    2012
  • 资助金额:
    $ 48.8万
  • 项目类别:
    Standard Grant
Collaborative Research: Unified Cloud Computing and Management
合作研究:统一云计算与管理
  • 批准号:
    1127965
  • 财政年份:
    2011
  • 资助金额:
    $ 48.8万
  • 项目类别:
    Standard Grant
Autonomic Middleware for Self-protection, Data Transfers, and Anomaly Analytics as a Service
用于自我保护、数据传输和异常分析即服务的自主中间件
  • 批准号:
    1032038
  • 财政年份:
    2010
  • 资助金额:
    $ 48.8万
  • 项目类别:
    Standard Grant
Collaborative Research: Adaptive IT appliance for collaborative review of child-death cases
协作研究:用于协作审查儿童死亡案件的自适应 IT 设备
  • 批准号:
    1042644
  • 财政年份:
    2010
  • 资助金额:
    $ 48.8万
  • 项目类别:
    Standard Grant
TIE: UF-FIU inter-I/UCRC collaboration to explore autonomic computing for the TerraFly server system
TIE:UF-FIU I/UCRC 间合作探索 TerraFly 服务器系统的自主计算
  • 批准号:
    0932023
  • 财政年份:
    2009
  • 资助金额:
    $ 48.8万
  • 项目类别:
    Standard Grant
Collaborative Research, II-NEW: An Instrumented Data Center Infrastructure for Research on Cross-Layer Autonomics
协作研究,II-新:用于跨层自主研究的仪表化数据中心基础设施
  • 批准号:
    0855123
  • 财政年份:
    2009
  • 资助金额:
    $ 48.8万
  • 项目类别:
    Continuing Grant

相似国自然基金

化脓性链球菌分泌性酯酶Sse抑制LC3相关吞噬促其侵袭的机制研究
  • 批准号:
    82202525
  • 批准年份:
    2022
  • 资助金额:
    30.00 万元
  • 项目类别:
    青年科学基金项目
化脓性链球菌分泌性酯酶Sse抑制LC3相关吞噬促其侵袭的机制研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
太阳能电池Cu2ZnSn(SSe)4/CdS界面过渡层结构模拟及缺陷态消除研究
  • 批准号:
    12274114
  • 批准年份:
    2022
  • 资助金额:
    55.00 万元
  • 项目类别:
    面上项目
太阳能电池Cu2ZnSn(SSe)4/CdS界面过渡层结构模拟及缺陷态消除研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    55 万元
  • 项目类别:
    面上项目
掺杂实现Cu2ZnSn(SSe)4吸收层表层稳定弱n型特性的第一性原理研究
  • 批准号:
  • 批准年份:
    2020
  • 资助金额:
    24 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

異常検知手法と大気ノイズ補正を併用したInSAR時系列による未知のSSE検出手法の確立
利用异常检测方法和大气噪声校正建立利用InSAR时间序列的未知SSE检测方法
  • 批准号:
    24K07168
  • 财政年份:
    2024
  • 资助金额:
    $ 48.8万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
A study on vibration theory for defect detection by acoustic excitation using SSE analysis
基于SSE分析的声激励缺陷检测振动理论研究
  • 批准号:
    23K03995
  • 财政年份:
    2023
  • 资助金额:
    $ 48.8万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Revealing spatiotemporal slow slip evolution at higher temporal resolution by kinematic GNSS
通过运动 GNSS 揭示更高时间分辨率的时空慢滑演化
  • 批准号:
    21K14007
  • 财政年份:
    2022
  • 资助金额:
    $ 48.8万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Study on defect detection by spatial spectral entropy (SSE) and healthy part evaluation for noncontact acoustic inspection
非接触声学检测中空间谱熵(SSE)缺陷检测和健康部位评估研究
  • 批准号:
    19K04414
  • 财政年份:
    2019
  • 资助金额:
    $ 48.8万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Numerical simulations of earthquake and SSE triggering by dynamic stress changes
动态应力变化引发地震和SSE的数值模拟
  • 批准号:
    18K03775
  • 财政年份:
    2018
  • 资助金额:
    $ 48.8万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了