ScienceLinker: A Framework for Finding, Linking, and Enriching Social Science Linked Data

ScienceLinker:查找、链接和丰富社会科学关联数据的框架

基本信息

项目摘要

Scientists in applied empirical research are typically searching for datasets and, in particular, measures within the datasets (e.g. variables in the case of the social science research) which allow them to investigate their specific research interest. These datasets are used for multiple purposes like for answering a particular research question, replicating a specific finding based on a different dataset or merging it with another dataset in order to increase possibilities for analysis or to reduce missing values. However, finding suitable data and measures for the support of one’s own hypothesis is a challenging task. In a lot of cases, a researcher will be able to find the desired data at a research data centre. Regarding the mass of data available on the web (resulting from the Open Data movement) additional interesting datasets are likely be available but are not provided by organized infrastructures like research data centres. Additionally, manual effort still has to be done to use the found datasets for interlinking, e.g. in order to enrich own datasets with additional content from the found data and metadata, also for a later publication in a journal, a self-archiving platform or on the web. The project ScienceLinker motivates two approaches for these challenges: (1) to develop methods to identify datasets published as Linked Open Data on the web that are compatible by their content and also provide an appropriate quality; (2) to apply Semantic Web technologies to use of the data e.g. for linking, enrichment and publishing. These techniques will be made usable for non-domain users by applying extensive automation when possible. The developed framework aims to guide the user (e.g. an employee of a data provider who is responsible for the publication of data or a scientist who is seeking datasets in order to complete his dataset with additional metadata) through the following five steps: the automatic identification of a set of related datasets published as Linked Open Data; the assessment of a dataset in terms of compatibility and quality; the linking of entities referenced in the dataset to the identified datasets; the enrichment of the dataset by applying a set of entity-type-specific rules to infer additional information about the entities also via non-identity links; and the preprocessing of the enriched dataset for a publication in self-archiving platforms, as Linked Data or via further publication ways.The investigations and developments in this project will be kept generic in order to allow an application of the framework in other domains. For the social sciences, potential related Linked Data sources may neither be scientific nor from the social science domain at all like e.g. DBpedia or Geonames. In order that the ScienceLinker framework can also be executed in a neutral environment, we will integrate it into the established data integration platform Karma which has been developed at ISI.
应用实证研究中的科学家通常会搜索数据集,特别是数据集中的测量(例如社会科学研究中的变量),这些测量使他们能够调查他们的具体研究兴趣。这些数据集用于多种目的,例如回答特定的研究问题,基于不同的数据集复制特定的发现,或将其与另一个数据集合并,以增加分析的可能性或减少缺失值。然而,寻找合适的数据和措施来支持自己的假设是一项具有挑战性的任务。在很多情况下,研究人员将能够在研究数据中心找到所需的数据。关于网络上可用的大量数据(来自开放数据运动),可能还有其他有趣的数据集,但不是由研究数据中心等有组织的基础设施提供的。此外,仍然必须进行手动工作以使用所发现的数据集进行互连,例如,以便用来自所发现的数据和元数据的附加内容来丰富自己的数据集,也用于以后在期刊、自存档平台或网络上的出版物。ScienceLinker项目激发了两种方法来应对这些挑战:(1)开发方法来识别在网络上发布的关联开放数据,这些数据集与其内容兼容,并提供适当的质量;(2)将语义网技术应用于数据的使用,例如链接,丰富和发布。这些技术将通过在可能的情况下应用广泛的自动化来使非域用户可用。开发的框架旨在指导用户(例如,数据提供者负责发布数据的雇员或正在寻找数据集以补充元数据的科学家)通过以下五个步骤:自动识别作为关联开放数据发布的一组相关数据集;评估数据集的兼容性和质量;通过应用一组实体类型特定的规则来也经由非身份链接推断关于实体的附加信息来丰富数据集;以及在自存档平台中对用于发布的丰富数据集进行预处理,作为关联数据或通过进一步的发布方式。该项目的调查和开发将保持通用,以便将该框架应用于其他领域。对于社会科学,潜在的相关关联数据源可能既不是科学的,也不是来自社会科学领域,例如DBpedia或Geonames。为了使ScienceLinker框架也可以在中立的环境中执行,我们将其集成到ISI开发的已建立的数据集成平台Karma中。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Dr. Benjamin Zapilko其他文献

Dr. Benjamin Zapilko的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似海外基金

Integrating Self-Regulated Learning Into STEM Courses: Maximizing Learning Outcomes With The Success Through Self-Regulated Learning Framework
将自我调节学习融入 STEM 课程:通过自我调节学习框架取得成功,最大化学习成果
  • 批准号:
    2337176
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
CAREER: Many-Body Green's Function Framework for Materials Spectroscopy
职业:材料光谱的多体格林函数框架
  • 批准号:
    2337991
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
CAREER: Resilient and Efficient Automatic Control in Energy Infrastructure: An Expert-Guided Policy Optimization Framework
职业:能源基础设施中的弹性和高效自动控制:专家指导的政策优化框架
  • 批准号:
    2338559
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Planning Grant: Developing capacity to attract diverse students to the geosciences: A public relations framework
规划补助金:培养吸引多元化学生学习地球科学的能力:公共关系框架
  • 批准号:
    2326816
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
RII Track-4:NSF: An Integrated Urban Meteorological and Building Stock Modeling Framework to Enhance City-level Building Energy Use Predictions
RII Track-4:NSF:综合城市气象和建筑群建模框架,以增强城市级建筑能源使用预测
  • 批准号:
    2327435
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
EAGER/Collaborative Research: An LLM-Powered Framework for G-Code Comprehension and Retrieval
EAGER/协作研究:LLM 支持的 G 代码理解和检索框架
  • 批准号:
    2347624
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
CRII: OAC: A Compressor-Assisted Collective Communication Framework for GPU-Based Large-Scale Deep Learning
CRII:OAC:基于 GPU 的大规模深度学习的压缩器辅助集体通信框架
  • 批准号:
    2348465
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Collaborative Research: An Integrated Framework for Learning-Enabled and Communication-Aware Hierarchical Distributed Optimization
协作研究:支持学习和通信感知的分层分布式优化的集成框架
  • 批准号:
    2331710
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Collaborative Research: An Integrated Framework for Learning-Enabled and Communication-Aware Hierarchical Distributed Optimization
协作研究:支持学习和通信感知的分层分布式优化的集成框架
  • 批准号:
    2331711
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
CAREER: A Universal Framework for Safety-Aware Data-Driven Control and Estimation
职业:安全意识数据驱动控制和估计的通用框架
  • 批准号:
    2340089
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了