Methods for Hypothesis-driven Analysis of Sequential Data (HydrAS)
假设驱动的序列数据分析方法 (HydrAS)
基本信息
- 批准号:438232455
- 负责人:
- 金额:--
- 依托单位:
- 依托单位国家:德国
- 项目类别:Research Grants
- 财政年份:
- 资助国家:德国
- 起止时间:
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Increased availability of large-scale digital trace data on human behavior requires the development of suitable algorithmic approaches in the fields of computer and data science. Such data often comes in the form of sequences, e.g. as sequences of visited websites or locations in cities. To analyze this kind of data and extract knowledge in large scale, the applicants and others presented a novel computational approach that enables the comparison of hypotheses (derived from intuition, previous studies, or social theories) with respect to their plausibility regarding observed sequences in a Bayesian approach. In this project, we will develop fundamentally new data analysis methods in that direction that overcome current shortcomings. In that regard, we will (1) systemize and simplify the process of hypothesis elicitation by integrating (semi-)automatic procedures for deriving interpretable base hypotheses from background knowledge and combining base hypotheses with each other. Additionally, we aim to (2) develop methods that partition data sequences in such a way that each part of the data can be succinctly described in terms of background information on the features, and the transition behavior in each partition can be explained by given hypotheses in order to account for heterogeneity in the data. Finally, we (3) extend the general framework of hypothesis-based analysis of sequential data, which currently focuses on simple first-order Markov Chain models to more complex models such as Hidden Markov chain models, continuous time Markov chain models or neural networks for sequential data. This would allow to formalize more complex and more fine-grained hypotheses, to pick models that are most suitable for a specific scenario, and integrate additional information (e.g., time information) in an easily understandable way.In contrast to many recently proposed methods in the field of data science and machine learning, our research will not focus on methods that yield the maximum predictive power. Instead, we concentrate on finding potential explanations of the data generation process that can be understood by human domain experts through incorporating their hypotheses directly into the analysis process. In that regard, it will provide unique opportunities to integrate hypothesis-driven data analysis on one hand with advanced machine learning techniques on the other hand to support the understanding of the underlying processes generating the observed sequences. While this project focuses on developing new data science methods for analyzing human behavior, we expect the results to be easily transferable to other application areas featuring sequential data.
人类行为的大规模数字跟踪数据的增加需要在计算机和数据科学领域开发合适的算法方法。这些数据通常以序列的形式出现,例如访问过的网站或城市位置的序列。为了分析这种数据并大规模地提取知识,申请人和其他人提出了一种新的计算方法,该方法能够比较假设(源自直觉、先前的研究或社会理论)关于它们在贝叶斯方法中关于观察到的序列的可验证性。在这个项目中,我们将在这个方向上开发全新的数据分析方法,以克服当前的缺点。在这方面,我们将(1)通过整合(半)自动化程序,从背景知识中导出可解释的基础假设,并将基础假设相互结合,从而系统化和简化假设推导过程。此外,我们的目标是(2)开发划分数据序列的方法,使数据的每个部分都可以根据特征的背景信息来简洁地描述,并且每个分区中的过渡行为可以通过给定的假设来解释,以说明数据中的异质性。最后,我们(3)扩展了基于假设的序列数据分析的一般框架,目前主要集中在简单的一阶马尔可夫链模型,更复杂的模型,如隐马尔可夫链模型,连续时间马尔可夫链模型或神经网络的序列数据。这将允许形式化更复杂和更细粒度的假设,以挑选最适合特定场景的模型,并整合额外的信息(例如,与数据科学和机器学习领域最近提出的许多方法相比,我们的研究将不会集中在产生最大预测能力的方法上。相反,我们专注于寻找潜在的解释数据生成过程中,可以理解的人类领域的专家,通过将他们的假设直接到分析过程中。在这方面,它将提供独特的机会,一方面将假设驱动的数据分析与先进的机器学习技术相结合,另一方面支持对产生观察到的序列的基本过程的理解。虽然该项目的重点是开发用于分析人类行为的新数据科学方法,但我们预计结果可以轻松转移到其他以序列数据为特征的应用领域。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Professor Dr. Andreas Hotho其他文献
Professor Dr. Andreas Hotho的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Professor Dr. Andreas Hotho', 18)}}的其他基金
Learning Environmental Maps - Integrating Participatory Sensing and Human Perception
学习环境地图 - 整合参与感知和人类感知
- 批准号:
314699772 - 财政年份:2016
- 资助金额:
-- - 项目类别:
Priority Programmes
Pragmatics and Semantics in Social Tagging Systems II
社会标签系统中的语用学和语义学 II
- 批准号:
196648487 - 财政年份:2011
- 资助金额:
-- - 项目类别:
Research Grants
BERT with Character - Knowledge Graph infused neural language models to analyse the depiction of literary characters (LitBERT)
BERT with Character - 知识图谱注入神经语言模型来分析文学人物的描述 (LitBERT)
- 批准号:
529659926 - 财政年份:
- 资助金额:
-- - 项目类别:
Research Grants
相似海外基金
CAREER: Testing the mismatch hypothesis for climate change-driven mutualism breakdown
职业:测试气候变化驱动的互利共生崩溃的不匹配假说
- 批准号:
2142792 - 财政年份:2022
- 资助金额:
-- - 项目类别:
Continuing Grant
Data analysis tools for leveraging massive public data to improve hypothesis-driven research
数据分析工具,利用大量公共数据来改进假设驱动的研究
- 批准号:
10598130 - 财政年份:2022
- 资助金额:
-- - 项目类别:
Data analysis tools for leveraging massive public data to improve hypothesis-driven research
数据分析工具,利用大量公共数据来改进假设驱动的研究
- 批准号:
10330636 - 财政年份:2022
- 资助金额:
-- - 项目类别:
EAGER: Evaluating a drought-driven hypothesis for the origin of obligate apomixis
EAGER:评估干旱驱动的专性无融合生殖起源假说
- 批准号:
2232106 - 财政年份:2022
- 资助金额:
-- - 项目类别:
Standard Grant
Data analysis tools for leveraging massive public data to improve hypothesis-driven research
数据分析工具,利用大量公共数据来改进假设驱动的研究
- 批准号:
10654376 - 财政年份:2022
- 资助金额:
-- - 项目类别:
A pan-cancer atlas of driver mutations in >100,000 patients based on a hypothesis-driven combined computational and experimental approach
基于假设驱动的计算和实验相结合的方法,绘制了超过 100,000 名患者的驱动突变泛癌图谱
- 批准号:
10620844 - 财政年份:2021
- 资助金额:
-- - 项目类别:
A pan-cancer atlas of driver mutations in >100,000 patients based on a hypothesis-driven combined computational and experimental approach
基于假设驱动的计算和实验相结合的方法,绘制了超过 100,000 名患者的驱动突变泛癌图谱
- 批准号:
10276520 - 财政年份:2021
- 资助金额:
-- - 项目类别:
A pan-cancer atlas of driver mutations in >100,000 patients based on a hypothesis-driven combined computational and experimental approach
基于假设驱动的计算和实验相结合的方法,绘制了超过 100,000 名患者的驱动突变泛癌图谱
- 批准号:
10617428 - 财政年份:2021
- 资助金额:
-- - 项目类别:
NCS-FO: Empowering Data-Driven Hypothesis Generation for Scalable Connectomics Analysis
NCS-FO:为可扩展的连接组学分析提供数据驱动的假设生成
- 批准号:
2124179 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Standard Grant
Development of an information/hypothesis-driven proteogenomics strategy
信息/假设驱动的蛋白质组学策略的开发
- 批准号:
20K21386 - 财政年份:2020
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Challenging Research (Exploratory)