Novel methods for large-scale genomic interval comparison

大规模基因组区间比较的新方法

基本信息

  • 批准号:
    10678947
  • 负责人:
  • 金额:
    $ 38.4万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2022
  • 资助国家:
    美国
  • 起止时间:
    2022-08-10 至 2026-05-31
  • 项目状态:
    未结题

项目摘要

ABSTRACT Epigenome data are driving discovery in biomedical analysis of genetic variation and gene regulation. Epigenome data produced by experimental protocols such as ATAC-seq or ChIP-seq are often summarized into sets of genomic intervals defined by a chromosome plus start and end coordinates. Databases now provide hundreds of thousands of such region sets, each containing potentially hundreds of thousands of individual regions. This data holds tremendous promise to understand gene regulation and disease because many health outcomes are affected by genetic variation or epigenetic perturbation in regulatory DNA. Many different tools and methods have been developed to assess such sets of genomic intervals. These ap- proaches are used for a broad array of biomedical research, such as annotating genetic variation associated with disease traits. Supporting region-based analyses, we and others have developed novel data structures and algorithms to compare similarity of region sets and to compute overlaps between interval sets, enabling interval comparisons on millions of regions. But as the genomic interval set data sources grow in size and scope, we require both faster algorithms and novel methods to compare this important data type. As the amount of available data increases, it is becoming intractable to compute exact overlaps. Furthermore, the fastest algorithms only analyze pure intervals, not signal values, which could be used to compare interval sets more accurately. Existing approaches have made little progress in the area of defining canonical interval sets to simplify analysis even further. Here, we address these limitations in several ways: First, we develop novel, more scalable algorithms using approximate computations and define the idea of interval set universes to consolidate analysis. Second, we develop an innovative approach to analyzing region sets that goes beyond simply counting overlaps, instead relying on cutting-edge machine learning methods to learn and measure similarity more accurately. We propose a novel set theoretic approach building on techniques from natural language processing to compare intervals. Together, we propose a first-pass filter that can be reasonably computed on data sets containing billions to trillions of genomic intervals, followed by a more accurate analysis to identify more subtle relationships among region sets. These advances will improve both the efficiency and accuracy of existing biomedical research approaches, and open the door to new ways of exploring the vast and growing corpus of genome interval data.
摘要

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Nathan Sheffield其他文献

Nathan Sheffield的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Nathan Sheffield', 18)}}的其他基金

Novel methods for large-scale genomic interval comparison
大规模基因组区间比较的新方法
  • 批准号:
    10842040
  • 财政年份:
    2022
  • 资助金额:
    $ 38.4万
  • 项目类别:
A modular data analysis ecosystem using portable encapsulated projects
使用便携式封装项目的模块化数据分析生态系统
  • 批准号:
    10468680
  • 财政年份:
    2018
  • 资助金额:
    $ 38.4万
  • 项目类别:
A modular data analysis ecosystem using portable encapsulated projects
使用便携式封装项目的模块化数据分析生态系统
  • 批准号:
    10019399
  • 财政年份:
    2018
  • 资助金额:
    $ 38.4万
  • 项目类别:
A modular data analysis ecosystem using portable encapsulated projects
使用便携式封装项目的模块化数据分析生态系统
  • 批准号:
    9751344
  • 财政年份:
    2018
  • 资助金额:
    $ 38.4万
  • 项目类别:
A modular data analysis ecosystem using portable encapsulated projects
使用便携式封装项目的模块化数据分析生态系统
  • 批准号:
    10224819
  • 财政年份:
    2018
  • 资助金额:
    $ 38.4万
  • 项目类别:

相似海外基金

Rational design of rapidly translatable, highly antigenic and novel recombinant immunogens to address deficiencies of current snakebite treatments
合理设计可快速翻译、高抗原性和新型重组免疫原,以解决当前蛇咬伤治疗的缺陷
  • 批准号:
    MR/S03398X/2
  • 财政年份:
    2024
  • 资助金额:
    $ 38.4万
  • 项目类别:
    Fellowship
CAREER: FEAST (Food Ecosystems And circularity for Sustainable Transformation) framework to address Hidden Hunger
职业:FEAST(食品生态系统和可持续转型循环)框架解决隐性饥饿
  • 批准号:
    2338423
  • 财政年份:
    2024
  • 资助金额:
    $ 38.4万
  • 项目类别:
    Continuing Grant
Re-thinking drug nanocrystals as highly loaded vectors to address key unmet therapeutic challenges
重新思考药物纳米晶体作为高负载载体以解决关键的未满足的治疗挑战
  • 批准号:
    EP/Y001486/1
  • 财政年份:
    2024
  • 资助金额:
    $ 38.4万
  • 项目类别:
    Research Grant
Metrology to address ion suppression in multimodal mass spectrometry imaging with application in oncology
计量学解决多模态质谱成像中的离子抑制问题及其在肿瘤学中的应用
  • 批准号:
    MR/X03657X/1
  • 财政年份:
    2024
  • 资助金额:
    $ 38.4万
  • 项目类别:
    Fellowship
CRII: SHF: A Novel Address Translation Architecture for Virtualized Clouds
CRII:SHF:一种用于虚拟化云的新型地址转换架构
  • 批准号:
    2348066
  • 财政年份:
    2024
  • 资助金额:
    $ 38.4万
  • 项目类别:
    Standard Grant
The Abundance Project: Enhancing Cultural & Green Inclusion in Social Prescribing in Southwest London to Address Ethnic Inequalities in Mental Health
丰富项目:增强文化
  • 批准号:
    AH/Z505481/1
  • 财政年份:
    2024
  • 资助金额:
    $ 38.4万
  • 项目类别:
    Research Grant
ERAMET - Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
ERAMET - 快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
  • 批准号:
    10107647
  • 财政年份:
    2024
  • 资助金额:
    $ 38.4万
  • 项目类别:
    EU-Funded
BIORETS: Convergence Research Experiences for Teachers in Synthetic and Systems Biology to Address Challenges in Food, Health, Energy, and Environment
BIORETS:合成和系统生物学教师的融合研究经验,以应对食品、健康、能源和环境方面的挑战
  • 批准号:
    2341402
  • 财政年份:
    2024
  • 资助金额:
    $ 38.4万
  • 项目类别:
    Standard Grant
Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
  • 批准号:
    10106221
  • 财政年份:
    2024
  • 资助金额:
    $ 38.4万
  • 项目类别:
    EU-Funded
Recite: Building Research by Communities to Address Inequities through Expression
背诵:社区开展研究,通过表达解决不平等问题
  • 批准号:
    AH/Z505341/1
  • 财政年份:
    2024
  • 资助金额:
    $ 38.4万
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了