Algorithmic approaches to systems biology, data integration, and evolution

系统生物学、数据集成和进化的算法方法

基本信息

  • 批准号:
    10268080
  • 负责人:
  • 金额:
    $ 138.52万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
  • 资助国家:
    美国
  • 起止时间:
  • 项目状态:
    未结题

项目摘要

My group continued to develop and apply computational methods that utilize and integrate large data sets to study gene regulation and diseases. We also develop methods to analyze data produced My group continued to develop and apply computational methods that utilize and integrate large data sets with a focus on gene regulation and diseases. We also developed new methods to analyze data produced by new, high throughput, technologies and experimental techniques such as single cell gene expression and HT-SELEX data. In our studies we use variety of algorithmic techniques including Integer Linear Programming (ILP) among other optimization strategies as well as Machine Learning approaches, including Hidden Markov Models and Deep Learning. Within this general area, the main focus of my group is on developing new computational methods allowing to utilize large cancer-related datasets (e.g. TCGA and ICGC) to obtain insists into etiology of cancer. Together with our experimental collaborators we also utilize new experimental data to obtain novel insights into fundamental biological processes. Much of the effort of the group during this reporting period has been devoted to studying of mutational patterns in cancer genomes. Specifically, through their lifetime, individuals acquire somatic mutations which might eventually led to cancer. These mutations often display characteristic patterns known as mutational signatures. Understanding relation between these patterns and their causes can provide important insights to into tumorigenesis in general and environmental contributions to cancer in particular. The two fundamental question in this area are (i) what is the best way to characterize these mutation patterns and (ii) leveraging such patterns of somatic mutations for understanding of mutagenic processes shaping human genome. One of the most challenging obstacles to a full characterization of mutational patterns comes from the fact that these patterns are the end-effect of several interplaying factors including carcinogenic exposures and potential deficiencies of the DNA repair mechanism. Separating these factors in nontrivial and thus the current methods typically do not attempt such separation assuming linear combination model. Yet, to fully understand the nature of each signature, it is important to disambiguate the atomic components that contribute to the final signature. As the first step in this direction we recently introduced a new descriptor of mutational signatures, DNA Repair FootPrint (RePrint) (1). Our work demonstrated, for the first time, that it is possible to identify signatures that include common DNA repair deficiency independent on the other mutagenic processes that contribute to the composite signature. We validated the method with published mutational signatures from cell lines targeted with CRISPR-Cas9-based knockouts of DNA repair genes. The second line of research related to mutational signatures is the identification of mutagenic processes underlying mutational signatures. To investigate the genetic aberrations associated with mutational signatures, we took a network-based approach considering mutational signatures as cancer phenotypes. Specifically, our analysis aimed to answer the following two complementary questions: (i) what are functional pathways whose gene expression activities correlate with the strengths of mutational signatures, and (ii) are there pathways whose genetic alterations might have led to specific mutational signatures? To identify mutated pathways, we adopted a recently developed optimization method based on integer linear programming. Analyzing a breast cancer dataset, we identified pathways associated with mutational signatures on both expression and mutation levels. Our analysis captured important differences in the etiology of the APOBEC-related signatures and the two clock-like signatures. In particular, it revealed that clustered and dispersed APOBEC mutations may be caused by different mutagenic processes. In addition, our analysis elucidated differences between two age-related signatures-one of the signatures is correlated with the expression of cell cycle genes while the other has no such correlation but shows patterns consistent with the exposure to environmental/external processes. This work investigated, for the first time, a network-level association of mutational signatures and dysregulated pathways. The identified pathways and subnetworks provide novel insights into mutagenic processes that the cancer genomes might have undergone and important clues for developing personalized drug therapies (2). In addition, we collaborated with Roded Sharans group from TAU, to provide a first probabilistic model of mutational signatures that accounts for context dependency and strand coordination (3). Finally, we started a research leveraging the concept mutational to study the relationship of smoking and expression ACE2 and other proteins known to be involved in the entrance the Coronavirus 2s (SARS-CoV-2) into the host cell. We also continued our research on methods to construct gene regulatory networks (GRNs). These networks describe regulatory relationships between transcription factors (TFs) and their target genes. Following the development of NetREX (Network Reprogramming using EXpression) technique to for constructing context-specific GRN given context-specific expression data and a context-agnostic prior network (reported last year), we developed NetREX-CF. The important novelty of NetREX-CF is the ability to deal with missing data. Specifically, NetREX-CF reconstruction approach that brings together a modern machine learning strategy (Collaborative Filtering model) and a biologically justified model of gene expression (sparse Network Component Analysis based model). The Collaborative Filtering (CF) is able to overcome the incompleteness of the prior knowledge and make edge recommends for building the GRN. Complementing CF, we use the sparse Network Component Analysis (NCA) to validate the recommended edges. Finally, we combine these two approaches using a novel data integration method and show that the new approach outperforms the currently leading GRN reconstruction methods. Our preliminary results show that this method drastically outperform previous approaches. This work has been selected for oral presentation RECOMB 2020 and the manuscript in preparation. My group also continues to develop software for public use including AptaBlocks Online -- a web-based toolkit for the In silico design of RNA complexes (4) and JUDY a flexible bioinformatics pipeline for diverse types of bioinformatics analysis (5). nWe also provided computational expertise and analysis of the specialized sequencing data, mRNA display, that our collaborators used for comparison of the performance of Linear, Monocyclic, and Bicyclic Libraries (6).
我的团队继续开发和应用计算方法,利用和整合大数据集来研究基因调控和疾病。我们还开发了分析产生的数据的方法。我的团队继续开发和应用利用和整合大型数据集的计算方法,重点关注基因调控和疾病。我们还开发了新方法来分析新的高通量技术和实验技术产生的数据,例如单细胞基因表达和 HT-SELEX 数据。在我们的研究中,我们使用各种算法技术,包括整数线性规划(ILP)和其他优化策略以及机器学习方法,包括隐马尔可夫模型和深度学习。在这个总体领域中,我的团队的主要重点是开发新的计算方法,允许利用大型癌症相关数据集(例如 TCGA 和 ICGC)来获得对癌症病因学的坚持。我们还与我们的实验合作者一起利用新的实验数据来获得对基本生物过程的新见解。 Much of the effort of the group during this reporting period has been devoted to studying of mutational patterns in cancer genomes. Specifically, through their lifetime, individuals acquire somatic mutations which might eventually led to cancer. These mutations often display characteristic patterns known as mutational signatures.了解这些模式及其原因之间的关系可以为了解一般肿瘤发生,特别是环境对癌症的影响提供重要见解。该领域的两个基本问题是(i)表征这些突变模式的最佳方法是什么,以及(ii)利用此类体细胞突变模式来理解塑造人类基因组的诱变过程。 全面表征突变模式的最具挑战性的障碍之一来自这样一个事实:这些模式是多种相互作用因素的最终结果,包括致癌暴露和 DNA 修复机制的潜在缺陷。分离这些因素并不重要,因此当前的方法通常不会尝试假设线性组合模型进行这种分离。 然而,为了充分理解每个签名的性质,重要的是要消除对最终签名做出贡献的原子组件的歧义。作为朝这个方向迈出的第一步,我们最近引入了一种新的突变特征描述符,DNA 修复足迹 (RePrint) (1)。 我们的工作首次证明,可以识别包括常见 DNA 修复缺陷的特征,独立于导致复合特征的其他诱变过程。 We validated the method with published mutational signatures from cell lines targeted with CRISPR-Cas9-based knockouts of DNA repair genes. The second line of research related to mutational signatures is the identification of mutagenic processes underlying mutational signatures. 为了研究与突变特征相关的遗传畸变,我们采用了基于网络的方法,将突变特征视为癌症表型。具体来说,我们的分析旨在回答以下两个互补的问题:(i)其基因表达活性与突变特征强度相关的功能途径是什么,以及(ii)是否存在其基因改变可能导致特定突变特征的途径? To identify mutated pathways, we adopted a recently developed optimization method based on integer linear programming. Analyzing a breast cancer dataset, we identified pathways associated with mutational signatures on both expression and mutation levels.我们的分析发现了 APOBEC 相关特征和两个类时钟特征在病因学上的重要差异。 In particular, it revealed that clustered and dispersed APOBEC mutations may be caused by different mutagenic processes.此外,我们的分析阐明了两个与年龄相关的特征之间的差异——其中一个特征与细胞周期基因的表达相关,而另一个没有这种相关性,但显示出与暴露于环境/外部过程一致的模式。 This work investigated, for the first time, a network-level association of mutational signatures and dysregulated pathways.已确定的通路和子网络为癌症基因组可能经历的诱变过程提供了新的见解,并为开发个性化药物疗法提供了重要线索 (2)。此外,我们与 TAU 的 Roded Sharans 小组合作,提供了第一个突变特征的概率模型,该模型解释了上下文依赖性和链协调 (3)。 最后,我们开始了一项研究,利用突变的概念来研究吸烟与 ACE2 和其他已知参与冠状病毒 2 型 (SARS-CoV-2) 进入宿主细胞的蛋白质表达之间的关系。 我们还继续研究构建基因调控网络(GRN)的方法。 These networks describe regulatory relationships between transcription factors (TFs) and their target genes.继 NetREX(使用表达的网络重编程)技术的发展,用于在给定上下文特定的表达数据和上下文不可知的先验网络(去年报告)的情况下构建上下文特定的 GRN 后,我们开发了 NetREX-CF。 The important novelty of NetREX-CF is the ability to deal with missing data. 具体来说,NetREX-CF 重建方法汇集了现代机器学习策略(协作过滤模型)和生物学上合理的基因表达模型(基于稀疏网络组件分析的模型)。协同过滤(CF)能够克服先验知识的不完整性,并为构建GRN做出边缘推荐。 Complementing CF, we use the sparse Network Component Analysis (NCA) to validate the recommended edges.最后,我们使用一种新颖的数据集成方法将这两种方法结合起来,并表明新方法优于当前领先的 GRN 重建方法。 Our preliminary results show that this method drastically outperform previous approaches. This work has been selected for oral presentation RECOMB 2020 and the manuscript in preparation. 我的团队还继续开发供公众使用的软件,包括 AptaBlocks Online——一个用于 RNA 复合物计算机设计的基于网络的工具包 (4) 和 JUDY 一个用于不同类型生物信息学分析的灵活生物信息学管道 (5)。 nWe also provided computational expertise and analysis of the specialized sequencing data, mRNA display, that our collaborators used for comparison of the performance of Linear, Monocyclic, and Bicyclic Libraries (6).

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Teresa Przytycka其他文献

Teresa Przytycka的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Teresa Przytycka', 18)}}的其他基金

Combinatorial and graph theoretical approach to systems biology and mol. evo.
系统生物学和分子生物学的组合和图论方法。
  • 批准号:
    8943247
  • 财政年份:
  • 资助金额:
    $ 138.52万
  • 项目类别:
Combinatorial and graph theoretical approach to systems biology and mol. evo.
系统生物学和分子生物学的组合和图论方法。
  • 批准号:
    8558125
  • 财政年份:
  • 资助金额:
    $ 138.52万
  • 项目类别:
Algorithmic approaches to systems biology, data integration, and evolution
系统生物学、数据集成和进化的算法方法
  • 批准号:
    10927048
  • 财政年份:
  • 资助金额:
    $ 138.52万
  • 项目类别:
Combinatorial and graph theoretical approach to systems biology and mol. evo.
系统生物学和分子生物学的组合和图论方法。
  • 批准号:
    7969252
  • 财政年份:
  • 资助金额:
    $ 138.52万
  • 项目类别:
Combinatorial and graph theoretical approach to systems biology and mol. evo.
系统生物学和分子生物学的组合和图论方法。
  • 批准号:
    8344970
  • 财政年份:
  • 资助金额:
    $ 138.52万
  • 项目类别:
Algorithmic approaches to systems biology, data integration, and evolution
系统生物学、数据集成和进化的算法方法
  • 批准号:
    9555743
  • 财政年份:
  • 资助金额:
    $ 138.52万
  • 项目类别:
Algorithmic approaches to systems biology, data integration, and evolution
系统生物学、数据集成和进化的算法方法
  • 批准号:
    10018681
  • 财政年份:
  • 资助金额:
    $ 138.52万
  • 项目类别:
Combinatorial and graph theoretical approach to systems biology and mol. evo.
系统生物学和分子生物学的组合和图论方法。
  • 批准号:
    8149615
  • 财政年份:
  • 资助金额:
    $ 138.52万
  • 项目类别:
Combinatorial and graph theoretical approach to systems biology and mol. evo.
系统生物学和分子生物学的组合和图论方法。
  • 批准号:
    7735092
  • 财政年份:
  • 资助金额:
    $ 138.52万
  • 项目类别:
Algorithmic approaches to systems biology, data integration, and evolution
系统生物学、数据集成和进化的算法方法
  • 批准号:
    10688922
  • 财政年份:
  • 资助金额:
    $ 138.52万
  • 项目类别:

相似海外基金

How novices write code: discovering best practices and how they can be adopted
新手如何编写代码:发现最佳实践以及如何采用它们
  • 批准号:
    2315783
  • 财政年份:
    2023
  • 资助金额:
    $ 138.52万
  • 项目类别:
    Standard Grant
One or Several Mothers: The Adopted Child as Critical and Clinical Subject
一位或多位母亲:收养的孩子作为关键和临床对象
  • 批准号:
    2719534
  • 财政年份:
    2022
  • 资助金额:
    $ 138.52万
  • 项目类别:
    Studentship
A comparative study of disabled children and their adopted maternal figures in French and English Romantic Literature
英法浪漫主义文学中残疾儿童及其收养母亲形象的比较研究
  • 批准号:
    2633211
  • 财政年份:
    2020
  • 资助金额:
    $ 138.52万
  • 项目类别:
    Studentship
A material investigation of the ceramic shards excavated from the Omuro Ninsei kiln site: Production techniques adopted by Nonomura Ninsei.
对大室仁清窑遗址出土的陶瓷碎片进行材质调查:野野村仁清采用的生产技术。
  • 批准号:
    20K01113
  • 财政年份:
    2020
  • 资助金额:
    $ 138.52万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
A comparative study of disabled children and their adopted maternal figures in French and English Romantic Literature
英法浪漫主义文学中残疾儿童及其收养母亲形象的比较研究
  • 批准号:
    2436895
  • 财政年份:
    2020
  • 资助金额:
    $ 138.52万
  • 项目类别:
    Studentship
A comparative study of disabled children and their adopted maternal figures in French and English Romantic Literature
英法浪漫主义文学中残疾儿童及其收养母亲形象的比较研究
  • 批准号:
    2633207
  • 财政年份:
    2020
  • 资助金额:
    $ 138.52万
  • 项目类别:
    Studentship
The limits of development: State structural policy, comparing systems adopted in two European mountain regions (1945-1989)
发展的限制:国家结构政策,比较欧洲两个山区采用的制度(1945-1989)
  • 批准号:
    426559561
  • 财政年份:
    2019
  • 资助金额:
    $ 138.52万
  • 项目类别:
    Research Grants
Securing a Sense of Safety for Adopted Children in Middle Childhood
确保被收养儿童的中期安全感
  • 批准号:
    2236701
  • 财政年份:
    2019
  • 资助金额:
    $ 138.52万
  • 项目类别:
    Studentship
A Study on Mutual Funds Adopted for Individual Defined Contribution Pension Plans
个人设定缴存养老金计划采用共同基金的研究
  • 批准号:
    19K01745
  • 财政年份:
    2019
  • 资助金额:
    $ 138.52万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Structural and functional analyses of a bacterial protein translocation domain that has adopted diverse pathogenic effector functions within host cells
对宿主细胞内采用多种致病效应功能的细菌蛋白易位结构域进行结构和功能分析
  • 批准号:
    415543446
  • 财政年份:
    2019
  • 资助金额:
    $ 138.52万
  • 项目类别:
    Research Fellowships
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了