ABI Innovation: Scalable kmer-based algorithms and software for gene expression and regulation
ABI Innovation:用于基因表达和调控的可扩展的基于 kmer 的算法和软件
基本信息
- 批准号:1564785
- 负责人:
- 金额:$ 80万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2016
- 资助国家:美国
- 起止时间:2016-07-01 至 2019-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Proteins are one of the basic building blocks of living organisms. Each tissue carefully regulates the set of proteins that it produces, both of the types of protein made and the amounts of each. For instance, the proteins that make up the structure and enable the function of the heart are different from those defining the lungs or brain. A deep understanding of the processes that control what proteins are made and how this is timed is of fundamental scientific importance, with implications throughout biology. This project focuses on two critical points in protein regulation: the step in which genes are read off of the genomic DNA to make the intermediate mRNA (called transcription), and the step in which some pieces of the mRNA are removed and the ends reconnected (called splicing), so that translation to protein will result in versions with distinct properties. Complete profiling of mRNA produced by tissues is possible but results in very large, complex data sets. This project will create software tools to facilitate efficient analyses of these data sets, making sense of the two processes described above. The results can be applied to investigate a wide variety of biological questions. Besides the scientific contributions, several educational and outreach activities are aimed at students at the high school and undergraduate levels. In particular, a summer workshop will be organized to provide hands-on bioinformatics experiences to local high school teachers and representative students. Students in the University of Maryland Terrapin Teacher program will be mentored to create lesson plans in Bioinformatics; the lessons will subsequently be delivered at local high schools. In addition, high school students will be invited to participate in summer research experiences under the mentorship of the PIs.Cellular morphology and function is determined by precise regulation of gene activity. A long-term goal of biological research is to fully decipher the mechanisms of gene regulation. Spatial and temporal regulation of final gene products is primarily executed when genes are transcribed and the resulting RNA is processed. In turn, transcriptional regulation is mediated by regulatory elements, such as enhancers and promoters, which are characterized by diffuse clusters of short degenerate DNA motifs. An important task in the analysis of the transcriptional regulation involves comparison of regulatory regions, both within and across genomes. With regards to post-transcriptional processing events, critical first steps are identification, characterization and quantification of alternatively processed variants of a gene. Increasingly, sequence data are available that hold the answers to these questions, but which must be mined in order to extract their secrets. The scale of the task of incorporating the massive amounts of next generation sequencing data render conventional solutions inefficient and thus presents a major computational bottleneck. The proposed research will develop efficient algorithms and tools for the analysis of gene expression at both transcriptional and post-transcriptional levels by exploiting recently developed ultrafast data structures for DNA words or k-mers. The proposed solutions to two related but different fundamental problems - quantification of regulatory region similarity, and quantification of alternative isoforms of a gene, provide an alternative to traditional approaches by reformulating the problems in terms of k-mer similarity searches, for which extremely efficient solutions have been recently developed and exploited for related problems, including transcript quantification (e.g. Sailfish). The proposed tools will enable investigation of certain fundamental mechanistic and evolutionary questions pertaining to transcriptional regulation and analysis of splice isoform variants at an unprecedented scale. The research goals will be tied to several STEM educational activities focused on local high school students, both in terms of early research involvement and classroom education. Some of these activities will be organized at the Center for Bioinformatics and Computational Biology (CBCB), thereby encouraging the close interaction between the students and the center faculty.The link to the results will be provided at PI's lab page at cbcb.umd.edu/~sridhar/software.html
蛋白质是生物体的基本组成部分之一。每个组织都仔细地调节它产生的蛋白质组,包括蛋白质的类型和每种蛋白质的数量。例如,构成心脏结构和功能的蛋白质与定义肺或大脑的蛋白质不同。深入了解控制蛋白质生成的过程及其时间安排具有重要的科学意义,并对整个生物学产生影响。该项目侧重于蛋白质调控中的两个关键点:从基因组DNA中读取基因以产生中间mRNA的步骤(称为转录),以及去除mRNA的一些片段并重新连接末端的步骤(称为剪接),以便翻译为蛋白质将导致具有不同特性的版本。组织产生的mRNA的完整分析是可能的,但会导致非常大的,复杂的数据集。该项目将创建软件工具,以促进对这些数据集的有效分析,使上述两个过程的意义。研究结果可用于研究各种生物学问题。除了科学贡献外,还针对高中和大学生开展了一些教育和外联活动。特别是,将组织一个夏季讲习班,为当地高中教师和有代表性的学生提供生物信息学的实践经验。马里兰州大学Terrapin教师计划的学生将接受指导,以创建生物信息学课程计划;课程随后将在当地高中交付。此外,高中生将被邀请参加在PI的指导下的暑期研究经验。细胞形态和功能是由基因活性的精确调节决定的。生物学研究的一个长期目标是完全破译基因调控的机制。最终基因产物的空间和时间调节主要在基因转录和产生的RNA加工时执行。反过来,转录调控是由调控元件介导的,如增强子和启动子,其特征在于短简并DNA基序的扩散簇。转录调控分析中的一项重要任务涉及基因组内和基因组间调控区域的比较。关于转录后加工事件,关键的第一步是识别、表征和量化基因的替代加工变体。越来越多的序列数据包含了这些问题的答案,但必须挖掘这些数据才能提取出它们的秘密。合并大量下一代测序数据的任务的规模使得常规解决方案效率低下,因此存在主要的计算瓶颈。拟议的研究将开发有效的算法和工具,通过利用最近开发的DNA字或k-mer的超快数据结构,在转录和转录后水平上分析基因表达。两个相关但不同的基本问题-调控区相似性的定量和基因的替代同种型的定量的建议的解决方案,提供了一种替代传统的方法,通过重新制定的问题方面的k-聚体相似性搜索,最近已经开发和利用非常有效的解决方案,相关的问题,包括转录本定量(例如旗鱼)。 所提出的工具将使调查有关的转录调控和剪接异构体变异体的分析在一个前所未有的规模的某些基本的机制和进化问题。研究目标将与几个专注于当地高中生的STEM教育活动联系在一起,包括早期研究参与和课堂教育。其中一些活动将在生物信息学和计算生物学中心(CBCB)组织,从而鼓励学生和中心教师之间的密切互动。结果链接将在PI的实验室页面cbcb.umd.edu/~sridhar/software.html上提供。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Sridhar Hannenhalli其他文献
Local rules for protein folding on a triangular lattice and generalized hydrophobicity in the HP model
HP 模型中三角晶格上蛋白质折叠的局部规则和广义疏水性
- DOI:
10.1145/267521.267522 - 发表时间:
1997 - 期刊:
- 影响因子:0
- 作者:
Richa Agarwala;S. Batzoglou;Vlado Dančík;Scott E. Decatur;Martin Farach;Sridhar Hannenhalli;S. Muthukrishnan;Steven Skiena - 通讯作者:
Steven Skiena
Genome-wide analysis of retroviral DNA integration
逆转录病毒 DNA 整合的全基因组分析
- DOI:
10.1038/nrmicro1263 - 发表时间:
2005-09-19 - 期刊:
- 影响因子:103.300
- 作者:
Frederic Bushman;Mary Lewinski;Angela Ciuffi;Stephen Barr;Jeremy Leipzig;Sridhar Hannenhalli;Christian Hoffmann - 通讯作者:
Christian Hoffmann
Correction: Transcriptomes of the tumor-adjacent normal tissues are more informative than tumors in predicting recurrence in colorectal cancer patients
- DOI:
10.1186/s12967-023-04124-4 - 发表时间:
2023-05-05 - 期刊:
- 影响因子:7.500
- 作者:
Jinho Kim;Hyunjung Kim;Min‑Seok Lee;Heetak Lee;Yeon Jeong Kim;Woo Yong Lee;Seong Hyeon Yun;Hee Cheol Kim;Hye Kyung Hong;Sridhar Hannenhalli;Yong Beom Cho;Donghyun Park;Sun Shim Choi - 通讯作者:
Sun Shim Choi
Sridhar Hannenhalli的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Sridhar Hannenhalli', 18)}}的其他基金
ACM BCB 2013: Conference on Bioinformatics and Computational Biology
ACM BCB 2013:生物信息学和计算生物学会议
- 批准号:
1341410 - 财政年份:2013
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Better Network Modules: New Tools for Protein Network Analysis
更好的网络模块:蛋白质网络分析的新工具
- 批准号:
0849899 - 财政年份:2009
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
相似海外基金
TWINNING: Scalable technologies for creating virtual patient twin populations to accelerate in-silico enabled medical device innovation.
双胞胎:用于创建虚拟患者双胞胎群体的可扩展技术,以加速计算机模拟医疗设备创新。
- 批准号:
10103504 - 财政年份:2024
- 资助金额:
$ 80万 - 项目类别:
Collaborative R&D
SBIR Phase I: Scalable, on-demand, research-based, help-seeking innovation for learners in virtual and recorded training programs
SBIR 第一阶段:通过虚拟和录制的培训项目为学习者提供可扩展、按需、基于研究、寻求帮助的创新
- 批准号:
2151406 - 财政年份:2023
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Statistical innovation to integrate sequences and phenotypes for scalable phylodynamic inference
统计创新整合序列和表型以进行可扩展的系统动力学推断
- 批准号:
10584588 - 财政年份:2021
- 资助金额:
$ 80万 - 项目类别:
Statistical innovation to integrate sequences and phenotypes for scalable phylodynamic inference
统计创新整合序列和表型以进行可扩展的系统动力学推断
- 批准号:
10390334 - 财政年份:2021
- 资助金额:
$ 80万 - 项目类别:
Statistical innovation to integrate sequences and phenotypes for scalable phylodynamic inference
统计创新整合序列和表型以进行可扩展的系统动力学推断
- 批准号:
10177121 - 财政年份:2021
- 资助金额:
$ 80万 - 项目类别:
Collaborative Research: Framework: Software: CINES: A Scalable Cyberinfrastructure for Sustained Innovation in Network Engineering and Science
合作研究:框架:软件:CINES:用于网络工程和科学持续创新的可扩展网络基础设施
- 批准号:
2210266 - 财政年份:2021
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Innovation of age-related macular disease treatment by a scalable DDS platform
通过可扩展的 DDS 平台创新年龄相关性黄斑疾病治疗
- 批准号:
20K12640 - 财政年份:2020
- 资助金额:
$ 80万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
NSF Convergence Accelerator Track D: Scalable, TRaceable Ai for Imaging Translation: Innovation to Implementation for Accelerated Impact (STRAIT I3)
NSF 融合加速器轨道 D:可扩展、可追踪的成像翻译人工智能:加速影响的创新实施 (STRAIT I3)
- 批准号:
2040462 - 财政年份:2020
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: Framework: Software: CINES: A Scalable Cyberinfrastructure for Sustained Innovation in Network Engineering and Science
合作研究:框架:软件:CINES:用于网络工程和科学持续创新的可扩展网络基础设施
- 批准号:
1835598 - 财政年份:2018
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
ABI Innovation: Scalable and Agile Analysis of Mass Spectrometry Experiments
ABI 创新:质谱实验的可扩展且敏捷的分析
- 批准号:
1759736 - 财政年份:2018
- 资助金额:
$ 80万 - 项目类别:
Standard Grant