Mining the Sequence of the Human Genome for Important Sequence Features
挖掘人类基因组序列以获取重要序列特征
基本信息
- 批准号:7734879
- 负责人:
- 金额:$ 15.42万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:
- 资助国家:美国
- 起止时间:至
- 项目状态:未结题
- 来源:
- 关键词:AddressAffectAfricanAlgorithmsAvian SarcomaBase SequenceCaliforniaClassificationClinical TrialsCollaborationsComputersDataData AnalysesDevelopmentFutureGene ExpressionGenesGenomeGenomicsGenus AlpharetrovirusGoalsGorilla gorillaHealthHumanHuman GenomeIndividualLaboratory OrganismLocationMacaca mulattaMapsMasksMethodologyMethodsMicroarray AnalysisMiningMolecular ProfilingMoloney Leukemia VirusMurine leukemia virusMusNucleotidesNumbersOligonucleotide MicroarraysOrganismPan GenusPan paniscusPan troglodytesPatientsPatternPongidaePositioning AttributeProcessProto-OncogenesResearchResearch PersonnelResearch Project GrantsRetroviral VectorRetroviridaeSIVSafetyScoreTimeTranscriptTranscription Initiation SiteUniversitiesVirusWorkX-Linked Severe Combined ImmunodeficiencyYeastsbasecomputerized toolsdesigngene therapygenome sequencingleukemianovelprogramsresearch studytechnology developmenttissue/cell culturevector
项目摘要
The last few years have seen a dramatic increase in the number of publicly available complete genome sequences and annotations. At the same time, researchers have been taking advantage of technology developments that allow individual labs to efficiently perform experiments that generate tens of thousands of data points. This massive increase in data means that some lab projects are no longer tractable by individual biologists but, rather, require large-scale data analysis capabilities best handled by a computer programmer. This research focuses on developing methodologies to integrate sequence, annotation, and experimentally generated data so that bench biologists can quickly and easily obtain results for their large-scale experiments.
The goal of this research project is to take advantage of the publicly available set of sequence and annotations to develop automated tools for the computational characterization of experimentally identified genomic sequences. The first step in the process is to align each sequence to the reference genome assembly to determine its genomic location. Existing programs suffice for most sequences, but we have developed a novel set of algorithms to map short sequences of less than 25 nucleotides. These programs can map tens of thousands of sequences in only a few minutes, even allowing for mismatches. The second step of the process is to compare the coordinates of the sequences to the coordinates of a variety of genome annotations. Using this approach, we can assign putative functions to the experimentally-identified sequences based on their proximity to known sequence features. In order to provide statistical rigor for the analysis, we have developed a pipeline to characterize sequences picked at random from the genome.
We are applying the above methods to a number of research projects. One example is to determine the positions at which retroviruses and retroviral vectors integrate into the host genome during the process of retroviral gene therapy. Moloney murine leukemia virus (MLV) is one of the common retroviruses used in gene therapy. However, recent studies have shown that MLV can integrate into genes, disrupting their function and thus affecting the patients health. Specifically, because MLV integrated near and then activated a proto-oncogene, four patients with X-linked severe combined immunodeficiency (X-SCID) developed leukemia following retroviral gene therapy treatment. With Dr. Cynthia Dunbar of NHLBI, we are working on projects to assess the efficacy and safety of retroviruses used in gene therapy. In one study, we performed a systematic analysis of the integration patterns of avian sarcoma leukosis virus (ASLV) in the rhesus macaque. Unlike MLV, ASLV does not tend to integrate near gene-rich regions, transcription start sites, or proto-oncogenes. Thus, optimized vectors based on this virus could be useful and safe for future gene therapy trials. In another study, we have analyzed the integration patterns of simian immunodeficiency virus (SIV) by following three rhesus macaques for more than four years following retroviral treatment. We found that the levels of SIV remained stable four years post treatment, and that the integration profile of SIV appears to be safer than that of MLV. Thus, this vector, too, may be pursued in clinical trials.
In collaboration with Dr. Joseph Hacia of the University of Southern California, we are also using our methods to develop strategies for the interpretation of microarray data. The development of gene expression microarray technology over a decade ago has revolutionized the analysis of the transcriptomes from numerous organisms. The earliest gene expression microarrays focused on widely-used experimental organisms, such as mouse and yeast, in addition to humans. In the intervening years, the number of commercially available species-specific whole genome expression microarrays has dramatically increased. Nevertheless, there are numerous species, such as African great apes (bonobos, chimpanzees, and gorillas), for which whole genome expression microarrays are not commercially available. In such cases, gene expression is often conducted using microarrays designed to evaluate a closely-related species or organism. Several groups have employed commercially available human oligonucleotide microarrays to obtain gene expression profiles from African great ape tissues and cultured cells. However, this method underestimates the abundance of transcripts whose sequences are not well conserved between human and African great ape. One simple approach to address this problem is to remove (mask) data from microarray probes that are poorly conserved. Starting with an existing commercial human oligonucleotide microarray, we determined which probes have single perfect matches to both the human and chimpanzee genomes. These data have been incorporated into studies that quantify the effects of probe number on the accuracy of intra- and interspecies gene expression comparisons. Based on our observations, we developed general rules for the interpretation of gene expression scores based on cross-species microarray experiments.
在过去的几年里,公众可获得的完整基因组序列和注释的数量急剧增加。与此同时,研究人员一直在利用技术的发展,使各个实验室能够有效地进行实验,生成成千上万的数据点。数据的大量增加意味着一些实验室项目不再是单个生物学家可以处理的,而是需要计算机程序员最好处理的大规模数据分析能力。这项研究的重点是开发方法来整合序列,注释和实验生成的数据,使板凳生物学家可以快速,轻松地获得他们的大规模实验的结果。
该研究项目的目标是利用公开的序列和注释集来开发用于实验鉴定的基因组序列的计算表征的自动化工具。该过程的第一步是将每个序列与参考基因组组装体进行比对,以确定其基因组位置。现有的程序足以满足大多数序列,但我们已经开发了一套新的算法来映射小于25个核苷酸的短序列。这些程序可以在短短几分钟内映射数万个序列,甚至允许错配。该过程的第二步是将序列的坐标与各种基因组注释的坐标进行比较。使用这种方法,我们可以根据它们与已知序列特征的接近程度将推定的功能分配给实验鉴定的序列。为了为分析提供统计上的严谨性,我们开发了一个管道来表征从基因组中随机挑选的序列。
我们正在将上述方法应用于一些研究项目。一个例子是确定逆转录病毒和逆转录病毒载体在逆转录病毒基因治疗过程中整合到宿主基因组中的位置。莫洛尼鼠白血病病毒(Moloney murine leukemia virus,MLV)是基因治疗中常用的逆转录病毒之一。然而,最近的研究表明,MLV可以整合到基因中,扰乱其功能,从而影响患者的健康。具体而言,由于MLV整合附近,然后激活原癌基因,4例X连锁严重联合免疫缺陷(X-SCID)患者在逆转录病毒基因治疗后发生白血病。与NHLBI的Cynthia Dunbar博士一起,我们正在开展项目,以评估用于基因治疗的逆转录病毒的有效性和安全性。在一项研究中,我们进行了系统分析的整合模式的禽肉瘤白血病病毒(ASLV)在恒河猴。与MLV不同,ASLV不倾向于在基因富集区、转录起始位点或原癌基因附近整合。因此,基于该病毒的优化载体对于未来的基因治疗试验可能是有用且安全的。在另一项研究中,我们分析了猴免疫缺陷病毒(SIV)的整合模式,对三只恒河猴进行了四年多的逆转录病毒治疗。我们发现SIV水平在治疗后4年保持稳定,并且SIV的整合特征似乎比MLV更安全。因此,这种载体也可以在临床试验中进行。
在与南加州大学的Joseph Hacia博士的合作中,我们也在使用我们的方法来开发解释微阵列数据的策略。十多年前基因表达微阵列技术的发展彻底改变了许多生物体转录组的分析。最早的基因表达微阵列集中在广泛使用的实验生物体,如小鼠和酵母,除了人类。近年来,商业上可获得的物种特异性全基因组表达微阵列的数量急剧增加。然而,有许多物种,如非洲大猿(倭黑猩猩,黑猩猩和大猩猩),其全基因组表达微阵列尚未商业化。在这种情况下,基因表达通常使用设计用于评估密切相关的物种或生物体的微阵列进行。几个研究小组已经使用商业上可获得的人寡核苷酸微阵列来获得来自非洲大猿组织和培养细胞的基因表达谱。然而,这种方法低估了大量的转录本,其序列在人类和非洲巨猿之间并不保守。解决这个问题的一个简单方法是从保守性差的微阵列探针中去除(掩蔽)数据。从现有的商业人类寡核苷酸微阵列开始,我们确定了哪些探针与人类和黑猩猩的基因组都有单一的完美匹配。这些数据已被纳入研究中,量化的探针数量的影响的准确性内和物种间的基因表达比较。基于我们的观察,我们开发了基于跨物种微阵列实验的基因表达分数的解释的一般规则。
项目成果
期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Distinct genomic integration of MLV and SIV vectors in primate hematopoietic stem and progenitor cells.
- DOI:10.1371/journal.pbio.0020423
- 发表时间:2004-12
- 期刊:
- 影响因子:9.8
- 作者:Hematti P;Hong BK;Ferguson C;Adler R;Hanawa H;Sellers S;Holt IE;Eckfeldt CE;Sharma Y;Schmidt M;von Kalle C;Persons DA;Billings EM;Verfaillie CM;Nienhuis AW;Wolfsberg TG;Dunbar CE;Calmels B
- 通讯作者:Calmels B
GeneLink: a database to facilitate genetic studies of complex traits.
GeneLink:促进复杂性状遗传研究的数据库。
- DOI:10.1186/1471-2164-5-81
- 发表时间:2004
- 期刊:
- 影响因子:4.4
- 作者:Gillanders,ElizabethM;Masiello,Anthony;Gildea,Derek;Umayam,Lowell;Duggal,Priya;Jones,MaryPat;Klein,AlisonP;Freas-Lutz,Diana;Ibay,Grace;Trout,Ken;Wolfsberg,TyraG;Trent,JeffreyM;Bailey-Wilson,JoanE;Baxevanis,AndreasD
- 通讯作者:Baxevanis,AndreasD
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Andreas Baxevanis其他文献
Andreas Baxevanis的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Andreas Baxevanis', 18)}}的其他基金
NHGRI/DIR Bioinformatics and Scientific Programming Core
NHGRI/DIR 生物信息学和科学编程核心
- 批准号:
8750737 - 财政年份:
- 资助金额:
$ 15.42万 - 项目类别:
NHGRI/DIR Bioinformatics and Scientific Programming Core
NHGRI/DIR 生物信息学和科学编程核心
- 批准号:
10910770 - 财政年份:
- 资助金额:
$ 15.42万 - 项目类别:
Comparative Genomic Studies on the Evolution of Morphological Complexity
形态复杂性进化的比较基因组研究
- 批准号:
10691105 - 财政年份:
- 资助金额:
$ 15.42万 - 项目类别:
NHGRI/DIR Bioinformatics and Scientific Programming Core
NHGRI/DIR 生物信息学和科学编程核心
- 批准号:
8350237 - 财政年份:
- 资助金额:
$ 15.42万 - 项目类别:
相似海外基金
How Does Particle Material Properties Insoluble and Partially Soluble Affect Sensory Perception Of Fat based Products
不溶性和部分可溶的颗粒材料特性如何影响脂肪基产品的感官知觉
- 批准号:
BB/Z514391/1 - 财政年份:2024
- 资助金额:
$ 15.42万 - 项目类别:
Training Grant
BRC-BIO: Establishing Astrangia poculata as a study system to understand how multi-partner symbiotic interactions affect pathogen response in cnidarians
BRC-BIO:建立 Astrangia poculata 作为研究系统,以了解多伙伴共生相互作用如何影响刺胞动物的病原体反应
- 批准号:
2312555 - 财政年份:2024
- 资助金额:
$ 15.42万 - 项目类别:
Standard Grant
RII Track-4:NSF: From the Ground Up to the Air Above Coastal Dunes: How Groundwater and Evaporation Affect the Mechanism of Wind Erosion
RII Track-4:NSF:从地面到沿海沙丘上方的空气:地下水和蒸发如何影响风蚀机制
- 批准号:
2327346 - 财政年份:2024
- 资助金额:
$ 15.42万 - 项目类别:
Standard Grant
Graduating in Austerity: Do Welfare Cuts Affect the Career Path of University Students?
紧缩毕业:福利削减会影响大学生的职业道路吗?
- 批准号:
ES/Z502595/1 - 财政年份:2024
- 资助金额:
$ 15.42万 - 项目类别:
Fellowship
感性個人差指標 Affect-X の構築とビスポークAIサービスの基盤確立
建立个人敏感度指数 Affect-X 并为定制人工智能服务奠定基础
- 批准号:
23K24936 - 财政年份:2024
- 资助金额:
$ 15.42万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Insecure lives and the policy disconnect: How multiple insecurities affect Levelling Up and what joined-up policy can do to help
不安全的生活和政策脱节:多种不安全因素如何影响升级以及联合政策可以提供哪些帮助
- 批准号:
ES/Z000149/1 - 财政年份:2024
- 资助金额:
$ 15.42万 - 项目类别:
Research Grant
How does metal binding affect the function of proteins targeted by a devastating pathogen of cereal crops?
金属结合如何影响谷类作物毁灭性病原体靶向的蛋白质的功能?
- 批准号:
2901648 - 财政年份:2024
- 资助金额:
$ 15.42万 - 项目类别:
Studentship
Investigating how double-negative T cells affect anti-leukemic and GvHD-inducing activities of conventional T cells
研究双阴性 T 细胞如何影响传统 T 细胞的抗白血病和 GvHD 诱导活性
- 批准号:
488039 - 财政年份:2023
- 资助金额:
$ 15.42万 - 项目类别:
Operating Grants
New Tendencies of French Film Theory: Representation, Body, Affect
法国电影理论新动向:再现、身体、情感
- 批准号:
23K00129 - 财政年份:2023
- 资助金额:
$ 15.42万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
The Protruding Void: Mystical Affect in Samuel Beckett's Prose
突出的虚空:塞缪尔·贝克特散文中的神秘影响
- 批准号:
2883985 - 财政年份:2023
- 资助金额:
$ 15.42万 - 项目类别:
Studentship














{{item.name}}会员




