Integrated Assembly Software for Sanger and Next Generation Sequence Technologies
适用于 Sanger 和下一代序列技术的集成装配软件
基本信息
- 批准号:8011298
- 负责人:
- 金额:$ 72.29万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2007
- 资助国家:美国
- 起止时间:2007-09-01 至 2011-12-31
- 项目状态:已结题
- 来源:
- 关键词:AlgorithmsAreaBacterial GenomeBase PairingBindingBioinformaticsBiological SciencesBiologyCharacteristicsChromosomesChromosomes, Human, Pair 21Chromosomes, Human, Pair 8ClinicComputer softwareComputersDNA ResequencingDNA SequenceDataData AnalysesData SetData Storage and RetrievalDatabasesDevelopmentDiploidyEvolutionFoundationsFundingFutureGenerationsGeneric DrugsGenesGenetic PolymorphismGenomeGoalsHealthcareHourHumanHuman GenomeLaboratory ResearchLifeLinkMedicineMemoryMethodsMiningOnline SystemsPaperPerformancePhasePriceProcessReadingRelative (related person)ReportingResearch PersonnelResourcesRunningSingle Nucleotide PolymorphismSolutionsSorting - Cell MovementStreamSystemTechnologyTimeVariantbasecomputerized data processingcomputing resourcescostcost effectivedata reductiondata structuredesigngenome sequencinggenome-wide analysisinnovationmeetingsmemory processnew technologynext generationprototypepublic health relevancesingle moleculetooluser-friendly
项目摘要
DESCRIPTION (provided by applicant): The advent of next-generation (Next-gen) sequencing technologies has begun a surge in whole genome sequencing and resequencing, exemplified spectacularly by four papers describing five complete human genomes in 2008 alone. One company, Knome, now even offers customers their entire genome sequence using Next-gen sequencing technology. These developments, together with targeted resequencing of genome, presage the day of the $1000 human genome. Broad-scale whole human genome resequencing (WHGR) will have enormous impact on the areas of personalized medicine, human evolution and human diversity. To fully realize that potential, however, software capabilities must be dramatically enhanced to meet the significant challenges posed by the sheer volume of data generated in these projects, the diversity of technology-specific data characteristics and simply analyzing the 6 billion base pair diploid human genome. Moreover, we see the day when technology improvements and cost reductions make WHGR as commonplace as bacterial genome sequencing has become today. For that to occur, assembly and analysis software must be accessible to a far broader and less computer savvy range of researchers than the highly specialized bioinformatics teams that decode the information now. Also, computer resources are far more limited even for a well funded research laboratory than available to a large sequencing center. Therefore, the overall goal of this proposal is to develop a Next-gen sequence assembly and analysis pipeline, DESKAPP, that will run on an affordable ($5000) high- end desktop computer and produce a human genome sequence in a reasonable timeframe (days, not weeks). WHGR by DESKAPP will involve a reference-guided main assembly as well as a de novo assembly branch to characterize unique regions of the new genome relative to the reference. Merging of the assemblies produces a complete sequence that can be evaluated for gene content, single nucleotide polymorphisms (SNPs) and structural variation (SV; indels, inversion, translocations) both by web-based searches of external databases to identify known allelic variation and by direct examination of the sequence to identify new polymorphisms. A Disk Sort Alignment algorithm allows the data sets which are far too large for in-memory processing to be evaluated and clustered for assembly by SeqMan N-Gen (SM N-Gen), our desktop assembly engine. Using a prototype DSA-SM N-Gen pipeline, we have processed the entire 7.4x 454 data set from the James Watson genome to a layout file in 31 hours using DSA and have assembled three chromosomes: 8; 21; and X; using SM N-Gen. Assembly times varied from 1 hour for Chromosome 21 to 10.6 hours for an average- sized chromosome, such as Chromosome 8. Together, these results demonstrate the feasibility of constructing a DESKAPP pipeline for WHGR. The Phase II Aims are designed to build upon this foundation and produce a seamless pipeline for the desktop assembly and analysis of a human genome in a matter of days.
PUBLIC HEALTH RELEVANCE: Next-gen sequencing technologies have started a new revolution throughout biology by providing DNA sequence data in unprecedented quantities at continually decreasing costs. This data will be invaluable in the emerging era of personalized medicine and in exploring the immense diversity of life. The goal of this project is to develop desktop computer software that will enable research laboratories and clinics of any size to realize the promise of these new technologies.
描述(由申请人提供):下一代(Next-gen)测序技术的出现已经开始了全基因组测序和重测序的热潮,仅2008年就有四篇论文描述了五个完整的人类基因组。Knome公司现在甚至向客户提供使用下一代测序技术的整个基因组序列。这些发展,加上有针对性的基因组重测序,预示着1000美元的人类基因组时代的到来。大规模人类全基因组重测序(WHGR)将对个性化医疗、人类进化和人类多样性等领域产生巨大影响。然而,为了充分实现这一潜力,必须大大增强软件功能,以应对这些项目中产生的庞大数据量、技术特定数据特征的多样性以及简单分析60亿个碱基对二倍体人类基因组所带来的重大挑战。此外,我们看到技术进步和成本降低使WHGR像今天的细菌基因组测序一样普遍的一天。为了实现这一目标,汇编和分析软件必须能够被更广泛的、不那么精通计算机的研究人员使用,而不是现在解码信息的高度专业化的生物信息学团队。此外,即使是资金充足的研究实验室,计算机资源也远比大型测序中心有限。因此,该提案的总体目标是开发下一代序列组装和分析管道,DESKAPP,它将在价格合理(5000美元)的高端台式计算机上运行,并在合理的时间范围内(几天,而不是几周)产生人类基因组序列。DESKAPP的WHGR将包括参考指导的主组装和从头组装分支,以表征相对于参考的新基因组的独特区域。这些片段的合并产生了一个完整的序列,可以通过基于网络的外部数据库搜索来确定已知的等位基因变异,也可以通过直接检查序列来确定新的多态性,从而评估基因含量、单核苷酸多态性(SNPs)和结构变异(SV;索引、倒置、易位)。磁盘排序对齐算法允许对内存中处理过于庞大的数据集进行评估,并通过我们的桌面组装引擎SeqMan N-Gen (SM N-Gen)进行聚类组装。使用原型DSA- sm N-Gen管道,我们使用DSA在31小时内将James Watson基因组的整个7.4x 454数据集处理为布局文件,并组装了三条染色体:8;21;和X;使用SM N-Gen。组装时间从21号染色体的1小时到8号染色体等平均大小的染色体的10.6小时不等。总之,这些结果证明了为WHGR构建DESKAPP管道的可行性。第二阶段的目标是建立在这个基础上,并在几天内为桌面组装和分析人类基因组提供一个无缝的管道。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
TIMOTHY J DURFEE其他文献
TIMOTHY J DURFEE的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('TIMOTHY J DURFEE', 18)}}的其他基金
Long read based sequencing software for the comprehensive analysis of clinical samples
基于长读长的测序软件,用于临床样本的综合分析
- 批准号:
10009727 - 财政年份:2020
- 资助金额:
$ 72.29万 - 项目类别:
Scalable post-assembly editing software for finishing and annotating personal genomes
可扩展的组装后编辑软件,用于完成和注释个人基因组
- 批准号:
9883809 - 财政年份:2018
- 资助金额:
$ 72.29万 - 项目类别:
Scalable post-assembly editing software for finishing and annotating personal genomes
可扩展的组装后编辑软件,用于完成和注释个人基因组
- 批准号:
9767335 - 财政年份:2018
- 资助金额:
$ 72.29万 - 项目类别:
Complete genome de novo assembly software for the emerging long read sequencing era
适用于新兴长读长测序时代的完整基因组从头组装软件
- 批准号:
9255092 - 财政年份:2017
- 资助金额:
$ 72.29万 - 项目类别:
Complete genome de novo assembly software for the emerging long read sequencing era
适用于新兴长读长测序时代的完整基因组从头组装软件
- 批准号:
9747613 - 财政年份:2017
- 资助金额:
$ 72.29万 - 项目类别:
Association Analysis Software for Mining Clinical Next-Gen Sequencing Data
用于挖掘临床下一代测序数据的关联分析软件
- 批准号:
8236680 - 财政年份:2012
- 资助金额:
$ 72.29万 - 项目类别:
Association Analysis Software for Mining Clinical Next-Gen Sequencing Data
用于挖掘临床下一代测序数据的关联分析软件
- 批准号:
8727829 - 财政年份:2012
- 资助金额:
$ 72.29万 - 项目类别:
Association Analysis Software for Mining Clinical Next-Gen Sequencing Data
用于挖掘临床下一代测序数据的关联分析软件
- 批准号:
8703156 - 财政年份:2012
- 资助金额:
$ 72.29万 - 项目类别:
Association Analysis Software for Mining Clinical Next-Gen Sequencing Data
用于挖掘临床下一代测序数据的关联分析软件
- 批准号:
8624982 - 财政年份:2012
- 资助金额:
$ 72.29万 - 项目类别:
A Desktop Assembly and Analysis Pipeline for Next-gen Metagenomic Sequencing
用于下一代宏基因组测序的桌面组装和分析流程
- 批准号:
8200467 - 财政年份:2011
- 资助金额:
$ 72.29万 - 项目类别:
相似国自然基金
层出镰刀菌氮代谢调控因子AreA 介导伏马菌素 FB1 生物合成的作用机理
- 批准号:2021JJ40433
- 批准年份:2021
- 资助金额:0.0 万元
- 项目类别:省市级项目
寄主诱导梢腐病菌AreA和CYP51基因沉默增强甘蔗抗病性机制解析
- 批准号:32001603
- 批准年份:2020
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
AREA国际经济模型的移植.改进和应用
- 批准号:18870435
- 批准年份:1988
- 资助金额:2.0 万元
- 项目类别:面上项目
相似海外基金
Onboarding Rural Area Mathematics and Physical Science Scholars
农村地区数学和物理科学学者的入职
- 批准号:
2322614 - 财政年份:2024
- 资助金额:
$ 72.29万 - 项目类别:
Standard Grant
Point-scanning confocal with area detector
点扫描共焦与区域检测器
- 批准号:
534092360 - 财政年份:2024
- 资助金额:
$ 72.29万 - 项目类别:
Major Research Instrumentation
TRACK-UK: Synthesized Census and Small Area Statistics for Transport and Energy
TRACK-UK:交通和能源综合人口普查和小区域统计
- 批准号:
ES/Z50290X/1 - 财政年份:2024
- 资助金额:
$ 72.29万 - 项目类别:
Research Grant
Wide-area low-cost sustainable ocean temperature and velocity structure extraction using distributed fibre optic sensing within legacy seafloor cables
使用传统海底电缆中的分布式光纤传感进行广域低成本可持续海洋温度和速度结构提取
- 批准号:
NE/Y003365/1 - 财政年份:2024
- 资助金额:
$ 72.29万 - 项目类别:
Research Grant
Collaborative Research: Scalable Manufacturing of Large-Area Thin Films of Metal-Organic Frameworks for Separations Applications
合作研究:用于分离应用的大面积金属有机框架薄膜的可扩展制造
- 批准号:
2326714 - 财政年份:2024
- 资助金额:
$ 72.29万 - 项目类别:
Standard Grant
Collaborative Research: Scalable Manufacturing of Large-Area Thin Films of Metal-Organic Frameworks for Separations Applications
合作研究:用于分离应用的大面积金属有机框架薄膜的可扩展制造
- 批准号:
2326713 - 财政年份:2024
- 资助金额:
$ 72.29万 - 项目类别:
Standard Grant
Unlicensed Low-Power Wide Area Networks for Location-based Services
用于基于位置的服务的免许可低功耗广域网
- 批准号:
24K20765 - 财政年份:2024
- 资助金额:
$ 72.29万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
RAPID: Collaborative Research: Multifaceted Data Collection on the Aftermath of the March 26, 2024 Francis Scott Key Bridge Collapse in the DC-Maryland-Virginia Area
RAPID:协作研究:2024 年 3 月 26 日 DC-马里兰-弗吉尼亚地区 Francis Scott Key 大桥倒塌事故后果的多方面数据收集
- 批准号:
2427233 - 财政年份:2024
- 资助金额:
$ 72.29万 - 项目类别:
Standard Grant
Postdoctoral Fellowship: OPP-PRF: Tracking Long-Term Changes in Lake Area across the Arctic
博士后奖学金:OPP-PRF:追踪北极地区湖泊面积的长期变化
- 批准号:
2317873 - 财政年份:2024
- 资助金额:
$ 72.29万 - 项目类别:
Standard Grant
RAPID: Collaborative Research: Multifaceted Data Collection on the Aftermath of the March 26, 2024 Francis Scott Key Bridge Collapse in the DC-Maryland-Virginia Area
RAPID:协作研究:2024 年 3 月 26 日 DC-马里兰-弗吉尼亚地区 Francis Scott Key 大桥倒塌事故后果的多方面数据收集
- 批准号:
2427232 - 财政年份:2024
- 资助金额:
$ 72.29万 - 项目类别:
Standard Grant