New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
基本信息
- 批准号:8460819
- 负责人:
- 金额:$ 36.79万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2012
- 资助国家:美国
- 起止时间:2012-04-19 至 2016-03-31
- 项目状态:已结题
- 来源:
- 关键词:AcuteAddressAlgorithmsBindingBiologyChIP-seqChromosomesCommunitiesComplementComplexComputer softwareComputersComputing MethodologiesDNA SequenceDataData SetDetectionDevelopmentEnvironmentEpidemiologyExonsFosteringFoundationsFutureGalaxyGenesGenetic VariationGenomeGenomicsGrowthHumanImageryJavaLanguageLibrariesMeasuresMemoryMethodsPositioning AttributePythonsRecruitment ActivityResearchResearch PersonnelRobin birdSequence AlignmentSolutionsSpectrum AnalysisTechniquesTechnologyTimeVariantWorkbasecell typecluster computingdesignfile formatflexibilitygenome annotationhuman diseaseimprovedinnovationinsightmultidisciplinarynovelopen sourceparallel computingresearch studysoftware developmenttooltool development
项目摘要
DESCRIPTION (provided by applicant): The exploration and interpretation of large, complex datasets is vital to discovery in genomics. However, researchers now confront a fundamental limitation; unprecedented experiments are possible thanks to modern DNA sequencing technologies, yet existing "genome arithmetic" techniques for comparing and dissecting the resulting datasets are incapable of keeping pace with inexorable growth in dataset size and complexity. Genome arithmetic (GA) represents a powerful and widely used set of techniques that allow one to explore relationships among sets of genome features (e.g., a gene, sequence alignment, ChIP-seq peak, or anything that can be described with chromosome coordinates). GA is used for a broad spectrum of analyses including: the detection of intersecting/overlapping features (e.g., sequence alignments and exons), describing feature coverage among datasets, and the merging, subtraction, and complementation of feature datasets. GA functionality is used by all genome browsers and data visualization tools, and by analysis software such as GATK and SAMTOOLS. Owing to their power and flexibility, existing GA tools (i.e., Galaxy, the UCSC Genome Browser, and our own BEDTOOLS) are extremely popular and are used in a broad range of complex genomic analyses. However, while GA is central to genomic analysis and discovery, the core algorithms employed by all existing tools are inherently incapable of scaling to the scale and diversity of modern genomic datasets. Restricted to these approaches, the present analytic bottleneck will become increasingly acute. Therefore, the overall objective of this proposal is to provide the genomics community with innovative new algorithms and software that keep pace with modern genomics experiments and facilitate future discoveries. The Specific Aims are to: (1) Devise efficient new algorithms for large-scale genome arithmetic analyses. We will develop innovative GA algorithms that scale to modern genomics experiments and are capable of integrating many diverse genomic datasets. We will devise novel algorithms and adapt proven, scalable approaches from the field of computational geometry. (2) Develop software and libraries that facilitate innovative analyses and new tool development. We will release our algorithms to the community as open-source software libraries and tools that will foster new tool development and provide innovative approaches for exploring large-scale datasets. (3) Extend our tools to scalable computing frameworks in order to enable future genomic discovery. We will adapt our software to parallel computing environments and thereby enable continued discovery on increasingly massive and complex datasets. The proposed research will devise entirely new, scalable approaches for genome arithmetic. This will provide the community with powerful new techniques for exploring and interpreting genomics experiments and provide tool developers with robust approaches for software development and improvement.
描述(由申请人提供):大型复杂数据集的探索和解释对于基因组学的发现至关重要。然而,研究人员现在面临着一个根本性的限制;由于现代DNA测序技术,前所未有的实验是可能的,但现有的“基因组算术”技术用于比较和解剖所产生的数据集无法跟上数据集大小和复杂性的不可阻挡的增长。基因组算术(GA)代表了一组强大且广泛使用的技术,其允许人们探索基因组特征组之间的关系(例如,基因、序列比对、ChIP-seq峰或可以用染色体坐标描述的任何东西)。GA用于广泛的分析,包括:交叉/重叠特征的检测(例如,序列比对和外显子),描述数据集之间的特征覆盖,以及特征数据集的合并、减法和互补。GA功能被所有基因组浏览器和数据可视化工具以及分析软件(如GATK和SAMTOOLS)使用。由于它们的能力和灵活性,现有的GA工具(即,银河,UCSC基因组浏览器,和我们自己的BEDTOOLS)是非常受欢迎的,并用于广泛的复杂基因组分析。然而,虽然GA是基因组分析和发现的核心,但所有现有工具所采用的核心算法本质上无法扩展到现代基因组数据集的规模和多样性。由于受到这些方法的限制,目前的分析瓶颈将变得日益严重。 因此,该提案的总体目标是为基因组学界提供创新的新算法和软件,以跟上现代基因组学实验的步伐,并促进未来的发现。具体目标是:(1)设计新的高效的大规模基因组算术分析算法。我们将开发创新的GA算法,可扩展到现代基因组学实验,并能够整合许多不同的基因组数据集。我们将设计新的算法,并适应证明,可扩展的方法从计算几何领域。(2)开发有助于创新分析和新工具开发的软件和库。我们将把我们的算法作为开源软件库和工具发布给社区,这将促进新工具的开发,并为探索大规模数据集提供创新方法。(3)将我们的工具扩展到可扩展的计算框架,以实现未来的基因组发现。我们将使我们的软件适应并行计算环境,从而能够在日益庞大和复杂的数据集上持续发现。 拟议中的研究将为基因组算术设计全新的、可扩展的方法。这将为社区提供强大的新技术,用于探索和解释基因组学实验,并为工具开发人员提供强大的软件开发和改进方法。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Aaron R Quinlan其他文献
Extending reference assembly models
- DOI:
10.1186/s13059-015-0587-3 - 发表时间:
2015-01-24 - 期刊:
- 影响因子:9.400
- 作者:
Deanna M Church;Valerie A Schneider;Karyn Meltz Steinberg;Michael C Schatz;Aaron R Quinlan;Chen-Shan Chin;Paul A Kitts;Bronwen Aken;Gabor T Marth;Michael M Hoffman;Javier Herrero;M Lisandra Zepeda Mendoza;Richard Durbin;Paul Flicek - 通讯作者:
Paul Flicek
Erratum: A reference bacterial genome dataset generated on the MinIONTM portable single-molecule nanopore sequencer
- DOI:
10.1186/s13742-015-0043-z - 发表时间:
2015-02-13 - 期刊:
- 影响因子:3.900
- 作者:
Joshua Quick;Aaron R Quinlan;Nicholas J Loman - 通讯作者:
Nicholas J Loman
Aaron R Quinlan的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Aaron R Quinlan', 18)}}的其他基金
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
- 批准号:
10357060 - 财政年份:2022
- 资助金额:
$ 36.79万 - 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
- 批准号:
10560502 - 财政年份:2022
- 资助金额:
$ 36.79万 - 项目类别:
Scalable detection and interpretation of structural variation in human genomes
人类基因组结构变异的可扩展检测和解释
- 批准号:
10576268 - 财政年份:2020
- 资助金额:
$ 36.79万 - 项目类别:
Scalable detection and interpretation of structural variation in human genomes
人类基因组结构变异的可扩展检测和解释
- 批准号:
9973582 - 财政年份:2020
- 资助金额:
$ 36.79万 - 项目类别:
Scalable detection and interpretation of structural variation in human genomes
人类基因组结构变异的可扩展检测和解释
- 批准号:
10341175 - 财政年份:2020
- 资助金额:
$ 36.79万 - 项目类别:
Scalable detection and interpretation of structural variation in human genomes
人类基因组结构变异的可扩展检测和解释
- 批准号:
10153847 - 财政年份:2020
- 资助金额:
$ 36.79万 - 项目类别:
Software for exploring all forms of genetic variation in any species
用于探索任何物种中所有形式的遗传变异的软件
- 批准号:
9749979 - 财政年份:2017
- 资助金额:
$ 36.79万 - 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
- 批准号:
8273206 - 财政年份:2012
- 资助金额:
$ 36.79万 - 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
- 批准号:
9272425 - 财政年份:2012
- 资助金额:
$ 36.79万 - 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
- 批准号:
8661785 - 财政年份:2012
- 资助金额:
$ 36.79万 - 项目类别:
相似海外基金
Rational design of rapidly translatable, highly antigenic and novel recombinant immunogens to address deficiencies of current snakebite treatments
合理设计可快速翻译、高抗原性和新型重组免疫原,以解决当前蛇咬伤治疗的缺陷
- 批准号:
MR/S03398X/2 - 财政年份:2024
- 资助金额:
$ 36.79万 - 项目类别:
Fellowship
CAREER: FEAST (Food Ecosystems And circularity for Sustainable Transformation) framework to address Hidden Hunger
职业:FEAST(食品生态系统和可持续转型循环)框架解决隐性饥饿
- 批准号:
2338423 - 财政年份:2024
- 资助金额:
$ 36.79万 - 项目类别:
Continuing Grant
Re-thinking drug nanocrystals as highly loaded vectors to address key unmet therapeutic challenges
重新思考药物纳米晶体作为高负载载体以解决关键的未满足的治疗挑战
- 批准号:
EP/Y001486/1 - 财政年份:2024
- 资助金额:
$ 36.79万 - 项目类别:
Research Grant
Metrology to address ion suppression in multimodal mass spectrometry imaging with application in oncology
计量学解决多模态质谱成像中的离子抑制问题及其在肿瘤学中的应用
- 批准号:
MR/X03657X/1 - 财政年份:2024
- 资助金额:
$ 36.79万 - 项目类别:
Fellowship
CRII: SHF: A Novel Address Translation Architecture for Virtualized Clouds
CRII:SHF:一种用于虚拟化云的新型地址转换架构
- 批准号:
2348066 - 财政年份:2024
- 资助金额:
$ 36.79万 - 项目类别:
Standard Grant
The Abundance Project: Enhancing Cultural & Green Inclusion in Social Prescribing in Southwest London to Address Ethnic Inequalities in Mental Health
丰富项目:增强文化
- 批准号:
AH/Z505481/1 - 财政年份:2024
- 资助金额:
$ 36.79万 - 项目类别:
Research Grant
ERAMET - Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
ERAMET - 快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
- 批准号:
10107647 - 财政年份:2024
- 资助金额:
$ 36.79万 - 项目类别:
EU-Funded
BIORETS: Convergence Research Experiences for Teachers in Synthetic and Systems Biology to Address Challenges in Food, Health, Energy, and Environment
BIORETS:合成和系统生物学教师的融合研究经验,以应对食品、健康、能源和环境方面的挑战
- 批准号:
2341402 - 财政年份:2024
- 资助金额:
$ 36.79万 - 项目类别:
Standard Grant
Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
- 批准号:
10106221 - 财政年份:2024
- 资助金额:
$ 36.79万 - 项目类别:
EU-Funded
Recite: Building Research by Communities to Address Inequities through Expression
背诵:社区开展研究,通过表达解决不平等问题
- 批准号:
AH/Z505341/1 - 财政年份:2024
- 资助金额:
$ 36.79万 - 项目类别:
Research Grant