New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
基本信息
- 批准号:10357060
- 负责人:
- 金额:$ 66.42万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-02-02 至 2027-01-31
- 项目状态:未结题
- 来源:
- 关键词:AddressAlgorithmic AnalysisAlgorithmic SoftwareAlgorithmsArithmeticBiologicalBiological AssayCellsChIP-seqChromatinChromosomesCollaborationsCommunitiesComplementComplexComputer softwareCustomDNADNA sequencingDataData SetDetectionDevelopmentDisciplineExonsFaceFoundationsFutureGene ExpressionGenesGenomeGenomicsGrowthLibrariesMeasuresMethodsMicroscopeModernizationNoiseNucleic AcidsPatternPerformanceProgramming LanguagesProteinsQuality ControlResearchResearch PersonnelSamplingSequence AlignmentSignal TransductionSpectrum AnalysisSpeedStatistical Data InterpretationStructureSystemTechniquesTechnologyTestingTimeVisualizationVisualization softwarebasedata formatdata visualizationdiverse dataexhaustionexperienceexperimental studyfile formatflexibilitygenome annotationgenome browsergenomic dataimprovedindexinginnovationinsightlarge datasetslarge scale dataoperationoutreachparallelizationreference genometooluser friendly software
项目摘要
PROJECT SUMMARY
The exploration and interpretation of large, complex datasets is vital to discovery in genomics. However,
researchers now confront a fundamental limitation; unprecedented experiments are possible thanks to modern
DNA sequencing technologies, yet existing “genome arithmetic” algorithms and data formats for comparing
and dissecting the resulting datasets are incapable of keeping pace with inexorable growth in dataset size and
complexity. Genome arithmetic (GA) represents a powerful and widely used set of techniques that allow one to
explore relationships among sets of genome features (e.g., a gene, sequence alignment, ChIP-seq peak, or
anything that can be described with chromosome coordinates). GA is used for a broad spectrum of analyses
including: the detection of intersecting/overlapping features (e.g., sequence alignments and exons), describing
feature coverage among datasets, and the merging, subtraction, and complementation of feature datasets. GA
functionality is used by all genome browsers and data visualization tools, and by analysis software such as
GATK and SAMTOOLS. Our BEDTOOLS software has become a staple of genomics research and is used in
a broad range of genomic analyses. However, continuous support and development have also revealed key
limitations with its current functionality and crucial limitations that hinder analytical flexibility. We argue that
innovations in genome arithmetic algorithms, data formats and user-friendly software are needed to: (1)
empower researchers to conduct large-scale analyses with simple, flexible tools; (2) improve analysis tools to
keep pace with the scale of modern datasets; (3) visualize and quantify relationships among genome
datasets.
Therefore, the overall objective of this proposal is to provide the genomics community with innovative
new algorithms and software that keep pace with modern genomics experiments and facilitate future
discoveries. The Specific Aims are to: (1) Develop a refined suite of genome arithmetic algorithms and
programming interface for scalable analysis with BEDTOOLS. (2) Create new algorithms and genome
interval sketching approaches to enable large-scale dataset comparisons. (3) Enable large-scale
visualization and statistical analyses grounded in our recent advances in devising scalable new data
formats. These innovations will yield with scalable new algorithms, data structures and formats that will
empower thousands of genomics researchers around the world.
项目摘要
对大型复杂数据集的探索和解释对于基因组学的发现至关重要。然而,在这方面,
研究人员现在面临着一个根本性的限制;由于现代技术的发展,前所未有的实验成为可能。
DNA测序技术,但现有的“基因组算术”算法和数据格式的比较
和解剖结果数据集无法跟上数据集大小的不可阻挡的增长,
复杂性基因组算法(GA)代表了一组功能强大且广泛使用的技术,
探索基因组特征组之间的关系(例如,基因、序列比对、ChIP-seq峰,或
任何可以用染色体坐标描述的东西)。遗传算法用于广泛的分析
包括:相交/重叠特征的检测(例如,序列比对和外显子),描述
数据集之间的特征覆盖,以及特征数据集的合并,减法和互补。GA
所有基因组浏览器和数据可视化工具以及分析软件(如
GATK和SAMTOOLS。我们的BEDTOOLS软件已成为基因组学研究的主要工具,
广泛的基因组分析。然而,持续的支持和发展也揭示了关键的
其当前功能的局限性以及阻碍分析灵活性的关键局限性。我们认为
需要在基因组算术算法、数据格式和用户友好型软件方面进行创新,以便:(1)
使研究人员能够使用简单、灵活的工具进行大规模分析;(2)改进分析工具,
与现代数据集的规模保持同步;(3)可视化和量化基因组之间的关系
数据集。
因此,本提案的总体目标是为基因组学界提供创新的
新的算法和软件,跟上现代基因组学实验的步伐,并促进未来
发现。具体目标是:(1)开发一套完善的基因组算术算法,
可编程接口,用于BEDTOOLS的可扩展分析。(2)创建新的算法和基因组
间隔素描方法,使大规模的数据集比较。(3)实现大规模
可视化和统计分析基于我们在设计可扩展的新数据方面的最新进展
格式.这些创新将产生可扩展的新算法,数据结构和格式,
赋予全世界成千上万的基因组学研究人员力量。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Aaron R Quinlan其他文献
Extending reference assembly models
- DOI:
10.1186/s13059-015-0587-3 - 发表时间:
2015-01-24 - 期刊:
- 影响因子:9.400
- 作者:
Deanna M Church;Valerie A Schneider;Karyn Meltz Steinberg;Michael C Schatz;Aaron R Quinlan;Chen-Shan Chin;Paul A Kitts;Bronwen Aken;Gabor T Marth;Michael M Hoffman;Javier Herrero;M Lisandra Zepeda Mendoza;Richard Durbin;Paul Flicek - 通讯作者:
Paul Flicek
Erratum: A reference bacterial genome dataset generated on the MinIONTM portable single-molecule nanopore sequencer
- DOI:
10.1186/s13742-015-0043-z - 发表时间:
2015-02-13 - 期刊:
- 影响因子:3.900
- 作者:
Joshua Quick;Aaron R Quinlan;Nicholas J Loman - 通讯作者:
Nicholas J Loman
Aaron R Quinlan的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Aaron R Quinlan', 18)}}的其他基金
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
- 批准号:
10560502 - 财政年份:2022
- 资助金额:
$ 66.42万 - 项目类别:
Scalable detection and interpretation of structural variation in human genomes
人类基因组结构变异的可扩展检测和解释
- 批准号:
10576268 - 财政年份:2020
- 资助金额:
$ 66.42万 - 项目类别:
Scalable detection and interpretation of structural variation in human genomes
人类基因组结构变异的可扩展检测和解释
- 批准号:
9973582 - 财政年份:2020
- 资助金额:
$ 66.42万 - 项目类别:
Scalable detection and interpretation of structural variation in human genomes
人类基因组结构变异的可扩展检测和解释
- 批准号:
10341175 - 财政年份:2020
- 资助金额:
$ 66.42万 - 项目类别:
Scalable detection and interpretation of structural variation in human genomes
人类基因组结构变异的可扩展检测和解释
- 批准号:
10153847 - 财政年份:2020
- 资助金额:
$ 66.42万 - 项目类别:
Software for exploring all forms of genetic variation in any species
用于探索任何物种中所有形式的遗传变异的软件
- 批准号:
9749979 - 财政年份:2017
- 资助金额:
$ 66.42万 - 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
- 批准号:
8273206 - 财政年份:2012
- 资助金额:
$ 66.42万 - 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
- 批准号:
9272425 - 财政年份:2012
- 资助金额:
$ 66.42万 - 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
- 批准号:
8460819 - 财政年份:2012
- 资助金额:
$ 66.42万 - 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
- 批准号:
8661785 - 财政年份:2012
- 资助金额:
$ 66.42万 - 项目类别:
相似海外基金
AI-based prediction of the belepharoptosis etiologies by means of machine learning algorithmic analysis of length-tensile force chart of levator muscle
通过提上睑肌长度-拉力图的机器学习算法分析,基于人工智能的上睑下垂病因预测
- 批准号:
22K09863 - 财政年份:2022
- 资助金额:
$ 66.42万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Algorithmic analysis of symmetric-key cryptographic primitives
对称密钥密码原语的算法分析
- 批准号:
262074-2008 - 财政年份:2013
- 资助金额:
$ 66.42万 - 项目类别:
Discovery Grants Program - Individual
Algorithmic analysis of symmetric-key cryptographic primitives
对称密钥密码原语的算法分析
- 批准号:
262074-2008 - 财政年份:2012
- 资助金额:
$ 66.42万 - 项目类别:
Discovery Grants Program - Individual
Algorithmic analysis of symmetric-key cryptographic primitives
对称密钥密码原语的算法分析
- 批准号:
262074-2008 - 财政年份:2011
- 资助金额:
$ 66.42万 - 项目类别:
Discovery Grants Program - Individual
Unified Approach for Nanotechnology CAD/Computation by Algorithmic Analysis of Periodic Crystal Structures
通过周期性晶体结构的算法分析实现纳米技术 CAD/计算的统一方法
- 批准号:
22650002 - 财政年份:2010
- 资助金额:
$ 66.42万 - 项目类别:
Grant-in-Aid for Challenging Exploratory Research
Algorithmic analysis of symmetric-key cryptographic primitives
对称密钥密码原语的算法分析
- 批准号:
262074-2008 - 财政年份:2010
- 资助金额:
$ 66.42万 - 项目类别:
Discovery Grants Program - Individual
Algorithmic analysis of symmetric-key cryptographic primitives
对称密钥密码原语的算法分析
- 批准号:
262074-2008 - 财政年份:2009
- 资助金额:
$ 66.42万 - 项目类别:
Discovery Grants Program - Individual
Algorithmic analysis of symmetric-key cryptographic primitives
对称密钥密码原语的算法分析
- 批准号:
262074-2008 - 财政年份:2008
- 资助金额:
$ 66.42万 - 项目类别:
Discovery Grants Program - Individual
Mathematical & Algorithmic Analysis of Natural and Artificial DNA Sequences
数学
- 批准号:
0218568 - 财政年份:2002
- 资助金额:
$ 66.42万 - 项目类别:
Standard Grant
Algorithmic Analysis and Congestion Control of Connection-Oriented Services in Large Scale Communication Networks.
大规模通信网络中面向连接的服务的算法分析和拥塞控制。
- 批准号:
9404947 - 财政年份:1994
- 资助金额:
$ 66.42万 - 项目类别:
Standard Grant