Addressing Open Challenges of Computational Genome Annotation
解决计算基因组注释的开放挑战
基本信息
- 批准号:9975182
- 负责人:
- 金额:$ 34.24万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2018
- 资助国家:美国
- 起止时间:2018-09-01 至 2023-06-30
- 项目状态:已结题
- 来源:
- 关键词:AddressAlgorithmsAlternative SplicingAreaBacteriophagesBenchmarkingBig DataBioinformaticsChronicCodeCollaborationsCollectionCommunitiesComplementComplexComputing MethodologiesDataDeteriorationDevelopmentDevelopment PlansDiseaseGene FamilyGene StructureGenesGenomeGenomicsGermanyGoalsHealthHumanInsectaInstitutesIntronsKnowledgeLengthLicensingMachine LearningMaintenanceMetagenomicsMethodsModelingModernizationNested GenesNoiseOverlapping GenesParasitesPerformancePopulationPositioning AttributePropertyProtein FamilyProtein IsoformsProteinsProteomicsRNA SplicingResearchRunningSpeedSpliced GenesStatistical ModelsSupervisionTechniquesTechnologyTimeTrainingTranscriptUniversitiesViralVirusannotation systembasebioinformatics toolcomputerized toolscostcourse developmentdesignevidence baseexpectationgene complementationgenome annotationgenome scienceshigh throughput technologyhuman pathogenimprovedinstrumentmembermetagenomemultiple omicsnanoporenew technologynovelnovel strategiesopen sourceoperationpredictive toolsprotein profilingreconstructionsuccesstooltranscriptometranscriptome sequencingtranscriptomicswhole genome
项目摘要
We propose to capitalize on success of ongoing collaboration between the bioinformatics
teams at the University of Greifswald (Germany) and at the Georgia Institute of Technology (USA)
and address open challenges in computational genome annotation. In the course of this
development, we plan to implement new algorithmic ideas and satisfy the needs of unbiased
integration of different types of OMICS data.
We plan to address one of the long-standing problems at interface of bioinformatics and
machine learning – automatic generative and discriminative parameterization of gene finding
algorithms. Current methods of combining OMICS evidence frequently result in under predicting
or over predicting tools. Having good understanding of the difficulties and the properties of
different types of OMICS evidence we propose an optimized approach to the full unsupervised,
generative and discriminative training.
We will introduce novel means to optimize integration of multiple OMICS evidence into gene
prediction. These ideas will develop further the protein family-based gene finding implemented
in AUGUSTUS-PPX. We propose to create representations of protein families for gene finding
that for the first time include cross-species gene structure information.
We will develop a new approach that will unify two advanced research areas - transcript
reconstruction from RNA-Seq and statistical gene finding that integrates RNA-Seq and homology
information. We will describe a new, comprehensive model and EM-like algorithmic technique
(the “wholistic” approach) to identify the sets of transcripts and their expression levels that best fit
the available OMICS evidence.
We will also develop an automatic gene-finding algorithm for a full content of metagenomes
including eukaryotic and viral metagenomic sequences. This task is conventionally considered
too challenging. We propose a solution exploiting and advancing algorithmic ideas and
approaches that we mastered in the course of creating gene finders for prokaryotic metagenomes
as well as eukaryotic genomes.
All new tools will be available to the community under open source licenses.
我们建议利用生物信息学之间正在进行的合作的成功,
格赖夫斯瓦尔德大学(德国)和格鲁吉亚理工学院(美国)的团队
并解决计算基因组注释中的开放挑战。在此过程中
开发,我们计划实施新的算法思想,并满足无偏见的需求,
整合不同类型的OMICS数据。
我们计划解决一个长期存在的问题,在接口的生物信息学和
机器学习-基因发现的自动生成和判别参数化
算法目前结合OMICS证据的方法经常导致预测不足
或者过度预测的工具。充分理解了困难和性质,
不同类型的OMICS证据,我们提出了一种优化的方法,以充分无监督,
生成性和辨别性训练。
我们将引入新的方法来优化多个OMICS证据整合到基因中,
预测.这些想法将进一步发展蛋白质家族为基础的基因发现实施
在AUGUSTUS-PPX。我们建议创建用于基因发现的蛋白质家族的表示
首次包含了跨物种的基因结构信息。
我们将开发一种新的方法,将统一两个先进的研究领域-转录
从RNA-Seq和整合RNA-Seq和同源性的统计基因发现重建
信息.我们将描述一个新的,全面的模型和EM类算法技术
(the“整体”方法),以确定最适合的转录本集及其表达水平
现有的OMICS证据。
我们还将开发一个自动基因发现算法的全部内容的宏基因组
包括真核和病毒宏基因组序列。这项任务通常被认为是
太有挑战性了我们提出了一个解决方案,利用和推进算法思想,
我们在为原核生物宏基因组创造基因发现者的过程中掌握的方法
以及真核生物的基因组。
所有新工具都将在开源许可证下提供给社区。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
MARK BORODOVSKY其他文献
MARK BORODOVSKY的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('MARK BORODOVSKY', 18)}}的其他基金
Addressing Open Challenges of Computational Genome Annotation
解决计算基因组注释的开放挑战
- 批准号:
9761554 - 财政年份:2018
- 资助金额:
$ 34.24万 - 项目类别:
NIGMS Administrative Supplements to Support Undergraduate Summer Research
NIGMS 支持本科生暑期研究的行政补充
- 批准号:
10393964 - 财政年份:2018
- 资助金额:
$ 34.24万 - 项目类别:
Improving Accuracy of Gene Prediction Programs of the G*
提高G*基因预测程序的准确性
- 批准号:
6581987 - 财政年份:2002
- 资助金额:
$ 34.24万 - 项目类别:
Improving Accuracy of Gene Prediction Programs of the G*
提高G*基因预测程序的准确性
- 批准号:
6686405 - 财政年份:2002
- 资助金额:
$ 34.24万 - 项目类别:
Conference-- Bioinformatics After the Human Genome
会议——人类基因组之后的生物信息学
- 批准号:
6439388 - 财政年份:2001
- 资助金额:
$ 34.24万 - 项目类别:
IN SILICO BIOLOGY--GENOMES TO STRUCTURE TO FUNCTION
计算机模拟生物学——基因组的结构和功能
- 批准号:
6135836 - 财政年份:1999
- 资助金额:
$ 34.24万 - 项目类别:
IN SILICO BIOLOGY--GENOMES TO STRUCTURE TO FUNCTION
计算机模拟生物学——基因组的结构和功能
- 批准号:
2725234 - 财政年份:1999
- 资助金额:
$ 34.24万 - 项目类别:
GENE PREDICTION: MARKOV MODELS AND COMPLEMENTARY METHODS
基因预测:马尔可夫模型和补充方法
- 批准号:
6388304 - 财政年份:1993
- 资助金额:
$ 34.24万 - 项目类别:
GENE PREDICTION--MARKOV MODELS AND COMPLEMENTARY METHODS
基因预测--马尔可夫模型和补充方法
- 批准号:
6286238 - 财政年份:1993
- 资助金额:
$ 34.24万 - 项目类别:
Gene Prediction by Markov Models and Complementary Methods
通过马尔可夫模型和补充方法进行基因预测
- 批准号:
8053866 - 财政年份:1993
- 资助金额:
$ 34.24万 - 项目类别:
相似海外基金
DMS-EPSRC: Asymptotic Analysis of Online Training Algorithms in Machine Learning: Recurrent, Graphical, and Deep Neural Networks
DMS-EPSRC:机器学习中在线训练算法的渐近分析:循环、图形和深度神经网络
- 批准号:
EP/Y029089/1 - 财政年份:2024
- 资助金额:
$ 34.24万 - 项目类别:
Research Grant
CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
- 批准号:
2337776 - 财政年份:2024
- 资助金额:
$ 34.24万 - 项目类别:
Continuing Grant
CAREER: From Dynamic Algorithms to Fast Optimization and Back
职业:从动态算法到快速优化并返回
- 批准号:
2338816 - 财政年份:2024
- 资助金额:
$ 34.24万 - 项目类别:
Continuing Grant
CAREER: Structured Minimax Optimization: Theory, Algorithms, and Applications in Robust Learning
职业:结构化极小极大优化:稳健学习中的理论、算法和应用
- 批准号:
2338846 - 财政年份:2024
- 资助金额:
$ 34.24万 - 项目类别:
Continuing Grant
CRII: SaTC: Reliable Hardware Architectures Against Side-Channel Attacks for Post-Quantum Cryptographic Algorithms
CRII:SaTC:针对后量子密码算法的侧通道攻击的可靠硬件架构
- 批准号:
2348261 - 财政年份:2024
- 资助金额:
$ 34.24万 - 项目类别:
Standard Grant
CRII: AF: The Impact of Knowledge on the Performance of Distributed Algorithms
CRII:AF:知识对分布式算法性能的影响
- 批准号:
2348346 - 财政年份:2024
- 资助金额:
$ 34.24万 - 项目类别:
Standard Grant
CRII: CSR: From Bloom Filters to Noise Reduction Streaming Algorithms
CRII:CSR:从布隆过滤器到降噪流算法
- 批准号:
2348457 - 财政年份:2024
- 资助金额:
$ 34.24万 - 项目类别:
Standard Grant
EAGER: Search-Accelerated Markov Chain Monte Carlo Algorithms for Bayesian Neural Networks and Trillion-Dimensional Problems
EAGER:贝叶斯神经网络和万亿维问题的搜索加速马尔可夫链蒙特卡罗算法
- 批准号:
2404989 - 财政年份:2024
- 资助金额:
$ 34.24万 - 项目类别:
Standard Grant
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
- 批准号:
2339310 - 财政年份:2024
- 资助金额:
$ 34.24万 - 项目类别:
Continuing Grant
CAREER: Improving Real-world Performance of AI Biosignal Algorithms
职业:提高人工智能生物信号算法的实际性能
- 批准号:
2339669 - 财政年份:2024
- 资助金额:
$ 34.24万 - 项目类别:
Continuing Grant