Collaborative Research: Algorithms for Threat Detection via Geometry of Virus Genome Space
合作研究:通过病毒基因组空间几何进行威胁检测的算法
基本信息
- 批准号:1120824
- 负责人:
- 金额:$ 74.52万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2011
- 资助国家:美国
- 起止时间:2011-09-15 至 2015-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
A genome space is a moduli space of genomes. In this space each point corresponds to a genome. It is expected that two genomes are closely related if the corresponding points in the genome space are close to each other. The investigators and their collaborators obtain a new geometric representation-the natural vector for DNA sequences, and show that the correspondence between DNA sequences and the natural vectors is one-to-one. They perform phylogenetic and clustering analysis for genome sequences in this space. Unlike most existing methods, the proposed genome space here does not need sequence alignment or any evolutionary model and thus avoids computational repetition. In a pilot study of 27,643 genome sequences, it takes only a couple of hours using the natural vector method to compute all the pairwise differences, while it will take four years using the classical multiple alignment methods. Considering the exponentially increasing size of the known genome database, the natural vector method is the only known feasible approach to cluster the whole genome space. With the constructed natural vectors, the investigators use the classification model based on a permanental process, a stochastic classification model, to perform classification and clustering. Moreover, the probability of each virus genome belonging to a cluster can also be obtained. For example, the investigators did clustering analysis for 59 Influenza A H1N1 swine flu genomes and 113 human rhinovirus (HRV) genomes based on their whole genome sequences, and showed that the new outbreak of Influenza A H1N1 swine flu virus was most closely related to Eurasian swine flu viruses and North American swine flu viruses, and the 113 HRV genomes were well clustered into 5 classes HRV-A, HRV-B, HRV-C, HEV-B, and HEV-C. It takes only 18 seconds for the proposed method to get the clustering result while it takes more than 19 hours for the commonly used multiple alignment method.Both methods yield the same clustering result. The first goal of the proposed activity is to collect all available genome sequences for each type of virus, compute their natural vectors, set up and maintain a "natural vector bank" for viruses. Secondly, the investigators will explore the necessary number of dimensions of the natural vector such that it accurately classifies or clusters the genomes. The third goal is to do clustering on the virus genomes based on their natural vectors. The final goal is to classify or identify any given new virus based on its genome sequence, and predict its functions or behavior pattern.In this project the investigators construct a novel, high-speed, accurate geometric representation, called the natural vector, for DNA sequences. Based on this new powerful method, the biologists can have a global comparison of all genomes simultaneously, which cannot be achieved by any other method. It is very fast and convenient once the genome sequence is known, which is vital to the homeland security. To predict the characteristics of a new virus coming from terrorist groups, one can compute the natural vector of the new virus and compare it with the natural vectors of other known viruses. In this way one can predict the possible properties of this new virus by looking at the properties of those viruses located nearby. Quickly and accurately identifying a new virus and predicting its functions will be very helpful to authorities taking precautions and manufacturing a vaccine before it reaches a pandemic state and propagates throughout the general public.
基因组空间是基因组的模空间。在这个空间中,每个点对应一个基因组。如果基因组空间中的对应点彼此接近,则预计两个基因组密切相关。研究人员及其合作者获得了一种新的几何表示——DNA序列的自然向量,并表明DNA序列和自然向量之间是一一对应的。他们对该空间中的基因组序列进行系统发育和聚类分析。与大多数现有方法不同,这里提出的基因组空间不需要序列比对或任何进化模型,因此避免了计算重复。在对 27,643 个基因组序列的初步研究中,使用自然向量方法只需要几个小时即可计算出所有成对差异,而使用经典的多重比对方法则需要四年的时间。考虑到已知基因组数据库的大小呈指数级增长,自然向量方法是唯一已知的对整个基因组空间进行聚类的可行方法。利用构建的自然向量,研究人员使用基于永久过程的分类模型(随机分类模型)来执行分类和聚类。此外,还可以获得每个病毒基因组属于一个簇的概率。例如,研究人员根据全基因组序列对59个甲型H1N1猪流感基因组和113个人类鼻病毒(HRV)基因组进行聚类分析,发现新爆发的甲型H1N1猪流感病毒与欧亚猪流感病毒和北美猪流感病毒关系最密切,113个HRV基因组很好地聚类为5类 HRV-A、HRV-B、HRV-C、HEV-B 和 HEV-C。所提出的方法只需要18秒就可以得到聚类结果,而常用的多重比对方法需要19个小时以上。两种方法得到的聚类结果相同。拟议活动的首要目标是收集每种病毒的所有可用基因组序列,计算其自然载体,建立并维护病毒的“自然载体库”。其次,研究人员将探索自然向量所需的维数,以便准确地对基因组进行分类或聚类。第三个目标是根据病毒的自然载体对病毒基因组进行聚类。最终目标是根据基因组序列对任何给定的新病毒进行分类或识别,并预测其功能或行为模式。在这个项目中,研究人员为 DNA 序列构建了一种新颖、高速、准确的几何表示,称为自然向量。基于这种新的强大方法,生物学家可以同时对所有基因组进行全局比较,这是任何其他方法都无法实现的。一旦知道基因组序列,这将非常快速和方便,这对国土安全至关重要。为了预测来自恐怖组织的新病毒的特征,可以计算新病毒的自然向量并将其与其他已知病毒的自然向量进行比较。通过这种方式,人们可以通过观察附近病毒的特性来预测这种新病毒的可能特性。快速准确地识别新病毒并预测其功能,对于当局在其达到大流行状态并在公众中传播之前采取预防措施和生产疫苗非常有帮助。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Stephen S. Yau其他文献
Blockchain-Based Software Architecture Development for Service Requirements With Smart Contracts
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:
- 作者:
Yan Zhu;Qian Guo;Hongjian Yin;Kaitai Liang;Stephen S. Yau - 通讯作者:
Stephen S. Yau
Attribute-based Private Data Sharing with Script-driven Programmable Ciphertext and Decentralized Key Management in Blockchain Internet of Things
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:10.6
- 作者:
Hongjian Yin;E Chen;Yan Zhu;Chengwei Zhao;Rongquan Feng;Stephen S. Yau - 通讯作者:
Stephen S. Yau
Dynamic Audit Services for Outsourced Storages in Clouds
云中外包存储的动态审计服务
- DOI:
10.1109/tsc.2011.51 - 发表时间:
2013-04 - 期刊:
- 影响因子:8.1
- 作者:
Hongxin Hu;Stephen S. Yau;Ho G. An;Chang-Jun Hu - 通讯作者:
Chang-Jun Hu
An adaptable distributed trust management framework for large-scale secure service-based systems
- DOI:
10.1007/s00607-013-0354-9 - 发表时间:
2013-10-12 - 期刊:
- 影响因子:2.800
- 作者:
Stephen S. Yau;Yisheng Yao;Arun Balaji Buduru - 通讯作者:
Arun Balaji Buduru
Towards Green Service Composition Approach in the Cloud
迈向云中的绿色服务组合方法
- DOI:
10.1109/tsc.2018.2868356 - 发表时间:
2018-09 - 期刊:
- 影响因子:8.1
- 作者:
Shangguang Wang;Ao Zhou;Ruo Bao;Chou Wu;Stephen S. Yau - 通讯作者:
Stephen S. Yau
Stephen S. Yau的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Stephen S. Yau', 18)}}的其他基金
Global invariants for complex varieties with isolated singularities and applications
具有孤立奇点和应用的复杂品种的全局不变量
- 批准号:
0802803 - 财政年份:2008
- 资助金额:
$ 74.52万 - 项目类别:
Continuing Grant
Global Invariants for CR Geometry and Isolated Singularities
CR 几何和孤立奇点的全局不变量
- 批准号:
0503868 - 财政年份:2005
- 资助金额:
$ 74.52万 - 项目类别:
Standard Grant
U.S.-Hong Kong Joint Workshop: Recent Developments in Several Complex Variables, Cauchy Riemann Geometry and Complex Algebraic Geometry
美国-香港联合研讨会:多复变量、柯西黎曼几何和复代数几何的最新进展
- 批准号:
0224546 - 财政年份:2002
- 资助金额:
$ 74.52万 - 项目类别:
Standard Grant
Some Natural Problems in Complex Geometry
复杂几何中的一些自然问题
- 批准号:
9702836 - 财政年份:1997
- 资助金额:
$ 74.52万 - 项目类别:
Standard Grant
U.S.- China Joint Seminar on Singularities and Complex Geometry, Beijing, China, June 1994
中美奇点与复杂几何联合研讨会,中国北京,1994 年 6 月
- 批准号:
9320301 - 财政年份:1994
- 资助金额:
$ 74.52万 - 项目类别:
Standard Grant
Mathematical Sciences: Some Natural Problems in Complex Geometry
数学科学:复杂几何中的一些自然问题
- 批准号:
9321262 - 财政年份:1994
- 资助金额:
$ 74.52万 - 项目类别:
Standard Grant
Mathematical Sciences: Some Natural Problems in Complex Geometry
数学科学:复杂几何中的一些自然问题
- 批准号:
9112949 - 财政年份:1991
- 资助金额:
$ 74.52万 - 项目类别:
Standard Grant
Mathematical Sciences: Some Natural Problems in Complex Geometry
数学科学:复杂几何中的一些自然问题
- 批准号:
8822747 - 财政年份:1989
- 资助金额:
$ 74.52万 - 项目类别:
Continuing Grant
Mathematical Sciences: Isolated Hypersurface Singularities, Invariant Theory of sl(2,C) and Complex Analytic Geometry
数学科学:孤立超曲面奇点、sl(2,C) 不变理论和复解析几何
- 批准号:
8601974 - 财政年份:1986
- 资助金额:
$ 74.52万 - 项目类别:
Continuing Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: AF: Medium: Algorithms Meet Machine Learning: Mitigating Uncertainty in Optimization
协作研究:AF:媒介:算法遇见机器学习:减轻优化中的不确定性
- 批准号:
2422926 - 财政年份:2024
- 资助金额:
$ 74.52万 - 项目类别:
Continuing Grant
Collaborative Research: AF: Small: Structural Graph Algorithms via General Frameworks
合作研究:AF:小型:通过通用框架的结构图算法
- 批准号:
2347322 - 财政年份:2024
- 资助金额:
$ 74.52万 - 项目类别:
Standard Grant
Collaborative Research: AF: Medium: Fast Combinatorial Algorithms for (Dynamic) Matchings and Shortest Paths
合作研究:AF:中:(动态)匹配和最短路径的快速组合算法
- 批准号:
2402283 - 财政年份:2024
- 资助金额:
$ 74.52万 - 项目类别:
Continuing Grant
Collaborative Research: AF: Small: Structural Graph Algorithms via General Frameworks
合作研究:AF:小型:通过通用框架的结构图算法
- 批准号:
2347321 - 财政年份:2024
- 资助金额:
$ 74.52万 - 项目类别:
Standard Grant
Collaborative Research: AF: Medium: Fast Combinatorial Algorithms for (Dynamic) Matchings and Shortest Paths
合作研究:AF:中:(动态)匹配和最短路径的快速组合算法
- 批准号:
2402284 - 财政年份:2024
- 资助金额:
$ 74.52万 - 项目类别:
Continuing Grant
Collaborative Research: AF: Medium: Adventures in Flatland: Algorithms for Modern Memories
合作研究:AF:媒介:平地历险记:现代记忆算法
- 批准号:
2423105 - 财政年份:2024
- 资助金额:
$ 74.52万 - 项目类别:
Continuing Grant
Collaborative Research: CIF: Small: Versatile Data Synchronization: Novel Codes and Algorithms for Practical Applications
合作研究:CIF:小型:多功能数据同步:实际应用的新颖代码和算法
- 批准号:
2312872 - 财政年份:2023
- 资助金额:
$ 74.52万 - 项目类别:
Standard Grant
Collaborative Research: III: Medium: Algorithms for scalable inference and phylodynamic analysis of tumor haplotypes using low-coverage single cell sequencing data
合作研究:III:中:使用低覆盖率单细胞测序数据对肿瘤单倍型进行可扩展推理和系统动力学分析的算法
- 批准号:
2415562 - 财政年份:2023
- 资助金额:
$ 74.52万 - 项目类别:
Standard Grant
Collaborative Research: Random Matrices and Algorithms in High Dimension
合作研究:高维随机矩阵和算法
- 批准号:
2306438 - 财政年份:2023
- 资助金额:
$ 74.52万 - 项目类别:
Continuing Grant
Collaborative Research: SLES: Safe Distributional-Reinforcement Learning-Enabled Systems: Theories, Algorithms, and Experiments
协作研究:SLES:安全的分布式强化学习系统:理论、算法和实验
- 批准号:
2331781 - 财政年份:2023
- 资助金额:
$ 74.52万 - 项目类别:
Standard Grant














{{item.name}}会员




