ITR Collaborative Research: Combinatorial Algorithms for Biological Data Clustering
ITR 协作研究:生物数据聚类的组合算法
基本信息
- 批准号:0324292
- 负责人:
- 金额:$ 45.5万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2003
- 资助国家:美国
- 起止时间:2003-09-15 至 2008-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Project SummaryThe Human Genome Project has opened the flood-gate of biological data, which has resulted in the generation of enormous amount of sequence, structure, expression, and interaction data at rates that far exceed our current capability of analyzing and interpreting them. New ideas and approaches are urgently needed to establish greatly improved capabilities for biological data analysis. Data clustering is fundamental to mining a large quantity of biological data. The goals of this project are (a) to develop a highly effective and general framework for biological data clustering, which is applicable to a large class of biological data analysis problems; (b) to demonstrate the effectiveness of this framework as a general-purpose clustering tool, through application to four challenging biological data analysis problems; (c) to implement this clustering framework as a set of library functions, in a similar fashion to LINPACK/LAPACK, with which other researchers can build their own clustering capabilities more efficiently; (d) to provide insight on several biological problems through clustering analysis; and (e) to train students/postdocs how to build biological data analysis tools, using our clustering framework as a training ground. The foundation of our framework is a minimum spanning tree (MST) representation of a data set and its relationships with clustering. Our preliminary studies have revealed that (i) there is a natural connection between MSTs and the concept of clustering, which can help to reduce a multi-dimensional data clustering problem to a tree-partitioning problem; (ii) clustering problems with general objective functions, defined on (minimum spanning) trees, can be solved optimally and efficiently; and (iii) MSTs provide a natural framework for solving a more general class of clustering problems, i.e., extracting data clusters from a noisy background. Additional preliminary studies have also revealed that MSTs have such rich properties related to clustering that further investigation could lead to significantly more effective ways of clustering and analyzing biological data. Our research will be organized and carried out in five tasks.o Investigation of fundamental properties of MSTs versus clustering: We will investigate fundamental relationships between MSTs and clustering. New insights and discoveries about their relationships will be used to lay the foundation for development of more effective ways of clustering.o Investigation and development of MST-based clustering algorithms and statistical analysis methods: We will investigate and develop a large class of MST-based algorithms for several clustering related problems. In addition, we will investigate and develop effective statistical analysis tools for assessing statistical significance and robustness of clustering results.o Development of improved analysis capabilities for four selected application problems: We will apply our clustering framework to four biological data analysis problems: (1) gene expression data analysis, (2) regulatory binding site identification, (3) two-hybrid data analysis, and (4) phylogenetic tree clustering analysis.o Implementation of our MST-based clustering framework as library functions: We will implement our MST-based clustering-related algorithms as APIs (Application Programming Interface), which can be used easily by other researchers in their own data analysis software. In addition, we will implement our clustering tools as a Web server for community service.o Training and education: As MST provides such a rich set of attractive properties relevant to clustering, we will use our MST-based clustering framework as a training platform to teach students/postdocs how to develop biological data analysis tools.Our proposed study and development directly address the research challenges of the ITR program in the following areas:o providing new computational, simulation and data-analysis methods and tools to model physical, biological,social, behavioral and mathematical phenomena, ando improving our ability to understand, model and control the behavior of complex systems.
人类基因组计划打开了生物数据的闸门,导致了大量序列、结构、表达和相互作用数据的产生,其速度远远超出了我们目前分析和解释它们的能力。迫切需要新的想法和方法来建立大大改进的生物数据分析能力。数据聚类是挖掘大量生物数据的基础。本项目的目标是:(A)开发一个适用于一大类生物数据分析问题的高效和通用的生物数据聚类框架;(B)通过对四个具有挑战性的生物数据分析问题的应用,展示该框架作为一个通用聚类工具的有效性;(C)以类似于LINPACK/LAPACK的方式,将该聚类框架作为一组库函数来实现,以使其他研究人员能够更有效地建立他们自己的聚类能力;(D)通过聚类分析对几个生物学问题提供洞察;以及(E)培训学生/博士后如何使用我们的集群框架作为培训基础来构建生物数据分析工具。该框架的基础是数据集的最小生成树(MST)表示及其与聚类的关系。我们的初步研究表明:(I)MSTs与聚类的概念之间存在着天然的联系,这有助于将多维数据聚类问题归结为树划分问题;(Ii)定义在(最小生成)树上的具有一般目标函数的聚类问题可以被最优且有效地解决;以及(Iii)MSTs为解决更一般的类聚类问题提供了一个自然的框架,即从噪声背景中提取数据簇。更多的初步研究也表明,MST具有与聚类相关的如此丰富的属性,进一步的研究可能导致更有效的方法来聚类和分析生物数据。我们的研究将在五个任务中组织和进行。o MST和集群的基本属性的调查:我们将调查MST和集群之间的基本关系。关于它们之间关系的新见解和新发现将被用来为开发更有效的集群方法奠定基础。o基于MST的集群算法和统计分析方法的研究和发展:我们将研究和开发一大类基于MST的算法来解决几个与集群相关的问题。O针对选定的四个应用问题开发改进的分析能力:我们将把我们的聚类框架应用于四个生物数据分析问题:(1)基因表达数据分析,(2)调控结合位点识别,(3)双杂交数据分析,(4)系统发育树聚类分析。o将我们的基于MST的聚类框架实现为库函数:我们将以API(应用编程接口)的形式实现我们的基于MST的聚类相关算法,以便其他研究人员在自己的数据分析软件中轻松使用。O培训和教育:由于MST提供了如此丰富的与集群相关的有吸引力的属性集,我们将使用我们基于MST的集群框架作为培训平台,教授学生/博士后如何开发生物数据分析工具。我们提出的研究和开发直接解决了ITR计划在以下领域的研究挑战:o提供新的计算、模拟和数据分析方法和工具来模拟物理、生物、社会、行为和数学现象,以及o提高我们理解、建模和控制复杂系统行为的能力。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Michael Zhang其他文献
Implications of Antiangiogenic Therapy on Radiographic Assessment of Brain Tumors.
抗血管生成治疗对脑肿瘤放射学评估的影响。
- DOI:
10.1016/j.wneu.2017.09.035 - 发表时间:
2017 - 期刊:
- 影响因子:2
- 作者:
H. Narayanamurthy;Michael Zhang;M. Teo - 通讯作者:
M. Teo
The value of pre-trip information on departure time and route choice in the morning commute under stochastic bottleneck capacity
随机瓶颈容量下出行前信息对早间通勤出发时间和路线选择的价值
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Xiao Han;Yun Yu;Ziyou Gao;Michael Zhang - 通讯作者:
Michael Zhang
Reducing Reliance on Spurious Features in Medical Image Classification with Spatial Specificity
减少对具有空间特异性的医学图像分类中的虚假特征的依赖
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Khaled Saab;Sarah Hooper;Mayee F. Chen;Michael Zhang;D. Rubin;Christopher Ré - 通讯作者:
Christopher Ré
Characterization of novel Actinobacteriophage Giantsbane reveals unexpected cluster AU relationships
新型放线菌噬菌体 Giantsbane 的表征揭示了意想不到的簇 AU 关系
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Pei;Christopher Liu;Preston Dang;Michael Zhang;A. Kapinos;Ryan Ngo;K. Reddi;J. M. Parker;Amanda C. Freise - 通讯作者:
Amanda C. Freise
Heterogeneous response and irAE patterns in advanced melanoma patients treated with anti-PD-1 monotherapy from different ethnic groups: Subtype distribution discrepancy and beyond.
不同种族的晚期黑色素瘤患者接受抗 PD-1 单药治疗的异质反应和 irAE 模式:亚型分布差异及其他。
- DOI:
10.1200/jco.2020.38.15_suppl.10020 - 发表时间:
2020 - 期刊:
- 影响因子:45.3
- 作者:
X. Bai;Henry T. Quach;C. Cann;Michael Zhang;Michelle S. Kim;Gyulnara G. Kasumova;L. Si;B. Tang;C. Cui;Xiaoling Yang;Xiaoting Wei;J. Cohen;D. Lawrence;T. Sharova;Dennie T. Frederick;K. Flaherty;R. Sullivan;G. Boland;Douglas B. Johnson;Jun Guo - 通讯作者:
Jun Guo
Michael Zhang的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Michael Zhang', 18)}}的其他基金
Collaborative Research: Bias Modeling and Estimation of Networked Transportation Data
合作研究:网络交通数据的偏差建模和估计
- 批准号:
1825873 - 财政年份:2018
- 资助金额:
$ 45.5万 - 项目类别:
Standard Grant
CPS: Synergy: Collaborative Research: Matching Parking Supply to Travel Demand towards Sustainability: a Cyber Physical Social System for Sensing Driven Parking
CPS:协同:协作研究:将停车供应与出行需求相匹配,实现可持续发展:传感驱动停车的网络物理社会系统
- 批准号:
1544835 - 财政年份:2015
- 资助金额:
$ 45.5万 - 项目类别:
Standard Grant
User-Centric Sensing and Distributed Control of Corridor Transportation Networks
以用户为中心的走廊交通网络感知和分布式控制
- 批准号:
1301496 - 财政年份:2013
- 资助金额:
$ 45.5万 - 项目类别:
Standard Grant
Distributed Vehicular Traffic Management via DSRC-Enabled Vehicles
通过支持 DSRC 的车辆进行分布式车辆交通管理
- 批准号:
0700383 - 财政年份:2007
- 资助金额:
$ 45.5万 - 项目类别:
Standard Grant
CAREER: Improved Continuum Models of Vehicular Traffic Flow
职业:改进的车辆交通流连续体模型
- 批准号:
9984239 - 财政年份:2000
- 资助金额:
$ 45.5万 - 项目类别:
Standard Grant
相似海外基金
ITR Collaborative Research: Pervasively Secure Infrastructures (PSI): Integrating Smart Sensing, Data Mining, Pervasive Networking, and Community Computing
ITR 协作研究:普遍安全基础设施 (PSI):集成智能传感、数据挖掘、普遍网络和社区计算
- 批准号:
1404694 - 财政年份:2013
- 资助金额:
$ 45.5万 - 项目类别:
Continuing Grant
ITR-SCOTUS: A Resource for Collaborative Research in Speech Technology, Linguistics, Decision Processes, and the Law
ITR-SCOTUS:语音技术、语言学、决策过程和法律合作研究的资源
- 批准号:
1139735 - 财政年份:2011
- 资助金额:
$ 45.5万 - 项目类别:
Continuing Grant
ITR/NGS: Collaborative Research: DDDAS: Data Dynamic Simulation for Disaster Management
ITR/NGS:合作研究:DDDAS:灾害管理数据动态模拟
- 批准号:
0963973 - 财政年份:2009
- 资助金额:
$ 45.5万 - 项目类别:
Continuing Grant
ITR/NGS: Collaborative Research: DDDAS: Data Dynamic Simulation for Disaster Management
ITR/NGS:合作研究:DDDAS:灾害管理数据动态模拟
- 批准号:
1018072 - 财政年份:2009
- 资助金额:
$ 45.5万 - 项目类别:
Continuing Grant
ITR Collaborative Research: A Reusable, Extensible, Optimizing Back End
ITR 协作研究:可重用、可扩展、优化的后端
- 批准号:
0838899 - 财政年份:2008
- 资助金额:
$ 45.5万 - 项目类别:
Continuing Grant
ITR Collaborative Research: Pervasively Secure Infrastructures (PSI): Integrating Smart Sensing, Data Mining, Pervasive Networking, and Community Computing
ITR 协作研究:普遍安全基础设施 (PSI):集成智能传感、数据挖掘、普遍网络和社区计算
- 批准号:
0833849 - 财政年份:2008
- 资助金额:
$ 45.5万 - 项目类别:
Continuing Grant
ITR/NGS: Collaborative Research: DDDAS: Data Dynamic Simulation for Disaster Management
ITR/NGS:合作研究:DDDAS:灾害管理数据动态模拟
- 批准号:
0808419 - 财政年份:2007
- 资助金额:
$ 45.5万 - 项目类别:
Continuing Grant
ITR: Collaborative Research - ASE - (sim+dmc): Image-based Biophysical Modeling: Scalable Registration and Inversion Algorithms and Distributed Computing
ITR:协作研究 - ASE - (sim dmc):基于图像的生物物理建模:可扩展配准和反演算法以及分布式计算
- 批准号:
0849301 - 财政年份:2007
- 资助金额:
$ 45.5万 - 项目类别:
Continuing Grant
ITR: Collaborative Research: Modeling and Display of Haptic Information for Enhanced Performance of Computer-Integrated Surgery
ITR:协作研究:触觉信息建模和显示,以提高计算机集成手术的性能
- 批准号:
0711040 - 财政年份:2007
- 资助金额:
$ 45.5万 - 项目类别:
Standard Grant
Collaborative Research: ITR-(ASE)-(dmc): Overcoming Fractionation Errors in Cancer Treatement Planning
合作研究:ITR-(ASE)-(dmc):克服癌症治疗计划中的分割错误
- 批准号:
0749671 - 财政年份:2006
- 资助金额:
$ 45.5万 - 项目类别:
Standard Grant














{{item.name}}会员




