Interoperation of Genome Databases and Tools
基因组数据库和工具的互操作
基本信息
- 批准号:6526869
- 负责人:
- 金额:$ 14.3万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2001
- 资助国家:美国
- 起止时间:2001-09-01 至 2006-08-31
- 项目状态:已结题
- 来源:
- 关键词:Internet computer data analysis computer program /software computer system design /evaluation data collection methodology /evaluation data management gene expression gene frequency genome human population genetics information retrieval microarray technology molecular biology information system parallel processing
项目摘要
DESCRIPTION: (provided by applicant) This application for an NIH Mentored          
 Quantitative Research Career Award requests support for Dr. Kei-Hoi Cheung as      
 he embarks on a faculty career focused on genome-related bioinformatics. This      
 application presents a research career development plan in the field of            
 bioinformatics, bridging computer science and biology. The plan includes two       
 partially overlapping phases: (1) a didactic phase that emphasizes training,       
 including coursework and laboratory work in the area of genetics and genomics      
 to complement Dr. Cheung's doctoral training in Computer Science and (ii) a        
 development phase that focuses on intense development of the proposed research.    
 These two phases will be closely supervised by a steering committee of senior      
 scientists, who will serve as mentors or advisors, in the area of biology and      
 bioinformatics.                                                                    
                                                                                    
 The human genome project and the rapid advance in genomic technology (e.g.,        
 microarrays) have produced numerous local, national, and international genome      
 databases, many of which are Web-accessible. To answer questions that arise in     
 advanced genome research projects, researchers often need to analyze a large       
 amount of data that are collected from multiple related databases. Therefore,      
 it is important to explore (1) how to integrate the databases involved in a        
 flexible and useful fashion and (2) how to perform large-scale data analyses as    
 easily and rapidly as possible. To this end, we propose two complimentary          
 approaches.                                                                        
                                                                                    
 1. The problem of data integration or interoperation is difficult because of       
 the syntactic and semantic heterogeneities involved. To address this problem,      
 we propose a metadata-driven approach using eXtensible Markup Language (XML),      
 which incorporates standardized vocabulary to map heterogeneous Web-accessible     
 data sets into a common format that facilitates interoperability.                  
                                                                                    
 2. To facilitate and speed up analysis of a large quantity of data, we will        
 also explore a range of computational techniques including the use of              
 Turbogenomics, which represents collaboration with the high performance            
 computing group within the Yale department of Computer Science. These              
 techniques allow (i) integration of heterogeneous software components (analysis    
 tools) to be done easily and (ii) exploitation of the power of parallel            
 computing.                                                                         
                                                                                    
 We will design, develop, test, and evaluate the approach in the context of         
 current database projects including: 1) TRIPLES that manages data for              
 large-scale yeast genome analysis (with Prof Snyder) and 2) ALFRED that stores     
 gene frequency data on different human populations (with Prof Kidd). We have       
 identified a number of related external Web-accessible databases as well as        
 tools that users would like to access from TRIPLES and ALFRED in an integrated     
 fashion. We will initially develop and apply our approach to integrate these       
 databases and tools. We will extend our approach to other types of genomic data    
 such as microarray data, which both laboratories and others will soon be           
 generating in large quantities.
描述:(由申请人提供)此NIH辅导申请          
 定量研究职业奖要求支持张启海博士,      
 他开始了专注于基因组相关生物信息学的教师生涯。这      
 申请提出了该领域的研究职业发展计划            
 生物信息学,连接计算机科学和生物学。该计划包括两个       
 部分重叠阶段:(1)强调培训的教学阶段,       
 包括遗传学和基因组学领域的课程作业和实验室工作      
 以补充张博士在计算机科学方面的博士培训;及(ii)        
 发展阶段,重点是密集发展的拟议研究。    
 这两个阶段将由一个高级指导委员会密切监督。      
 科学家,谁将担任导师或顾问,在生物学领域,      
 生物信息学                                                                    
                                                                                    
 人类基因组计划和基因组技术的快速发展(例如,        
 微阵列)已经产生了许多本地、国家和国际基因组      
 数据库,其中许多是网络访问。为了回答出现在     
 先进的基因组研究项目,研究人员往往需要分析一个大的       
 从多个相关数据库中收集的数据量。因此,我们认为,      
 重要的是要探索(1)如何整合数据库中涉及的一个        
 灵活和有用的方式和(2)如何执行大规模的数据分析,    
 尽可能简单和快速。为此,我们提出两个免费的          
 接近。                                                                        
                                                                                    
 1.数据集成或互操作的问题是困难的,       
 句法和语义的异质性。为了解决这个问题,      
 我们提出了一种使用可扩展标记语言(XML)的元数据驱动的方法,      
 它结合了标准化的词汇表,     
 数据集转换为通用格式,便于互操作。                  
                                                                                    
 2.为方便及加快分析大量数据,我们会        
 还探讨了一系列的计算技术,包括使用              
 涡轮基因组学,这代表了与高性能            
 耶鲁大学计算机科学系的计算小组。这些              
 技术允许(i)异构软件组件的集成(分析    
 工具)容易完成和(ii)利用并行的力量            
 计算的                                                                         
                                                                                    
 我们将在以下背景下设计、开发、测试和评估该方法:         
 目前的数据库项目包括:1)TRIPLES,管理数据,              
 大规模酵母基因组分析(与斯奈德教授)和2)ALFRED,     
 不同人群的基因频率数据(与基德教授一起)。我们有       
 确定了一些相关的外部网络访问数据库,以及        
 用户希望从TRIPLES和ALFRED访问的集成工具     
 时尚.我们最初将开发并应用我们的方法来整合这些       
 数据库和工具。我们将把我们的方法扩展到其他类型的基因组数据    
 例如微阵列数据,实验室和其他机构很快就会           
 大量产生。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
                item.title }}
{{ item.translation_title }}
- DOI:{{ item.doi }} 
- 发表时间:{{ item.publish_year }} 
- 期刊:
- 影响因子:{{ item.factor }}
- 作者:{{ item.authors }} 
- 通讯作者:{{ item.author }} 
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ patent.updateTime }}
KEI-HOI CHEUNG其他文献
KEI-HOI CHEUNG的其他文献
{{
              item.title }}
{{ item.translation_title }}
- DOI:{{ item.doi }} 
- 发表时间:{{ item.publish_year }} 
- 期刊:
- 影响因子:{{ item.factor }}
- 作者:{{ item.authors }} 
- 通讯作者:{{ item.author }} 
{{ truncateString('KEI-HOI CHEUNG', 18)}}的其他基金
相似海外基金
Improving the Use of Computer Data Analysis Skills in Undergraduate Meteorology
提高本科气象学计算机数据分析技能的运用
- 批准号:0940842 
- 财政年份:2009
- 资助金额:$ 14.3万 
- 项目类别:Standard Grant 
Instrumentation for Undergraduate Lab Instruction in Molecular Cell Biology Using Non-Radioactive Labels and Computer Data Analysis
使用非放射性标记和计算机数据分析进行分子细胞生物学本科实验室教学的仪器
- 批准号:9750703 
- 财政年份:1997
- 资助金额:$ 14.3万 
- 项目类别:Standard Grant 
The IR Spectroscopy and Computer Data Analysis in Organic, General, and Introductory Chemistry Laboratories
有机、普通和入门化学实验室中的红外光谱和计算机数据分析
- 批准号:9051296 
- 财政年份:1990
- 资助金额:$ 14.3万 
- 项目类别:Standard Grant 
Fourier Transform Spectroscopy and Laboratory Computer Data Analysis
傅里叶变换光谱学和实验室计算机数据分析
- 批准号:8551986 
- 财政年份:1985
- 资助金额:$ 14.3万 
- 项目类别:Standard Grant 

 刷新
              刷新
            
















 {{item.name}}会员
              {{item.name}}会员
            



