POWRE: Combining Data Mining and Information Visualization Techniques with a Molecular Biology Sequence Similarity Database System
POWRE:将数据挖掘和信息可视化技术与分子生物学序列相似性数据库系统相结合
基本信息
- 批准号:9753283
- 负责人:
- 金额:$ 7.06万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:1998
- 资助国家:美国
- 起止时间:1998-01-01 至 1999-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
The main objective of this project is to aid genome  researchers with the  task of elucidating patterns and clusters in large amounts  of biological  data.  For genome researchers who are interested in  comparing gene or  protein sequences to the sequences within one genome or  across genomes,  this task involves executing hundreds of thousands of  similarity searches  that produce text output.  This project involves the  development of two  specific software tools for visualizing and exploring the  similarity data  in a database of biological sequence similarity results.    The first tool will be an Interactive Categorization Tool.  This tool will  display attributes of selected similarity database objects  in a 2D scatterplot and enable dynamic manipulation of the  display. This will  enable the genome researcher to explore the attributes of  similarities  and categorize the similarities based on those attributes.  For example,  the genome researcher will be able to vary the input  parameters of a  function for computing the strength of each detected  similarity and  display a plot with the strength of each similarity shown as  the color of  each point, and the points situated in the 2D space based on  score and  statistical significance as the X and Y axes.  The tool will  enable  genome researchers to dynamically manipulate the generation  of higher-  level concepts or categories for detected similarities  (strong, marginal,  and weak similarities as opposed to individual similarities  with particular values of score and statistical significance  that are more  difficult to compare). This will lead to their ability to  categorize hits  as orthologous or paralogous, based on various attributes of  the detected  similarities. Score and p-value are not the only attributes  that can be  used -- the system is general enough that other attributes,  such as  percent identity, percent conserved, and length of  alignment, among  others, could be used in functions. Thus, genome researchers  can cond uct  exploration at different stages of the genome comparison  research process.    The second tool will be a Cluster Exploration Tool. Using  the results  from data mining techniques that cluster like sequences  together, genome  researchers will be able to visualize the similarities among  the sequences in the clusters. For example, the tool can be  used for a cluster of new unknown sequences that were found  similar to members of a  group of known sequences. The new sequences can be  positioned as nodes on  the left in a bipartite graph, and the known sequences that  they are  similar to can be positioned along the right. Lines drawn  between the  nodes, colored differently based on the strength of the  hits, will enable  the researcher to visualize the connectedness of the  sequences in the  cluster. Details about each sequence and each similarity in  the cluster  can be obtained from the DBMS. This will enable genome  researchers to  study groups of orthologous or parologous sequences.    A key feature of these tools is that they will be 'thin'  clients (often  referred to as applets) that communicate with the underlying  DBMS via  queries formulated visually by the genome researchers. The  use of Java-  based components for these tools will enable them to be  easily used and  shared by the bioinformatics community and the genome  research community.  The development of these tools will demonstrate the  feasibility of the  thin-client approach that is the hallmark of the network  computing architecture philosophy.
该项目的主要目标是帮助基因组研究人员阐明大量生物数据中的模式和聚类。  对于有兴趣将基因或蛋白质序列与一个基因组内或跨基因组的序列进行比较的基因组研究人员来说,这项任务涉及执行数十万次相似性搜索以产生文本输出。  该项目涉及开发两种特定的软件工具,用于可视化和探索生物序列相似性结果数据库中的相似性数据。    第一个工具是交互式分类工具。  该工具将在 2D 散点图中显示所选相似性数据库对象的属性,并启用显示的动态操作。这将使基因组研究人员能够探索相似性的属性,并根据这些属性对相似性进行分类。  例如,基因组研究人员将能够改变用于计算每个检测到的相似性强度的函数的输入参数,并显示一个图,其中每个相似性的强度显示为每个点的颜色,并且基于得分和统计显着性的点位于 2D 空间中作为 X 和 Y 轴。  该工具将使基因组研究人员能够动态地操纵生成更高级别的概念或类别以检测相似性(强相似性、边际相似性和弱相似性,而不是具有更难以比较的特定得分值和统计显着性的个体相似性)。这将导致他们能够根据检测到的相似性的各种属性将命中分类为直系同源或旁系同源。得分和 p 值并不是唯一可以使用的属性 - 该系统足够通用,可以在函数中使用其他属性,例如同一性百分比、保守百分比和比对长度等。因此,基因组研究人员可以在基因组比较研究过程的不同阶段进行探索。    第二个工具是集群探索工具。利用将相似序列聚类在一起的数据挖掘技术的结果,基因组研究人员将能够可视化聚类中序列之间的相似性。例如,该工具可用于发现与一组已知序列的成员相似的新未知序列簇。新序列可以作为节点放置在二分图中的左侧,而与它们相似的已知序列可以沿着右侧放置。在节点之间绘制的线(根据命中的强度以不同的颜色着色)将使研究人员能够可视化簇中序列的连通性。有关簇中每个序列和每个相似性的详细信息可以从 DBMS 获得。这将使基因组研究人员能够研究直系同源或旁系同源序列组。    这些工具的一个关键特征是它们将是“瘦”客户端(通常称为小程序),通过基因组研究人员直观地制定的查询与底层 DBMS 进行通信。这些工具使用基于 Java 的组件将使生物信息学界和基因组研究界能够轻松使用和共享它们。  这些工具的开发将证明瘦客户端方法的可行性,这是网络计算架构哲学的标志。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
                item.title }}
{{ item.translation_title }}
- DOI:{{ item.doi }} 
- 发表时间:{{ item.publish_year }} 
- 期刊:
- 影响因子:{{ item.factor }}
- 作者:{{ item.authors }} 
- 通讯作者:{{ item.author }} 
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ patent.updateTime }}
Elizabeth Shoop其他文献
Hands-on parallel & distributed computing with Raspberry Pi devices and clusters
- DOI:10.1016/j.jpdc.2024.104996 
- 发表时间:2025-02-01 
- 期刊:
- 影响因子:
- 作者:Elizabeth Shoop;Suzanne J. Matthews;Richard Brown;Joel C. Adams 
- 通讯作者:Joel C. Adams 
Elizabeth Shoop的其他文献
{{
              item.title }}
{{ item.translation_title }}
- DOI:{{ item.doi }} 
- 发表时间:{{ item.publish_year }} 
- 期刊:
- 影响因子:{{ item.factor }}
- 作者:{{ item.authors }} 
- 通讯作者:{{ item.author }} 
{{ truncateString('Elizabeth Shoop', 18)}}的其他基金
Collaborative Research: CS in Parallel: Scaling an Incremental Modular Approach to Injecting Parallel Computing Throughout CS Curricula
协作研究:并行计算机科学:扩展增量模块化方法以在整个计算机科学课程中注入并行计算
- 批准号:1225796 
- 财政年份:2012
- 资助金额:$ 7.06万 
- 项目类别:Standard Grant 
Collaborative Research: CCLI-Responding to manycore: A strategy for injecting parallel computing education throughout the computer science curriculum
合作研究:CCLI-响应众核:在整个计算机科学课程中注入并行计算教育的策略
- 批准号:0941962 
- 财政年份:2010
- 资助金额:$ 7.06万 
- 项目类别:Standard Grant 
Into the Community: Changing Perceptions and Increasing Participation in Computer Science
走进社区:改变观念并增加对计算机科学的参与
- 批准号:0850106 
- 财政年份:2009
- 资助金额:$ 7.06万 
- 项目类别:Standard Grant 
相似海外基金
Combining Qualitative and Quantitative AI data for mobility
结合移动性的定性和定量人工智能数据
- 批准号:10080158 
- 财政年份:2023
- 资助金额:$ 7.06万 
- 项目类别:Collaborative R&D 
Combining job mobility patterns and vacancy data to better measure labour market opportunities and skill mismatch
结合工作流动模式和职位空缺数据,更好地衡量劳动力市场机会和技能不匹配
- 批准号:ES/X011887/1 
- 财政年份:2023
- 资助金额:$ 7.06万 
- 项目类别:Research Grant 
Computer-assisted diagnosis of ear pathologies by combining digital otoscopy with complementary data using machine learning
通过使用机器学习将数字耳镜与补充数据相结合来计算机辅助诊断耳部病变
- 批准号:10564534 
- 财政年份:2023
- 资助金额:$ 7.06万 
- 项目类别:
IMAT-ITCR Collaboration: Combining FIBI and topological data analysis: Synergistic approaches for tumor structural microenvironment exploration
IMAT-ITCR 合作:结合 FIBI 和拓扑数据分析:肿瘤结构微环境探索的协同方法
- 批准号:10884028 
- 财政年份:2023
- 资助金额:$ 7.06万 
- 项目类别:
OPEN INNOVATION PLATFORM FOR OPTIMISING PRODUCTION SYSTEMS BY COMBINING PRODUCT DEVELOPMENT, VIRTUAL ENGINEERING WORKFLOWS AND PRODUCTION DATA (PIONEER)
通过结合产品开发、虚拟工程工作流程和生产数据来优化生产系统的开放式创新平台(先锋)
- 批准号:10063937 
- 财政年份:2023
- 资助金额:$ 7.06万 
- 项目类别:EU-Funded 
Combining long-term field data and remote sensing to test how tree diversity influences aboveground biomass recovery in logged tropical forests
结合长期实地数据和遥感来测试树木多样性如何影响被砍伐的热带森林的地上生物量恢复
- 批准号:NE/X000281/1 
- 财政年份:2023
- 资助金额:$ 7.06万 
- 项目类别:Research Grant 
Combining quantum multicomponent molecular theory and data science to understand the mechanism of physical properties in low-barrier hydrogen-bonded systems
结合量子多组分分子理论和数据科学来理解低势垒氢键系统的物理性质机制
- 批准号:23K17905 
- 财政年份:2023
- 资助金额:$ 7.06万 
- 项目类别:Grant-in-Aid for Challenging Research (Exploratory) 
Collaborative Research: Combining Heterogeneous Data Sources to Identify Genetic Modifiers of Diseases
合作研究:结合异质数据源来识别疾病的遗传修饰因素
- 批准号:2309825 
- 财政年份:2023
- 资助金额:$ 7.06万 
- 项目类别:Continuing Grant 
IMAT-ITCR Collaboration: Combining FIBI and topological data analysis: Synergistic approaches for tumor structural microenvironment exploration
IMAT-ITCR 合作:结合 FIBI 和拓扑数据分析:肿瘤结构微环境探索的协同方法
- 批准号:10885376 
- 财政年份:2023
- 资助金额:$ 7.06万 
- 项目类别:
OPEN INNOVATION PLATFORM FOR OPTIMISING PRODUCTION SYSTEMS BY COMBINING PRODUCT DEVELOPMENT, VIRTUAL ENGINEERING WORKFLOWS AND PRODUCTION DATA (PIONEER)
通过结合产品开发、虚拟工程工作流程和生产数据来优化生产系统的开放式创新平台(先锋)
- 批准号:10058079 
- 财政年份:2023
- 资助金额:$ 7.06万 
- 项目类别:EU-Funded 

 刷新
              刷新
            
















 {{item.name}}会员
              {{item.name}}会员
            



