CAREER: Combinatorial Algorithms for Pattern Discovery with Applications to Data Mining and Computational Biology

职业:模式发现的组合算法及其在数据挖掘和计算生物学中的应用

基本信息

  • 批准号:
    0447773
  • 负责人:
  • 金额:
    --
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2005
  • 资助国家:
    美国
  • 起止时间:
    2005-08-01 至 2011-07-31
  • 项目状态:
    已结题

项目摘要

The exponential growth of the web, the recent technological progresses in molecular biology, the launch of massive-scale digital library projects, and the ability of exchanging information at our fingertips, have all contributed to the creation of an unprecedented quantity of textual data in digital form. Plain or semi-structured text is still the most versatile format in which to exchange information and there is so much of this data that is likely that the large majority of it will never be read by anyone, unless the way in which we access information drastically improves.The major limiting factor in handling large textual datasets is typically related to space rather than time. When the amount of data is too large to be stored in main memory, computer scientists have to resort to algorithms capable of dealing with compressed representations of the data (called 'sketches' or 'indexes'). For textual data, the construction of the sketch typically involves keeping statistics on substrings or related associations or rules.The first set of objectives of this project is centered around a new sketch based on a novel family of gapped patterns. We are applying the new index to three selected problems: databases; data compression; and computational biology. In the second set of objectives we are extending the pattern discovery problem to two-dimensional matrices. The discovery problem associated with two-dimensional patterns has a wide spectrum of applications including the analysis of gene expression data, recommender systems and collaborative filtering, identification of web communities, load balancing, and discovery of association rules.The education goal of the proposal is to establish the algorithmic and the fundamental software development component of an interdisciplinary bioinformatics curriculum. Funds from this proposal are being used to enhance these activities through the development of new courses in computational genomics for in-depth training on individualized research topics. Since UCR is a minority-serving institution, this plan will also have an impact on the education of under-represented students.
网络的指数级增长,分子生物学最近的技术进步,大规模数字图书馆项目的启动,以及在我们指尖交换信息的能力,所有这些都促成了以数字形式创建的前所未有的数量的文本数据。纯文本或半结构化文本仍然是交换信息的最通用的格式,这些数据中有如此之多的数据可能永远不会被任何人读取,除非我们获取信息的方式显著改善。处理大型文本数据集的主要限制因素通常与空间而不是时间有关。当数据量太大而无法存储在主内存中时,计算机科学家不得不求助于能够处理数据的压缩表示的算法(称为“草图”或“索引”)。对于文本数据,草图的构建通常涉及对子字符串或相关关联或规则的统计。本项目的第一组目标以基于一系列新型间隙图案的新草图为中心。我们正在将新的索引应用于三个选定的问题:数据库、数据压缩和计算生物学。在第二组目标中,我们将模式发现问题扩展到二维矩阵。与二维模式相关的发现问题具有广泛的应用范围,包括基因表达数据分析、推荐系统和协作过滤、网络社区识别、负载平衡和关联规则发现。该计划的教育目标是建立跨学科生物信息学课程的算法和基础软件开发组件。这项提议的资金正用于通过开发新的计算基因组学课程来加强这些活动,以便就个别化研究主题进行深入培训。由于UCR是一家为少数群体服务的机构,这项计划也将对代表不足的学生的教育产生影响。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Stefano Lonardi其他文献

TRFill: synergistic use of HiFi and Hi-C sequencing enables accurate assembly of tandem repeats for population-level analysis
  • DOI:
    10.1186/s13059-025-03685-5
  • 发表时间:
    2025-07-28
  • 期刊:
  • 影响因子:
    9.400
  • 作者:
    Huaming Wen;Jinbao Yang;Xianjia Zhao;Xingbin Wang;Jiawei Lei;Yanchun Li;Wenjie Du;Dongxi Li;Yun Xu;Stefano Lonardi;Weihua Pan
  • 通讯作者:
    Weihua Pan
Correction to: Comprehensive benchmarking and ensemble approaches for metagenomic classifiers
  • DOI:
    10.1186/s13059-019-1687-2
  • 发表时间:
    2019-04-05
  • 期刊:
  • 影响因子:
    9.400
  • 作者:
    Alexa B. R. McIntyre;Rachid Ounit;Ebrahim Afshinnekoo;Robert J. Prill;Elizabeth Hénaff;Noah Alexander;Samuel S. Minot;David Danko;Jonathan Foox;Sofia Ahsanuddin;Scott Tighe;Nur A. Hasan;Poorani Subramanian;Kelly Moffat;Shawn Levy;Stefano Lonardi;Nick Greenfield;Rita R. Colwell;Gail L. Rosen;Christopher E. Mason
  • 通讯作者:
    Christopher E. Mason

Stefano Lonardi的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Stefano Lonardi', 18)}}的其他基金

III: Small: Improving de novo Genome Assembly using Optical Maps
III:小:使用光学图谱改进从头基因组组装
  • 批准号:
    1814359
  • 财政年份:
    2018
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
III: Small: Algorithms for Genome Assembly of Ultra-Deep Sequencing Data
III:小:超深度测序数据的基因组组装算法
  • 批准号:
    1526742
  • 财政年份:
    2015
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
III: Medium: Algorithms and Software Tools for Epigenetics Research
III:媒介:表观遗传学研究的算法和软件工具
  • 批准号:
    1302134
  • 财政年份:
    2013
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
ABI Innovation: Barcoding-Free Multiplexing: Leveraging Combinatorial Pooling for High-Throughput Sequencing
ABI 创新:无条形码多重分析:利用组合池进行高通量测序
  • 批准号:
    1062301
  • 财政年份:
    2011
  • 资助金额:
    --
  • 项目类别:
    Standard Grant

相似海外基金

Collaborative Research: AF: Medium: Fast Combinatorial Algorithms for (Dynamic) Matchings and Shortest Paths
合作研究:AF:中:(动态)匹配和最短路径的快速组合算法
  • 批准号:
    2402283
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
Collaborative Research: AF: Medium: Fast Combinatorial Algorithms for (Dynamic) Matchings and Shortest Paths
合作研究:AF:中:(动态)匹配和最短路径的快速组合算法
  • 批准号:
    2402284
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
Collaborative Research: FET: Small: De Novo Protein Scaffold Filling by Combinatorial Algorithms and Deep Learning Models
合作研究:FET:小型:通过组合算法和深度学习模型从头填充蛋白质支架
  • 批准号:
    2307573
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Collaborative Research: FET: Small: De Novo Protein Scaffold Filling by Combinatorial Algorithms and Deep Learning Models
合作研究:FET:小型:通过组合算法和深度学习模型从头填充蛋白质支架
  • 批准号:
    2307571
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Collaborative Research: FET: Small: De Novo Protein Scaffold Filling by Combinatorial Algorithms and Deep Learning Models
合作研究:FET:小型:通过组合算法和深度学习模型从头填充蛋白质支架
  • 批准号:
    2307572
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Combinatorial Algorithms for Parallel and Distributed Computing
并行和分布式计算的组合算法
  • 批准号:
    RGPIN-2020-06789
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Discovery Grants Program - Individual
Efficient Algorithms for Combinatorial Optimization Problems in Networks and Beyond
网络及其他领域组合优化问题的有效算法
  • 批准号:
    RGPIN-2017-03956
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Discovery Grants Program - Individual
Approximation Algorithms for Combinatorial Optimization Problems
组合优化问题的近似算法
  • 批准号:
    RGPIN-2020-06423
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Discovery Grants Program - Individual
Algorithms for hard quadratic combinatorial optimization problems and linkages with quantum bridge analytics
硬二次组合优化问题的算法以及与量子桥分析的联系
  • 批准号:
    RGPIN-2021-03190
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Discovery Grants Program - Individual
Combinatorial problems from the perspective of algorithms and complexity
从算法和复杂度角度看组合问题
  • 批准号:
    RGPIN-2017-04459
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Discovery Grants Program - Individual
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了