A clustering framework for the process of knowledge discovery in databases
数据库中知识发现过程的聚类框架
基本信息
- 批准号:250960-2006
- 负责人:
- 金额:$ 2.23万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2008
- 资助国家:加拿大
- 起止时间:2008-01-01 至 2009-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Knowledge Discovery in Databases (KDD) has been defined as the process of extracting valid, novel, understandable and potentially useful patterns from large databases. The KDD process involves several steps, in particular focusing, pre-processing, data mining and evaluation, that normally have to be iterated to achieve satisfactory results. So far, most KDD research has focused on the data mining step, developing efficient algorithms for tasks such as clustering, classification and association rule mining. Unfortunately, not much research has addressed the other steps and the process as a whole, which has seriously limited the usefulness of existing data mining methods. In this project, we want to explore support for the entire KDD process in the context of clustering, one of the most important data mining tasks. The lack of support for all KDD steps is especially problematic for clustering due to its unsupervised, exploratory nature and because most clustering algorithms do not generate explicit patterns, but return clusters simply as sets of objects. The objective of this proposed project is to develop a framework for clustering supporting all KDD steps. Most existing clustering algorithms exploit only attributes of the objects to be clustered, but in many emerging applications relationships among the target table and attributes from related tables play an important role in representing the objects of interest. In market segmentation, e.g., not only the purchasing preferences but also the social network among the customers is relevant for clustering. When clustering gene expression data, as another example, attributes of the related proteins and their further relationships must be considered. We plan to evaluate our clustering framework in close collaboration with domain experts in the applications of analysis of gene expression data, analysis of flow cytometry data as well as community identification and market segmentation.
数据库中的知识发现(KDD)被定义为从大型数据库中提取有效的、新颖的、可理解的和潜在有用的模式的过程。 KDD 过程涉及几个步骤,特别是聚焦、预处理、数据挖掘和评估,通常必须迭代才能获得满意的结果。到目前为止,大多数 KDD 研究都集中在数据挖掘步骤,为聚类、分类和关联规则挖掘等任务开发有效的算法。不幸的是,没有太多研究涉及其他步骤和整个过程,这严重限制了现有数据挖掘方法的实用性。在这个项目中,我们希望在集群(最重要的数据挖掘任务之一)的背景下探索对整个 KDD 流程的支持。缺乏对所有 KDD 步骤的支持对于聚类来说尤其成问题,因为它具有无监督、探索性的性质,而且大多数聚类算法不会生成显式模式,而是简单地将聚类作为对象集返回。该拟议项目的目标是开发一个支持所有 KDD 步骤的集群框架。大多数现有的聚类算法仅利用要聚类的对象的属性,但在许多新兴应用中,目标表和相关表的属性之间的关系在表示感兴趣的对象方面发挥着重要作用。例如,在市场细分中,不仅购买偏好而且客户之间的社交网络都与聚类相关。作为另一个例子,当对基因表达数据进行聚类时,必须考虑相关蛋白质的属性及其进一步的关系。我们计划与基因表达数据分析、流式细胞术数据分析以及社区识别和市场细分应用领域的专家密切合作,评估我们的聚类框架。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Ester, Martin其他文献
MOLI: multi-omics late integration with deep neural networks for drug response prediction
- DOI:
10.1093/bioinformatics/btz318 - 发表时间:
2019-07-15 - 期刊:
- 影响因子:5.8
- 作者:
Sharifi-Noghabi, Hossein;Zolotareva, Olga;Ester, Martin - 通讯作者:
Ester, Martin
AITL: Adversarial Inductive Transfer Learning with input and output space adaptation for pharmacogenomics
- DOI:
10.1093/bioinformatics/btaa442 - 发表时间:
2020-07-01 - 期刊:
- 影响因子:5.8
- 作者:
Sharifi-Noghabi, Hossein;Peng, Shuman;Ester, Martin - 通讯作者:
Ester, Martin
Collaborative intra-tumor heterogeneity detection
- DOI:
10.1093/bioinformatics/btz355 - 发表时间:
2019-07-15 - 期刊:
- 影响因子:5.8
- 作者:
Khakabimamaghani, Sahand;Malikic, Salem;Ester, Martin - 通讯作者:
Ester, Martin
Ligand Binding Prediction Using Protein Structure Graphs and Residual Graph Attention Networks.
- DOI:
10.3390/molecules27165114 - 发表时间:
2022-08-11 - 期刊:
- 影响因子:4.6
- 作者:
Pandey, Mohit;Radaeva, Mariia;Mslati, Hazem;Garland, Olivia;Fernandez, Michael;Ester, Martin;Cherkasov, Artem - 通讯作者:
Cherkasov, Artem
HUME: large-scale detection of causal genetic factors of adverse drug reactions
- DOI:
10.1093/bioinformatics/bty475 - 发表时间:
2018-12-15 - 期刊:
- 影响因子:5.8
- 作者:
Mansouri, Mehrdad;Yuan, Bowei;Ester, Martin - 通讯作者:
Ester, Martin
Ester, Martin的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Ester, Martin', 18)}}的其他基金
Data Mining in Heterogeneous Information Networks with Attributes
具有属性的异构信息网络中的数据挖掘
- 批准号:
RGPIN-2017-04072 - 财政年份:2022
- 资助金额:
$ 2.23万 - 项目类别:
Discovery Grants Program - Individual
Data Mining in Heterogeneous Information Networks with Attributes
具有属性的异构信息网络中的数据挖掘
- 批准号:
RGPIN-2017-04072 - 财政年份:2021
- 资助金额:
$ 2.23万 - 项目类别:
Discovery Grants Program - Individual
Data Mining in Heterogeneous Information Networks with Attributes
具有属性的异构信息网络中的数据挖掘
- 批准号:
RGPIN-2017-04072 - 财政年份:2020
- 资助金额:
$ 2.23万 - 项目类别:
Discovery Grants Program - Individual
Data Mining in Heterogeneous Information Networks with Attributes
具有属性的异构信息网络中的数据挖掘
- 批准号:
RGPIN-2017-04072 - 财政年份:2019
- 资助金额:
$ 2.23万 - 项目类别:
Discovery Grants Program - Individual
Data Mining in Heterogeneous Information Networks with Attributes
具有属性的异构信息网络中的数据挖掘
- 批准号:
RGPIN-2017-04072 - 财政年份:2018
- 资助金额:
$ 2.23万 - 项目类别:
Discovery Grants Program - Individual
Data Mining in Heterogeneous Information Networks with Attributes
具有属性的异构信息网络中的数据挖掘
- 批准号:
RGPIN-2017-04072 - 财政年份:2017
- 资助金额:
$ 2.23万 - 项目类别:
Discovery Grants Program - Individual
Probabilistic Graphical Models for Data Mining and Recommendation in Social Media
社交媒体中数据挖掘和推荐的概率图形模型
- 批准号:
250960-2012 - 财政年份:2016
- 资助金额:
$ 2.23万 - 项目类别:
Discovery Grants Program - Individual
Probabilistic Graphical Models for Data Mining and Recommendation in Social Media
社交媒体中数据挖掘和推荐的概率图形模型
- 批准号:
250960-2012 - 财政年份:2015
- 资助金额:
$ 2.23万 - 项目类别:
Discovery Grants Program - Individual
Create Program for Computational Methods for the Analysis of the Diversity and Dynamics of Genomes (Create - CMADDG Training Program)
创建基因组多样性和动态分析的计算方法程序(创建 - CMADDG 培训程序)
- 批准号:
433905-2013 - 财政年份:2015
- 资助金额:
$ 2.23万 - 项目类别:
Collaborative Research and Training Experience
Probabilistic Graphical Models for Data Mining and Recommendation in Social Media
社交媒体中数据挖掘和推荐的概率图形模型
- 批准号:
250960-2012 - 财政年份:2014
- 资助金额:
$ 2.23万 - 项目类别:
Discovery Grants Program - Individual
相似海外基金
A Process-Based Framework for Open Innovation with Social Media Data
基于流程的社交媒体数据开放式创新框架
- 批准号:
DP230102657 - 财政年份:2024
- 资助金额:
$ 2.23万 - 项目类别:
Discovery Projects
DIGINTRACE:A Digital value chain Integration Traceability framework for process industries for Circularity and low Emissions by waste reduction and use of secondary raw materials
DIGINTRACE:流程工业的数字价值链集成追溯框架,通过减少废物和使用二次原材料实现循环和低排放
- 批准号:
10061918 - 财政年份:2023
- 资助金额:
$ 2.23万 - 项目类别:
EU-Funded
Project 3: 3-D Molecular Atlas of cerebral amyloid angiopathy in the aging brain with and without co-pathology
项目 3:有或没有共同病理的衰老大脑中脑淀粉样血管病的 3-D 分子图谱
- 批准号:
10555899 - 财政年份:2023
- 资助金额:
$ 2.23万 - 项目类别:
Selective C(sp3)–H Functionalization Enabled by Metal-Organic Framework Catalysis
金属有机框架催化实现选择性 C(sp3)–H 官能化
- 批准号:
10679785 - 财政年份:2023
- 资助金额:
$ 2.23万 - 项目类别:
Home foot-temperature monitoring through smart mat technology to improve access, equity, and outcomes in high-risk patients with diabetes
通过智能垫技术进行家庭足部温度监测,以改善高危糖尿病患者的可及性、公平性和结果
- 批准号:
10539209 - 财政年份:2023
- 资助金额:
$ 2.23万 - 项目类别:
Functional and behavioral dissection of higher order thalamocortical circuits in schizophrenia.
精神分裂症高阶丘脑皮质回路的功能和行为解剖。
- 批准号:
10633810 - 财政年份:2023
- 资助金额:
$ 2.23万 - 项目类别:
Multimodal Label-Free Nanosensor for Single Virus Characterization and Content Analysis
用于单一病毒表征和内容分析的多模式无标记纳米传感器
- 批准号:
10641529 - 财政年份:2023
- 资助金额:
$ 2.23万 - 项目类别:
Extensible Open Source Zero-Footprint Web Viewer for Cancer Imaging Research
用于癌症成像研究的可扩展开源零足迹 Web 查看器
- 批准号:
10644112 - 财政年份:2023
- 资助金额:
$ 2.23万 - 项目类别:
Tufts Clinical and Translational Science Institute (Clinical Trial Design Labs Supplement)
塔夫茨临床和转化科学研究所(临床试验设计实验室补充材料)
- 批准号:
10844980 - 财政年份:2023
- 资助金额:
$ 2.23万 - 项目类别:














{{item.name}}会员




