Generalized sequential data mining using enhanced object representations based on preliminary clustering profiles

使用基于初步聚类概况的增强对象表示的广义顺序数据挖掘

基本信息

  • 批准号:
    RGPIN-2018-05363
  • 负责人:
  • 金额:
    $ 2.48万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2022
  • 资助国家:
    加拿大
  • 起止时间:
    2022-01-01 至 2023-12-31
  • 项目状态:
    已结题

项目摘要

My research proposes the use of unsupervised clustering to enhance datasets prior to applying other data mining and business intelligence techniques including data summarization, second order unsupervised clustering, classification, prediction, and association mining. Inspired by unsupervised pre-training used in the convergence of deep feed-forward neural networks, the proposed work will be a significant step forward in this area. In the first phase, we will use state-of-the-art clustering techniques. In subsequent phases, we will explore data mining techniques beyond neural networks. To illustrate an application of the proposed work, consider a financial transaction dataset that can be used for customer relationship management, inventory management and fraud detection. A customer's representation can include temporal, geographical, or spending profiles. In phase one, these profiles can be used as attributes for representing customers in addition to other raw attributes such as total spending, frequency and recency of visits. The augmented dataset can be analyzed in the second phase using a diverse set of data mining techniques including business intelligence, association mining, supervised learning (classification and prediction), and second order unsupervised learning (clustering). The association mining will provide rules such as: “Those who spend more in summer tend to spend more on weekends”. The supervised learning will use known incidences of default, fraud and other anomalous behavior in financial transaction data to create models that predict the chances of fraud and anomalies based on customer profiles. For example, "holiday spenders" tend to default more on their loans. The subsequent unsupervised learning will also help us understand correlations between different profiles. For instance, "summer spenders" tend to be "non-local spenders".We will demonstrate the usefulness of our proposal using data from one wholesaler, one retailer, and open datasets for financial defaults to test and refine the proposed sequential data mining techniques. The application will not be restricted to commercial transactions. In addition, we will also use weather and energy consumption patterns to suggest optimal energy control strategies for buildings.While this proposal derives its inspiration from the unsupervised pre-training in deep learning, it will benefit from utilizing broader clustering research from the last fifty years. An example of one improvement is using cluster validity indices to determine the appropriate number of clusters. Furthermore, first-phase supervised learning will also be applied to other classification and prediction techniques such as decision trees, random forests and support vector machines. It can also enhance other data mining techniques such as optimization, association mining and business intelligence.
我的研究建议使用无监督聚类来增强数据集,然后再应用其他数据挖掘和商业智能技术,包括数据摘要,二阶无监督聚类,分类,预测和关联挖掘。受深度前馈神经网络收敛中使用的无监督预训练的启发,拟议的工作将是该领域的重要一步。 在第一阶段,我们将使用最先进的聚类技术。在后续阶段,我们将探索神经网络之外的数据挖掘技术。为了说明所提出的工作的应用,考虑可以用于客户关系管理、库存管理和欺诈检测的金融交易数据集。客户的表示可以包括时间、地理或消费概况。在第一阶段,除了其他原始属性(如总支出、访问频率和最近访问)之外,这些配置文件还可以用作表示客户的属性。在第二阶段,可以使用各种数据挖掘技术来分析增强的数据集,包括商业智能、关联挖掘、监督学习(分类和预测)和二阶无监督学习(聚类)。关联挖掘将提供规则,例如:“那些在夏天花费更多的人倾向于在周末花费更多”。监督学习将使用金融交易数据中已知的违约、欺诈和其他异常行为的发生率来创建模型,根据客户资料预测欺诈和异常的可能性。例如,“假日消费者”往往拖欠更多的贷款。随后的无监督学习也将帮助我们理解不同配置文件之间的相关性。举例来说,“夏季消费者”往往是“非本地消费者”。我们会使用一间批发商、一间零售商的数据,以及公开的财务违约数据集,来测试和改进建议的序列数据挖掘技术,以证明我们的建议是有用的。申请将不限于商业交易。此外,我们还将利用天气和能源消耗模式为建筑物提出最佳能源控制策略。虽然这项建议的灵感来自深度学习中的无监督预训练,但它将受益于利用过去50年来更广泛的聚类研究。一个改进的例子是使用聚类有效性指数来确定聚类的适当数量。此外,第一阶段监督学习还将应用于其他分类和预测技术,如决策树、随机森林和支持向量机。 它还可以增强其他数据挖掘技术,如优化,关联挖掘和商业智能。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Lingras, Pawan其他文献

AEDNav: indoor navigation for locating automated external defibrillator.
  • DOI:
    10.1186/s12911-022-01886-7
  • 发表时间:
    2022-06-20
  • 期刊:
  • 影响因子:
    3.5
  • 作者:
    Rao, Gaurav;Mago, Vijay;Lingras, Pawan;Savage, David W.
  • 通讯作者:
    Savage, David W.
Rough set based 1-v-1 and 1-v-r approaches to support vector machine multi-classification
  • DOI:
    10.1016/j.ins.2007.03.028
  • 发表时间:
    2007-09-15
  • 期刊:
  • 影响因子:
    8.1
  • 作者:
    Lingras, Pawan;Butz, Cory
  • 通讯作者:
    Butz, Cory
Rough Cluster Quality Index Based on Decision Theory
基于决策理论的粗聚类质量指标
Granular meta-clustering based on hierarchical, network, and temporal connections
  • DOI:
    10.1007/s41066-015-0007-9
  • 发表时间:
    2016-03-01
  • 期刊:
  • 影响因子:
    5.5
  • 作者:
    Lingras, Pawan;Haider, Farhana;Triff, Matt
  • 通讯作者:
    Triff, Matt
Qualitative and quantitative combinations of crisp and rough clustering schemes using dominance relations

Lingras, Pawan的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Lingras, Pawan', 18)}}的其他基金

Generalized sequential data mining using enhanced object representations based on preliminary clustering profiles
使用基于初步聚类概况的增强对象表示的广义顺序数据挖掘
  • 批准号:
    RGPIN-2018-05363
  • 财政年份:
    2021
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Discovery Grants Program - Individual
Generalized sequential data mining using enhanced object representations based on preliminary clustering profiles
使用基于初步聚类概况的增强对象表示的广义顺序数据挖掘
  • 批准号:
    RGPIN-2018-05363
  • 财政年份:
    2020
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Discovery Grants Program - Individual
Generalized sequential data mining using enhanced object representations based on preliminary clustering profiles
使用基于初步聚类概况的增强对象表示的广义顺序数据挖掘
  • 批准号:
    RGPIN-2018-05363
  • 财政年份:
    2018
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Discovery Grants Program - Individual
Medical Diagnosis using Raman Spectrographs and Machine Learning
使用拉曼光谱仪和机器学习进行医疗诊断
  • 批准号:
    521157-2017
  • 财政年份:
    2017
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Engage Grants Program
Adaptive recognition of time series of images for warehouse inventory cataloging
用于仓库库存编目的时间序列图像的自适应识别
  • 批准号:
    494282-2016
  • 财政年份:
    2017
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Collaborative Research and Development Grants
Adaptive recognition of time series of images for warehouse inventory cataloging
用于仓库库存编目的时间序列图像的自适应识别
  • 批准号:
    494282-2016
  • 财政年份:
    2016
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Collaborative Research and Development Grants
Recursive and iterative clustering in granular hierarchical, network, and temporal datasets
粒度分层、网络和时间数据集中的递归和迭代聚类
  • 批准号:
    123746-2013
  • 财政年份:
    2015
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Discovery Grants Program - Individual
Updating server inventory database through image recognition
通过图像识别更新服务器库存数据库
  • 批准号:
    485507-2015
  • 财政年份:
    2015
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Engage Grants Program
Recursive and iterative clustering in granular hierarchical, network, and temporal datasets
粒度分层、网络和时间数据集中的递归和迭代聚类
  • 批准号:
    123746-2013
  • 财政年份:
    2014
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Discovery Grants Program - Individual
Recursive and iterative clustering in granular hierarchical, network, and temporal datasets
粒度分层、网络和时间数据集中的递归和迭代聚类
  • 批准号:
    123746-2013
  • 财政年份:
    2013
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Discovery Grants Program - Individual

相似国自然基金

微生物发酵过程的自组织建模与优化控制
  • 批准号:
    60704036
  • 批准年份:
    2007
  • 资助金额:
    21.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

The Natural History of Overall Mortality with Diagnosed Symptomatic Gallstone Disease in the United States: A Sequential Mixed-methods Study Evaluating Emergency, Non-emergency, and No Cholecystectomy
美国诊断有症状胆结石病的总体死亡率的自然史:一项评估紧急、非紧急和不进行胆囊切除术的序贯混合方法研究
  • 批准号:
    10664339
  • 财政年份:
    2023
  • 资助金额:
    $ 2.48万
  • 项目类别:
P1: Sources and Mechanisms of Sequential Activity
P1:顺序活动的来源和机制
  • 批准号:
    10705963
  • 财政年份:
    2023
  • 资助金额:
    $ 2.48万
  • 项目类别:
Community Liaison and Recruitment Core (CLRC)
社区联络和招聘核心 (CLRC)
  • 批准号:
    10729793
  • 财政年份:
    2023
  • 资助金额:
    $ 2.48万
  • 项目类别:
Pre-motor neural circuits enable versatile and sequential limb movements
前运动神经回路可实现多功能且连续的肢体运动
  • 批准号:
    10721086
  • 财政年份:
    2023
  • 资助金额:
    $ 2.48万
  • 项目类别:
Optimizing Telehealth-delivery of a Weight Loss Intervention in Older Adults with Multiple Chronic Conditions: A Sequential, Multiple Assignment, Randomized Trial
优化对患有多种慢性病的老年人进行远程医疗的减肥干预:一项序贯、多项分配、随机试验
  • 批准号:
    10583917
  • 财政年份:
    2023
  • 资助金额:
    $ 2.48万
  • 项目类别:
Precision Medicine in Alzheimer’s Disease: A SMART Trial of Adaptive Exercises and Their Mechanisms of Action Using AT(N) Biomarkers to Optimize Aerobic-Fitness Responses
阿尔茨海默病的精准医学:使用 AT(N) 生物标志物优化有氧健身反应的适应性运动及其作用机制的 SMART 试验
  • 批准号:
    10581973
  • 财政年份:
    2023
  • 资助金额:
    $ 2.48万
  • 项目类别:
Novel Combination Therapy for Treatment and Prevention of PulmonaryLymphangioleiomyomatosis (LAM) and Tuberous Sclerosis Complex (TSC)
治疗和预防肺淋巴管平滑肌瘤病 (LAM) 和结节性硬化症 (TSC) 的新型联合疗法
  • 批准号:
    10697901
  • 财政年份:
    2023
  • 资助金额:
    $ 2.48万
  • 项目类别:
Phentermine/Topiramate in children, adolescents, and young adults with hypothalamic obesity: a pilot and feasibility study
芬特明/托吡酯治疗下丘脑肥胖儿童、青少年和年轻人:一项试点和可行性研究
  • 批准号:
    10734754
  • 财政年份:
    2023
  • 资助金额:
    $ 2.48万
  • 项目类别:
Bottom-Up, Top-Down, and Local Interactions in the Generation and Consolidation of Cortical Representations of Sequential Experience
顺序经验的皮层表征的生成和巩固中的自下而上、自上而下和局部交互
  • 批准号:
    10658227
  • 财政年份:
    2023
  • 资助金额:
    $ 2.48万
  • 项目类别:
Sequential Modeling for Prediction of Periodontal Diseases: an intra-Collaborative Practice-based Research study (ICPRS)
牙周病预测的序列模型:基于内部协作实践的研究 (ICPRS)
  • 批准号:
    10755010
  • 财政年份:
    2023
  • 资助金额:
    $ 2.48万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了