L2M NSERC - Intelligent system for classifying imbalanced data based on three-way Bayesian confirmation
L2M NSERC - 基于三向贝叶斯确认的不平衡数据分类智能系统
基本信息
- 批准号:580671-2023
- 负责人:
- 金额:$ 1.46万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Idea to Innovation
- 财政年份:2022
- 资助国家:加拿大
- 起止时间:2022-01-01 至 2023-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Imbalanced data refers to a data set where the data points are not evenly distributed across different classes. As a result, there are majority classes taking high proportions of the data points and minority classes taking the remaining low proportions. Imbalanced data is prevalent in practical situations, especially when we try to detect something abnormal such as fraudulent transactions, spam emails, and certain diseases. Standard classification models may not work well with imbalanced data. For example, consider an imbalanced data set where a majority class takes 90% of the data points and a minority class takes 10%. A standard classification model will likely learn that it can simply predict the majority class without any condition and be correct for 90% of the cases. This definitely does not explain the inherent reason for the classifications. In contrast, it is hard to predict a minority class correctly in most cases. However, the minority class usually consists of fraudulent transactions, spam emails, and cases of diseases that we actually want to learn and detect, more importantly than the majority class.We apply the Bayesian confirmation theory to build an intelligent system to learn effective features and rules for classifying imbalanced data. The importance of a certain feature or a group of features is evaluated by comparing the prior probability of a class before observing the value(s) of the feature(s) and the posterior probability after observing the value(s). The change shows the real impact of the feature(s) on our classification decisions and helps detect the effective features. For example, in detecting a certain disease, we may have a group of features measured through different tests. The values of some features may significantly increase or decrease the probability of the disease. Accordingly, the patient may be asked to do the corresponding tests to achieve an effective diagnosis. Compared to the standard classification models, we are able to learn the actual effective features and rules for detecting the majority and minority classes, addressing the aforementioned issues in analyzing imbalanced data.
不平衡数据是指数据点在不同类中分布不均匀的数据集。因此,多数班占数据点的高比例,少数班占其余的低比例。不平衡的数据在实际情况中很普遍,特别是当我们试图检测一些异常情况时,例如欺诈性交易,垃圾邮件和某些疾病。标准分类模型可能无法很好地处理不平衡的数据。例如,考虑一个不平衡的数据集,其中多数类占据90%的数据点,少数类占据10%。一个标准的分类模型可能会学习到,它可以简单地预测大多数类,而不需要任何条件,并且在90%的情况下都是正确的。这绝对不能解释分类的内在原因。相比之下,在大多数情况下,很难正确预测少数群体。然而,少数类通常包括欺诈交易,垃圾邮件,和疾病的情况下,我们实际上要学习和检测,比多数类更重要,我们应用贝叶斯确认理论,建立一个智能系统,学习有效的特征和规则,分类不平衡的数据。通过比较观察特征值之前的类的先验概率和观察值之后的后验概率来评估某个特征或一组特征的重要性。这种变化显示了特征对我们的分类决策的真实的影响,并有助于检测有效的特征。例如,在检测某种疾病时,我们可能会通过不同的测试测量一组特征。某些特征的值可能会显着增加或减少疾病的概率。因此,可以要求患者进行相应的测试以实现有效的诊断。与标准分类模型相比,我们能够学习实际有效的特征和规则来检测大多数和少数类,解决了分析不平衡数据时的上述问题。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Yao, YiyuYY其他文献
Yao, YiyuYY的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似海外基金
NSERC COHESA: Computing Hardware for Emerging Intelligent Sensory Applications
NSERC COHESA:用于新兴智能传感应用的计算硬件
- 批准号:
485577-2015 - 财政年份:2022
- 资助金额:
$ 1.46万 - 项目类别:
Strategic Network Grants Program
NSERC/ Industrial Research Chair in Intelligent Transportation Systems
NSERC/智能交通系统工业研究主席
- 批准号:
548709-2018 - 财政年份:2022
- 资助金额:
$ 1.46万 - 项目类别:
Industrial Research Chairs
NSERC/SEASPAN Industrial Research Chairs in intelligent and green marine vessels (IGMVs): Advanced Tools and Techniques for Multiphysics Prediction and Design Optimization
NSERC/SEASPAN 智能和绿色船舶 (IGMV) 工业研究主席:多物理场预测和设计优化的先进工具和技术
- 批准号:
550069-2019 - 财政年份:2021
- 资助金额:
$ 1.46万 - 项目类别:
Industrial Research Chairs
NSERC/ Industrial Research Chair in Intelligent Transportation Systems
NSERC/智能交通系统工业研究主席
- 批准号:
548709-2018 - 财政年份:2021
- 资助金额:
$ 1.46万 - 项目类别:
Industrial Research Chairs
NSERC COHESA: Computing Hardware for Emerging Intelligent Sensory Applications
NSERC COHESA:用于新兴智能传感应用的计算硬件
- 批准号:
485577-2015 - 财政年份:2021
- 资助金额:
$ 1.46万 - 项目类别:
Strategic Network Grants Program
NSERC/SEASPAN Industrial Research Chairs in intelligent and green marine vessels (IGMVs): Advanced Tools and Techniques for Multiphysics Prediction and Design Optimization
NSERC/SEASPAN 智能和绿色船舶 (IGMV) 工业研究主席:多物理场预测和设计优化的先进工具和技术
- 批准号:
550071-2019 - 财政年份:2021
- 资助金额:
$ 1.46万 - 项目类别:
Industrial Research Chairs
NSERC I2I Phase Ia: An Intelligent Framework for Social Engineering Cyber Security Training
NSERC I2I 第一阶段:社会工程网络安全培训智能框架
- 批准号:
567660-2021 - 财政年份:2021
- 资助金额:
$ 1.46万 - 项目类别:
Idea to Innovation
NSERC COHESA: Computing Hardware for Emerging Intelligent Sensory Applications
NSERC COHESA:用于新兴智能传感应用的计算硬件
- 批准号:
485577-2015 - 财政年份:2020
- 资助金额:
$ 1.46万 - 项目类别:
Strategic Network Grants Program
NSERC/C-COM Industrial Research Chair in Intelligent Antenna and Radio Systems for Next Generation Millimeter-Wave Mobile Communications
NSERC/C-COM 下一代毫米波移动通信智能天线和无线电系统工业研究主席
- 批准号:
320316-2016 - 财政年份:2020
- 资助金额:
$ 1.46万 - 项目类别:
Industrial Research Chairs
NSERC/SEASPAN Industrial Research Chairs in intelligent and green marine vessels (IGMVs): Advanced Tools and Techniques for Multiphysics Prediction and Design Optimization
NSERC/SEASPAN 智能和绿色船舶 (IGMV) 工业研究主席:多物理场预测和设计优化的先进工具和技术
- 批准号:
550069-2019 - 财政年份:2020
- 资助金额:
$ 1.46万 - 项目类别:
Industrial Research Chairs