权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Optimizing Classification Models to Application-Specific Performance Metrics

根据特定于应用程序的性能指标优化分类模型

基本信息

批准号：
0412930
负责人：
Rich Caruana
金额：
--
依托单位：
Cornell University
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2004
资助国家：
美国
起止时间：
2004-08-15 至 2008-07-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=0412930&HistoricalAwards=false
关键词：
Optimizing Classification Models Application Specific

项目摘要

Many different criteria can be used to train and evaluate classifiers. Different criteria are appropriate in different settings, and learning methods that perform well on one criterion may not perform well on other. If a user must use a specific learning algorithm or model class, but needs to optimize to a performance measure for which that model class is not designed, they cannot do so. For example, neural nets are easy to train for continuous measures such as squared error and cross-entropy, but are difficult to train for discontinuous measures such as accuracy, Lift, and ROC area. SVMs are designed to optimize accuracy, but not squared error or cross-entropy. Decision trees typically optimize information-theoretic measures or accuracy, but are not designed to maximize ROC area or to minimize squared error. Moreover, for some performance metrics such as Lift we do not yet have any effective learning procedures. We are developing general-purpose cross-optimization methods for training learning algorithms to any performance measure. More specifically, we are developing meta-algorithms for optimizing the performance of different types of classifiers to metrics other than the one for which they were designed. We are using two meta-learning methods to accomplish this. The first is an ensemble learning method can optimize the performance of an ensemble of base-level classifiers to the user's criterion. The second method is a model adaption procedure that starts with a model optimized to one metric, and then iteratively transforms it into a model that is near-optimal with respect to a different user specified criterion. Both methods are designed to be compatible with most existing supervised learning methods. Our work has the potential to substantially improve classifiers by dealing up-front with the performance requirements of real-world applications. The work will have broad impact by giving machine learning users the flexibility to apply the performance metric that best fits their scientific, governmental or commercial needs. Our plans for outreach include distribution of software to aid classifier evaluation on multiple metrics, distributing software for ensemble selection, creating a multiple-metric competition at a conference such as KDD, organizing workshops on multi-metric learning, and building educational modules that demonstrate the importance of performance metrics in application-specific classifier design.

可以使用许多不同的标准来训练和评估分类器。不同的标准适用于不同的环境，在一个标准上表现良好的学习方法可能在另一个标准上表现不佳。如果用户必须使用特定的学习算法或模型类，但需要优化到该模型类未设计的性能度量，则他们不能这样做。例如，神经网络很容易训练连续的测量，如平方误差和交叉熵，但很难训练不连续的测量，如准确度，Lift和ROC面积。支持向量机旨在优化精度，但不是平方误差或交叉熵。决策树通常优化信息理论测量或准确性，但不是为了最大化ROC面积或最小化平方误差而设计的。此外，对于一些性能指标，如Lift，我们还没有任何有效的学习过程。我们正在开发通用的交叉优化方法，用于训练学习算法到任何性能指标。更具体地说，我们正在开发元算法，用于优化不同类型的分类器的性能，以度量它们的设计目标。我们使用两种元学习方法来实现这一点。第一种是集成学习方法，可以根据用户的标准优化基本层分类器的集成性能。第二种方法是模型自适应过程，该过程从优化到一个度量的模型开始，然后迭代地将其转换为相对于不同的用户指定的标准接近最优的模型。这两种方法都被设计为与大多数现有的监督学习方法兼容。我们的工作有可能通过预先处理现实世界应用程序的性能要求来大幅改进分类器。这项工作将产生广泛的影响，使机器学习用户能够灵活地应用最适合其科学、政府或商业需求的性能指标。我们的推广计划包括分发软件，以帮助分类器评估多个指标，分发软件进行集成选择，在会议上创建一个多指标的竞争，如KDD，组织研讨会上的多指标学习，并建立教育模块，证明性能指标的重要性，在特定于应用程序的分类器设计。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Rich Caruana其他文献

1017 Insights into severe maternal morbidity in the NTSV population

DOI：
10.1016/j.ajog.2020.12.1042
发表时间：
2021-02-01
期刊：
Conference abstract
影响因子：
作者：
Benjamin J. Lengerich;Rich Caruana;William B. Weeks;Ian Painter;Sydney Spencer;Kristin Sitcov;Colleen Daly;Vivienne Souter
通讯作者：
Vivienne Souter

102 - UNDERSTANDING AGE AS A RISK FACTOR FOR COMPLICATIONS AFTER TOTAL KNEE ARTHROPLASTY: WHAT CAN WE LEARN FROM MACHINE LEARNING?

DOI：
10.1016/j.joca.2024.02.113
发表时间：
2024-04-01
期刊：
Conference abstract
影响因子：
作者：
Bella Mehta;Yi Yiyuan;Chloe Heiting;Kaylee Ho;Susan Goodman;Peter Sculco;Fei Wang;Rich Caruana;Peter Cram;Said Ibrahim
通讯作者：
Said Ibrahim

46 Length of labor and severe maternal morbidity in the NTSV population

DOI：
10.1016/j.ajog.2020.12.011
发表时间：
2021-02-01
期刊：
Conference abstract
影响因子：
作者：
Benjamin J. Lengerich;Rich Caruana;William B. Weeks;Ian Painter;Sydney Spencer;Kristin Sitcov;Colleen Daly;Vivienne Souter
通讯作者：
Vivienne Souter

Predicting severe maternal morbidity at admission for delivery using intelligible machine learning

DOI：
10.1016/j.ajog.2022.11.704
发表时间：
2023-01-01
期刊：
Conference abstract
影响因子：
作者：
Zifei Xu;Tomas M. Bosschieter;Hui Lan;Benjamin Lengerich;Harsha Nori;Kristin Sitcov;Ian Painter;Vivienne Souter;Rich Caruana
通讯作者：
Rich Caruana