Nonparametric Outlyingness and Descriptive Measures in Multivariate and General Data Settings
多元和一般数据设置中的非参数异常性和描述性测量
基本信息
- 批准号:0805786
- 负责人:
- 金额:--
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2008
- 资助国家:美国
- 起止时间:2008-08-01 至 2011-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This project extends foundations in two closely interactive areas of core statistical science: nonparametric outlier identification, and robust descriptive measures. Multivariate and more complex data types are emphasized. Data points apart from the main body ("outliers") can adversely affect statistical analyses unless identified and taken into account. This concern is arising in new contexts calling for updated and broadened formulations of current methods. Multivariate modeling with heavy tailed data and with skewness and kurtosis descriptive measures in addition to location and skewness involves increased concern with outliers. Diverse new data types (functional, shape, image, set, symbolic, sensor, stream, tree, graph, etc.) being treated by sophisticated but ad hoc computer science data mining approaches need more systematic treatment. Shape-fitting problems in computational geometry impose new forms of outlier issues. The study develops new general foundational approaches to outlier detection, eliminates reliance on algorithms that only handle outliers without actually identifying them in the input space, eliminates undue reliance upon elliptical outlyingness contours, and strengthens the accommodation of heavy tailed data. The overall project goals are to establish extended conceptual statistical foundations for outlier detection and to develop new structures for robust descriptive measures of location, dispersion, skewness, kurtosis, etc., with the aim of broad application across general data settings.With advancing computational resources, the scope of statistical data analysis and modeling is widening to accommodate pressing new arenas of application. Data in all areas of science and engineering has complex multidimensional structure, typically with large sample sizes and involving curves, images, text, and other objects, often within astream or network structure. This is generating major new problems in detection and handling of "anomalous" data points ("outliers"). Which cases stand apart? How do the "unusual" cases impact statistical analyses on the full data set? What computational steps efficiently find the outliers when the data is massive and involves many variables? What general principles apply across diverse new situations such as fraud detection, intrusion detection, network analysis, and data mining? How to define "outlier" relative to a fusion of several related data sets, for example image, text, and sensor data, as might arise in Homeland Security? This study addresses these basic practical questions with the aim of developing new methodological approaches soundly based upon established statistical principles.
该项目扩展了核心统计科学两个密切互动领域的基础:非参数离群值识别和稳健的描述性测量。强调多变量和更复杂的数据类型。除非加以识别和考虑,否则主体以外的数据点(“异常值”)可能对统计分析产生不利影响。这一关切是在新的情况下产生的,要求更新和扩大现有方法的拟订。多变量建模与重尾数据和偏度和峰度的描述性措施,除了位置和偏度涉及到异常值增加的关注。各种各样的新数据类型(功能、形状、图像、集合、符号、传感器、流、树、图等)正在被复杂但特殊的计算机科学数据挖掘方法处理,需要更系统的处理。计算几何中的形状拟合问题提出了新形式的离群值问题。该研究开发了新的离群值检测的通用基础方法,消除了对只处理离群值而不实际识别输入空间中的离群值的算法的依赖,消除了对椭圆离群值轮廓的过度依赖,并加强了对重尾数据的适应。该项目的总体目标是为离群值检测建立扩展的概念统计基础,并为位置、分散、偏度、峰度等的稳健描述性测量开发新的结构,目的是在一般数据设置中广泛应用。随着计算资源的发展,统计数据分析和建模的范围正在扩大,以适应迫切的新应用领域。所有科学和工程领域的数据都具有复杂的多维结构,通常具有大样本量,涉及曲线、图像、文本和其他对象,通常在流或网络结构中。这在检测和处理“异常”数据点(“异常值”)方面产生了重大的新问题。哪些案例不同?这些“不寻常”的案例如何影响对整个数据集的统计分析?当数据量大且涉及许多变量时,怎样的计算步骤能有效地找到离群值?哪些一般原则适用于欺诈检测、入侵检测、网络分析和数据挖掘等不同的新情况?如何定义“离群值”相对于几个相关数据集的融合,例如图像、文本和传感器数据,如可能出现在国土安全?这项研究解决了这些基本的实际问题,目的是在既定的统计原则基础上发展新的方法方法。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Robert Serfling其他文献
On Liu’s simplicial depth and Randles’ interdirections
- DOI:
10.1016/j.csda.2016.02.002 - 发表时间:
2016-07-01 - 期刊:
- 影响因子:
- 作者:
Robert Serfling;Yunfei Wang - 通讯作者:
Yunfei Wang
Depth functions in nonparametric multivariate inference
- DOI:
10.1090/dimacs/072/01 - 发表时间:
2003 - 期刊:
- 影响因子:0
- 作者:
Robert Serfling - 通讯作者:
Robert Serfling
On masking and swamping robustness of leading nonparametric outlier identifiers for univariate data
- DOI:
10.1016/j.jspi.2015.02.002 - 发表时间:
2015-07-01 - 期刊:
- 影响因子:
- 作者:
Shanshan Wang;Robert Serfling - 通讯作者:
Robert Serfling
Depth-based nonparametric description of functional data, with emphasis on use of spatial depth
- DOI:
10.1016/j.csda.2016.07.007 - 发表时间:
2017-01-01 - 期刊:
- 影响因子:
- 作者:
Robert Serfling;Uditha Wijesuriya - 通讯作者:
Uditha Wijesuriya
Robert Serfling的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Robert Serfling', 18)}}的其他基金
Multivariate Depth and Quantile Functions: Foundations and Applications
多元深度和分位数函数:基础和应用
- 批准号:
1106691 - 财政年份:2011
- 资助金额:
-- - 项目类别:
Continuing Grant
Nonparametric and Robust Multivariate Analysis via Quantile Functions
通过分位数函数进行非参数和稳健的多元分析
- 批准号:
0103698 - 财政年份:2001
- 资助金额:
-- - 项目类别:
Continuing Grant
Multidimensional Depth Functions, Multidimensional Generalized L-Statistics, and Related Procedures
多维深度函数、多维广义 L 统计量及相关过程
- 批准号:
9705209 - 财政年份:1997
- 资助金额:
-- - 项目类别:
Standard Grant