Diagnostics for Structured Data and Quality Improvement

结构化数据诊断和质量改进

基本信息

  • 批准号:
    9803622
  • 负责人:
  • 金额:
    $ 9.21万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    1998
  • 资助国家:
    美国
  • 起止时间:
    1998-08-15 至 2002-07-31
  • 项目状态:
    已结题

项目摘要

9803622Douglas M. HawkinsStructured data sets are those in which the data are other than independent identically distributed scalar quantities. Examples are multiple regression and multivariate data sets, and time series. Diagnostics involves identifying cases that depart from some baseline (usually Gaussian) model. This general framework covers finding outliers as a part of data analysis, and is also the basis for statistical process control (SPC) methodologies. One thread of the present project extends the PI's previous work in outlier identification, particularly in situations where the outliers are numerous or badly placed. It has recently become apparent that methods that work well on text-book-sized problems are useless in data sets with even a few thousand cases in a few dozen dimensions. As such data sets are increasingly common, this has reopened a major emphasis of the present project -- the whole question of workable approaches for finding outlying cases in large data sets. A somewhat distinct problem is detection of persistent changes in time-ordered scalar and multivariate data. This is the problem addressed by change-point, exponentially weighted moving average, and cumulative sum methodologies. The program of work includes a major effort in this area also. The union of the two problem areas leads to the design of statistical process control methodologies that are resistant to isolated outliers.In large data bases it is impossible to verify the correctness or internal consistency of the entries using current methodology. Methods that work well on small data sets are computationally unthinkable in data sets up in the megabyte and beyond range leaving the quality of information in data bases hostage to undetected errors. This project is developing methods to identify "outliers" --- atypical entries in large data bases --- with an acceptable, though still large, amount of computational effort. More processor power alone will not solve the problem, but more powerful processors combined with the improved algorithms developed in this program of work may do so. The problem is inherently amenable to distributed processing --- previous work showed how outlier identification could be speeded up by using an array of central processors. Another thread of the work is cumulative sum (cusum) charting, a tool in the statistical process control (SPC) family. The classic Shewhart Xbar and R control charts are incapable of detecting small but persistent shifts. Such shifts are found and diagnosed rapidly with cusums. Used in conjunction with Shewhart charts, cusums can diagnose manufacturing problems, leading to substantial quality improvement. Cusums are also effective in many other monitoring situations, from online medical monitoring to detecting plumes of pollution in air or water. Groundwater monitoring around a landfill, for example, aims at exactly this problem of detecting an increased level of pollutants against a highly variable background. Cusums are already recognized as a powerful tool to use in the detection and diagnosis of leakages, and their extension to handle non-detect chemical data is important in extending their applicability to pollutants like heavy metals that are harmful at low concentrations.
9803622道格拉斯M.结构化数据集是指数据不是独立同分布的标量的数据集。 例子是多元回归和多变量数据集,以及时间序列。 诊断涉及识别偏离某些基线(通常是高斯)模型的情况。 这个一般框架涵盖了作为数据分析一部分的异常值,也是统计过程控制(SPC)方法的基础。 本项目的一个线程扩展了PI以前在离群值识别方面的工作,特别是在离群值众多或位置不佳的情况下。 最近很明显,在教科书大小的问题上工作良好的方法在几十个维度的几千个案例的数据集上是无用的。 由于这类数据集越来越普遍,这就重新提出了本项目的一个主要重点-在大型数据集中寻找外围病例的可行方法的整个问题。 一个有点明显的问题是检测时间排序的标量和多变量数据中的持续变化。 这就是变点、指数加权移动平均和累积和方法所解决的问题。 工作方案也包括在这一领域作出重大努力。 这两个问题领域的结合导致了统计过程控制方法的设计,这些方法对孤立的离群值具有抵抗力。在大型数据库中,使用当前的方法不可能验证条目的正确性或内部一致性。 在小数据集上工作良好的方法在兆字节和超出范围的数据集上是计算上不可想象的,从而使数据库中的信息质量受到未检测到的错误的影响。 这个项目正在开发方法来识别“离群值”-大型数据库中的非典型条目-尽管计算工作量仍然很大,但可以接受。 更强大的处理器能力本身并不能解决这个问题,但更强大的处理器结合本工作计划中开发的改进算法可能会解决这个问题。 这个问题本质上是服从分布式处理-以前的工作表明如何离群识别可以通过使用中央处理器阵列来加速。 工作的另一个线程是累积和(cumulative sum)图表,统计过程控制(SPC)家族中的一种工具。 经典的ShewhartXbar和R控制图无法检测微小但持续的变化。 这种变化被发现和诊断迅速与累积。 与休哈特图结合使用,累积和可以诊断制造问题,从而导致实质性的质量改进。 累积求和在许多其他监测情况下也很有效,从在线医疗监测到检测空气或水中的污染羽流。 例如,对垃圾填埋场周围的地下水进行监测,就是为了解决这个问题,即在高度可变的背景下检测污染物水平的增加。 累积量已经被公认为是检测和诊断泄漏的有力工具,其扩展到处理非检测化学数据对于将其适用性扩展到低浓度有害的重金属等污染物非常重要。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Douglas Hawkins其他文献

A joint international consensus statement for measuring quality of survival for patients with childhood cancer
衡量儿童癌症患者生存质量的联合国际共识声明
  • DOI:
    10.1038/s41591-023-02339-y
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    82.9
  • 作者:
    Rebecca J. van Kalsbeek;M. Hudson;R. L. Mulder;M. Ehrhardt;D. Green;D. Mulrooney;Jessica Hakkert;J. den Hartogh;A. Nijenhuis;H. V. van Santen;A. S. Schouten;Harm van Tinteren;L. Verbruggen;H. Conklin;L. Jacola;R. Webster;M. Partanen;W. Kollen;M. Grootenhuis;R. Pieters;L. Kremer;Rebecca J. Jaap Hanneke M. Harm Femke Madeleine Traci Chan van Kalsbeek den Hartogh van Santen van Tinteren A;Rebecca J. van Kalsbeek;J. den Hartogh;H. V. van Santen;Harm van Tinteren;F. Aarsen;Madeleine Adams;Traci Adams;Chantal van den Akker;Roland Amman;Shekinah J Andrews;Greg Armstrong;Andishe Atterbaschi;Amedeo A Azizi;K. van Baarsen;Simon Bailey;Justin Baker;Lisa Bakker;Laura R. Beek;Peter Bekkering;Janneke van den Bergen;Esther M. M. van den Bergh;M. Bierings;Michael Bishop;G. Bisogno;John Boatner;Saskia Boerboom;Judith de Bont;F. Boop;C. van den Bos;Eric Bouffet;Rick Brandsma;Ida Bremer Ophorst;Bernadette Brennan;Rachel C. Brennan;D. Bresters;Sippy ten Brink;L. Brugières;Birgit Burkhardt;Gabriele Calaminus;F. Calkoen;Kristin E. Canavera;Leeann Carmichael;Sharon M Castellino;M. Cepelova;W. Chemaitilly;Julia Chisholm;Karen Clark;Debbie Crom;Amanda Curry;Brian M. DeFeo;Jennifer van Dijk;Stephanie B. Dixon;Jeffrey Dome;Jean Donadieu;Babet L Drenth;Carlo Dufour;Adam Esbenshade;G. Escherich;T. Fay;C. Faure;Andrea Ferrari;J. Flerlage;Kayla Foster;Lindsay Frazier;Wayne Furman;Carlos Galindo;Hoong;Jessica A. Gartrell;James I. Geller;C. Gidding;Jan Godzinsky;B. Goemans;R. Gorlick;Rinske Graafland;Norbert Graf;M. van Grotel;Marjolein ter Haar;V. de Haas;M. Hagleitner;Karen Hale;Chris Halsey;Darren R Hargrave;J. Harman;Henrik Hasle;R. Haupt;L. Haveman;Douglas Hawkins;L. van der Heijden;Katja M. J. Heitink;M. V. D. van den Heuvel;N. Hijiya;L. Hjorth;B. Hoeben;Renske Houben;E. Hoving;C. Hulsker;Antoinette Jaspers;Liza Johnson;Niki Jurbergs;L. Kahalley;Seth E. Karol;G. Kaspers;Erica Kaye;Anne Kazak;Rachèl Kemps;T. Kepák;Raja Khan;P. Klimo;R. Knops;Andy Kolb;Rianne Koopman;K. Kraal;C. Kramm;Matthew T Krasin;P. Lähteenmäki;Judith Landman;J. Lavecchia;J. Lemiere;Angelia Lenschau;Charlotte Ligthart;Raphaële R. L. van Litsenburg;Jan Loeffen;Mignon Loh;John Lucas;J. van der Lugt;Peggy Lüttich;Renee Madden;Arshia Madni;John Maduro;Sanne van der Mark;Armanda Markesteijn;Christine Mauz;Annelies Mavinkurve;L. Meijer;T. Merchant;H. Merks;Bill Meyer;F. Meyer;Paul A. Meyers;Rebecka Meyers;Erna M. C. Michiels;M. Minkov;B. De Moerloose;Kristen Molina;John Moppett;Kyle Morgan;Bruce Morland;Sabine Mueller;Hermann Müller;Roosmarijn Muller;M. Muraca;Sandra Murphy;V. Nanduri;Michael Neel;C. Niemeyer;Maureen O’Brien;D. Orbach;Jale Özyurt;H. H. van der Pal;V. Papadakis;Alberto S Pappo;Lauren Pardue;Kendra R. Parris;Annemarie Peek;Bob Phillips;S. Plasschaert;Marieka Portegies;Brian S. Potter;I. Qaddoumi;Debbie Redd;Lineke Rehorst;Stephen Roberts;J. Roganovic;Stefan Rutkowski;M. V. D. van de Sande;Victor Santana;Stephanie Saslawsky;Kim Sawyer;Katrin Scheinemann;G. Schleiermacher;Kjeld Schmiegelow;R. Schoot;Fiona Schulte;A. Sehested;Inge Sieswerda;Rod Skinner;Relinde Slooff;Donna Sluijs;I. van der Sluis;Daniel Smith;Holly Spraker;Sheri L. Spunt;Mirjam Sulkers;T. Sweeney;Mary Taj;Clifford Takemoto;Aimee C. Talleur;Hannah Taylor;Chantal Tersteeg;Sheila Terwisscha;Sophie Thomas;Brigitte W. Thomassen;C. Tinkle;Rebecca Tippett;W. Tissing;I. Tonning;Anke Top;Erin Turner;Santhosh Upadhyaya;A. Uyttebroeck;Güler Uyuk;Kees P. van de Ven;B. Versluys;Emma Verwaaijen;Saphira Visser;Jochem van Vliet;E. de Vos;A. D. de Vries;D. V. van Vuurden;Claire Wakefield;K. Warren;Chantal van Wegen Peelen;Aaron Weiss;Marianne D van de Wetering;Jeremy Whelan;Romy Wichink;L. Wiener;Marc H.W.A. Wijnen;V. Willard;Terry Wilson;Jennifer Windham;Laura de Winter;O. Witt;M. Wlodarski;Kim Wouters;Corina Wouterse;Kasey Wyrick;L. Zaletel;Alia Zaidi;Jonne van Zanten;J. Zsiros;Lisa Zwiers
  • 通讯作者:
    Lisa Zwiers

Douglas Hawkins的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Douglas Hawkins', 18)}}的其他基金

Diagnostics for structured data and quality improvement
结构化数据诊断和质量改进
  • 批准号:
    0306304
  • 财政年份:
    2003
  • 资助金额:
    $ 9.21万
  • 项目类别:
    Standard Grant
Mathematical Sciences: Diagnostics in Structured Data and Quality Improvement
数学科学:结构化数据诊断和质量改进
  • 批准号:
    9505440
  • 财政年份:
    1995
  • 资助金额:
    $ 9.21万
  • 项目类别:
    Standard Grant
Mathematical Sciences: Diagnostics and Graphics for Structured Data
数学科学:结构化数据的诊断和图形
  • 批准号:
    9208819
  • 财政年份:
    1992
  • 资助金额:
    $ 9.21万
  • 项目类别:
    Standard Grant
Mathematical Sciences: Location of Outliers in Structured Data
数学科学:结构化数据中异常值的位置
  • 批准号:
    9010983
  • 财政年份:
    1990
  • 资助金额:
    $ 9.21万
  • 项目类别:
    Continuing Grant
Mathematical Sciences: Location of Outliers in Structured Data
数学科学:结构化数据中异常值的位置
  • 批准号:
    8902571
  • 财政年份:
    1989
  • 资助金额:
    $ 9.21万
  • 项目类别:
    Standard Grant

相似海外基金

Computing over Compressed Graph-Structured Data
压缩图结构数据的计算
  • 批准号:
    EP/X039447/1
  • 财政年份:
    2024
  • 资助金额:
    $ 9.21万
  • 项目类别:
    Research Grant
CAREER: Learning from Data on Structured Complexes: Products, Bundles, and Limits
职业:从结构化复合体的数据中学习:乘积、捆绑和限制
  • 批准号:
    2340481
  • 财政年份:
    2024
  • 资助金额:
    $ 9.21万
  • 项目类别:
    Continuing Grant
Evaluating and Optimizing Care for Opioid Use Disorder using a Structured Data-Science Approach
使用结构化数据科学方法评估和优化阿片类药物使用障碍的护理
  • 批准号:
    10571088
  • 财政年份:
    2023
  • 资助金额:
    $ 9.21万
  • 项目类别:
CAREER: Resource Efficient Systems for Machine Learning on Structured Data
职业:结构化数据机器学习的资源高效系统
  • 批准号:
    2237306
  • 财政年份:
    2023
  • 资助金额:
    $ 9.21万
  • 项目类别:
    Continuing Grant
On Optimal Transport-based Statistical Measures for Graph Structured Data and Applications
基于传输的最优图结构化数据统计方法及应用
  • 批准号:
    23K16939
  • 财政年份:
    2023
  • 资助金额:
    $ 9.21万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Clinical foundation model for structured clinical data
结构化临床数据的临床基础模型
  • 批准号:
    10639397
  • 财政年份:
    2023
  • 资助金额:
    $ 9.21万
  • 项目类别:
Machine learning on biological structured data
生物结构化数据的机器学习
  • 批准号:
    559300-2021
  • 财政年份:
    2022
  • 资助金额:
    $ 9.21万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
Machine learning for graph-structured data: Understanding complex biological systems
图结构数据的机器学习:理解复杂的生物系统
  • 批准号:
    RGPIN-2020-05341
  • 财政年份:
    2022
  • 资助金额:
    $ 9.21万
  • 项目类别:
    Discovery Grants Program - Individual
Developing a community to mitigate caregiver burden in inherited retinal diseases using AI labelled structured and unstructured data
使用人工智能标记的结构化和非结构化数据建立一个社区,以减轻遗传性视网膜疾病的护理人员负担
  • 批准号:
    10046943
  • 财政年份:
    2022
  • 资助金额:
    $ 9.21万
  • 项目类别:
    Grant for R&D
Understanding and Improving Deep Learning for Structured Data
理解和改进结构化数据的深度学习
  • 批准号:
    RGPIN-2022-04636
  • 财政年份:
    2022
  • 资助金额:
    $ 9.21万
  • 项目类别:
    Discovery Grants Program - Individual
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了