III: Medium: Collaborative Research: Algorithms and Cyberinfrastructure for High-Precision Automated Quality Control of Hydro-Meteo Sensor Networks
III:媒介:合作研究:Hydro-Meteo 传感器网络高精度自动化质量控制的算法和网络基础设施
基本信息
- 批准号:1513512
- 负责人:
- 金额:$ 36.35万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2015
- 资助国家:美国
- 起止时间:2015-09-01 至 2019-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Advances in sensor technology are greatly expanding the range of quantities that can be measured while simultaneously reducing the cost. However, deployed sensors drift out of calibration and fail, so every sensor network requires quality control (QC) procedures to promptly detect these failures. Existing QC methods rely on human experts to carefully examine the data, which means that when the number of sensors in a network doubles, the number of experts must double too. This project will develop algorithms and software to increase the level of automation in sensor QC so that a smaller number of experts can manage a much larger network of sensors. The methods will be tested on weather data from Oklahoma (the Oklahoma Mesonet), Oregon (the Andrews Long-Term Ecological Network site), the US (the Earth Networks "WeatherBug" network), and sub-Saharan Africa (the TAHMO project), and if the methods are found to work well, they will be deployed in these networks at at the CUAHSI Water Data Center. Accurate weather data could significantly increase the productivity of farms and improve food security, particularly in Africa.The project will develop an open-source standards-compliant system, SENSOR-DX, that implements automated data QC. Existing probabilistic QC methods assume that correct sensor readings are jointly Gaussian and readings from broken sensors obey a uniform distribution. These assumptions lead to many QC mistakes. This project will develop a new approach in which novel nonparametric anomaly detection algorithms analyze the sensor data. Correct sensor readings have low anomaly scores, while broken sensor readings have high scores; both follow parametric distributions. Probabilistic methods can therefore model the distribution of the resulting anomaly scores instead of the joint distribution of the original sensor readings and infer (probabilistically) whether each sensor is working correctly. To enhance the fault-detection capability of the anomaly detection algorithms, the raw sensor data will be detrended and assembled into multiple views that highlight various correlations among sensor values. The project will develop a novel View-Anomaly-Diagnosis (VAD) framework in which anomaly detection algorithms are applied to the tuples in each view, and then the anomaly scores are combined via a probabilistic diagnostic model to infer which sensors are broken and which are functioning correctly. The project will study how good the detrending models need to be in order to enhance the accuracy of anomaly detection. The new anomaly detection algorithms are based on a new anomaly detection principle: "anomaly detection by overfitting". Existing methods fit a statistical model to "normal" behavior and then identify data points that do not fit well ("are underfit") and mark them as anomalies. The new principle measures how easy it is to "overfit" a statistical model that separates candidate anomalies from the rest of the data. The project will develop new algorithms based on this principle and understand how they relate to existing methods of anomaly detection by underfitting. The VAD framework will be implemented in the SENSOR-DX system: a series of Kepler workflows that provide support for connecting a new sensor network, training the detrending and anomaly detection models, performing real-time anomaly detection, and repairing bad sensor readings using predictive models. SENSOR-DX will also support semantic matching of new sensor data streams by extending the EnvThs controlled vocabulary thesaurus.For further information see the project web site at http://tahmo.org/sensor-dx
传感器技术的进步极大地扩展了可测量的量的范围,同时降低了成本。然而,部署的传感器会偏离校准并发生故障,因此每个传感器网络都需要质量控制(QC)程序来及时检测这些故障。现有的质量控制方法依赖于人类专家仔细检查数据,这意味着当网络中的传感器数量增加一倍时,专家的数量也必须增加一倍。该项目将开发算法和软件,以提高传感器质量控制的自动化水平,以便更少的专家可以管理更大的传感器网络。这些方法将在来自俄克拉荷马州(俄克拉荷马州Mesonet)、俄勒冈州(安德鲁斯长期生态网络站点)、美国(地球网络“WeatherBug”网络)和撒哈拉以南非洲(TAHMO项目)的天气数据上进行测试,如果发现这些方法效果良好,它们将部署在CUAHSI水数据中心的这些网络中。准确的天气数据可以显著提高农场的生产力,改善粮食安全,特别是在非洲。该项目将开发一个符合标准的开源系统SENSOR-DX,实现自动数据质量控制。现有的概率QC方法假设正确的传感器读数是联合高斯和读数从损坏的传感器服从均匀分布。这些假设导致许多QC错误。本项目将开发一种新的方法,其中新的非参数异常检测算法分析传感器数据。正确的传感器读数具有较低的异常分数,而损坏的传感器读数具有较高的分数;两者都遵循参数分布。因此,概率方法可以对所得到的异常分数的分布而不是原始传感器读数的联合分布进行建模,并(概率地)推断每个传感器是否正确工作。 为了增强异常检测算法的故障检测能力,原始传感器数据将被去除趋势并组装成多个视图,这些视图突出传感器值之间的各种相关性。该项目将开发一种新的视图-异常-诊断(VAD)框架,其中异常检测算法应用于每个视图中的元组,然后通过概率诊断模型组合异常分数,以推断哪些传感器损坏,哪些传感器正常工作。该项目将研究去趋势模型需要有多好,以提高异常检测的准确性。新的异常检测算法基于一种新的异常检测原理:“过拟合异常检测”。现有的方法将统计模型拟合到“正常”行为,然后识别出拟合不好的数据点(“拟合不足”)并将其标记为异常。新的原则衡量了“过拟合”统计模型的容易程度,该模型将候选异常与其余数据分开。该项目将根据这一原则开发新的算法,并了解它们与现有的欠拟合异常检测方法的关系。 VAD框架将在SENSOR-DX系统中实施:一系列Kepler工作流,为连接新的传感器网络、训练去趋势和异常检测模型、执行实时异常检测以及使用预测模型修复不良传感器读数提供支持。SENSOR-DX还将通过扩展EnvThs控制词汇词库来支持新传感器数据流的语义匹配。有关更多信息,请访问项目网站http://tahmo.org/sensor-dx
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Michael Piasecki其他文献
EnvThs: a controlled vocabulary service application for environmental data
- DOI:
10.1007/s12145-014-0187-x - 发表时间:
2014-11-15 - 期刊:
- 影响因子:3.000
- 作者:
Peng Ji;Michael Piasecki;Rachel Lovell - 通讯作者:
Rachel Lovell
Michael Piasecki的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Michael Piasecki', 18)}}的其他基金
Collaborative Research: CUAHSI/CLEANER Demonstration and Development of a Test-bed Digital Observatory for the Susquehanna River Basin and Chesapeake Bay
合作研究:CUAHSI/CLEANER 萨斯奎哈纳河流域和切萨皮克湾试验台数字观测站的演示和开发
- 批准号:
0609832 - 财政年份:2006
- 资助金额:
$ 36.35万 - 项目类别:
Standard Grant
Workshop on: Research Opportunities in Cyberengineering/Cyberinfrastructure. To be held April 22-23, 2004 at Drexel University in Philadelphia, PA.
研讨会主题:网络工程/网络基础设施的研究机会。
- 批准号:
0429002 - 财政年份:2004
- 资助金额:
$ 36.35万 - 项目类别:
Standard Grant
COLLABORATIVE RESEARCH: Development of Informatics Infrastructure for the Hydrologic Sciences
合作研究:水文科学信息学基础设施的开发
- 批准号:
0412904 - 财政年份:2004
- 资助金额:
$ 36.35万 - 项目类别:
Continuing Grant
CLEANER: Collaborative Research: Cyberinfrastructure Needs for a Model Environmental Field Facility in Baltimore, Maryland as Part of an Engineering Analysis Network
CLEANER:协作研究:马里兰州巴尔的摩的模型环境现场设施的网络基础设施需求,作为工程分析网络的一部分
- 批准号:
0414204 - 财政年份:2004
- 资助金额:
$ 36.35万 - 项目类别:
Standard Grant
相似海外基金
III : Medium: Collaborative Research: From Open Data to Open Data Curation
III:媒介:协作研究:从开放数据到开放数据管理
- 批准号:
2420691 - 财政年份:2024
- 资助金额:
$ 36.35万 - 项目类别:
Standard Grant
Collaborative Research: III: Medium: Designing AI Systems with Steerable Long-Term Dynamics
合作研究:III:中:设计具有可操纵长期动态的人工智能系统
- 批准号:
2312865 - 财政年份:2023
- 资助金额:
$ 36.35万 - 项目类别:
Standard Grant
Collaborative Research: III: MEDIUM: Responsible Design and Validation of Algorithmic Rankers
合作研究:III:媒介:算法排序器的负责任设计和验证
- 批准号:
2312932 - 财政年份:2023
- 资助金额:
$ 36.35万 - 项目类别:
Standard Grant
Collaborative Research: III: Medium: Algorithms for scalable inference and phylodynamic analysis of tumor haplotypes using low-coverage single cell sequencing data
合作研究:III:中:使用低覆盖率单细胞测序数据对肿瘤单倍型进行可扩展推理和系统动力学分析的算法
- 批准号:
2415562 - 财政年份:2023
- 资助金额:
$ 36.35万 - 项目类别:
Standard Grant
III: Medium: Collaborative Research: Integrating Large-Scale Machine Learning and Edge Computing for Collaborative Autonomous Vehicles
III:媒介:协作研究:集成大规模机器学习和边缘计算以实现协作自动驾驶汽车
- 批准号:
2348169 - 财政年份:2023
- 资助金额:
$ 36.35万 - 项目类别:
Continuing Grant
Collaborative Research: III: Medium: VirtualLab: Integrating Deep Graph Learning and Causal Inference for Multi-Agent Dynamical Systems
协作研究:III:媒介:VirtualLab:集成多智能体动态系统的深度图学习和因果推理
- 批准号:
2312501 - 财政年份:2023
- 资助金额:
$ 36.35万 - 项目类别:
Standard Grant
Collaborative Research: III: Medium: Knowledge discovery from highly heterogeneous, sparse and private data in biomedical informatics
合作研究:III:中:生物医学信息学中高度异构、稀疏和私有数据的知识发现
- 批准号:
2312862 - 财政年份:2023
- 资助金额:
$ 36.35万 - 项目类别:
Standard Grant
Collaborative Research: III: MEDIUM: Responsible Design and Validation of Algorithmic Rankers
合作研究:III:媒介:算法排序器的负责任设计和验证
- 批准号:
2312930 - 财政年份:2023
- 资助金额:
$ 36.35万 - 项目类别:
Standard Grant
Collaborative Research: III: Medium: New Machine Learning Empowered Nanoinformatics System for Advancing Nanomaterial Design
合作研究:III:媒介:新的机器学习赋能纳米信息学系统,促进纳米材料设计
- 批准号:
2347592 - 财政年份:2023
- 资助金额:
$ 36.35万 - 项目类别:
Standard Grant
Collaborative Research: IIS: III: MEDIUM: Learning Protein-ish: Foundational Insight on Protein Language Models for Better Understanding, Democratized Access, and Discovery
协作研究:IIS:III:中等:学习蛋白质:对蛋白质语言模型的基础洞察,以更好地理解、民主化访问和发现
- 批准号:
2310113 - 财政年份:2023
- 资助金额:
$ 36.35万 - 项目类别:
Standard Grant