Reduction of Infinite Data Dimension via B Spline Smoothing
通过 B 样条平滑减少无限数据维度
基本信息
- 批准号:0706518
- 负责人:
- 金额:$ 22.15万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2007
- 资助国家:美国
- 起止时间:2007-06-01 至 2010-05-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This research project develops B spline smoothing methods for: (1) reducing dimension in machine learning and (2) non- and semi parametric GARCH volatility model, with fast computing and explicit formulae. Asymptotically simultaneous confidence band are provided for all nonparametric estimation. The proposal aims to develop the underlying theory as a crucial guide to practical implementation. For dimension reduction in machine learning, the focuses are on the generalized additive model (GAM) and the single index model (SIM), with dimensions tending to infinity. For dimensions from low to moderately high (400-D), spline-backfitted kernel smoothing procedure for additive model and direct spline smoothing procedure for SIM are theoretically reliable, intuitively appealing with extremely fast computing. The current project extends these procedures to GAM and SIM with dimension going to infinity, preserving the theoretical, intuitive and computing benefits. The investigator also studies B spline smoothing algorithms for non- and semi- parametric GARCH model, achieving the same asymptotics as kernel smoothing. As typical applications of GARCH model involve sample sizes from thousands to millions and equally large number of lagged values, B spline smoothing can compute in seconds what kernel smoothing would need days. Thus the proposed methods satisfy both theoreticians and financial analysts.In the age of information overload, researchers in nearly all areas of biological, medical, physical and social sciences are routinely confronted with large data sets. With tens of thousands of characteristics called variables or features, these large data sets are treasure troughs of valuable scientific information. The methods developed by the investigator are powerful new tools for drawing such useful information out of large data sets. Typical examples of such data include but are not limited to, environmental and global change studies, high frequency financial data, state and federal demographic surveys, federal biometric database, etc. Codes written in free software R are made publicly available for wide dissemination. Practitioners from industry and government can analyze their own large data sets with these user-friendly modules, in real time, with confidence and precision. A distinctive feature of the project is the active integration of cutting-edge research with the education and training of graduate students, especially those from underrepresented groups. This is consistent with the education goal of NSF and fulfills NSF's commitment to the principle of fostering diversity in science.
本研究针对机器学习中的降维问题和非参数和半参数GARCH波动率模型,提出了B样条平滑方法,具有计算速度快、公式明确等优点。给出了所有非参数估计的渐近同时置信带。该提案旨在发展基本理论,作为实际执行的重要指南。机器学习中的降维方法主要集中在维度趋于无穷大的广义加性模型(GAM)和单指数模型(SIM)。对于从低到中等的维度(400维),加性模型的样条库拟合核光滑化方法和SIM的直接样条化光滑化方法在理论上是可靠的,并且计算非常快,直观地吸引人。目前的项目将这些过程扩展到GAM和SIM,维度达到无穷大,保持了理论、直观和计算上的好处。研究人员还研究了非参数和半参数GARCH模型的B样条平滑算法,获得了与核平滑相同的渐近性。由于GARCH模型的典型应用涉及的样本量从数千到数百万,以及同样大量的滞后值,B样条平滑可以在几秒钟内计算出核平滑需要几天的时间。在信息过载的时代,生物、医学、物理和社会科学几乎所有领域的研究人员都经常面临着海量数据集。这些庞大的数据集拥有数以万计的称为变量或特征的特征,是宝贵的科学信息的宝库。研究人员开发的方法是从大数据集中提取有用信息的强大新工具。这类数据的典型例子包括但不限于环境和全球变化研究、高频金融数据、州和联邦人口调查、联邦生物统计数据库等。以自由软件R编写的代码被公开提供以供广泛传播。来自行业和政府的从业者可以使用这些用户友好的模块实时、自信和准确地分析他们自己的大型数据集。该项目的一个显著特点是积极地将尖端研究与研究生的教育和培训结合起来,特别是那些来自代表性不足群体的研究生。这与NSF的教育目标是一致的,并履行了NSF对促进科学多样性原则的承诺。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Lijian Yang其他文献
One-Step Stereoselective Synthesis of (2Z,4Z,6Z,8Z)-Decatetraene Diketone from Pyrylium Salts
由吡喃鎓盐一步立体选择性合成(2Z,4Z,6Z,8Z)-十碳四烯二酮
- DOI:
10.1002/ejoc.201301685 - 发表时间:
2014 - 期刊:
- 影响因子:2.8
- 作者:
Lijian Yang;Junwei Ye;Yuan Gao;D. Deng;Yuan Lin;G. Ning - 通讯作者:
G. Ning
EFFICIENT AND FAST SPLINE-BACKFITTED KERNEL SMOOTHING OF ADDITIVE REGRESSION MODEL ∗
- DOI:
- 发表时间:
2004 - 期刊:
- 影响因子:0
- 作者:
Lijian Yang - 通讯作者:
Lijian Yang
div class=pagediv class=layoutAreadiv class=columnbr /Characteristics of CARMA1-BCL10-MALT1-A20-NF-κBexpression in T cell-acute lymphocytic leukemia br //div/div
T细胞急性淋巴细胞白血病CARMA1-BCL10-MALT1-A20-NF-κB表达特点
- DOI:
- 发表时间:
2014 - 期刊:
- 影响因子:4.2
- 作者:
Xu Wang;Fan Zhang;Shaohua Chen;Lijian Yang;Gengxin Luo;Xin Huang;Suming Huang;Xiuli Wu;Yangqiu Li - 通讯作者:
Yangqiu Li
DOOB, IGNATOV AND OPTIONAL SKIPPING
DOOB、IGNATOV 和可选跳过
- DOI:
10.1214/aop/1039548377 - 发表时间:
2002 - 期刊:
- 影响因子:2.3
- 作者:
G. Simons;Yi;Lijian Yang - 通讯作者:
Lijian Yang
Spline Single-Index Prediction Model
样条单指标预测模型
- DOI:
- 发表时间:
2007 - 期刊:
- 影响因子:0
- 作者:
Li Wang;Lijian Yang - 通讯作者:
Lijian Yang
Lijian Yang的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Lijian Yang', 18)}}的其他基金
Simultaneous Confidence Regions for Functional Data Analysis: Theory and Methods
函数数据分析的同时置信区域:理论与方法
- 批准号:
1007594 - 财政年份:2010
- 资助金额:
$ 22.15万 - 项目类别:
Standard Grant
Monte-Carlo multi-step ahead forecasting for nonlinear time series
非线性时间序列的蒙特卡洛多步超前预测
- 批准号:
0405330 - 财政年份:2004
- 资助金额:
$ 22.15万 - 项目类别:
Standard Grant
Non- and Semi-parametric Identification and Prediction of Autoregressive Models, with Applications to Econometrics
自回归模型的非参数和半参数识别和预测及其在计量经济学中的应用
- 批准号:
9971186 - 财政年份:1999
- 资助金额:
$ 22.15万 - 项目类别:
Standard Grant
相似海外基金
Non-asymptotic inference for high and infinite dimensional data
高维和无限维数据的非渐近推理
- 批准号:
RGPIN-2018-05678 - 财政年份:2022
- 资助金额:
$ 22.15万 - 项目类别:
Discovery Grants Program - Individual
The Complexity of Computing with Infinite Data
无限数据计算的复杂性
- 批准号:
RGPIN-2021-02481 - 财政年份:2022
- 资助金额:
$ 22.15万 - 项目类别:
Discovery Grants Program - Individual
Non-asymptotic inference for high and infinite dimensional data
高维和无限维数据的非渐近推理
- 批准号:
RGPIN-2018-05678 - 财政年份:2021
- 资助金额:
$ 22.15万 - 项目类别:
Discovery Grants Program - Individual
The Complexity of Computing with Infinite Data
无限数据计算的复杂性
- 批准号:
RGPIN-2021-02481 - 财政年份:2021
- 资助金额:
$ 22.15万 - 项目类别:
Discovery Grants Program - Individual
Non-asymptotic inference for high and infinite dimensional data
高维和无限维数据的非渐近推理
- 批准号:
RGPIN-2018-05678 - 财政年份:2020
- 资助金额:
$ 22.15万 - 项目类别:
Discovery Grants Program - Individual
Robust stability analysis of infinite-dimensional sampled-data systems
无限维采样数据系统的鲁棒稳定性分析
- 批准号:
20K14362 - 财政年份:2020
- 资助金额:
$ 22.15万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Non-asymptotic inference for high and infinite dimensional data
高维和无限维数据的非渐近推理
- 批准号:
RGPIN-2018-05678 - 财政年份:2019
- 资助金额:
$ 22.15万 - 项目类别:
Discovery Grants Program - Individual
Non-asymptotic inference for high and infinite dimensional data
高维和无限维数据的非渐近推理
- 批准号:
RGPIN-2018-05678 - 财政年份:2018
- 资助金额:
$ 22.15万 - 项目类别:
Discovery Grants Program - Individual
Non-asymptotic inference for high and infinite dimensional data
高维和无限维数据的非渐近推理
- 批准号:
DGECR-2018-00166 - 财政年份:2018
- 资助金额:
$ 22.15万 - 项目类别:
Discovery Launch Supplement
Proof Theory: Finite Data from Infinite Mathematics
证明论:无限数学中的有限数据
- 批准号:
1600263 - 财政年份:2016
- 资助金额:
$ 22.15万 - 项目类别:
Continuing Grant