Sparse linear models: Their existence and stability
稀疏线性模型:它们的存在性和稳定性
基本信息
- 批准号:EP/W011905/1
- 负责人:
- 金额:$ 10.12万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2022
- 资助国家:英国
- 起止时间:2022 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Large quantities of data are gathered in many domains of life, including the medical, financial, retail, industrial and social media domains. This data must be analysed such that its properties can be extracted and the underlying system understood. It is necessary to distinguish between the quantity of data, which may be large, and the information contained in the data, which may allow it to be represented, with acceptable accuracy, by a simple model. This simple model captures fundamental properties of the system, such that it can be used for the determination of the response of the system on new (unseen) data. An example of a simple model is a low degree polynomial, but this proposal considers a sparse model, which is another example of a simple model. A sparse model of a system is a model in which the dominant input variables (predictors) that determine the output, rather than all the input variables, are identified. Genomics provides an example of a sparse model because there are about 30,000 genes in the human body, but not all genes are associated directly with cancer. It is therefore desirable to identify the genes that are most directly associated with cancer, such that treatment is focused on the dominant contributory factors, rather than factors whose role in the cause of cancer is minor. Sparsity of the solution x of the linear algebraic equation Ax=b is imposed by regularisation in the 1-norm (the lasso). This is different from regularisation in the 2-norm (Tikhonov regularisation), which imposes stability on x. The lasso is not understood as well as Tikhonov regularisation because of the absence of a 1-norm matrix decomposition, but fundamental properties of a regularised solution of Ax=b are independent of the norm in which the regularisation is imposed. For example, a regularised solution in both norms must be stable and the error between it and the exact solution must be small. This proposal considers these properties of a regularised solution when regularisation by the lasso is used.Computations on sparse models are, in general, simpler and faster than computations on exact dense models, which is advantageous, and important theoretical issues that must be addressed are considered in this proposal. A sparse model is an approximation of an exact dense model and there is therefore an error associated with a sparse model. A good sparse model is a model in which this error is small, and this sparse model is accepted because this small error is balanced by the greater physical insight allowed by a sparse model. Furthermore, a sparse model must be computationally reliable such that results derived from it are numerically stable, and thus a good sparse model must have a small error and be stable. It cannot, however, be assumed that all inputs yield an approximate input-output relationship that is sparse, stable and has a small error. It is therefore necessary to establish the class of inputs for which these properties are, and are not, satisfied. It follows that there are many issues to be considered before a sparse model can be used with confidence of its correctness. This proposal addresses these issues and it will include theoretical results and computational experiments. The benefits of the proposed research extend to the many areas in which a sparse model is used to model an input-output relationship. These applications include the medical, financial, retail, industrial and social media domains (as stated above). Apart from the computational advantages of a sparse model (stated above), the desirability of a sparse model follows from its simplicity, and it is therefore easier to obtain a physical understanding of the input-output relationship of the system.
生活中的许多领域都收集了大量数据,包括医疗、金融、零售、工业和社交媒体领域。必须对这些数据进行分析,以便提取其属性并了解其基础系统。有必要区分数据量(可能很大)和数据中所含的信息(可以用一个简单的模型以可接受的准确度表示)。这个简单的模型捕获了系统的基本属性,因此它可以用于确定系统对新(未见过的)数据的响应。简单模型的一个例子是低次多项式,但该建议考虑稀疏模型,这是简单模型的另一个例子。系统的稀疏模型是一种模型,其中确定输出的主要输入变量(预测变量),而不是所有输入变量。基因组学提供了一个稀疏模型的例子,因为人体中大约有30,000个基因,但并非所有基因都与癌症直接相关。因此,需要鉴定与癌症最直接相关的基因,使得治疗集中于主要的促成因素,而不是在癌症原因中作用较小的因素。线性代数方程Ax=B的解x的稀疏性由1-范数的正则化(套索)施加。这与2-范数正则化(Tikhonov正则化)不同,后者对x施加了稳定性。由于没有1-范数矩阵分解,套索不像吉洪诺夫正则化那样被理解,但是Ax=B的正则化解的基本性质与施加正则化的范数无关。例如,在两个范数下的正则化解必须是稳定的,并且它与精确解之间的误差必须很小。这个建议考虑这些属性的正则化的解决方案时,正则化的套索used.Computations稀疏模型,在一般情况下,更简单,更快的计算比精确密集的模型,这是有利的,重要的理论问题,必须解决的考虑在这个建议。稀疏模型是精确密集模型的近似,因此存在与稀疏模型相关联的误差。一个好的稀疏模型是一个模型,在这个模型中,这个误差是小的,这个稀疏模型是可以接受的,因为这个小误差是由稀疏模型所允许的更大的物理洞察力来平衡的。此外,稀疏模型必须在计算上可靠,使得从其导出的结果在数值上稳定,并且因此好的稀疏模型必须具有小的误差并且稳定。然而,不能假定所有投入都产生一种稀疏、稳定和误差小的近似投入产出关系。因此,有必要确定满足和不满足这些性质的输入类别。因此,在可以使用稀疏模型并确信其正确性之前,需要考虑许多问题。这个建议解决了这些问题,它将包括理论结果和计算实验。 所提出的研究的好处扩展到许多领域,其中稀疏模型是用来模拟输入输出关系。这些应用包括医疗、金融、零售、工业和社交媒体领域(如上所述)。除了稀疏模型的计算优势(如上所述),稀疏模型的可取之处在于它的简单性,因此更容易获得对系统输入-输出关系的物理理解。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Joab Winkler其他文献
Joab Winkler的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Joab Winkler', 18)}}的其他基金
Mathematical Methods in Geometric Modelling
几何建模中的数学方法
- 批准号:
EP/D509858/1 - 财政年份:2007
- 资助金额:
$ 10.12万 - 项目类别:
Training Grant
相似国自然基金
Development of a Linear Stochastic Model for Wind Field Reconstruction from Limited Measurement Data
- 批准号:
- 批准年份:2020
- 资助金额:40 万元
- 项目类别:
基于个体分析的投影式非线性非负张量分解在高维非结构化数据模式分析中的研究
- 批准号:61502059
- 批准年份:2015
- 资助金额:19.0 万元
- 项目类别:青年科学基金项目
全纯Mobius变换及其在相对论和信号分析中的应用
- 批准号:11071230
- 批准年份:2010
- 资助金额:28.0 万元
- 项目类别:面上项目
枢纽港选址及相关问题的算法设计
- 批准号:71001062
- 批准年份:2010
- 资助金额:17.6 万元
- 项目类别:青年科学基金项目
统计过程控制图的设计理论及其应用
- 批准号:10771107
- 批准年份:2007
- 资助金额:22.0 万元
- 项目类别:面上项目
MIMO电磁探测技术与成像方法研究
- 批准号:40774055
- 批准年份:2007
- 资助金额:35.0 万元
- 项目类别:面上项目
相似海外基金
CAREER: Scalable algorithms for regularized and non-linear genetic models of gene expression
职业:基因表达的正则化和非线性遗传模型的可扩展算法
- 批准号:
2336469 - 财政年份:2024
- 资助金额:
$ 10.12万 - 项目类别:
Continuing Grant
Computational and neural signatures of interoceptive learning in anorexia nervosa
神经性厌食症内感受学习的计算和神经特征
- 批准号:
10824044 - 财政年份:2024
- 资助金额:
$ 10.12万 - 项目类别:
Time series clustering to identify and translate time-varying multipollutant exposures for health studies
时间序列聚类可识别和转化随时间变化的多污染物暴露以进行健康研究
- 批准号:
10749341 - 财政年份:2024
- 资助金额:
$ 10.12万 - 项目类别:
Noradrenergic gating of astrocyte calcium-mediated homeostasis in vivo
星形胶质细胞钙介导体内稳态的去甲肾上腺素能门控
- 批准号:
10679269 - 财政年份:2023
- 资助金额:
$ 10.12万 - 项目类别:
The space-time organization of sleep oscillations as potential biomarker for hypersomnolence
睡眠振荡的时空组织作为嗜睡的潜在生物标志物
- 批准号:
10731224 - 财政年份:2023
- 资助金额:
$ 10.12万 - 项目类别:
Imaging Epilepsy Sources with Biophysically Constrained Deep Neural Networks
使用生物物理约束的深度神经网络对癫痫源进行成像
- 批准号:
10655833 - 财政年份:2023
- 资助金额:
$ 10.12万 - 项目类别:
Intracranial Investigation of Neural Circuity Underlying Human Mood
人类情绪背后的神经回路的颅内研究
- 批准号:
10660355 - 财政年份:2023
- 资助金额:
$ 10.12万 - 项目类别:
Investigating relationships between naturalistic light exposure and sleep
研究自然光照与睡眠之间的关系
- 批准号:
10739430 - 财政年份:2023
- 资助金额:
$ 10.12万 - 项目类别:
A 11C-UCB-J PET Study of Synaptic Density in Binge Eating Disorder (BED)
暴食症 (BED) 突触密度的 11C-UCB-J PET 研究
- 批准号:
10673376 - 财政年份:2023
- 资助金额:
$ 10.12万 - 项目类别: