Adaptation of New Statistical Ideas for Medicine
新的医学统计理念的适应
基本信息
- 批准号:7757165
- 负责人:
- 金额:$ 22.43万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:1993
- 资助国家:美国
- 起止时间:1993-01-15 至 2015-01-31
- 项目状态:已结题
- 来源:
- 关键词:AffectAlgorithmsAllelesAnthrax diseaseAwardBioinformaticsBloodBlood PressureBusinessesCaliforniaCancer PatientCategoriesCertificationCitiesCodeComplexContractsCountryCountyCredentialingDNA copy numberDataData SetDiseaseDisease modelDissectionDoctor of PhilosophyElectronic MailFamilyFirst NameFundingGenesGeneticGrantGray unit of radiation doseHawaiiHealthHeritabilityHigh Density Lipoprotein CholesterolHuman GeneticsHypertensionIncomeInstitutesInstructionInsulin ResistanceInvestigationJapanese PopulationKnowledgeLast NameLeft Ventricular MassMalignant NeoplasmsManufactured BaseballMedicareMedicineMethodsMicrosatellite RepeatsModelingModemsMusculoskeletalNamesNational Heart, Lung, and Blood InstituteNational Institute of Biomedical Imaging and BioengineeringPanamaPersonsPhase III Clinical TrialsPositioning AttributePositive Lymph NodePrincipal InvestigatorProcessProvincePublicationsResearchResearch PersonnelRheumatologyRoleSapphireScanningSentinel Lymph NodeSiteSystemTechnologyTelefacsimileTelephoneTestingTimeTreesTriglyceridesTrusteesUniversitiesValidationdata miningexperienceexpirationfamilial hypertensiongenome-wide linkagehuman subjectinterestlymph nodesmalignant breast neoplasmmeetingsmultidisciplinaryprofessorprogramsstatisticssymposiumsystematic reviewtechnical reporttheoriestraitvector
项目摘要
Our MERIT award work will continue to have two main components: involvement in .specific biomedical
reseai-ch projects sucli as NHBLI's FEHGAS study, and development of new statistical methods appropriate
for the analysis of large, complex data sets. These efforts are complementary, with the speciflc projects
¿suggesting which statistical rnethods are mofit needed, and also serving as test cases for new methodology.
The FEHGAS study, for exarhple,- seeks to predict age of onset of hypertiension from SNP data (and
background variables such as age and gender). There are 550,000 SNPs available for prediction, most of
which will turn out to be useless, making the problem an ijrder of magnitude more challenging, than in
expression microarray situations. Efron plans to extend the empirical Bayes liiethodology from his recent
paper to this context, hopefully overcoming the difficulties caused by the usually weak predictive power
of individual SNPs. Olshen plans to extend CART (Computer Assisted Regre.s.sion Trees) and bootstrap
methodology to the selection of groups of promising predictive SNPs.
Large-scale significance testing, for instance selecting 'significant' genes in a microarray cancer study,
has become an area of iiitense statistical development. Nevertheless, crucial questions of appropriate implomentation
remain vague in the literature: the choice of an appropriate null hypothesis; the selection of a
comparison set (Should all 550,000 SNPs be tested together or sepai-ately by chromosome?); and the effects
of correlation. We have made some headway in answering thescf questions, as described in the Progress
Report. Our continuing efforts are a combination of methodological implementation and theoretical development.
Correlatiion can have particularly dra.stic effects on staiidard statistical techniques. Iii "Are a .set of microarrays
independent of each other?" it is shovyn that a study involving 20,000 genes has its effective sample
size reduced to about 17 because of severe gene-wise correlation. We are currently developing diagnostic
methods to spot correlation difficulties in massive data sets, and to assess their effects on hypothesis tests,
estimates, and predictions. A 20,000 gene microarray study produces 200,000,000 correlations, which sounds
oppressively large for practical insight. But we are making progress on an empirical Ba5'es approximation
that summarizes correlation, effects in a single number, suitable for simple analysis.
Twentieth Centiiry biostatistical applications were overwhelmingly frequentist in nature. Pure: frequentism,
though, becomfSi impra<;tical for analyzing the large, complex data sets produced by modem biomedical
devices, where the relationships of thousands of parameters and millions of data points have to be considered
together. We are continuing to develop empirical Bayes methods that allow Bayesian ideas to be brought to
bear on questions of multiple inference, without requiring specific prior distributions from the .scientist.
A long-term project is to understand how quickly empirical Bayes information accrues in a medical study.
A False Discovery Rate is an estimate of the Bayes posterior probabiUty that a gene (or a SNP, br a voxel)
is 'null', given the observed data. How many subjects and how many genes do we need to observe in order
to get an acciurate empirical Bayes estiinate of the posterior probability?
hi our own version of Moore's law, biomedical data sets have increased an order of magnitude in size every
few years since the 1990s. Emerging technologies (tiling arrays, bead arrays, aptamer chips, methylation arrays,
exon chips, and a variety of new imaging devices) promise further increases, taxing both computational
equipment and statistical inethodology. Our long-term MERIT goal is to provide algorithms and theory
appropriate tp massive-data biomedical requirements.
我们的优异奖工作将继续有两个主要组成部分:参与特定的生物医学
RESAI-CH 项目如 NHBLI 的 FEHGAS 研究,并开发适当的新统计方法
用于分析大型、复杂的数据集。这些努力与具体项目是相辅相成的
??建议需要哪些统计方法,并作为新方法的测试用例。
例如,FEHGAS 研究旨在根据 SNP 数据(以及
背景变量,例如年龄和性别)。有 550,000 个 SNP 可用于预测,其中大部分
这将被证明是无用的,使问题比在更大程度上更具挑战性
表达微阵列情况。埃夫隆计划扩展他最近的经验贝叶斯方法论
本文就是针对这种背景提出的,希望能够克服由于预测能力通常较弱而带来的困难
单个 SNP。 Olshen 计划扩展 CART(计算机辅助回归树)和 bootstrap
选择有前景的预测 SNP 组的方法。
大规模显着性测试,例如在微阵列癌症研究中选择“显着”基因,
已成为统计发展的一个领域。然而,适当实施的关键问题
文献中仍然含糊不清:选择适当的零假设;的选择
比较集(所有 550,000 个 SNP 应该一起测试还是通过染色体单独测试?);和效果
的相关性。正如进展中所述,我们在回答 scf 问题方面取得了一些进展
报告。我们的持续努力是方法实施和理论发展的结合。
相关性对标准化的统计技术有特别显着的影响。 III“是一组微阵列
相互独立吗?”shovyn 一项涉及20000个基因的研究有有效样本
由于严重的基因相关性,大小减少到约 17。我们目前正在开发诊断
发现海量数据集中相关性困难并评估其对假设检验的影响的方法,
估计和预测。一项 20,000 个基因微阵列研究产生了 200,000,000 个相关性,这听起来
对于实际的洞察力而言,其规模之大令人难以承受。但我们在经验 Ba5'es 近似方面取得了进展
将相关性和影响总结为一个数字,适合简单分析。
二十世纪的生物统计学应用本质上是绝大多数的频率论。纯粹:频率主义,
然而,分析现代生物医学产生的大型、复杂的数据集是不切实际的
设备,其中必须考虑数千个参数和数百万个数据点的关系
一起。我们正在继续开发经验贝叶斯方法,使贝叶斯思想能够被应用到
承担多重推理的问题,而不需要 .scientist 的特定先验分布。
一个长期项目是了解医学研究中经验贝叶斯信息积累的速度有多快。
错误发现率是对基因(或 SNP、体素)的贝叶斯后验概率的估计
给定观察到的数据,为“空”。我们需要观察多少个受试者和多少个基因才能顺序
获得后验概率的准确经验贝叶斯估计?
在我们自己版本的摩尔定律中,生物医学数据集的大小每增加一个数量级
自20世纪90年代以来的几年。新兴技术(平铺阵列、珠阵列、适体芯片、甲基化阵列、
外显子芯片和各种新的成像设备)有望进一步增加,这对计算和计算都造成了负担
设备和统计方法学。我们的长期 MERIT 目标是提供算法和理论
适当的tp海量数据生物医学需求。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
BRADLEY EFRON其他文献
BRADLEY EFRON的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('BRADLEY EFRON', 18)}}的其他基金
STATISTICAL METHODS FOR IDENTITY BY DESCENT MAPS
通过血统图进行身份识别的统计方法
- 批准号:
2674211 - 财政年份:1994
- 资助金额:
$ 22.43万 - 项目类别:
相似海外基金
DMS-EPSRC: Asymptotic Analysis of Online Training Algorithms in Machine Learning: Recurrent, Graphical, and Deep Neural Networks
DMS-EPSRC:机器学习中在线训练算法的渐近分析:循环、图形和深度神经网络
- 批准号:
EP/Y029089/1 - 财政年份:2024
- 资助金额:
$ 22.43万 - 项目类别:
Research Grant
CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
- 批准号:
2337776 - 财政年份:2024
- 资助金额:
$ 22.43万 - 项目类别:
Continuing Grant
CAREER: From Dynamic Algorithms to Fast Optimization and Back
职业:从动态算法到快速优化并返回
- 批准号:
2338816 - 财政年份:2024
- 资助金额:
$ 22.43万 - 项目类别:
Continuing Grant
CAREER: Structured Minimax Optimization: Theory, Algorithms, and Applications in Robust Learning
职业:结构化极小极大优化:稳健学习中的理论、算法和应用
- 批准号:
2338846 - 财政年份:2024
- 资助金额:
$ 22.43万 - 项目类别:
Continuing Grant
CRII: SaTC: Reliable Hardware Architectures Against Side-Channel Attacks for Post-Quantum Cryptographic Algorithms
CRII:SaTC:针对后量子密码算法的侧通道攻击的可靠硬件架构
- 批准号:
2348261 - 财政年份:2024
- 资助金额:
$ 22.43万 - 项目类别:
Standard Grant
CRII: AF: The Impact of Knowledge on the Performance of Distributed Algorithms
CRII:AF:知识对分布式算法性能的影响
- 批准号:
2348346 - 财政年份:2024
- 资助金额:
$ 22.43万 - 项目类别:
Standard Grant
CRII: CSR: From Bloom Filters to Noise Reduction Streaming Algorithms
CRII:CSR:从布隆过滤器到降噪流算法
- 批准号:
2348457 - 财政年份:2024
- 资助金额:
$ 22.43万 - 项目类别:
Standard Grant
EAGER: Search-Accelerated Markov Chain Monte Carlo Algorithms for Bayesian Neural Networks and Trillion-Dimensional Problems
EAGER:贝叶斯神经网络和万亿维问题的搜索加速马尔可夫链蒙特卡罗算法
- 批准号:
2404989 - 财政年份:2024
- 资助金额:
$ 22.43万 - 项目类别:
Standard Grant
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
- 批准号:
2339310 - 财政年份:2024
- 资助金额:
$ 22.43万 - 项目类别:
Continuing Grant
CAREER: Improving Real-world Performance of AI Biosignal Algorithms
职业:提高人工智能生物信号算法的实际性能
- 批准号:
2339669 - 财政年份:2024
- 资助金额:
$ 22.43万 - 项目类别:
Continuing Grant














{{item.name}}会员




