Efficient Methods for Dimensionality Reduction ofSingle-Cell RNA-Sequencing Data
单细胞 RNA 测序数据降维的有效方法
基本信息
- 批准号:10356883
- 负责人:
- 金额:$ 5.18万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-03-16 至 2023-03-15
- 项目状态:已结题
- 来源:
- 关键词:AddressAdoptedAlgorithmsBiologicalCellsCodeCollectionCommunitiesComputer HardwareComputing MethodologiesConsensusDataData AnalysesData SetDevelopmentDimensionsDiseaseEvaluationFellowshipGaussian modelGenesHourHumanLanguageLearningLibrariesMathematicsMeasuresMentorshipMethodsModelingModernizationNamesNoiseNormal Statistical DistributionPaperPhysiciansPhysiologyPopulationPrincipal Component AnalysisProcessPublishingRNARandomizedResearch PersonnelResolutionRunningScientistSpeedStatistical BiasStatistical MethodsSystematic BiasTechniquesTechnologyTimeTissuesTrainingVariantVisualizationbasedesigndimensional analysisdistributed dataexperienceexperimental studyhigh dimensionalityimprovedinsightlaptopnon-Gaussian modelparallelizationprofessorsingle cell analysissingle-cell RNA sequencingstatisticssupercomputertheoriestooltranscriptometranscriptome sequencing
项目摘要
Project Summary: Efficient Methods for Dimensionality Reduction of Single-Cell RNA-Sequencing Data
Single-cell RNA-sequencing is a revolutionary technology enabling discoveries in human physiology and
disease. The datasets generated from single-cell RNA-sequencing experiments are so large that they cannot be
analyzed or visualized using traditional statistical methods until the datasets have been shrunk using a
technique named “dimensionality reduction.” Almost every analysis of single-cell RNA-sequencing begins
using a technique named principal component analysis (PCA) to accomplish dimensionality reduction.
However, single-cell RNA-sequencing presents unique challenges making PCA difficult. First, the size of these
datasets is so large that computing PCA requires specialized hardware and multiple hours. Fast algorithms to
approximate PCA have been shown to dramatically speed up this process, but have not proliferated in the
single cell-RNA sequencing community, in part because no parallelized algorithm has been written in the R
computing language. Second, PCA requires the researcher to decide the final desired size of the dataset.
Choosing too small of a size results in discarding valuable biological insights, while choosing too large a size
increases the noise. However, there is no consensus on how to pick the optimal size for single-cell RNA
sequencing, and there is evidence that this size might be systematically underestimated. Lastly, PCA cannot be
applied directly to the count-data measured in single cell RNA sequencing, so researchers must first apply a
preprocessing technique to normalize it. The current standard in the field is to apply the log transform –
however, several recent studies have shown that the log transform creates statistical biases in single-cell RNA
sequencing. In this fellowship, specifically tailored, fast methods for performing PCA on single-cell RNA-
sequencing data will be developed: 1a) A framework to rigorously measure the consequence of changing
preprocessing parameters on the final results of several publicly available single cell RNA sequencing datasets
to enable experimentation of PCA on single-cell RNA-sequencing data. 1b) An ultra-fast, parallelized
implementation of randomized PCA allowing researchers using standard laptops to rapidly perform PCA on
single cell RNA sequencing data. 2) A technique for rigorously choosing the final size when performing
principal component analysis for single-cell RNA-sequencing datasets. 3) A method for transforming single-cell
RNA-sequencing data so that it becomes appropriately distributed enabling proper usage of PCA without
incurring statistical biases. This fellowship also includes a detailed training plan with valuable learning
experiences for the applicant’s development as a physician-scientist who can apply methods from high
dimensional-statistics to solving biomedical problems.
项目摘要:减少单细胞RNA测序数据重复性的有效方法
单细胞RNA测序是一项革命性的技术,能够发现人类生理学和
疾病单细胞RNA测序实验生成的数据集非常大,
使用传统的统计方法进行分析或可视化,直到使用
这就是所谓的“降维”技术。几乎每一次单细胞RNA测序分析都始于
使用称为主成分分析(PCA)的技术来实现降维。
然而,单细胞RNA测序提出了独特的挑战,使PCA变得困难。第一,这些
数据集是如此之大,以至于计算PCA需要专门的硬件和多个小时。快速算法,
近似PCA已被证明可以大大加快这一过程,但在
单细胞RNA测序社区,部分原因是没有并行算法已经写在R
计算机语言其次,PCA要求研究人员决定数据集的最终期望大小。
选择太小的尺寸会导致丢弃有价值的生物学见解,而选择太大的尺寸
增加了噪音。然而,对于如何选择单细胞RNA的最佳大小,
测序,有证据表明,这一规模可能被系统地低估。最后,PCA不能
直接应用于单细胞RNA测序中测量的计数数据,因此研究人员必须首先应用
该领域的当前标准是应用对数变换-
然而,最近的几项研究表明,对数变换在单细胞RNA中产生了统计偏差
测序在这项研究中,专门定制的,快速的方法进行PCA的单细胞RNA-
测序数据将被开发:1a)一个框架,以严格衡量改变的后果
预处理参数对几个公开可用的单细胞RNA测序数据集的最终结果
以使PCA在单细胞RNA测序数据上的实验成为可能。1b)一个超快、并行化的
实施随机PCA,使研究人员能够使用标准笔记本电脑快速对
单细胞RNA测序数据。2)一种在表演时严格选择最终尺寸的技巧
单细胞RNA测序数据集的主成分分析。3)一种转化单细胞的方法,
RNA测序数据,使其变得适当分布,从而能够正确使用PCA,
导致统计偏差。该奖学金还包括一个详细的培训计划与宝贵的学习
申请人作为一名医生,科学家的发展经验,可以从高层次应用方法
解决生物医学问题。
项目成果
期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Quantitative assessment of p16 expression in FNA specimens from head and neck squamous cell carcinoma and correlation with HPV status.
- DOI:10.1002/cncy.22399
- 发表时间:2021-05
- 期刊:
- 影响因子:3.4
- 作者:Abi-Raad R;Prasad ML;Gilani S;Garritano J;Barlow D;Cai G;Adeniran AJ
- 通讯作者:Adeniran AJ
RAS mutation and associated risk of malignancy in the thyroid gland: An FNA study with cytology-histology correlation.
- DOI:10.1002/cncy.22537
- 发表时间:2022-04
- 期刊:
- 影响因子:3.4
- 作者:Gilani, Syed M.;Abi-Raad, Rita;Garritano, James;Cai, Guoping;Prasad, Manju L.;Adeniran, Adebowale J.
- 通讯作者:Adeniran, Adebowale J.
Anaplastic Thyroid Carcinoma: Cytomorphologic Features on Fine-Needle Aspiration and Associated Diagnostic Challenges.
甲状腺未分化癌:细针抽吸的细胞形态学特征及相关诊断挑战。
- DOI:10.1093/ajcp/aqab159
- 发表时间:2022
- 期刊:
- 影响因子:3.5
- 作者:Podany,Peter;Abi-Raad,Rita;Barbieri,Andrea;Garritano,James;Prasad,ManjuL;Cai,Guoping;Adeniran,AdebowaleJ;Gilani,SyedM
- 通讯作者:Gilani,SyedM
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
James Michael Garritano其他文献
James Michael Garritano的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似海外基金
How novices write code: discovering best practices and how they can be adopted
新手如何编写代码:发现最佳实践以及如何采用它们
- 批准号:
2315783 - 财政年份:2023
- 资助金额:
$ 5.18万 - 项目类别:
Standard Grant
One or Several Mothers: The Adopted Child as Critical and Clinical Subject
一位或多位母亲:收养的孩子作为关键和临床对象
- 批准号:
2719534 - 财政年份:2022
- 资助金额:
$ 5.18万 - 项目类别:
Studentship
A material investigation of the ceramic shards excavated from the Omuro Ninsei kiln site: Production techniques adopted by Nonomura Ninsei.
对大室仁清窑遗址出土的陶瓷碎片进行材质调查:野野村仁清采用的生产技术。
- 批准号:
20K01113 - 财政年份:2020
- 资助金额:
$ 5.18万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
A comparative study of disabled children and their adopted maternal figures in French and English Romantic Literature
英法浪漫主义文学中残疾儿童及其收养母亲形象的比较研究
- 批准号:
2633211 - 财政年份:2020
- 资助金额:
$ 5.18万 - 项目类别:
Studentship
A comparative study of disabled children and their adopted maternal figures in French and English Romantic Literature
英法浪漫主义文学中残疾儿童及其收养母亲形象的比较研究
- 批准号:
2436895 - 财政年份:2020
- 资助金额:
$ 5.18万 - 项目类别:
Studentship
A comparative study of disabled children and their adopted maternal figures in French and English Romantic Literature
英法浪漫主义文学中残疾儿童及其收养母亲形象的比较研究
- 批准号:
2633207 - 财政年份:2020
- 资助金额:
$ 5.18万 - 项目类别:
Studentship
A Study on Mutual Funds Adopted for Individual Defined Contribution Pension Plans
个人设定缴存养老金计划采用共同基金的研究
- 批准号:
19K01745 - 财政年份:2019
- 资助金额:
$ 5.18万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
The limits of development: State structural policy, comparing systems adopted in two European mountain regions (1945-1989)
发展的限制:国家结构政策,比较欧洲两个山区采用的制度(1945-1989)
- 批准号:
426559561 - 财政年份:2019
- 资助金额:
$ 5.18万 - 项目类别:
Research Grants
Securing a Sense of Safety for Adopted Children in Middle Childhood
确保被收养儿童的中期安全感
- 批准号:
2236701 - 财政年份:2019
- 资助金额:
$ 5.18万 - 项目类别:
Studentship
Structural and functional analyses of a bacterial protein translocation domain that has adopted diverse pathogenic effector functions within host cells
对宿主细胞内采用多种致病效应功能的细菌蛋白易位结构域进行结构和功能分析
- 批准号:
415543446 - 财政年份:2019
- 资助金额:
$ 5.18万 - 项目类别:
Research Fellowships