RECONSTRUCTION FROM HETEROGENEOUS MOLECULE POPULATIONS
从异质分子群重建
基本信息
- 批准号:7954575
- 负责人:
- 金额:$ 11.17万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2009
- 资助国家:美国
- 起止时间:2009-02-01 至 2010-01-31
- 项目状态:已结题
- 来源:
- 关键词:AccountingAddressAffectAlgorithmsArchitectureAwardBindingBioinformaticsBiologicalBiteClassificationCollaborationsCommunitiesComputer Retrieval of Information on Scientific Projects DatabaseComputer softwareDataData SetDepositionDevelopmentDocumentationEscherichia coliEuropeanFrequenciesFundingFutureGenesGrantHandHeterogeneityImageImageryInstitutesInstitutionJournalsLigand BindingLigandsManuscriptsMapsMemoryMethodsMolecular ConformationNatureNoisePaperPeptide Elongation Factor GPharmacy facilityPhasePopulationPreparationProcessPublicationsReportingResearchResearch PersonnelResolutionResourcesRibosomesRoentgen RaysSamplingSeedsSignal TransductionSourceSpidersStructureStudentsTest ResultTestingUnited States National Institutes of HealthUniversitiesWorkWritingabstractingbasedensitydesignexperiencefortificationmacromoleculeparticlepreventreconstructionresearch studysoftware developmentstructural biologysuccesssupercomputer
项目摘要
This subproject is one of many research subprojects utilizing the
resources provided by a Center grant funded by NIH/NCRR. The subproject and
investigator (PI) may have received primary funding from another NIH source,
and thus could be represented in other CRISP entries. The institution listed is
for the Center, which is not necessarily the institution for the investigator.
ABSTRACT:
This TRD addresses a problem that is paramount in cryo-EM single-particle reconstruction of macromolecules, and that is in many cases the single obstacle preventing the attainment of high resolution (better than 10 ¿). This problem is the heterogeneity of molecules in the sample due to partial ligand occupancy and conformational variability. We will develop general approaches for the classification of heterogeneous molecule populations from their cryo-EM projections, which will include both supervised and unsupervised classification methods. We will interact with leading experts in this field and use typical data both from the PI's group and from other groups pursuing single-particle reconstruction. Resulting software, if successful, will be made available to a wide community.
Specific Aims:
1) (Exploration phase): Explore methods of classification of single-particle projections that refine existing template-based approaches, or exploit general intrinsic mathematical relationships among projections of unchanged objects. In this phase of the project, algorithms such as self-organized (SOMs) will be designed, or the utility of existing ones explored. Phantom data sets are derived from existing density maps of molecules or from X-ray structures that present different conformations or states of ligand binding. Such maps are projected systematically into a variety of directions, the resulting projections are low-pass filtered and contaminated with noise. These data will allow a determination of which algorithm or which SOM configuration will perform best at different resolutions and signal-to-noise ratios.
2) (Testing phase): Test the resulting algorithms and SOMs on well-defined experimental cryo-EM data sets from single-particle projects that are conducted within and outside the Wadsworth Center. Ideally, these should be data that have been characterized in previous publications, so that the improvements due to the new classification approaches can be easily assessed.
3) (Dissemination phase): Integrate the software with existing SPIDER software and develop comprehensive documentation. Publication of the underlying concepts in explicit form will also allow other authors of software packages such as EMAN (Ludtke et al., 2001) to implement their own version, for wider dissemination.
Choice of Maximum Likelihood Classification (ML3D) as standard
A collaboration with the Jose-Maria Carazo group, our main collaborator in TRD3, produced remarkable results and this has evidently helped to popularize the Maximum-likelihood method within the 3DEM community. 90,000 ribosome images were classified according to EF-G binding and associated "ratcheting" changes in ribosome conformation. Following collaborative publication of the Nature Methods paper by Scheres et al. in 2007), there has been a surge of applications by several EM groups in the field.
Because of the success of this approach, we have stopped pursuing the "cluster tracking" method (Fu et al., J. Structural Biology 2007) since efforts to expand the cluster tracking globally (in the hands of BMS student Jie Fu and RVBC-supported posrdoc Tanvir Shaikh) were unsuccessful (details to be found in Jie Fu's dissertation). Much larger datasets may be needed to pursue this particular development in the future.
One of our collaborators, Dr. Harry Zuzan, is working on a GPU (graphics processing unit) implementation of Scheres' Maximum-likelihood method. Speedups of up to 100 might be expected. Dr. Zuzan is doing this as a private effort as he is now employed by a Pharmacy Company. He has promised to share the software as well as the hardware specifications with us once he succeeds.
Construction of a Phantom Dataset
To enable an objective comparison of classification methods, or parameter settings of any particular method, we set out to construct a phantom data set based on the E. coli ribosome with and without EF-G bound. We argued that such an effort would not only serve our own optimization efforts, but would also be welcomed by the entire 3DEM community. An analysis of the noise sources showed that an important source of noise, namely structural noise, had been overlooked in all previous attempts to produce phantom data. As described in the previous report, we conducted experiments to estimate the signal-to-noise ratio (SNR) of various steps of EM image formation, including the SNR of structural noise. The method and results of the estimation has been written up in a paper by Baxter et al., and submitted to the Journal of Structural Biology. The manuscript features both an estimation of the SNRs but also of their spectral distributions (SSNRs). Since the estimates of the SSNR distributions were of limited accuracy in the high-frequency range, the reviewers asked for an increase in the dataset for statistical fortification, and Dr. Baxter is now processing a larger dataset. However, this issue does not affect the accuracy of the SNR estimation. Concurrent with the preparation of a revised manuscript, we have therefore constructed a phantom dataset using the SNR values from our estimation, and have deposited the data with the European Bioinformatics Institute (EBI) in Cambridge.
Experience with ML3D of Phantom Data, and Supercomputer applications
Test computations for small datasets (decimated arrays and small number of images) showed very inconsistent results. The results were different for different choices of seeds, and this convinced us that we need to go to larger datasets to establish optimal settings. Our strategy was therefore to apply for a large allocation on the Teragrid. Dr. Baxter and Dr. Frank applied separately for accounts associated, respectively, with the RVBC at Wadsworth and accounts associated with Columbia University for the ribosome collaborative projects. On October 1, 2008 allocations of 100,000 and 450,000 were awarded.
We had also initially hoped to be able to install XMIPP, the Madrid-based software in which ML3D is embedded, on RPI's Blue Gene. Unfortunately, incompatibility of The Blue Gene's 32-bit architecture with XMIPP and memory issues prevented progress with this particular supercomputer.
该副本是使用众多研究子项目之一
由NIH/NCRR资助的中心赠款提供的资源。子弹和
调查员(PI)可能已经从其他NIH来源获得了主要资金,
因此可以在其他清晰的条目中代表。列出的机构是
对于中心,这是调查员的机构。
抽象的:
该TRD解决了一个大分子的冷冻EM单粒子重建中至关重要的问题,在许多情况下,这是一种障碍,阻止了高分辨率的属性(大于10€)。这个问题是由于部分配体占用率和构象变异性,样品中分子的异质性。我们将开发一般的方法来从其冷冻EM项目中分类异质分子种群,其中包括受监督和无监督的分类方法。我们将与该领域的领先专家进行互动,并使用PI组的典型数据以及其他追求单粒子重建的组。结果软件(如果成功)将提供给广泛的社区。
具体目的:
1)(探索阶段):探索单粒子项目的分类方法,这些方法可以完善现有的基于模板的方法,或探索未改变对象项目之间的一般内在数学关系。在项目的这个阶段,将设计诸如自组织(SOM)之类的算法,或者探索了现有算法。幻影数据集源自分子的现有密度图或X射线结构,这些结构呈现出不同的会议或配体结合状态。这样的地图系统地将其投影到各种方向上,由此产生的项目被低通滤波并被噪声污染。这些数据将允许确定哪种算法或哪种SOM配置将在不同的分辨率和信噪比的比率下表现最佳。
2)(测试阶段):从沃兹沃思中心内外进行的单粒子项目中测试所得算法和SOM。理想情况下,这些应该是以前出版物中表征的数据,以便可以轻松评估由于新的分类方法而进行的改进。
3)(传播阶段):将软件与现有的蜘蛛软件集成并制定全面文档。以明确形式出版的基础概念还将允许其他软件包的作者(例如Eman(Ludtke等,2001)实施自己的版本,以进行更广泛的传播。
选择最大似然分类(ML3D)标准
我们与我们在TRD3的主要合作者Jose-Maria Carazo集团的合作取得了显着的结果,这显然有助于在3DEM社区中普及了最大样品。根据EF-G结合和核糖体会议中相关的“棘轮”变化,对90,000个核糖体图像进行了分类。在Scheres等人撰写的《自然方法论文》的合作出版之后。在2007年),该领域的几个EM组发生了大量应用。
由于这种方法的成功,我们已经停止追求“集群跟踪”方法(Fu et al。,J。结构生物学2007)以来,自全球范围内扩展集群跟踪的努力(在BMS学生手中扩展了集群跟踪,而RVBC支持的Posrdoc posrdoc tanvir shaaikh)是不合格的(详细介绍了Jie Fu的详细信息)。将来可能需要更大的数据集来追求这一特定的发展。
我们的合作者之一Harry Zuzan博士正在研究Scheres最大样子方法的GPU(图形处理单元)。可能会预期高达100个加速度。 Zuzan博士正在作为私人努力,因为他现在被一家药房公司聘用。一旦他成功,他已承诺与我们共享该软件以及硬件规格。
幻影数据集的构建
为了实现分类方法或任何特定方法的参数设置的客观比较,我们着手基于有或没有EF-G结合的大肠杆菌核糖体构建幻影数据集。我们认为,这样的努力不仅可以为我们自己的优化工作,而且还会受到整个3DEM社区的欢迎。对噪声源的分析表明,在产生幻影数据的所有尝试中,都忽略了重要的噪声来源,即结构噪声。如上一报告所述,我们进行了实验,以估计EM图像形成的各个步骤(包括结构噪声的SNR)的信噪比(SNR)。估计的方法和结果已在Baxter等人的论文中写成,并提交了《结构生物学杂志》。手稿既有对SNR的估计,又有其频谱分布(SSNR)。由于SSNR分布的估计值在高频范围内的准确性有限,因此审阅者要求增加数据集以进行统计防御力,而Baxter博士现在正在处理较大的数据集。但是,此问题不会影响SNR估计的准确性。因此,与制备修订的手稿同时,我们使用估算中的SNR值构建了一个幻影数据集,并将数据存放在剑桥的欧洲生物信息学研究所(EBI)中。
具有幻影数据和超级计算机应用的ML3D经验
小型数据集的测试计算(被删除的阵列和少量图像)显示出非常不一致的结果。对于种子的不同选择,结果是不同的,这使我们相信我们需要转到更大的数据集以建立最佳设置。因此,我们的策略是在Teragrid上申请大量分配。 Baxter博士和Frank博士分别申请了与Wadsworth的RVBC以及与哥伦比亚大学有关的核糖体协作项目相关的RVBC的帐户。 2008年10月1日,授予了100,000和450,000的分配。
我们最初还希望能够安装Xmipp,这是基于马德里的软件,其中ML3D嵌入了RPI的蓝色基因上。不幸的是,蓝色基因与XMIPP的32位体系结构和内存问题的不相容性阻止了此特定的超级计算机的进步。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
JOACHIM FRANK其他文献
JOACHIM FRANK的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('JOACHIM FRANK', 18)}}的其他基金
Acquisition of Equipment for Structural Studies of Macromolecular Assemblies Using Cryo-EM
采购使用冷冻电镜进行大分子组装体结构研究的设备
- 批准号:
10635738 - 财政年份:2021
- 资助金额:
$ 11.17万 - 项目类别:
Structural Studies of Macromolecular Assemblies Using Cryo-EM
使用冷冻电镜进行大分子组装体的结构研究
- 批准号:
10552673 - 财政年份:2021
- 资助金额:
$ 11.17万 - 项目类别:
Structural Studies of Macromolecular Assemblies Using Cryo-EM
使用冷冻电镜进行大分子组装体的结构研究
- 批准号:
10335173 - 财政年份:2021
- 资助金额:
$ 11.17万 - 项目类别:
Development and Commercialization of a Sample Preparation System for Time Resolved Cryo-Electron Microscopy
时间分辨冷冻电子显微镜样品制备系统的开发和商业化
- 批准号:
10081915 - 财政年份:2020
- 资助金额:
$ 11.17万 - 项目类别:
Development and Commercialization of a Sample Preparation System for Time Resolved Cryo-Electron Microscopy
时间分辨冷冻电子显微镜样品制备系统的开发和商业化
- 批准号:
10461078 - 财政年份:2020
- 资助金额:
$ 11.17万 - 项目类别:
Development and Commercialization of a Sample Preparation System for Time Resolved Cryo-Electron Microscopy
时间分辨冷冻电子显微镜样品制备系统的开发和商业化
- 批准号:
10231377 - 财政年份:2020
- 资助金额:
$ 11.17万 - 项目类别:
STUDIES OF TRANSLATION IN E COLI IN THE PHASES OF INITIATION, DECODING,
大肠杆菌翻译起始阶段、解码阶段、
- 批准号:
8172266 - 财政年份:2010
- 资助金额:
$ 11.17万 - 项目类别:
RECONSTRUCTION FROM HETEROGENEOUS MOLECULE POPULATIONS
从异质分子群重建
- 批准号:
8172273 - 财政年份:2010
- 资助金额:
$ 11.17万 - 项目类别:
STUDIES OF TRANSLATION IN E COLI IN THE PHASES OF INITIATION, DECODING,
大肠杆菌翻译起始阶段、解码阶段、
- 批准号:
7954564 - 财政年份:2009
- 资助金额:
$ 11.17万 - 项目类别:
相似国自然基金
时空序列驱动的神经形态视觉目标识别算法研究
- 批准号:61906126
- 批准年份:2019
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
本体驱动的地址数据空间语义建模与地址匹配方法
- 批准号:41901325
- 批准年份:2019
- 资助金额:22.0 万元
- 项目类别:青年科学基金项目
大容量固态硬盘地址映射表优化设计与访存优化研究
- 批准号:61802133
- 批准年份:2018
- 资助金额:23.0 万元
- 项目类别:青年科学基金项目
IP地址驱动的多径路由及流量传输控制研究
- 批准号:61872252
- 批准年份:2018
- 资助金额:64.0 万元
- 项目类别:面上项目
针对内存攻击对象的内存安全防御技术研究
- 批准号:61802432
- 批准年份:2018
- 资助金额:25.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Climate Change Effects on Pregnancy via a Traditional Food
气候变化通过传统食物对怀孕的影响
- 批准号:
10822202 - 财政年份:2024
- 资助金额:
$ 11.17万 - 项目类别:
NeuroMAP Phase II - Recruitment and Assessment Core
NeuroMAP 第二阶段 - 招募和评估核心
- 批准号:
10711136 - 财政年份:2023
- 资助金额:
$ 11.17万 - 项目类别:
Genetic and Environmental Influences on Individual Sweet Preference Across Ancestry Groups in the U.S.
遗传和环境对美国不同血统群体个体甜味偏好的影响
- 批准号:
10709381 - 财政年份:2023
- 资助金额:
$ 11.17万 - 项目类别:
A Next Generation Data Infrastructure to Understand Disparities across the Life Course
下一代数据基础设施可了解整个生命周期的差异
- 批准号:
10588092 - 财政年份:2023
- 资助金额:
$ 11.17万 - 项目类别:
Substance use treatment and county incarceration: Reducing inequities in substance use treatment need, availability, use, and outcomes
药物滥用治疗和县监禁:减少药物滥用治疗需求、可用性、使用和结果方面的不平等
- 批准号:
10585508 - 财政年份:2023
- 资助金额:
$ 11.17万 - 项目类别: