Deep tensor genomic imputation
深度张量基因组插补
基本信息
- 批准号:10557916
- 负责人:
- 金额:$ 38.38万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-02-01 至 2025-01-31
- 项目状态:未结题
- 来源:
- 关键词:3-DimensionalArchitectureAvocadoAwarenessBindingBiochemicalBiological AssayCell LineCellsCellular AssayChromatinChromatin Interaction Analysis by Paired-End Tag SequencingCollectionComplexComputer softwareCouplesDNADNA MethylationDNA SequenceDNA sequencingDataData SetDiseaseEpigenetic ProcessEvaluationFutureGene ExpressionGenetic TranscriptionGenetic VariationGenomeGenomicsGenotype-Tissue Expression ProjectGoalsHealthHi-CHigh-Throughput Nucleotide SequencingHumanIndividualInternetInvestigationJointsLearningMachine LearningMeasurementMeasuresMethodologyMethodsMethylationModelingMolecularPatternPositioning AttributeProcessPropertyRegulatory ElementResolutionResourcesSamplingScientistSystemTechniquesTechnologyTissuesTrainingUnited States National Institutes of HealthUntranslated RNAValidationVariantWorkbiological systemscell typecostdata standardsdeep neural networkexperimental studygenetic manipulationgenome-widegenomic datagenomic locushistone modificationimprovedin silicoinventionlarge datasetsmodel organismnext generationopen sourcepredictive modelingsyntaxtranscription factorweb portal
项目摘要
Project Summary/Abstract
High-throughput sequencing assays allow scientists to measure biochemical properties like transcription factor
binding, histone modifications, and gene expression in nearly any cell line or primary tissue (“biosample”).
Unfortunately, measuring all possible biochemical properties in every biosample is infeasible, both because of
limited sample availability and because the cost would be prohibitive. We have previously developed a state-of-
the-art imputation method, called Avocado, that can fill in the holes in such data sets. Avocado couples tensor
factorization with a deep neural network. The method is scalable to large data sets and provides more accurate
imputations than competing methods such as ChromImpute or PREDICTD. We have already applied Avocado
systematically to the NIH ENCODE data set and made the imputations publicly available via the ENCODE web
por tal.
Here, we propose to extend Avocado in four important ways. First, we will extend Avocado to handle single-cell
data sets, thereby effectively turning each single-cell experiment into an in silico co-assay that measures multiple
properties of each cell in parallel. Second, we will extend Avocado to work with data such as Hi-C, which measures
three-dimensional properties of DNA. The extension involves converting Avocado's 3D tensor (biosample assay
genomic position) to a 4D tensor with two genomic position axes. This extension will apply to a wide variety
of data types, including various types of Hi-C data, SPRITE, GAM, ChIA-PET and PLAC-seq. Third, we will
enhance Avocado to use variant aware genomic sequence to enable high-resolution imputation of regulatory
profiles. Finally, we will leverage the imputed data to infer cis-regulatory sequence annotations and the molecular
impact of regulatory non-coding variants in one of the most comprehensive collections of cellular contexts.
All of the software produced by this project will be open source, and all of the imputed data and latent
factorizations will be made publicly available via the web portals associated with the NIH 4D Nucleome and
ENCODE Consortia, providing a valuable public resource for users of these data sets.
项目总结/摘要
高通量测序分析使科学家能够测量生物化学特性,如转录因子
结合,组蛋白修饰和基因表达在几乎任何细胞系或原代组织(“生物样品”)。
不幸的是,测量每个生物样品中所有可能的生化特性是不可行的,这既是因为
有限的样品可用性,因为成本过高。我们以前开发了一个国家-
最先进的估算方法,称为鳄梨,可以填补这些数据集中的漏洞。鳄梨耦合张量
使用深度神经网络进行分解。该方法可扩展到大型数据集,并提供更准确的
与ChromImpute或PREDICTD等竞争方法相比,我们已经应用了鳄梨
系统地与NIH ENCODE数据集进行比较,并通过ENCODE网站公开提供估算结果
por tal.
在这里,我们建议以四种重要的方式扩展Avocado。首先,我们将扩展Avocado以处理单细胞
数据集,从而有效地将每个单细胞实验转化为计算机辅助测定,
每个单元格的属性并行。其次,我们将扩展Avocado,使其能够处理Hi-C等数据,
DNA的三维特性。该扩展涉及将Avocado的3D张量(生物样品测定)
基因组位置)到具有两个基因组位置轴的4D张量。这一扩展将适用于各种
数据类型,包括各种类型的Hi-C数据、SPRITE、GAM、ChIA-PET和PLAC-seq。三是
增强鳄梨使用变异感知基因组序列,以实现高分辨率的调控
profiles.最后,我们将利用估算的数据来推断顺式调控序列注释和分子生物学特性。
在最全面的细胞环境集合之一的调控非编码变体的影响。
该项目产生的所有软件都将是开源的,所有的估算数据和潜在的
因子分解将通过与NIH 4D Nucleome相关的门户网站公开提供,
ENCODE Consortia,为这些数据集的用户提供了宝贵的公共资源。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
William Stafford Noble其他文献
Learning a latent representation of human genomics using Avocado
使用鳄梨学习人类基因组学的潜在表示
- DOI:
10.1101/2020.06.18.159756 - 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
Jacob M. Schreiber;William Stafford Noble - 通讯作者:
William Stafford Noble
Cohesin interacts with a panoply of splicing factors required for cell cycle progression and genomic organization
粘连蛋白与细胞周期进程和基因组组织所需的一系列剪接因子相互作用
- DOI:
- 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
Jung‐Sik Kim;Xiaoyuan He;Jie Liu;Z. Duan;Taeyeon Kim;J. Gerard;Brian S. Kim;William Arbuthnot Sir Lane;William Stafford Noble;B. Budnik;T. Waldman - 通讯作者:
T. Waldman
Self‐Reports about Tinnitus and about Cochlear Implants
关于耳鸣和人工耳蜗的自我报告
- DOI:
10.1097/00003446-200008001-00007 - 发表时间:
2000 - 期刊:
- 影响因子:3.7
- 作者:
William Stafford Noble - 通讯作者:
William Stafford Noble
A COMPARATIVE ANALYSIS OF THE CLINICAL AND FUNCTIONAL OUTCOME OF HIGH FLEXION AND STANDARD TOTAL KNEE REPLACEMENT PROSTHESIS
高屈度与标准全膝关节置换假肢临床及功能结果的比较分析
- DOI:
- 发表时间:
2015 - 期刊:
- 影响因子:0
- 作者:
T. Pramila;Wei Wu;William Stafford Noble;L. Breeden - 通讯作者:
L. Breeden
A biologist ’ s introduction to support vector machines
- DOI:
- 发表时间:
2006 - 期刊:
- 影响因子:0
- 作者:
William Stafford Noble - 通讯作者:
William Stafford Noble
William Stafford Noble的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('William Stafford Noble', 18)}}的其他基金
Optimization and joint modeling for peptide detection by tandem mass spectrometry
串联质谱肽检测的优化和联合建模
- 批准号:
9214942 - 财政年份:2017
- 资助金额:
$ 38.38万 - 项目类别:
Project 2: UW-CNOF Data Analysis and Modeling
项目 2:UW-CNOF 数据分析和建模
- 批准号:
9021413 - 财政年份:2015
- 资助金额:
$ 38.38万 - 项目类别:
University of Washington Center for Nuclear Organization and Function
华盛顿大学核组织与功能中心
- 批准号:
9983850 - 财政年份:2015
- 资助金额:
$ 38.38万 - 项目类别:
University of Washington Center for Nuclear Organization and Function
华盛顿大学核组织与功能中心
- 批准号:
9353379 - 财政年份:2015
- 资助金额:
$ 38.38万 - 项目类别:
University of Washington Center for Nuclear Organization and Function
华盛顿大学核组织与功能中心
- 批准号:
9916567 - 财政年份:2015
- 资助金额:
$ 38.38万 - 项目类别:
Machine learning methods to impute and annotate epigenomic maps
用于估算和注释表观基因组图谱的机器学习方法
- 批准号:
8814095 - 财政年份:2014
- 资助金额:
$ 38.38万 - 项目类别:
Machine learning methods to impute and annotate epigenomic maps
用于估算和注释表观基因组图谱的机器学习方法
- 批准号:
8925082 - 财政年份:2014
- 资助金额:
$ 38.38万 - 项目类别:
BIGDATA: DA: Interpreting massive genomic data sets via summarization
BIGDATA:DA:通过汇总解释海量基因组数据集
- 批准号:
8642168 - 财政年份:2013
- 资助金额:
$ 38.38万 - 项目类别:
BIGDATA: DA: Interpreting massive genomic data sets via summarization
BIGDATA:DA:通过汇总解释海量基因组数据集
- 批准号:
8840551 - 财政年份:2013
- 资助金额:
$ 38.38万 - 项目类别:
相似海外基金
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
- 批准号:
2339310 - 财政年份:2024
- 资助金额:
$ 38.38万 - 项目类别:
Continuing Grant
CAREER: Creating Tough, Sustainable Materials Using Fracture Size-Effects and Architecture
职业:利用断裂尺寸效应和架构创造坚韧、可持续的材料
- 批准号:
2339197 - 财政年份:2024
- 资助金额:
$ 38.38万 - 项目类别:
Standard Grant
Travel: Student Travel Support for the 51st International Symposium on Computer Architecture (ISCA)
旅行:第 51 届计算机体系结构国际研讨会 (ISCA) 的学生旅行支持
- 批准号:
2409279 - 财政年份:2024
- 资助金额:
$ 38.38万 - 项目类别:
Standard Grant
Understanding Architecture Hierarchy of Polymer Networks to Control Mechanical Responses
了解聚合物网络的架构层次结构以控制机械响应
- 批准号:
2419386 - 财政年份:2024
- 资助金额:
$ 38.38万 - 项目类别:
Standard Grant
I-Corps: Highly Scalable Differential Power Processing Architecture
I-Corps:高度可扩展的差分电源处理架构
- 批准号:
2348571 - 财政年份:2024
- 资助金额:
$ 38.38万 - 项目类别:
Standard Grant
Collaborative Research: Merging Human Creativity with Computational Intelligence for the Design of Next Generation Responsive Architecture
协作研究:将人类创造力与计算智能相结合,设计下一代响应式架构
- 批准号:
2329759 - 财政年份:2024
- 资助金额:
$ 38.38万 - 项目类别:
Standard Grant
Hardware-aware Network Architecture Search under ML Training workloads
ML 训练工作负载下的硬件感知网络架构搜索
- 批准号:
2904511 - 财政年份:2024
- 资助金额:
$ 38.38万 - 项目类别:
Studentship
The architecture and evolution of host control in a microbial symbiosis
微生物共生中宿主控制的结构和进化
- 批准号:
BB/X014657/1 - 财政年份:2024
- 资助金额:
$ 38.38万 - 项目类别:
Research Grant
RACCTURK: Rock-cut Architecture and Christian Communities in Turkey, from Antiquity to 1923
RACCTURK:土耳其的岩石建筑和基督教社区,从古代到 1923 年
- 批准号:
EP/Y028120/1 - 财政年份:2024
- 资助金额:
$ 38.38万 - 项目类别:
Fellowship
NSF Convergence Accelerator Track M: Bio-Inspired Surface Design for High Performance Mechanical Tracking Solar Collection Skins in Architecture
NSF Convergence Accelerator Track M:建筑中高性能机械跟踪太阳能收集表皮的仿生表面设计
- 批准号:
2344424 - 财政年份:2024
- 资助金额:
$ 38.38万 - 项目类别:
Standard Grant














{{item.name}}会员




