Novel methods for large-scale genomic interval comparison
大规模基因组区间比较的新方法
基本信息
- 批准号:10842040
- 负责人:
- 金额:$ 31.47万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-08-10 至 2026-05-31
- 项目状态:未结题
- 来源:
- 关键词:AddressAdministrative SupplementAffectAlgorithmsAtlasesAutomobile DrivingAwardBindingBiologicalBiomedical ResearchChromatinChromosome TerritoryCommunitiesConfusionConsensusConsumptionCpG IslandsDNADNA MethylationDataData AnalysesData ReportingData SetData SourcesDatabasesDedicationsDiseaseDisease OutcomeEpigenetic ProcessExonsFriendsGene Expression RegulationGenesGeneticGenetic VariationGenomeGenomic SegmentGenomicsGoalsGrantHealthIndividualIntronsKnowledgeLearningLocationMachine LearningMeasuresMetadataMethodsModelingNational Human Genome Research InstituteOutcomeParentsProceduresProcessPropertyResearch PersonnelResourcesSeriesSiteSourceStandardizationSystemTimeTrainingVariantWorkcell typecloud baseddata sharingdesignepigenomeepigenomicsexperimental studygenome analysisgenome resourcehistone modificationimprovedinnovationmachine learning modelnovelparent grantpromotertranscription factoruser-friendlyvector
项目摘要
ABSTRACT
This administrative supplement creates AI/ML-ready resources for epigenome genomic interval data.
Epigenome data summarized as sets of genomic intervals are now available for thousands of variations of cell
type, disease, condition, etc. This data holds tremendous promise to understand gene regulation and disease be-
cause many health outcomes are affected by genetic variation or epigenetic perturbation in regulatory DNA. The
parent R01 develops novel, scalable algorithms and measures of similarity between genomic interval datasets.
These advances will improve both the efficiency and accuracy of existing biomedical research approaches that
rely on analyzing genomic region data. They will open the door to new ways of exploring the vast and growing
corpus of genome interval data.
In this administrative supplement, we seek to take this rich data source and produce AI/ML-ready resources for the
community. While there has been some effort to create uniformly processed databases of genomic interval data,
there are few high-quality genomic interval currently available that are designed for machine learning applications.
One of the first steps to integrating epigenome data across data sources is defining consensus regions that
fit the original data well. Many downstream analyses, particularly learning tasks, rely on such a consensus
region set. However, choosing a good consensus can be a time-consuming and confusing process, and also
has potential to lose substantial information and introduce errors into results. To help alleviate this challenge,
this proposal will take several datasets through a principled approach to generate AI/ML-ready resources. This
process will include 1) defining consensus regions; 2) projecting raw data into the consensus to standardize it;
and 3) standardizing annotation. Finally, we will make these available to the community with user-friendly and
well-documented interfaces. The outcome will be a series of datasets that are ready for use for the community to
build ML models.
摘要
该管理补充为表观基因组基因组间隔数据创建AI/ML就绪资源。
表观基因组数据总结为基因组间隔集,现在可用于数千种细胞变异。
类型,疾病,条件等。这些数据为了解基因调控和疾病提供了巨大的希望-
因为许多健康结果受到基因变异或调控DNA中的表观遗传干扰的影响。的
亲本R 01开发了新的、可扩展的算法和基因组间隔数据集之间的相似性度量。
这些进展将提高现有生物医学研究方法的效率和准确性,
依赖于分析基因组区域数据。他们将打开大门,以新的方式探索广阔的和不断增长的
基因组间隔数据库。
在这份行政补充中,我们试图利用这一丰富的数据源,为
社区虽然已经做出了一些努力来创建基因组间隔数据的统一处理的数据库,
目前几乎没有为机器学习应用设计的高质量基因组间隔。
跨数据源整合表观基因组数据的第一步是定义一致区域,
把原始数据做好。许多下游分析,特别是学习任务,依赖于这样的共识
区域设置。然而,选择一个好的共识可能是一个耗时和混乱的过程,
有可能丢失大量信息并在结果中引入错误。为了帮助缓解这一挑战,
该提案将通过一种原则性的方法来获取多个数据集,以生成AI/ML就绪的资源。这
过程将包括1)定义共识区域; 2)将原始数据投影到共识中以使其标准化;
(3)规范注释。最后,我们会以简单易用的方式,
良好记录的接口。结果将是一系列数据集,可供社区使用,
构建ML模型。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Nathan Sheffield其他文献
Nathan Sheffield的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Nathan Sheffield', 18)}}的其他基金
Novel methods for large-scale genomic interval comparison
大规模基因组区间比较的新方法
- 批准号:
10678947 - 财政年份:2022
- 资助金额:
$ 31.47万 - 项目类别:
A modular data analysis ecosystem using portable encapsulated projects
使用便携式封装项目的模块化数据分析生态系统
- 批准号:
10468680 - 财政年份:2018
- 资助金额:
$ 31.47万 - 项目类别:
A modular data analysis ecosystem using portable encapsulated projects
使用便携式封装项目的模块化数据分析生态系统
- 批准号:
10019399 - 财政年份:2018
- 资助金额:
$ 31.47万 - 项目类别:
A modular data analysis ecosystem using portable encapsulated projects
使用便携式封装项目的模块化数据分析生态系统
- 批准号:
9751344 - 财政年份:2018
- 资助金额:
$ 31.47万 - 项目类别:
A modular data analysis ecosystem using portable encapsulated projects
使用便携式封装项目的模块化数据分析生态系统
- 批准号:
10224819 - 财政年份:2018
- 资助金额:
$ 31.47万 - 项目类别:
相似海外基金
A Longitudinal Qualitative Study of Fentanyl-Stimulant Polysubstance Use Among People Experiencing Homelessness (Administrative supplement)
无家可归者使用芬太尼兴奋剂多物质的纵向定性研究(行政补充)
- 批准号:
10841820 - 财政年份:2023
- 资助金额:
$ 31.47万 - 项目类别:
Proton-secreting epithelial cells as key modulators of epididymal mucosal immunity - Administrative Supplement
质子分泌上皮细胞作为附睾粘膜免疫的关键调节剂 - 行政补充
- 批准号:
10833895 - 财政年份:2023
- 资助金额:
$ 31.47万 - 项目类别:
Administrative Supplement: Life-Space and Activity Digital Markers for Detection of Cognitive Decline in Community-Dwelling Older Adults: The RAMS Study
行政补充:用于检测社区老年人认知衰退的生活空间和活动数字标记:RAMS 研究
- 批准号:
10844667 - 财政年份:2023
- 资助金额:
$ 31.47万 - 项目类别:
StrokeNet Administrative Supplement for the Funding Extension
StrokeNet 资助延期行政补充文件
- 批准号:
10850135 - 财政年份:2023
- 资助金额:
$ 31.47万 - 项目类别:
2023 NINDS Landis Mentorship Award - Administrative Supplement to NS121106 Control of Axon Initial Segment in Epilepsy
2023 年 NINDS 兰迪斯指导奖 - NS121106 癫痫轴突初始段控制的行政补充
- 批准号:
10896844 - 财政年份:2023
- 资助金额:
$ 31.47万 - 项目类别:
Biomarkers of Disease in Alcoholic Hepatitis Administrative Supplement
酒精性肝炎行政补充剂中疾病的生物标志物
- 批准号:
10840220 - 财政年份:2023
- 资助金额:
$ 31.47万 - 项目类别:
Administrative Supplement: Improving Inference of Genetic Architecture and Selection with African Genomes
行政补充:利用非洲基因组改进遗传结构的推断和选择
- 批准号:
10891050 - 财政年份:2023
- 资助金额:
$ 31.47万 - 项目类别:
Power-Up Study Administrative Supplement to Promote Diversity
促进多元化的 Power-Up 研究行政补充
- 批准号:
10711717 - 财政年份:2023
- 资助金额:
$ 31.47万 - 项目类别:
Administrative Supplement for Peer-Delivered and Technology-Assisted Integrated Illness Management and Recovery
同行交付和技术辅助的综合疾病管理和康复的行政补充
- 批准号:
10811292 - 财政年份:2023
- 资助金额:
$ 31.47万 - 项目类别:
Sedentary behavior, physical activity, and 24-hour behavior in pregnancy and offspring health: the Pregnancy 24/7 Offspring Study Administrative Supplement
久坐行为、体力活动和 24 小时行为对怀孕和后代健康的影响:怀孕 24/7 后代研究行政补充
- 批准号:
10893074 - 财政年份:2023
- 资助金额:
$ 31.47万 - 项目类别: