BSF:2012304:Methods for Preprocessing Population Sequence Data
BSF:2012304:群体序列数据的预处理方法
基本信息
- 批准号:1331176
- 负责人:
- 金额:$ 4万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2013
- 资助国家:美国
- 起止时间:2013-09-01 至 2018-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This project is funded as part of the United States-Israel Collaboration in Computer Science (USICCS) program. Through this program, NSF and the United States - Israel Binational Science Foundation (BSF) jointly support collaborations among US-based researchers and Israel-based researchers.In recent years, many genetic studies have been performed, revealing many new associations between human genetic variation and complex diseases. These studies, referred to as genome-wide association studies, are limited to common genetic variants because the technology which collected the genetic variation was limited to only collecting common variants. There is evidence suggesting that rare variants have an important role in disease architectures. Recently, sequencing technologies have been introduced which are capable of collecting both genetic common and rare genetic variation. Sequencing technologies generate enormous amounts of data, raising new computational challenges. In this project, the PIs will develop methods for addressing these computational challenges including the design of efficient algorithms and the modeling of the sequencing process. In addition, the researchers will develop methods for incorporating rare variants into the analysis of genetic studies. The immediate broader impact of our project is the availability of these tools for general use by geneticists, leading to an improved understanding of the disease genetics. Particularly, the PIs will apply their methods to studies of non-Hodgkin's lymphoma, bipolar, dyslipidemia, neurodegenerative dementia, and Tourette syndrome, which will result in a direct impact on our understanding of these particular conditions.Current computational methods for the analysis of sequencing data exist, however they are limited to the analysis of a single sample. In this project the PIs will design efficient computational methods for the analysis of sequence data across a population. For population samples, the tremendous size of the data requires the design of highly efficient algorithms in terms of memory and runtime. Specifically, the PIs propose to design algorithms for the compression of sequencing data, for the search of regions identical by descent across multiple samples, and for high-resolution haplotype inference from sequence data. The PIs will explicitly model rare variants and the sequencing process, and use machine learning techniques and convex optimization to estimate the model parameters efficiently. These methods will allow for a fine-scale analysis of population data, resulting in improved understanding of complex diseases and human history. The collaborative nature of the project will expose the students involved in the project to the medical and genetics worlds, both in Israel and in the US, and it will improve their abilities to design and implement solutions to complex algorithmic problems. The methods developed in this project will be part of the teaching material of courses in UCLA and Tel-Aviv, and these materials will be made publicly available.
该项目是美国-以色列计算机科学合作(USICCS)计划的一部分。通过这个项目,NSF和美国-以色列两国科学基金会(BSF)共同支持美国科学家和以色列科学家之间的合作。近年来,许多遗传学研究揭示了人类遗传变异与复杂疾病之间的许多新的联系。这些研究,被称为全基因组关联研究,仅限于常见的遗传变异,因为收集遗传变异的技术仅限于收集常见变异。有证据表明,罕见变异在疾病结构中起着重要作用。近年来,测序技术的引入,既可以收集遗传常见变异,也可以收集罕见变异。测序技术产生了大量的数据,带来了新的计算挑战。在这个项目中,pi将开发解决这些计算挑战的方法,包括高效算法的设计和测序过程的建模。此外,研究人员将开发将罕见变异纳入遗传研究分析的方法。我们项目的直接影响是遗传学家可以使用这些工具,从而提高对疾病遗传学的理解。特别是,pi将把他们的方法应用于非霍奇金淋巴瘤、双相情感障碍、血脂异常、神经退行性痴呆和妥瑞特综合征的研究,这将对我们对这些特殊疾病的理解产生直接影响。目前存在分析测序数据的计算方法,但是它们仅限于对单个样本的分析。在这个项目中,pi将设计有效的计算方法来分析整个种群的序列数据。对于总体样本,庞大的数据规模要求在内存和运行时间方面设计高效的算法。具体来说,pi建议设计用于压缩测序数据的算法,用于在多个样本中通过下降搜索相同的区域,以及从序列数据中进行高分辨率的单倍型推断。pi将明确地对罕见变量和排序过程进行建模,并使用机器学习技术和凸优化来有效地估计模型参数。这些方法将允许对人口数据进行精细分析,从而提高对复杂疾病和人类历史的理解。该项目的合作性质将使参与该项目的学生接触到以色列和美国的医学和遗传学世界,并将提高他们设计和实施复杂算法问题解决方案的能力。在这个项目中开发的方法将成为加州大学洛杉矶分校和特拉维夫的课程教材的一部分,这些教材将向公众开放。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Eleazar Eskin其他文献
Improving the usability and archival stability of bioinformatics software
- DOI:
10.1186/s13059-019-1649-8 - 发表时间:
2019-02-27 - 期刊:
- 影响因子:9.400
- 作者:
Serghei Mangul;Lana S. Martin;Eleazar Eskin;Ran Blekhman - 通讯作者:
Ran Blekhman
Systematic benchmarking of omics computational tools
组学计算工具的系统基准测试
- DOI:
10.1038/s41467-019-09406-4 - 发表时间:
2019-03-27 - 期刊:
- 影响因子:15.700
- 作者:
Serghei Mangul;Lana S. Martin;Brian L. Hill;Angela Ka-Mei Lam;Margaret G. Distler;Alex Zelikovsky;Eleazar Eskin;Jonathan Flint - 通讯作者:
Jonathan Flint
Discrete profile comparison using information bottleneck
- DOI:
10.1186/1471-2105-7-s1-s8 - 发表时间:
2006-03-20 - 期刊:
- 影响因子:3.300
- 作者:
Sean O'Rourke;Gal Chechik;Robin Friedman;Eleazar Eskin - 通讯作者:
Eleazar Eskin
MEF: Malicious Email Filter - A UNIX Mail Filter That Detects Malicious Windows Executables
MEF:恶意电子邮件过滤器 - 检测恶意 Windows 可执行文件的 UNIX 邮件过滤器
- DOI:
- 发表时间:
2001 - 期刊:
- 影响因子:0
- 作者:
M. Schultz;Eleazar Eskin;E. Zadok;Manasi Bhattacharyya;Salvatore J. Stolfo - 通讯作者:
Salvatore J. Stolfo
Dealing with large diagonals in kernel matrices
- DOI:
10.1007/bf02530507 - 发表时间:
2003-06-01 - 期刊:
- 影响因子:0.600
- 作者:
Jason Weston;Bernhard Schölkopf;Eleazar Eskin;Christina Leslie;William Stafford Noble - 通讯作者:
William Stafford Noble
Eleazar Eskin的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Eleazar Eskin', 18)}}的其他基金
III: Medium: Causal inference in biobanks: Leveraging genetics to infer causal relationships using electronic health records
III:中:生物库中的因果推断:利用电子健康记录利用遗传学来推断因果关系
- 批准号:
2106908 - 财政年份:2021
- 资助金额:
$ 4万 - 项目类别:
Continuing Grant
III:Small: Replication Studies for High Dimensional Data: Insights into Confounding and Heterogeneity
III:小:高维数据的复制研究:洞察混杂和异质性
- 批准号:
1910885 - 财政年份:2019
- 资助金额:
$ 4万 - 项目类别:
Continuing Grant
III: Medium: Detecting Low Dimensional Structures in Genomic Data
III:中:检测基因组数据中的低维结构
- 批准号:
1705197 - 财政年份:2017
- 资助金额:
$ 4万 - 项目类别:
Standard Grant
III: Small: Causal and Statistical Inference in the Presence of Confounding Factors
III:小:存在混杂因素时的因果和统计推断
- 批准号:
1320589 - 财政年份:2013
- 资助金额:
$ 4万 - 项目类别:
Standard Grant
III: Medium: Meta-analysis reinterpreted using causal graphs
III:中:使用因果图重新解释荟萃分析
- 批准号:
1302448 - 财政年份:2013
- 资助金额:
$ 4万 - 项目类别:
Continuing Grant
III: Medium: Private Identification of Relatives and Private GWAS: First Steps in the New Field of CryptoGenomics
III:媒介:亲属的私人身份识别和私人 GWAS:密码基因组学新领域的第一步
- 批准号:
1065276 - 财政年份:2011
- 资助金额:
$ 4万 - 项目类别:
Standard Grant
III: Small: Inference of Causal Regulatory Relationships from Genetic Studies
III:小:从遗传研究中推断因果调节关系
- 批准号:
0916676 - 财政年份:2009
- 资助金额:
$ 4万 - 项目类别:
Continuing Grant
Collaborative Research: Design and Analysis of Compressed Sensing DNA Microarrays
合作研究:压缩传感 DNA 微阵列的设计和分析
- 批准号:
0729049 - 财政年份:2007
- 资助金额:
$ 4万 - 项目类别:
Continuing Grant
Collaborative Research: SEIII: Estimating Haplotype Frequencies
合作研究:SEIII:估计单倍型频率
- 批准号:
0731455 - 财政年份:2007
- 资助金额:
$ 4万 - 项目类别:
Standard Grant
Collaborative Research: SEIII: Estimating Haplotype Frequencies
合作研究:SEIII:估计单倍型频率
- 批准号:
0513612 - 财政年份:2005
- 资助金额:
$ 4万 - 项目类别:
Standard Grant














{{item.name}}会员




