IIBR:Informatics:Toward an Automated RNA-seq Bioinformatician
IIBR:信息学:走向自动化 RNA-seq 生物信息学家
基本信息
- 批准号:1937540
- 负责人:
- 金额:$ 54.61万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-06-01 至 2023-05-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Measurement of gene expression --- which genes are active in which conditions --- is an indispensable tool for understanding biological systems. Analysis of gene expression from modern genomic sequencing technologies requires the use of sophisticated software such as read mappers, transcript assemblers, and expression abundance estimators. A software program implementing one of these steps typically has a large number of user-settable parameters that influence how the analysis algorithm performs. Scientists,biologists, and clinical researchers must often tune these parameters by hand or through other ad hoc means. The goal of this project is to automate this process by designing and implementing a framework for automatically learning high-performing parameters for gene expression analysis software. This project also aims to develop algorithms, software, and methodology to make this framework practical and useful. This will allow more researchers to obtain high-quality gene expression analyses with significantly less effort and will also enable improved analysis of large data sets where per-sample parameter tuning by hand is impractical. Reproducibility of biological results will also be enhanced since the choice of parameters is explicitly ceded to an automated, repeatable process. This research will make biological studies involving gene expression more accurate and less costly. A number of educational and outreach activities for various levels of students (elementary through undergraduate) are planned to enhance community understanding of gene expression and its analysis.The developed processes will be implemented in several wrapper tools for parameter optimization that can be dropped into existing RNA-seq analysis pipelines to improve accuracy at each step. The research to design these tools will be broken down into several more tractable steps. The first step will be learning, for each tool, a collection of representative parameter vectors by analyzing large collections of existing RNA-seq samples. In the second step, machine learning methods, based on a combination of techniques such as Bayesian Optimization, genetic algorithms, and classification approaches, will be used to design techniques to select parameter vectors from these sets that are predicted to offer high performance. In the third step, techniques for providing human-interpretable rationales for the automatic parameter choices will be designed and implemented. The design of this system will also enhance our practical knowledge of techniques for such parameter optimization in other application domains within biology. Results from the project can be founThis award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
基因表达的测量-哪些基因在哪些条件下活跃-是理解生物系统不可或缺的工具。从现代基因组测序技术分析基因表达需要使用复杂的软件,如读取映射器、转录本组装器和表达丰度估计器。实现这些步骤之一的软件程序通常具有大量用户可设置的参数,这些参数影响分析算法的执行方式。科学家、生物学家和临床研究人员经常必须手动或通过其他特别方法调整这些参数。这个项目的目标是通过设计和实现一个自动学习基因表达分析软件的高性能参数的框架来自动化这一过程。该项目还旨在开发算法、软件和方法论,使该框架实用和有用。这将使更多的研究人员能够以显著较少的工作量获得高质量的基因表达分析,并将能够改进对大数据集的分析,在这些数据集中,手动调整每个样本的参数是不现实的。生物结果的重现性也将得到提高,因为参数的选择明确地让给了一个自动化的、可重复的过程。这项研究将使涉及基因表达的生物学研究更准确,成本更低。计划为不同层次的学生(小学到本科)开展一些教育和外展活动,以加强社区对基因表达及其分析的了解。开发的过程将在几个包装工具中实施,用于参数优化,可以放入现有的RNA-SEQ分析管道中,以提高每一步的准确性。设计这些工具的研究将被分解为几个更容易处理的步骤。第一步将是通过分析现有RNA-SEQ样本的大量集合,为每个工具学习具有代表性的参数向量的集合。在第二步中,将使用基于贝叶斯优化、遗传算法和分类方法等技术的组合的机器学习方法来设计从这些集合中选择预测提供高性能的参数向量的技术。在第三步中,将设计和实现为自动参数选择提供人类可解释的理由的技术。该系统的设计还将增强我们在生物学其他应用领域中此类参数优化技术的实用知识。这个奖项反映了NSF的法定使命,通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(12)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Revisiting the complexity of and algorithms for the graph traversal edit distance and its variants
- DOI:10.1186/s13015-024-00262-6
- 发表时间:2024-04-29
- 期刊:
- 影响因子:1
- 作者:Qiu,Yutong;Shen,Yihang;Kingsford,Carl
- 通讯作者:Kingsford,Carl
Reinforcement Learning for Robotic Liquid Handler Planning
机器人液体处理机规划的强化学习
- DOI:
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Ferdosi, Mohsen;Ge, Yuejun;Kingsford, Carl
- 通讯作者:Kingsford, Carl
How much data is sufficient to learn high-performing algorithms? generalization guarantees for data-driven algorithm design
- DOI:10.1145/3406325.3451036
- 发表时间:2021-06
- 期刊:
- 影响因子:0
- 作者:Maria-Florina Balcan;Dan F. DeBlasio;Travis Dick;Carl Kingsford;T. Sandholm;Ellen Vitercik
- 通讯作者:Maria-Florina Balcan;Dan F. DeBlasio;Travis Dick;Carl Kingsford;T. Sandholm;Ellen Vitercik
Computationally Efficient High-Dimensional Bayesian Optimization via Variable Selection
通过变量选择进行计算高效的高维贝叶斯优化
- DOI:
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Shen, Yihang;Kingsford, Carl
- 通讯作者:Kingsford, Carl
Optimizing Dynamic Structures with Bayesian Generative Search
- DOI:
- 发表时间:2020-07
- 期刊:
- 影响因子:0
- 作者:Minh Hoang;Carleton Kingsford
- 通讯作者:Minh Hoang;Carleton Kingsford
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Carleton Kingsford其他文献
Carleton Kingsford的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Carleton Kingsford', 18)}}的其他基金
Conference: NSF-NIH Joint Workshop on Foundational AI in Biology
会议:NSF-NIH 生物学基础人工智能联合研讨会
- 批准号:
2325301 - 财政年份:2023
- 资助金额:
$ 54.61万 - 项目类别:
Standard Grant
III:Small: Expressiveness of Genome Graphs: Construction, Comparison, and Heterogeneity
III:小:基因组图的表现力:构建、比较和异质性
- 批准号:
2232121 - 财政年份:2023
- 资助金额:
$ 54.61万 - 项目类别:
Standard Grant
Workshop on Future Directions for Algorithms in Biology
生物学算法未来方向研讨会
- 批准号:
1748493 - 财政年份:2017
- 资助金额:
$ 54.61万 - 项目类别:
Standard Grant
AF: Small: Multiscale Spectral Signatures for Local and Multi-objective Biological Network Alignment
AF:小:用于局部和多目标生物网络比对的多尺度光谱特征
- 批准号:
1319998 - 财政年份:2013
- 资助金额:
$ 54.61万 - 项目类别:
Standard Grant
CAREER: Model-based Reconstruction of Ancient Biological Networks
职业:基于模型的古代生物网络重建
- 批准号:
1256087 - 财政年份:2012
- 资助金额:
$ 54.61万 - 项目类别:
Continuing Grant
CAREER: Model-based Reconstruction of Ancient Biological Networks
职业:基于模型的古代生物网络重建
- 批准号:
1053918 - 财政年份:2011
- 资助金额:
$ 54.61万 - 项目类别:
Continuing Grant
相似海外基金
REU Site: Program for Access to Training in Health Informatics (PATHI)
REU 网站:健康信息学培训计划 (PATHI)
- 批准号:
2348793 - 财政年份:2024
- 资助金额:
$ 54.61万 - 项目类别:
Standard Grant
Travel: IEEE International Conference on Healthcare Informatics (IEEE ICHI 2024) Doctoral Consortium Travel Scholarship
旅行:IEEE 国际医疗信息学会议 (IEEE ICHI 2024) 博士联盟旅行奖学金
- 批准号:
2414093 - 财政年份:2024
- 资助金额:
$ 54.61万 - 项目类别:
Standard Grant
Reliable Tensor-Network Fusion Approach to Medical Informatics: Novel Techniques and Benchmarks
可靠的张量网络融合医学信息学方法:新技术和基准
- 批准号:
24K03005 - 财政年份:2024
- 资助金额:
$ 54.61万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Development of Informatics Materials with an Awareness of the High School-University connection and a Learning Support Environment for Data-Driven Instruction
开发具有高中与大学联系意识的信息学材料和数据驱动教学的学习支持环境
- 批准号:
23H01019 - 财政年份:2023
- 资助金额:
$ 54.61万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Travel: NSF Student Travel Grant for 2023 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI)
旅行:2023 年 IEEE-EMBS 国际生物医学和健康信息学会议 (BHI) 的 NSF 学生旅行补助金
- 批准号:
2331680 - 财政年份:2023
- 资助金额:
$ 54.61万 - 项目类别:
Standard Grant
CAREER: Transforming Personal Informatics Systems to Support Routine Transitions in Healthy Eating
职业:转变个人信息系统以支持健康饮食的常规转变
- 批准号:
2414270 - 财政年份:2023
- 资助金额:
$ 54.61万 - 项目类别:
Continuing Grant
Pioneering Research of industrial materials informatics for innovative lithium battery anodes
创新锂电池阳极工业材料信息学的开创性研究
- 批准号:
23K18465 - 财政年份:2023
- 资助金额:
$ 54.61万 - 项目类别:
Grant-in-Aid for Challenging Research (Exploratory)
Categorical Duality and Semantics Across Mathematics, Informatics and Physics and their Applications to Categorical Machine Learning and Quantum Computing
数学、信息学和物理领域的分类对偶性和语义及其在分类机器学习和量子计算中的应用
- 批准号:
23K13008 - 财政年份:2023
- 资助金额:
$ 54.61万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
ACTS (AD Clinical Trial Simulation): Developing Advanced Informatics Approaches for an Alzheimer's Disease Clinical Trial Simulation System
ACTS(AD 临床试验模拟):为阿尔茨海默病临床试验模拟系统开发先进的信息学方法
- 批准号:
10753675 - 财政年份:2023
- 资助金额:
$ 54.61万 - 项目类别:
CAREER: Transforming Personal Informatics Systems to Support Routine Transitions in Healthy Eating
职业:转变个人信息系统以支持健康饮食的常规转变
- 批准号:
2239727 - 财政年份:2023
- 资助金额:
$ 54.61万 - 项目类别:
Continuing Grant