权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Transfer Rule Learning for Knowledge Based Biomarker Discovery and Predictive Bio

基于知识的生物标志物发现和预测生物的转移规则学习

基本信息

批准号：
8373065
负责人：
Vanathi Gopalakrishnan
金额：
$ 29.97万
依托单位：
UNIVERSITY OF PITTSBURGH AT PITTSBURGH
依托单位国家：
美国
项目类别：
财政年份：
2012
资助国家：
美国
起止时间：
2012-09-24 至 2015-07-31
项目状态：
已结题

来源：
https://reporter.nih.gov/project-details/8373065
关键词：
Address Algorithms Biological Biological Markers Cancer Detection Cancer Prognosis Cause of Death Characteristics Classification Clinical Clinical Data Clinical Research Collaborations Computing Methodologies Data Data Set Data Sources Development Disease Early Diagnosis Evaluation Gene Expression Glean Goals Institution Internet Knowledge Lead Learning Machine Learning Malignant Neoplasms Malignant neoplasm of lung Maps Measurement Measures Methodology Methods MicroRNAs Modeling Molecular Profiling Monitor Outcome Outcomes Research Performance Proteomics Protocols documentation Psychological Transfer Publishing Sample Size Sampling Sampling Studies Screening procedure Seeds Serum Solutions Source Structure Testing United States Validation Work base cancer proteomics cost data mining design improved insight interest knowledge base lung cancer screening new technology novel novel diagnostics outcome forecast predictive modeling research study text searching tool

项目摘要

DESCRIPTION (provided by applicant): Predictive modeling of biomedical data arising from clinical studies for early detection, monitoring and prognosis of diseases is a crucial step in biomarker discovery. Since the data are typically measurements subject to error, and the sample size of any study is very small compared to the number of variables measured, the validity and verification of models arising from such datasets significantly impacts the discovery of reliable discriminatory markers for a disease. An important opportunity to make the most of these scarce data is to combine information from multiple related data sets for more effective biomarker discovery. Because the costs of creating large data sets for every disease of interest are likely to remain prohibitive, methods for more effectively making use of related biomarker discovery data sets continues to be important. Solution: This project develops and applies Transfer Rule Learning (TRL), a novel framework for integrative biomarker discovery from related but separate data sets, such as those generated from similar biomarker profiling studies. TRL alleviates the problem of data scarcity by providing automated ways to express, verify and use prior hypotheses generated from one data set while learning new knowledge via a related data set. This is the first study of transfer learning for biomarker discovery. Unlike other transfr learning approaches, TRL takes knowledge in the form of interpretable, modular classification rules, and uses them to seed learning of a rule model on a new data set. Classification rules simplify the extraction of discriminatory markers, and have been used successfully for biomarker discovery and verification in a non-integrative fashion. Specific Aims: This project tests the main hypothesis that TRL provides a mechanism for transfer learning of classification rules between related source and target data sets that improve performance on the target data, compared to learning without transfer. TRL will be evaluated using cross-validation performance of classification accuracy and transfer measures, on related groups of existing biomarker discovery datasets obtained from multiple experimental platforms for lung cancer detection and prognosis. A new set of independent validation data will be generated for early detection of lung cancer to test the models generated on pilot data. Insights into the impact of different modeling algorithms on transfer outcomes will be gleaned. Significance: The TRL framework and tool are important for combined analysis and interpretation of clinical data, as they support incremental building, verification and refinement of rule models for predictive biomedicine. The application of TRL to real-world biomarker discovery datasets can yield insights into novel interactions involving known markers, and the most reliable biomarkers for early detection of disease, particularly lung cancer. This project has the potential to help create new diagnostic screening tools for lung cancer detection. It allows foundational understanding of the use of transfer learning for integrative biomarker discovery that could lead to novel technologies for combining information from data and prior knowledge. PUBLIC HEALTH RELEVANCE: This project will develop highly-needed computational methods for integrative biomarker discovery from related but separate data sets produced by predictive molecular profiling studies of disease. It will generate new experimental data for early detection of lung cancer, and has the potential to help create new diagnostic screening tools for lung cancer, a leading cause of death from cancer in the United States.

描述（由申请人提供）：用于疾病早期检测、监测和预后的临床研究中产生的生物医学数据的预测建模是生物标志物发现的关键步骤。由于数据通常是测量误差，任何研究的样本量与测量的变量数量相比都非常小，因此从这些数据集产生的模型的有效性和验证显著影响了疾病可靠的判别标志物的发现。充分利用这些稀缺数据的一个重要机会是将来自多个相关数据集的信息联合收割机进行组合，以更有效地发现生物标志物。由于为每种感兴趣的疾病创建大数据集的成本可能仍然过高，因此更有效地利用相关生物标志物发现数据集的方法仍然很重要。解决方法：该项目开发并应用了迁移规则学习（TRL），这是一种新的框架，用于从相关但独立的数据集（例如从类似的生物标志物分析研究中生成的数据集）中发现综合生物标志物。TRL通过提供自动化的方法来表达，验证和使用从一个数据集生成的先验假设，同时通过相关的数据集学习新知识，从而解决了数据稀缺的问题。这是第一项针对生物标志物发现的迁移学习研究。与其他的学习方法不同，TRL以可解释的模块化分类规则的形式获取知识，并使用它们在新的数据集上进行规则模型的种子学习。分类规则简化了区分性标志物的提取，并已成功地用于以非整合方式发现和验证生物标志物。具体目标：本项目主要测试假设TRL提供了一种在相关源数据集和目标数据集之间转移分类规则的学习机制，与没有转移的学习相比，它提高了目标数据的性能。TRL将使用分类准确性和转移测量的交叉验证性能，对从肺癌检测和预后的多个实验平台获得的现有生物标志物发现数据集的相关组进行评估。将为肺癌的早期检测生成一组新的独立验证数据，以测试根据试点数据生成的模型。深入了解不同的建模算法对传输结果的影响。重要性：TRL框架和工具对于临床数据的组合分析和解释非常重要，因为它们支持预测生物医学规则模型的增量构建、验证和细化。的应用 TRL到真实世界的生物标志物发现数据集可以深入了解涉及已知标志物的新型相互作用，以及用于疾病早期检测的最可靠的生物标志物，特别是肺癌。该项目有可能帮助创建新的肺癌检测诊断筛查工具。它允许对迁移学习用于综合生物标志物发现的基本理解，这可能导致将数据和先验知识中的信息结合起来的新技术。公共卫生关系：该项目将开发急需的计算方法，用于从疾病预测分子谱研究产生的相关但独立的数据集中发现综合生物标志物。它将为早期的研究提供新的实验数据。该技术可以检测肺癌，并有可能帮助创建新的肺癌诊断筛查工具，肺癌是美国癌症死亡的主要原因。