权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Reinforcement Learning in Dynamic Treatment Regimes: Dealing with scarce data, safe exploration, and explainability.

动态治疗方案中的强化学习：处理稀缺数据、安全探索和可解释性。

基本信息

批准号：
2606309
负责人：
金额：
--
依托单位：
University of Warwick
依托单位国家：
英国
项目类别：
Studentship
财政年份：
2021
资助国家：
英国
起止时间：
2021 至无数据
项目状态：
未结题

来源：
https://gtr.ukri.org/projects?ref=studentship-2606309
关键词：
Reinforcement Learning Dynamic Treatment Regimes

项目摘要

In light of the rapid increase in accessible clinical data and the growing interest in personalized medicine among clinical scientist communities, there is an unprecedented opportunity to improve the quality of Dynamic Treatment Regimes (DTRs) using available clinical data. This is especially true in the case of chronic diseases where treatment must adapt to the ever-evolving illness and unique response of each patient, and data-driven methods prove to be an extremely helpful tool to enhance dynamic treatment. DTRs generalizes personalized treatment as a time-varying treatment setting, in which treatment in each stage is tailored suitably based on historical and up-to-date clinical information of each patient. The goal of DTRs is to improve the patient outcome according to adaptive treatment, offering invaluable assistance to the clinical decision support systems, which lie at the heart of the chronic care model. DTRs can be formulated as a sequential decision-making problem with a time-varying or dynamic state, in which a decision rule in each state depends on the treatment history and latest information of the patient. It provides a powerful tool to deal with the chronic and personal condition of each patient. Since DTRs can be considered as a sequential decision-making problem, Reinforcement learning (RL) is one of the most appropriate methods to deal with these problems. Recently, RL techniques such as Q-learning have been studied in the literature of DTRs [1] and yielded promising results. However, strict requirements such as safety and interpretability are still major challenges when applying RL in DTRs and healthcare in general. In this project we aim to look at the following research questions: 1. Could learning from the treatment results from groups of patients be used to facilitate the training process of future treatment of a patient that has similar health conditions? 2. Could prior knowledge and insight of experts be integrated into the learning process of recommendation systems? 3. Could exploration be guided to guarantee both the safety of the treatment and the improvement of treatment strategies? 4. If the method includes function approximation, how could newly discovered treatment be interpreted and verified? To answer these research questions, we rely on the novel combination of transfer learning, safe exploration, and interactive reinforcement learning. Alignment with EPSRC research themes: It is very well aligned with Artificial Intelligence and Robotics. It is also related to the Healthcare Technologies theme. Finally, it is within the scope of Information and Communication Technologies theme.

随着可访问临床数据的快速增长和临床科学家群体对个性化医学的兴趣与日俱增，利用现有临床数据提高动态治疗方案(DTRs)的质量面临前所未有的机遇。尤其是在慢性病的情况下，治疗必须适应每个患者不断演变的疾病和独特的反应，事实证明，数据驱动的方法是加强动态治疗的一个极其有用的工具。DTRS将个性化治疗概括为一种随时间变化的治疗环境，其中每个阶段的治疗都是根据每个患者的历史和最新临床信息进行适当定制的。DTRS的目标是根据适应性治疗改善患者的预后，为处于慢性护理模式核心的临床决策支持系统提供宝贵的帮助。动态决策问题可以描述为一个时变或动态的序贯决策问题，其中每个状态下的决策规则依赖于患者的治疗历史和最新信息。它提供了一个强大的工具来处理每个患者的慢性和个人情况。由于决策树可以看作是一个序贯决策问题，强化学习(RL)是处理这些问题的最合适的方法之一。最近，RL技术，如Q-学习，已经在DTRS的文献[1]中进行了研究，并取得了令人振奋的结果。然而，在DSR和一般医疗保健中应用RL时，安全性和可解释性等严格要求仍然是主要挑战。在这个项目中，我们的目标是研究以下问题：1.从一组患者的治疗结果中学习是否可以用于促进具有相似健康状况的患者未来治疗的培训过程？2.专家的先验知识和洞察力能否被整合到推荐系统的学习过程中？3.探索能否被引导以保证治疗的安全性和治疗策略的改进？4.如果该方法包括函数逼近，如何解释和验证新发现的治疗方法？为了回答这些研究问题，我们依靠迁移学习、安全探索和交互式强化学习的新颖组合。与EPSRC的研究主题保持一致：它与人工智能和机器人学非常一致。它也与医疗保健技术主题有关。最后，它属于信息和通信技术主题的范围。