权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Nonparametric Learning for Situated Data-to-Text Generation: Helping People to Understand Uncertain Data

用于情景数据到文本生成的非参数学习：帮助人们理解不确定数据

基本信息

批准号：
EP/L026775/1
负责人：
Verena Rieser
金额：
$ 12.54万
依托单位：
Heriot-Watt University
依托单位国家：
英国
项目类别：
Research Grant
财政年份：
2014
资助国家：
英国
起止时间：
2014 至无数据
项目状态：
已结题

来源：
https://gtr.ukri.org/projects?ref=EP%2FL026775%2F1
关键词：
Nonparametric Learning Situated Data Text

项目摘要

Information overload is a pervasive problem in many environments, particularly those in which human decision making is based on extensive data sets. Data-to-text systems have been shown to successfully address this problem by automatically generating textual descriptions of the underlying data. However, when translating (numerical) data into words, an appropriate level of precision needs to be chosen. The following example is from a system which summarises medical time series data for neonatal care: "At 17:24 T1 is 35.7 and T2 is 34.5C" (Gatt et al., 2009). This summary is clearly targeted to experts, such as doctors or nurses, which need precise information for decision making. However, other users, such as visiting parents might be more happy with a description such as "In the evening your baby had normal temperature." In this project, we will build a data-to-text system that automatically determines the appropriate level of precision for a given context by using statistical machine learning methods. These methods can learn an optimal generation policy from real data and promise to be more robust to new situations than hand-written rules by human experts. We will also investigate novel feedback-based non-parametric state estimation methods to reduce the data annotation cost for data-to-text systems. Typically, the first step in creating such systems is to manually interpret and align the raw data sources. However, this step is very costly as human experts need to trained for this task. Our new methods promise for data-to-text systems to be rapidly applied to new domains. The domain we will be targeting for this initial project is pedestrian navigation, where the task is to translate uncertain user positions into walking instructions. The underlying data uncertainty here arises from several sources, such as the user's speech signal, the GPS location, estimated viewshed, walking direction and speed. We will integrate and test our learnt data-to-text generation strategy by integrating it in an existing system and running an evaluation with real users. One of the outcomes of this project is a data-driven linguistic view on the question of "how to communicate uncertainty", which is an active interdisciplinary research area, including researchers from medicine, law, environmental modelling and climate change.In future work we will also investigate how the proposed framework transfers to new domains, such as natural language generation from medical data, weather forecasts, or output from complex environmental models.

信息过载是许多环境中普遍存在的问题，特别是那些人类决策是基于大量数据集的环境。数据到文本系统已经被证明可以通过自动生成底层数据的文本描述成功地解决这个问题。然而，在将（数字）数据转换为文字时，需要选择适当的精度级别。以下示例来自总结用于新生儿护理的医学时间序列数据的系统：“在17：24，T1是35.7并且T2是34.5C”（Gatt等人，2009年）。这份摘要显然是针对专家的，比如医生或护士，他们需要精确的信息来做出决策。然而，其他用户，如来访的父母可能更高兴的描述，如“在晚上，你的宝宝有正常的温度。“在这个项目中，我们将构建一个数据到文本的系统，通过使用统计机器学习方法自动确定给定上下文的适当精度水平。这些方法可以从真实的数据中学习最优的生成策略，并且比人类专家手写的规则对新情况更具鲁棒性。我们还将研究新的基于反馈的非参数状态估计方法，以减少数据到文本系统的数据注释成本。通常，创建此类系统的第一步是手动解释和对齐原始数据源。然而，这一步是非常昂贵的，因为人类专家需要为此任务进行培训。我们的新方法有望使数据到文本系统快速应用于新的领域。我们将针对这个初始项目的领域是行人导航，其中的任务是将不确定的用户位置转换为步行指令。这里潜在的数据不确定性来自几个来源，例如用户的语音信号、GPS位置、估计的视域、行走方向和速度。我们将通过将其集成到现有系统中并与真实的用户进行评估来集成和测试我们学习的数据到文本生成策略。该项目的成果之一是从数据驱动的语言学角度来研究“如何传达不确定性”的问题，这是一个活跃的跨学科研究领域，包括医学、法律、环境建模和气候变化等领域的研究人员。在未来的工作中，我们还将研究所提出的框架如何转移到新的领域，例如从医疗数据生成自然语言，天气预报，或复杂环境模型的输出。

项目成果

期刊论文数量（10）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Generating and Evaluating Landmark-based Navigation Instructions in Virtual Environments

在虚拟环境中生成和评估基于地标的导航指令

DOI：
发表时间：
2015
期刊：
影响因子：
0
作者：
Cercas Curry A.
通讯作者：
Cercas Curry A.

Information density and overlap in spoken dialogue

DOI：
10.1016/j.csl.2015.11.001
发表时间：
2016-05
期刊：
Comput. Speech Lang.
影响因子：
0
作者：
Nina Dethlefs;H. Hastie;H. Cuayáhuitl;Yanchao Yu;Verena Rieser;Oliver Lemon
通讯作者：
Nina Dethlefs;H. Hastie;H. Cuayáhuitl;Yanchao Yu;Verena Rieser;Oliver Lemon

How to talk to strangers: Generating medical reports for first-time users