Prodigy: Probabilistic Deep Generation

Prodigy:概率深度生成

基本信息

  • 批准号:
    EP/E029116/1
  • 负责人:
  • 金额:
    $ 26.91万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2007
  • 资助国家:
    英国
  • 起止时间:
    2007 至 无数据
  • 项目状态:
    已结题

项目摘要

Computational methods for generating language are lagging behind computational methods for analysing language in several ways, most obviously in that they are not used commercially. The main reasons are that systems for generating language take inordinate amounts of time to build, yet once built cannot be reused, and tend to be severely lacking in language variation, something that is easily perceived as a lack of quality. The current situation in language generation research is reminiscent of language analysis research in the late 1980s, when symbolic and statistical methods briefly formed entirely separate research paradigms. Language analysis soon moved towards a paradigm merger, realising that symbolic methods lacked the efficiency and robustness that probabilistic methods could provide, which in turn would benefit from the accuracy and subtlety of symbolic methods. A similar development is currently underway in the field of machine translation where - after several years of purely statistical methods dominating the field - researchers are now beginning to bring linguistic knowledge back in. The experience from these research fields suggests that higher quality can be achieved when the symbolic and statistical paradigms join forces. Recent research shows that this is likely to be true for language generation too. The purpose of the Prodigy project is to develop, for the first time, a comprehensive, linguistically informed, probabilistic methodology for generating language that substantially improves development time, reusability and language variation in language generation systems, and thereby enhances their commercial viability. Taking the principal investigator's previous EPSRC-funded research on probabilistic NLG as a starting point, the Prodigy project will explore whether the combination of the probabilistic and the linguistic can be as beneficial for the field of language generation as it has been for language analysis. We will focus on two aspect in particular: (i) developing reusable data representation and encoding strategies, and (ii) developing specific probabilistic techniques for guiding language generation processes.We will test and evaluate our representations and techniques on five different data sets which have been collected from real-world text production tasks and include weather forecasts, descriptions of museum exhibits, and nurses' reports.The Prodigy project will produce research outcomes that are of potential benefit to industry, the research community and individual end-users. Research will primarily benefit through advances in our understanding of reusable language generation technology, industry through improvements in commercial viability, and the technology itself can help individual users by speeding up text production, as well as by making available a modality that does not always exist (e.g. enabling visually impaired readers to access graphical information).
生成语言的计算方法在几个方面落后于分析语言的计算方法,最明显的是它们没有商业化。主要原因是生成语言的系统需要大量的时间来构建,但是一旦构建就不能重用,并且往往严重缺乏语言变体,这很容易被认为是缺乏质量。语言生成研究的现状让人想起20世纪80年代后期的语言分析研究,当时符号和统计方法短暂地形成了完全独立的研究范式。语言分析很快就走向了范式合并,意识到符号方法缺乏概率方法所能提供的效率和鲁棒性,而概率方法反过来又会受益于符号方法的准确性和微妙性。类似的发展目前正在机器翻译领域进行,在纯统计方法主导该领域几年之后,研究人员现在开始将语言知识重新引入。这些研究领域的经验表明,当符号和统计范式结合起来时,可以实现更高的质量。最近的研究表明,这可能也适用于语言生成。Prodigy项目的目的是首次开发一种全面的、语言学方面的、概率性的语言生成方法,大大缩短语言生成系统的开发时间、可重复使用性和语言变化,从而提高其商业可行性。以主要研究者之前EPSRC资助的关于概率NLG的研究为起点,Prodigy项目将探索概率和语言学的结合是否能像语言分析那样有益于语言生成领域。我们将特别关注两个方面:(i)开发可重复使用的数据表示和编码策略,以及(ii)开发用于指导语言生成过程的特定概率技术。我们将在五个不同的数据集上测试和评估我们的表示和技术,这些数据集是从真实世界的文本生成任务中收集的,包括天气预报,博物馆展品的描述,Prodigy项目将产生对工业、研究界和个人最终用户有潜在好处的研究成果。研究将主要受益于我们对可重复使用的语言生成技术的理解的进步,工业通过商业可行性的改善,技术本身可以通过加速文本生成来帮助个人用户,以及通过提供并不总是存在的模式(例如,使视障读者能够访问图形信息)。

项目成果

期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
System Building Cost vs. Output Quality in Data-to-Text Generation
  • DOI:
    10.3115/1610195.1610198
  • 发表时间:
    2009-03
  • 期刊:
  • 影响因子:
    0
  • 作者:
    A. Belz;Eric Kow
  • 通讯作者:
    A. Belz;Eric Kow
Extracting Parallel Fragments from Comparable Corpora for Data-to-text Generation
从可比较的语料库中提取并行片段以生成数据到文本
  • DOI:
  • 发表时间:
    2010
  • 期刊:
  • 影响因子:
    0
  • 作者:
    AS Belz
  • 通讯作者:
    AS Belz
Probabilistic generation of weather forecast texts
天气预报文本的概率生成
  • DOI:
  • 发表时间:
    2010
  • 期刊:
  • 影响因子:
    0
  • 作者:
    AS Belz
  • 通讯作者:
    AS Belz
Assessing the Trade-Off between System Building Cost and Output Quality in Data-to-Text Generation
评估数据到文本生成中系统构建成本和输出质量之间的权衡
  • DOI:
  • 发表时间:
    2010
  • 期刊:
  • 影响因子:
    0
  • 作者:
    AS Belz
  • 通讯作者:
    AS Belz
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Anya Belz其他文献

Generating Irish Text with a Flexible Plug-and-Play Architecture
使用灵活的即插即用架构生成爱尔兰语文本
The ReproGen Shared Task on Reproducibility of Human Evaluations in NLG: Overview and Results
ReproGen 关于 NLG 中人类评估可重复性的共享任务:概述和结果
  • DOI:
    10.18653/v1/2021.inlg-1.24
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Anya Belz;Anastasia Shimorina;Shubham Agarwal;Ehud Reiter
  • 通讯作者:
    Ehud Reiter
A Pipeline for Extracting Abstract Dependency Templates for Data-to-Text Natural Language Generation
用于提取数据到文本自然语言生成的抽象依赖模板的管道
Towards a Consensus Taxonomy for Annotating Errors in Automatically Generated Text
走向用于注释自动生成文本中的错误的共识分类法
Quantified Reproducibility Assessment of NLP Results
NLP 结果的量化再现性评估

Anya Belz的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Anya Belz', 18)}}的其他基金

ReproHum: Investigating Reproducibility of Human Evaluations in Natural Language Processing
ReproHum:研究自然语言处理中人类评估的再现性
  • 批准号:
    EP/V05645X/1
  • 财政年份:
    2022
  • 资助金额:
    $ 26.91万
  • 项目类别:
    Research Grant
Generation Challenges 2011: Towards a Surface Realisation Shared Task
2011 年世代挑战:迈向表面实现共享任务
  • 批准号:
    EP/I032320/1
  • 财政年份:
    2011
  • 资助金额:
    $ 26.91万
  • 项目类别:
    Research Grant
EPSRC Network on Vision and Language (V&L Net)
EPSRC 视觉和语言网络 (V
  • 批准号:
    EP/H018557/1
  • 财政年份:
    2010
  • 资助金额:
    $ 26.91万
  • 项目类别:
    Research Grant
Generation Challenges 2010
2010 年世代挑战
  • 批准号:
    EP/H032886/1
  • 财政年份:
    2010
  • 资助金额:
    $ 26.91万
  • 项目类别:
    Research Grant
Generation Challenges 2009
2009 年世代挑战
  • 批准号:
    EP/G03995X/1
  • 财政年份:
    2009
  • 资助金额:
    $ 26.91万
  • 项目类别:
    Research Grant
REG Challenge 2008: A Shared Task Evaluation Event for Referring Expression Generation
REG Challenge 2008:参考表达式生成的共享任务评估活动
  • 批准号:
    EP/F059760/1
  • 财政年份:
    2008
  • 资助金额:
    $ 26.91万
  • 项目类别:
    Research Grant

相似海外基金

New approaches to training deep probabilistic models
训练深度概率模型的新方法
  • 批准号:
    2613115
  • 财政年份:
    2025
  • 资助金额:
    $ 26.91万
  • 项目类别:
    Studentship
Collaborative Research: SHF: Medium: Verifying Deep Neural Networks with Spintronic Probabilistic Computers
合作研究:SHF:中:使用自旋电子概率计算机验证深度神经网络
  • 批准号:
    2311295
  • 财政年份:
    2023
  • 资助金额:
    $ 26.91万
  • 项目类别:
    Continuing Grant
CAREER: Accelerating Probabilistic Predictions of Sea-level Rise with Deep Learning
职业:利用深度学习加速海平面上升的概率预测
  • 批准号:
    2238316
  • 财政年份:
    2023
  • 资助金额:
    $ 26.91万
  • 项目类别:
    Standard Grant
Integration of a deep probabilistic model and an outlier detection method with an attention mechanism and its application to super-high dimensional time series data
深度概率模型与带有注意力机制的异常值检测方法的集成及其在超高维时间序列数据中的应用
  • 批准号:
    23H03357
  • 财政年份:
    2023
  • 资助金额:
    $ 26.91万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Collaborative Research: SHF: Medium: Verifying Deep Neural Networks with Spintronic Probabilistic Computers
合作研究:SHF:中:使用自旋电子概率计算机验证深度神经网络
  • 批准号:
    2311296
  • 财政年份:
    2023
  • 资助金额:
    $ 26.91万
  • 项目类别:
    Continuing Grant
Probabilistic deep learning models and integrated biological experiments for analyzing dynamic and heterogeneous microbiomes
用于分析动态和异质微生物组的概率深度学习模型和集成生物实验
  • 批准号:
    10622713
  • 财政年份:
    2023
  • 资助金额:
    $ 26.91万
  • 项目类别:
CAREER: Modeling Language Evolution via Deep Probabilistic Factorization
职业:通过深度概率分解建模语言演化
  • 批准号:
    2146151
  • 财政年份:
    2022
  • 资助金额:
    $ 26.91万
  • 项目类别:
    Continuing Grant
Illuminating the dark metabolome via deep learning and probabilistic graphical models
通过深度学习和概率图模型照亮黑暗代谢组
  • 批准号:
    544268-2020
  • 财政年份:
    2022
  • 资助金额:
    $ 26.91万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
ERI: Harnessing Probabilistic Deep Learning Method Integrated with Tailored Features for Enhanced Real-Time Machinery Fault Diagnosis and Prognosis
ERI:利用概率深度学习方法与定制特征相结合,增强实时机械故障诊断和预测
  • 批准号:
    2138522
  • 财政年份:
    2022
  • 资助金额:
    $ 26.91万
  • 项目类别:
    Standard Grant
Probabilistic deep learning approaches in medical imaging
医学成像中的概率深度学习方法
  • 批准号:
    2736482
  • 财政年份:
    2022
  • 资助金额:
    $ 26.91万
  • 项目类别:
    Studentship
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了