CAREER: Content and Cohesion Models, with Applications to Text Summarization and Natural Language Generation
职业:内容和衔接模型,及其在文本摘要和自然语言生成中的应用
基本信息
- 批准号:0448168
- 负责人:
- 金额:$ 40万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2005
- 资助国家:美国
- 起止时间:2005-02-15 至 2012-01-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Within the last decade, probabilistic methods have delivered successful analyses of natural language texts that, in turn, have enabled a broad range of valuable and practical applications, such as machine translation, question answering, and summarization. Despite this success, existing methods suffer from a fundamental limitation: they process each document with little or no ability to take advantage of its global structure. All too often, this results in suboptimal performance for the task at hand.The goal of this project is to develop probabilistic models for two fundamental, orthogonal dimensions of text, content and cohesion. A model based on the first dimension, content, describes the topics present in a text and their organization. The second dimension, cohesion, is concerned with how information is realized in a given text. Development of these models requires new unsupervised techniques able to capture complex text properties and novel algorithms for topical discretization and discourse grammar induction.Gaining a computational measure of what constitutes a good text will open new research avenues on the edge of humanities and computer science. Probabilistic text models will form a basis for novel approaches to text summarization and generation that will make on-line information much more accessible than is currently the case. This will substantially affect the way people experience the many forms of textual on-line information, including news reports, consumer health information, and government documents. Students will become involved in this research through hands-on projects, outreach programs, and courses at both the undergraduate and graduate level.
在过去的十年中,概率方法成功地提供了对自然语言文本的分析,进而使广泛的有价值的实际应用成为可能,如机器翻译、问题回答和摘要。尽管取得了这样的成功,但现有的方法存在一个根本的局限性:它们处理每个文档的能力很少或根本没有利用其全球结构的能力。这个项目的目标是为文本的两个基本的、正交的维度--内容和衔接--开发概率模型。一个基于第一个维度内容的模型描述了文本中存在的主题及其组织。第二个维度,衔接,是关于信息在给定的语篇中是如何实现的。这些模型的发展需要能够捕捉复杂文本属性的新的无监督技术,以及主题离散化和语篇语法归纳的新算法。获得好文本的计算度量将在人文和计算机科学的边缘开辟新的研究途径。概率文本模型将构成文本摘要和生成的新方法的基础,这些方法将使在线信息比目前情况下更容易获得。这将极大地影响人们体验多种形式的文本在线信息的方式,包括新闻报道、消费者健康信息和政府文件。学生将通过实践项目、外展项目以及本科生和研究生的课程参与到这项研究中来。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Regina Barzilay其他文献
Incidental breast carcinoma: incidence, management, and outcomes in 4804 bilateral reduction mammoplasties
- DOI:
10.1007/s10549-019-05335-4 - 发表时间:
2019-07-17 - 期刊:
- 影响因子:3.000
- 作者:
Rong Tang;Francisco Acevedo;Conor Lanahan;Suzanne B. Coopey;Adam Yala;Regina Barzilay;Clara Li;Amy Colwell;Anthony J. Guidi;Curtis Cetrulo;Judy Garber;Barbara L. Smith;Michele A. Gadd;Michelle C. Specht;Kevin S. Hughes - 通讯作者:
Kevin S. Hughes
AI-driven discovery of synergistic drug combinations against pancreatic cancer
人工智能驱动的针对胰腺癌的协同药物组合的发现
- DOI:
10.1038/s41467-025-56818-6 - 发表时间:
2025-04-29 - 期刊:
- 影响因子:15.700
- 作者:
Mohsen Pourmousa;Sankalp Jain;Elena Barnaeva;Wengong Jin;Joshua Hochuli;Zina Itkin;Travis Maxfield;Cleber Melo-Filho;Andrew Thieme;Kelli Wilson;Carleen Klumpp-Thomas;Sam Michael;Noel Southall;Tommi Jaakkola;Eugene N. Muratov;Regina Barzilay;Alexander Tropsha;Marc Ferrer;Alexey V. Zakharov - 通讯作者:
Alexey V. Zakharov
Deep learning enhances the prediction of HLA class I-presented CD8+ T cell epitopes in foreign pathogens
深度学习增强了对外源病原体中 HLA I 类呈递的 CD8+T 细胞表位的预测。
- DOI:
10.1038/s42256-024-00971-y - 发表时间:
2025-01-28 - 期刊:
- 影响因子:23.900
- 作者:
Jeremy Wohlwend;Anusha Nathan;Nitan Shalon;Charles R. Crain;Rhoda Tano-Menka;Benjamin Goldberg;Emma Richards;Gaurav D. Gaiha;Regina Barzilay - 通讯作者:
Regina Barzilay
Atypical ductal hyperplasia in men with gynecomastia: what is their breast cancer risk?
- DOI:
10.1007/s10549-018-05117-4 - 发表时间:
2019-01-21 - 期刊:
- 影响因子:3.000
- 作者:
Suzanne B. Coopey;Kinyas Kartal;Clara Li;Adam Yala;Regina Barzilay;Heather R. Faulkner;Tari A. King;Francisco Acevedo;Judy E. Garber;Anthony J. Guidi;Kevin S. Hughes - 通讯作者:
Kevin S. Hughes
Regina Barzilay的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Regina Barzilay', 18)}}的其他基金
SGER: Reconstructing the Tower of Babel: Cross-lingual Language Learning
SGER:重建巴别塔:跨语言语言学习
- 批准号:
0835445 - 财政年份:2008
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Student Research Workshop in Computational Linguistics, at the Association for Computational Linguistics (ACL) 2005 Conference; June 27, 2005; Ann Arbor, MI
计算语言学学生研究研讨会,计算语言学协会 (ACL) 2005 年会议;
- 批准号:
0527130 - 财政年份:2005
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Automatic Processing of Spoken and Written Lecture Material
口语、书面授课材料自动处理
- 批准号:
0415865 - 财政年份:2004
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
相似海外基金
Native, non-native or artificial phonetic content for pronunciation education: representations and perception in the case of L2 French
用于发音教育的母语、非母语或人工语音内容:以法语 L2 为例的表征和感知
- 批准号:
24K00093 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
An innovative, AI-driven application that helps users assess/action information pollution for social media content.
一款创新的人工智能驱动应用程序,可帮助用户评估/消除社交媒体内容的信息污染。
- 批准号:
10100049 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Collaborative R&D
Sustainable Remanufacturing solution with increased automation and recycled content in laser and plasma based process (RESTORE)
可持续再制造解决方案,在基于激光和等离子的工艺中提高自动化程度和回收内容(RESTORE)
- 批准号:
10112149 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
EU-Funded
CAREER: Optoelectronic lab-on-a-chip technology for high content automated multiparametric physiological analyses of live cells
职业:用于活细胞高内涵自动化多参数生理分析的光电芯片实验室技术
- 批准号:
2339030 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
Postdoctoral Fellowship: STEMEdIPRF: Pedagogical Content Knowledge for Course-based Undergraduate Research Instruction
博士后奖学金:STEMEdIPRF:基于课程的本科生研究教学的教学内容知识
- 批准号:
2327187 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
RESTORE Sustainable Remanufacturing solution with increased automation and recycled content in laser and plasmabased process.
RESTORE 可持续再制造解决方案,在激光和等离子工艺中提高自动化程度和回收内容。
- 批准号:
10109638 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
EU-Funded
WITHIN TOUCHING DISTANCE brings together artistic & technological innovation to explore how arts-based therapeutic content can be combined with XR.
触手可及的距离汇聚了艺术
- 批准号:
ES/Y011082/1 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Research Grant
Hopscotch 4 Scientific Investigation: Promoting Elementary Preservice Teacher Three-Dimensional Learning during Science Content Courses
跳房子4科学调查:在科学内容课程中促进小学职前教师三维学习
- 批准号:
2315617 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Development of a method for designing content for viewing that fosters the ability to perceive based on information foraging theory
基于信息搜寻理论开发一种培养感知能力的观看内容设计方法
- 批准号:
23K11334 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Designing a Bridging Model Using Learning Content Information LOD to Link School Education and Digital Archives
使用学习内容信息 LOD 设计桥接模型来链接学校教育和数字档案
- 批准号:
23H03695 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Grant-in-Aid for Scientific Research (B)