CAREER: Discourse Processing and Content Generation for Document Simplification
职业:用于文档简化的话语处理和内容生成
基本信息
- 批准号:2145479
- 负责人:
- 金额:$ 54.05万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-09-01 至 2027-08-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
This award is funded in whole or in part under the American Rescue Plan Act of 2021 (Public Law 117-2).Simplification is the process of making a text more accessible to a target audience, e.g., language learners, children, and individuals with language impairments, while preserving its meaning and content. The lack of accessible material can exacerbate social issues, for example, the complexity of language used in college admission and financial aid applications has contributed to the lagging access to higher education among emergent bilingual students; the WHO has recognized the urgency of accessible technical information, given the rise of medical misinformation especially in the wake of the COVID-19 pandemic. While there has been much work on sentence simplification, very few datasets are large enough to train supervised models; simplifying a document also involves different operations from those at the sentence level, including content addition, and how sentences connect with each other. This project aims to develop new resources and data-driven approaches for document simplification, with the potential to address information transparency and fair access across a range of high-stake domains. This project will also support the education and training of a diverse body of undergraduate and graduate students across disciplines.To substantially advance document simplification, this CAREER project will tackle several key issues in existing simplification work, including corpora diversity, explanation generation, and document-level approaches. This is achieved by the following research activities: (1) introducing new corpora that tackle the pressing challenge of data diversity in simplification research and enable new application scenarios, especially in the accessibility of technical and jargon-laden texts; (2) tackling content addition and elaboration during simplification---a previously little-explored challenge, and propose a novel, linguistically-informed framework that characterizes and generates elaborations; (3) develop models for document simplification that are informed by structures of discourse, using both coherence structure and entity salience. The innovative ways to integrate discourse target a larger challenge for models to take stretches of discourse into account.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该奖项全部或部分由2021年美国救援计划法案(公法117-2)资助。简化是使文本更容易被目标受众所理解的过程,例如,语言学习者、儿童和有语言障碍的个人,同时保留其意义和内容。缺乏可获取的材料可能会加剧社会问题,例如,大学入学和经济援助申请中使用的语言的复杂性导致新兴双语学生接受高等教育的机会滞后;鉴于医疗错误信息的增加,特别是在COVID-19大流行之后,世卫组织已经认识到获取技术信息的紧迫性。虽然在句子简化方面有很多工作,但很少有数据集足够大,可以训练监督模型;简化文档还涉及与句子级别不同的操作,包括内容添加以及句子如何相互连接。该项目旨在开发新的资源和数据驱动的文件简化方法,有可能解决一系列高风险领域的信息透明度和公平获取问题。该项目还将支持教育和培训不同学科的本科生和研究生。为了大力推进文档简化,该CAREER项目将解决现有简化工作中的几个关键问题,包括语料库多样性,解释生成和文档级方法。这是通过以下研究活动实现的:(1)引入新的语料库,以应对简化研究中数据多样性的紧迫挑战,并实现新的应用场景,特别是在技术和术语文本的可访问性方面;(2)在简化过程中处理内容添加和细化-以前很少探索的挑战,并提出一个新的,(3)建立以语篇结构为基础的文档简化模型,同时利用连贯结构和实体显著性。整合话语的创新方法针对模型考虑话语延伸的更大挑战。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(13)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Counterfactual Probing for the Influence of Affect and Specificity on Intergroup Bias
反事实探究情感和特异性对群体间偏见的影响
- DOI:
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Govindarajan, Venkata Subrahmanyan;Beaver, David;Mahowald, Kyle;Li, Junyi Jessy
- 通讯作者:Li, Junyi Jessy
Discourse Comprehension: A Question Answering Framework to Represent Sentence Connections
话语理解:表示句子连接的问答框架
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Ko, Wei-Jen;Dalton, Cutter;Simmons, Mark;Fisher, Eliza;Durrett, Greg;Li, Junyi Jessy
- 通讯作者:Li, Junyi Jessy
The Role of Context and Uncertainty in Shallow Discourse Parsing
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Katherine Atwell;Remi Choi;Junyi Jessy Li;Malihe Alikhani
- 通讯作者:Katherine Atwell;Remi Choi;Junyi Jessy Li;Malihe Alikhani
CoditT5: Pretraining for Source Code and Natural Language Editing
- DOI:10.1145/3551349.3556955
- 发表时间:2022-08
- 期刊:
- 影响因子:0
- 作者:Jiyang Zhang;Sheena Panthaplackel;Pengyu Nie;Junyi Jessy Li;Miloš Gligorić
- 通讯作者:Jiyang Zhang;Sheena Panthaplackel;Pengyu Nie;Junyi Jessy Li;Miloš Gligorić
Using Developer Discussions to Guide Fixing Bugs in Software
- DOI:10.48550/arxiv.2211.06335
- 发表时间:2022-11
- 期刊:
- 影响因子:0
- 作者:Sheena Panthaplackel;Miloš Gligorić;Junyi Jessy Li;R. Mooney
- 通讯作者:Sheena Panthaplackel;Miloš Gligorić;Junyi Jessy Li;R. Mooney
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Junyi Li其他文献
Tidal currents in the coastal waters east of Hainan Island in winter
海南岛东部近海冬季潮流
- DOI:
10.1007/s00343-021-0453-y - 发表时间:
2021 - 期刊:
- 影响因子:1.6
- 作者:
Li Min;Lingling Xie;Xiaolong Zong;Junyi Li;Mingming Li;Tong Yan;Ronglei Han - 通讯作者:
Ronglei Han
Serum albumin predicts hyperuricemia in patients with idiopathic membranous nephropathy.
血清白蛋白可预测特发性膜性肾病患者的高尿酸血症。
- DOI:
10.21203/rs.3.rs-199182/v1 - 发表时间:
2021 - 期刊:
- 影响因子:1.1
- 作者:
Cuimei Wei;Tong Li;X. Xuan;Haofei Hu;Xiaohua Xiao;Junyi Li - 通讯作者:
Junyi Li
Glomerulosclerosis predicts poor renal outcome in patients with idiopathic membranous nephropathy
肾小球硬化预示着特发性膜性肾病患者的肾脏预后不佳
- DOI:
10.1007/s11255-020-02641-5 - 发表时间:
2020 - 期刊:
- 影响因子:2
- 作者:
Cuimei Wei;Yongcheng He;Tong Li;Haofei Hu;Haiying Song;Dongli Qi;Yuan Cheng;Jia Chen;Mijie Guan;Xiaohua Xiao;Junyi Li - 通讯作者:
Junyi Li
YuLan: An Open-source Large Language Model
YuLan:一个开源的大语言模型
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Yutao Zhu;Kun Zhou;Kelong Mao;Wentong Chen;Yiding Sun;Zhipeng Chen;Qian Cao;Yihan Wu;Yushuo Chen;Feng Wang;Lei Zhang;Junyi Li;Xiaolei Wang;Lei Wang;Beichen Zhang;Zican Dong;Xiaoxue Cheng;Yuhan Chen;Xinyu Tang;Yupeng Hou;Qiangqiang Ren;Xincheng Pang;Shufang Xie;Wayne Xin Zhao;Zhicheng Dou;Jiaxin Mao;Yankai Lin;Rui;Jun Xu;Xu Chen;Rui Yan;Zhewei Wei;Di Hu;Wenbing Huang;Ze;Yueguo Chen;Weizheng Lu;Ji - 通讯作者:
Ji
Toxicity Characterization of Environment-Related Pollutants Using a Biospectroscopy-Bioreporter-Coupling Approach: Potential for Real-World Toxicity Determination and Source Apportionment of Multiple Pollutants.
使用生物光谱-生物报告-耦合方法对环境相关污染物进行毒性表征:多种污染物的真实毒性测定和来源解析的潜力。
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:7.4
- 作者:
Naifu Jin;Kai Yang;Junyi Li;Yizhi Song;A. Ding;Yujiao Sun;Guang;Dayin Zhang - 通讯作者:
Dayin Zhang
Junyi Li的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Junyi Li', 18)}}的其他基金
Collaborative Research: HCC: Medium: Fine-grained Emotion Analysis in Crises
合作研究:HCC:中:危机中的细粒度情绪分析
- 批准号:
2107524 - 财政年份:2021
- 资助金额:
$ 54.05万 - 项目类别:
Standard Grant
CRII:RI:A Multi-level Framework for Text Specificity
CRII:RI:文本特异性的多层次框架
- 批准号:
1850153 - 财政年份:2019
- 资助金额:
$ 54.05万 - 项目类别:
Standard Grant
相似海外基金
Neural Correlates of Discourse Processing in Adolescents
青少年话语处理的神经相关性
- 批准号:
10687822 - 财政年份:2022
- 资助金额:
$ 54.05万 - 项目类别:
Doctoral Dissertation Research: The interaction of discourse status and memory retrieval in real time language processing
博士论文研究:实时语言处理中话语状态与记忆检索的相互作用
- 批准号:
2214437 - 财政年份:2022
- 资助金额:
$ 54.05万 - 项目类别:
Standard Grant
Advance the area of discourse parsing by exploiting distant supervision signals from closely related Natural Language Processing (NLP) tasks to overcome the prevailing lack of discourse annotated data
通过利用密切相关的自然语言处理(NLP)任务中的远程监督信号来克服普遍缺乏话语注释数据的问题,从而推进话语解析领域的发展
- 批准号:
547337-2020 - 财政年份:2022
- 资助金额:
$ 54.05万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Doctoral
Natural Language Processing and Computational Linguistics - Discourse Parsing and Summarization
自然语言处理和计算语言学 - 语篇解析和摘要
- 批准号:
566113-2021 - 财政年份:2021
- 资助金额:
$ 54.05万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Master's
Advance the area of discourse parsing by exploiting distant supervision signals from closely related Natural Language Processing (NLP) tasks to overcome the prevailing lack of discourse annotated data
通过利用密切相关的自然语言处理(NLP)任务中的远程监督信号来克服普遍缺乏话语注释数据的问题,从而推进话语解析领域的发展
- 批准号:
547337-2020 - 财政年份:2021
- 资助金额:
$ 54.05万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Doctoral
Advance the area of discourse parsing by exploiting distant supervision signals from closely related Natural Language Processing (NLP) tasks to overcome the prevailing lack of discourse annotated data
通过利用密切相关的自然语言处理(NLP)任务中的远程监督信号来克服普遍缺乏话语注释数据的问题,从而推进话语解析领域的发展
- 批准号:
547337-2020 - 财政年份:2020
- 资助金额:
$ 54.05万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Doctoral
A field-based psycholinguistic study of the discourse processing mechanisms of OS languages
操作系统语言话语处理机制的基于现场的心理语言学研究
- 批准号:
15H02603 - 财政年份:2015
- 资助金额:
$ 54.05万 - 项目类别:
Grant-in-Aid for Scientific Research (A)
Doctoral Dissertation Research: Processing of long-distance dependencies at the syntax-discourse interface: The case of Clitic Left Dislocation
博士论文研究:句法-话语界面长距离依存关系的处理:以附着力左错位为例
- 批准号:
1323229 - 财政年份:2013
- 资助金额:
$ 54.05万 - 项目类别:
Standard Grant
Doctoral Dissertation Research: The Processing of Referential Expressions in Discourse in L2 English
博士论文研究:二语英语话语中指称表达的处理
- 批准号:
1252235 - 财政年份:2013
- 资助金额:
$ 54.05万 - 项目类别:
Standard Grant
Japanese EFL Learners' Syntactic and Discourse Processing in Teaching of Reading Comprehension
日本英语学习者阅读理解教学中的句法和语篇加工
- 批准号:
23652145 - 财政年份:2011
- 资助金额:
$ 54.05万 - 项目类别:
Grant-in-Aid for Challenging Exploratory Research