Resource-intensive and data-intensive methods for robust fine-grained sentiment-analysis

用于稳健的细粒度情感分析的资源密集型和数据密集型方法

基本信息

项目摘要

We present a research proposal for sentiment analysis in which we address shortcomings on expression-level text analysis. This fine-grained level has not been examined much in previous work even though for practical applications, such as opinion question answering or summarization, it is essential.For the predominant type of expressions in this task, i.e. polar expressions such as nice or terrible that convey positive or negative sentiment, we will focus on the problem of unknown words. We plan to investigate the use of morphological analysis for both decomposing and synthesizing words. Moreover, we will address the issue of polar intensity. We plan to systematically compare different automatic ordering methods among each other and also with human ratings.We will also create lexicons that contain different types of valence shifters. Shifters are essential for contextual classification, as they modify or even fully switch the polarity conveyed by polar expressions. Since valence shifting, so far, has been mostly reduced to handling common negation, this task requires a more thorough investigation on the nature of shifting.With regard to the entity extraction tasks in expression-level sentiment analysis, i.e. opinion holder and opinion target extraction, we aim to create novel lexicons that can serve as the back-bone of rule-based extraction systems. Such systems are usually fairly domain-independent and easy to create in the absence of labeled textual data. In order to tackle the aforementioned tasks, we will employ both resource-intensive methods, i.e. rule-based methods that make use of very deep semantic representations, and data-intensive methods, i.e. corpus-based methods which may also employ standard NLP tools.We will examine these tasks for English and German. Since the majority of previous research in natural language processing focussed on the former language, there are already sophisticated resources available which allow investigations of deep(er) linguistic methods. By contrast, for the latter these resources are not available. Accordingly, shallower methods, typically data-intensive ones, need to be applied. One additional contribution of this project is that, in particular for German, new resources, such as lexical resources and processing tools for sentiment analysis, will be created.In connection with the comparison of resource-intensive and data-intensive methods, we also want to answer the question which type of representation is best suited for the different classification/extraction tasks in fine-grained sentiment analysis. In this context, we will also critically assess the suitability of traditional lemma-based representations and contrast them with other potential levels, such as the sense level.Finally, we plan to review established evaluation methods and examine whether they make sufficiently transparent which kinds of phenomena an analysis system handles well and which it does not.
我们提出了情感分析的研究建议,其中我们解决了表达级文本分析的缺点。这种细粒度的层次在以前的工作中没有得到过多的研究,尽管对于实际应用,如意见问题回答或摘要,它是必不可少的。对于本任务中主要的表达类型,即表达积极或消极情绪的极性表达,如nice或terrible,我们将重点关注未知单词的问题。我们计划研究词形分析在词的分解和合成中的应用。此外,我们将解决极地强度问题。我们计划系统地比较不同的自动排序方法彼此之间以及与人工评分。我们还将创建包含不同类型的价移器的词典。移位词对于语境分类是必不可少的,因为它们修改甚至完全切换极性表达式所传达的极性。由于价移,到目前为止,已经主要减少到处理共同的否定,这项任务需要对转移的性质进行更彻底的调查。对于表达级情感分析中的实体提取任务,即意见持有人和意见目标的提取,我们的目标是创建新的词汇,作为基于规则的提取系统的主干。这样的系统通常是相当独立于领域的,并且在没有标记文本数据的情况下易于创建。为了解决上述任务,我们将采用资源密集型方法,即基于规则的方法,利用非常深的语义表示,以及数据密集型方法,即基于语料库的方法,也可能使用标准的NLP工具。我们将用英语和德语检查这些任务。由于之前自然语言处理的大部分研究都集中在前一种语言上,因此已经有成熟的资源可以用于研究深度(er)语言方法。相比之下,后者没有这些资源。因此,需要应用较浅的方法,通常是数据密集型的方法。这个项目的另一个贡献是,特别是对于德语,将创建新的资源,如词汇资源和情感分析的处理工具。在比较资源密集型和数据密集型方法的同时,我们还想回答一个问题,即哪种类型的表示最适合细粒度情感分析中不同的分类/提取任务。在这种情况下,我们还将批判性地评估传统的基于引理的表示的适用性,并将其与其他潜在级别(如感觉级别)进行对比。最后,我们计划审查已建立的评估方法,并检查它们是否足够透明,分析系统可以处理哪些现象,哪些现象不能处理。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Dr. Josef Ruppenhofer其他文献

Dr. Josef Ruppenhofer的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似海外基金

Aligning Patient Acuity with Resource Intensity after Major Surgery
大手术后使患者的敏锐度与资源强度保持一致
  • 批准号:
    10635798
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
CRITICAL: Collaborative Resource for Intensive care Translational science, Informatics, Comprehensive Analytics, and Learning
关键:重症监护转化科学、信息学、综合分析和学习的协作资源
  • 批准号:
    10461229
  • 财政年份:
    2021
  • 资助金额:
    --
  • 项目类别:
CRITICAL: Collaborative Resource for Intensive care Translational science, Informatics, Comprehensive Analytics, and Learning
关键:重症监护转化科学、信息学、综合分析和学习的协作资源
  • 批准号:
    10673051
  • 财政年份:
    2021
  • 资助金额:
    --
  • 项目类别:
CRITICAL: Collaborative Resource for Intensive care Translational science, Informatics, Comprehensive Analytics, and Learning
关键:重症监护转化科学、信息学、综合分析和学习的协作资源
  • 批准号:
    10300398
  • 财政年份:
    2021
  • 资助金额:
    --
  • 项目类别:
Preventing antimicrobial resistance and infections in hospitalized neonates in low resource settings
预防资源匮乏地区住院新生儿的抗菌药物耐药性和感染
  • 批准号:
    10215584
  • 财政年份:
    2020
  • 资助金额:
    --
  • 项目类别:
Preventing antimicrobial resistance and infections in hospitalized neonates in low resource settings
预防资源匮乏地区住院新生儿的抗菌药物耐药性和感染
  • 批准号:
    10438625
  • 财政年份:
    2020
  • 资助金额:
    --
  • 项目类别:
Preventing antimicrobial resistance and infections in hospitalized neonates in low resource settings
预防资源匮乏地区住院新生儿的抗菌药物耐药性和感染
  • 批准号:
    10652977
  • 财政年份:
    2020
  • 资助金额:
    --
  • 项目类别:
Computation and Data Intensive Parallel and Distributed Systems: Resource Management and Data Handling Techniques
计算和数据密集型并行和分布式系统:资源管理和数据处理技术
  • 批准号:
    RGPIN-2018-06297
  • 财政年份:
    2018
  • 资助金额:
    --
  • 项目类别:
    Discovery Grants Program - Individual
Developing a care bundle for neonatal sepsis prevention in low-resource settings
开发资源匮乏地区预防新生儿败血症的护理包
  • 批准号:
    9906287
  • 财政年份:
    2017
  • 资助金额:
    --
  • 项目类别:
Energy Efficient Resource Management for Data Intensive Computations in Clouds
云中数据密集型计算的节能资源管理
  • 批准号:
    479933-2015
  • 财政年份:
    2015
  • 资助金额:
    --
  • 项目类别:
    Alexander Graham Bell Canada Graduate Scholarships - Master's
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了