SGER: Automatic Processing of Natural Language Code Switching

SGER:自然语言代码切换的自动处理

基本信息

  • 批准号:
    0749062
  • 负责人:
  • 金额:
    --
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2007
  • 资助国家:
    美国
  • 起止时间:
    2007-09-01 至 2009-02-28
  • 项目状态:
    已结题

项目摘要

Code switching is a natural linguistic phenomenon in which a speaker mixes two or more languages or dialects, or two or more linguistic registers from the same language. Extensive sociolinguistic studies have been dedicated to this widespread and common phenomenon and there has been some prior work in formal linguistics, but to date it has not been considered a problem of interest to the computational linguistics community. However, in this age of globalization and the current explosion in information and web access, more and more spontaneously generated linguistic data from around the world are being made available to the computational research community. Such data abounds with code switching in its different forms, so there is a real need for computational linguists to address code switching as a central research problem. This exploratory research effort addresses the issues of how to process code switching automatically. It examines the different aspects of code switching, allowing for the creation of better-principled algorithms based on a clear understanding of the phenomenon. The main questions revolve around morphological and syntactic constraints on switching and how these constraints can be modeled computationally. One of the outcomes of this research is the annotation of significant amounts of data exhibiting code switching in different languages, most likely Arabic, Hindi and Spanish. This research aims at initiating a formal study of code switching in a computational framework, which both increases our understanding of the phenomenon, and develops algorithms for processing natural language data that manifests code switching.
语码转换是一种自然的语言现象,说话者将两种或两种以上的语言或方言,或同一种语言的两个或两个以上的语域混合在一起。大量的社会语言学研究致力于这种广泛而普遍的现象,并且在正式语言学中也有一些先前的工作,但迄今为止,它还没有被认为是计算语言学社区感兴趣的问题。然而,在这个全球化的时代和当前信息和网络访问的爆炸,越来越多的自发生成的语言数据来自世界各地的计算研究社区提供。这些数据中充斥着不同形式的代码转换,因此计算语言学家确实需要将代码转换作为一个中心研究问题来解决,这是真实的需要。这种探索性的研究工作解决了如何自动处理代码切换的问题。它检查了代码切换的不同方面,允许在对现象有清晰理解的基础上创建更好的原则性算法。主要的问题围绕切换的形态和句法约束,以及如何这些约束可以模拟计算。这项研究的成果之一是对大量数据进行了注释,这些数据显示了不同语言中的代码转换,最有可能是阿拉伯语,印地语和西班牙语。本研究旨在启动一个正式的研究代码切换的计算框架,既增加了我们的理解的现象,并开发算法处理自然语言数据,体现代码切换。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Mona Diab其他文献

Improving Coherence of Language Model Generation with Latent Semantic State
提高语言模型生成与潜在语义状态的一致性
  • DOI:
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Amanda Askell;Yuntao Bai;Anna Chen;Dawn Drain;Deep Ganguli;T. Henighan;Andy Jones;Benjamin Mann;Nova Dassarma;Nelson El;Zac Hatfield;Danny Hernandez;John Kernion;Kamal Ndousse;Catherine Olsson;Dario Amodei;Tom Brown;J. Clark;Sam Mc;Chris Olah;Jared Kaplan;Nick Ryder;Jared D Subbiah;Prafulla Kaplan;A. Dhariwal;P. Neelakantan;Girish Shyam;Amanda Sastry;Sandhini Askell;Ariel Agarwal;Herbert;Gretchen Krueger;R. Child;Aditya Ramesh;Daniel M. Ziegler;Jeffrey Wu;Christopher Winter;Mark Hesse;Eric Chen;Mateusz Sigler;Scott teusz Litwin;Benjamin Gray;Jack Chess;Christopher Clark;Sam Berner;Alec McCandlish;Ilya Radford;Sutskever Dario;Amodei;Joshua Maynez;Shashi Narayan;Bernd Bohnet;Kurt Shuster;Spencer Poff;Moya Chen;Douwe Kiela;Shane Storks;Qiaozi Gao;Yichi Zhang;Joyce Chai;Niket Tandon;Keisuke Sakaguchi;Bhavana Dalvi;Dheeraj Rajagopal;Peter Clark;Michal Guerquin;Kyle Richardson;Eduard H. Hovy;A. Dataset;Rowan Zellers;Ari Holtzman;Matthew E. Peters;Roozbeh Mottaghi;Aniruddha Kembhavi;Ali Farhadi;Chunting Zhou;Graham Neubig;Jiatao Gu;Mona Diab;Francisco Guzmán;Luke Zettlemoyer
  • 通讯作者:
    Luke Zettlemoyer
Investigating Cultural Alignment of Large Language Models
研究大型语言模型的文化一致性
  • DOI:
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Badr AlKhamissi;Muhammad N. ElNokrashy;Mai AlKhamissi;Mona Diab
  • 通讯作者:
    Mona Diab
Arabic natural language processing for Qur’anic research: a systematic review
  • DOI:
    10.1007/s10462-022-10313-2
  • 发表时间:
    2022-12-02
  • 期刊:
  • 影响因子:
    13.900
  • 作者:
    Muhammad Huzaifa Bashir;Aqil M. Azmi;Haq Nawaz;Wajdi Zaghouani;Mona Diab;Ala Al-Fuqaha;Junaid Qadir
  • 通讯作者:
    Junaid Qadir
Combining Discrete Wavelet and Cosine Transforms for Efficient Sentence Embedding
结合离散小波和余弦变换实现高效句子嵌入
Author Correction: Arabic natural language processing for Qur’anic research: a systematic review
  • DOI:
    10.1007/s10462-023-10390-x
  • 发表时间:
    2023-03-24
  • 期刊:
  • 影响因子:
    13.900
  • 作者:
    Muhammad Huzaifa Bashir;Aqil M. Azmi;Haq Nawaz;Wajdi Zaghouani;Mona Diab;Ala Al-Fuqaha;Junaid Qadir
  • 通讯作者:
    Junaid Qadir

Mona Diab的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Mona Diab', 18)}}的其他基金

CI-P: Towards the Creation of a Unified Repository for MultiLingual and CrossLingual Multiword Expressions
CI-P:为多语言和跨语言多词表达式创建统一存储库
  • 批准号:
    1513116
  • 财政年份:
    2015
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
CI-ADDO-NEW: Collaborative Research: A Repository for Annotating Multilingual Code Switched Data
CI-ADDO-NEW:协作研究:用于注释多语言代码交换数据的存储库
  • 批准号:
    1343530
  • 财政年份:
    2013
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
CI-ADDO-NEW: Collaborative Research: A Repository for Annotating Multilingual Code Switched Data
CI-ADDO-NEW:协作研究:用于注释多语言代码交换数据的存储库
  • 批准号:
    1205556
  • 财政年份:
    2012
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Collaborative Research: CI-P: Creation of an annotated repository of multilingual and multigenre code switched data for several language pairs
合作研究:CI-P:创建多个语言对的多语言和多流派代码交换数据的带注释存储库
  • 批准号:
    0958440
  • 财政年份:
    2010
  • 资助金额:
    --
  • 项目类别:
    Standard Grant

相似海外基金

Excellence in Research: Exploring Effectiveness of Automatic Assessment of Cognitive and Metacognitive Processes in Engineering Learning through Natural Language Processing Models
卓越研究:通过自然语言处理模型探索工程学习中认知和元认知过程自动评估的有效性
  • 批准号:
    2302686
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
AddBiomechanics: Automatic Processing and Sharing of Human Movement Data
AddBiomechanics:人体运动数据的自动处理和共享
  • 批准号:
    10743411
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
LEAPS-MPS: Artificial Intelligence Techniques for Automatic NMR Metabolomics Data Processing
LEAPS-MPS:用于自动 NMR 代谢组学数据处理的人工智能技术
  • 批准号:
    2245530
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
Automatic optimization of deep learning models and reconstruction of training data for microscopic image processing
深度学习模型的自动优化和显微图像处理训练数据的重建
  • 批准号:
    22K12270
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Applications of Stochastic Machine Learning and Statistical Signal Processing Approaches to Automatic Music Transcription and Visualisation
随机机器学习和统计信号处理方法在自动音乐转录和可视化中的应用
  • 批准号:
    2738835
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Studentship
The interplay between the attended and the unattended: A gateway to automatic and controlled processing
有人值守和无人值守之间的相互作用:自动控制处理的门户
  • 批准号:
    RGPIN-2020-05626
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Discovery Grants Program - Individual
The interplay between the attended and the unattended: A gateway to automatic and controlled processing
有人值守和无人值守之间的相互作用:自动控制处理的门户
  • 批准号:
    RGPIN-2020-05626
  • 财政年份:
    2021
  • 资助金额:
    --
  • 项目类别:
    Discovery Grants Program - Individual
LEAPS-MPS: Artificial Intelligence Techniques for Automatic NMR Metabolomics Data Processing
LEAPS-MPS:用于自动 NMR 代谢组学数据处理的人工智能技术
  • 批准号:
    2137575
  • 财政年份:
    2021
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
Development of image processing techniques using deep learning for automatic diagnosis and diagnosis supporting
开发利用深度学习的图像处理技术进行自动诊断和诊断支持
  • 批准号:
    21K11958
  • 财政年份:
    2021
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Smart Gauge - Automatic rail survey processing and gauging using deep learning.
智能测量 - 使用深度学习自动进行铁路测量处理和测量。
  • 批准号:
    971730
  • 财政年份:
    2020
  • 资助金额:
    --
  • 项目类别:
    Small Business Research Initiative
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了