Collaborative Research: CI-P: Creation of an annotated repository of multilingual and multigenre code switched data for several language pairs

合作研究:CI-P:创建多个语言对的多语言和多流派代码交换数据的带注释存储库

基本信息

  • 批准号:
    0958440
  • 负责人:
  • 金额:
    $ 7.8万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2010
  • 资助国家:
    美国
  • 起止时间:
    2010-03-01 至 2011-09-30
  • 项目状态:
    已结题

项目摘要

Code switching (CS) is the term used to describe a common practice among bilingual speakers of a given language pair in which the speakers switch back and forth between their common languages. CS occurs in all genres of communication, and at different levels of linguistic representation. Computational algorithms trained for a single language fail when the input has other languages in the signal i.e. data with CS phenomena. One major barrier to research on processing CS is the lack of large, accurately annotated corpora of CS data. This planning proposal aims at creating the framework for a large consistently annotated data repository that will target 7 different languages annotated with features at different levels of granularity. In the course of the planning grant, we plan to hold a community workshop to ensure that we are addressing their needs in the repository. We will work with the community in order to prepare the full CRI proposal. This data will be transformative for computational linguistics research as it will provide a testbed for adaptive learning algorithms, lead to significant robustness in handling very diverse data sources, and create a framework for genuine multilingual processing. Moreover, it will have a direct impact on the way sociolinguists account for CS leading to more robust and replicable generalizations. Research on CS will help acknowledge the creativity of bilinguals in exploiting their verbal repertoire. The CS repository will enable new research in many interconnected fields. This research will contribute to raising general awareness of bi/multilingualism.
语码转换(英语:Code switching,CS)是一个术语,用来描述特定语言对的双语者之间的一种常见行为,即说话者在他们的共同语言之间来回切换。交际策略存在于所有的交际类型中,并在不同的语言表达水平上出现。 当输入信号中有其他语言时,针对单一语言训练的计算算法失败,即具有CS现象的数据。处理CS研究的一个主要障碍是缺乏大型的、准确注释的CS数据语料库。该规划提案旨在为大型一致注释数据库创建框架,该数据库将针对7种不同的语言,并以不同的粒度级别进行功能注释。在规划补助金的过程中,我们计划举办一个社区研讨会,以确保我们在存储库中满足他们的需求。我们将与社区合作,以准备完整的CRI提案。这些数据将为计算语言学研究带来变革,因为它将为自适应学习算法提供测试平台,在处理非常多样化的数据源时具有显著的鲁棒性,并为真正的多语言处理创建框架。此外,它将直接影响社会语言学家对语码转换的解释,从而产生更强大和可复制的概括。对语码转换的研究将有助于承认双语者在开发他们的语言技能方面的创造性。CS存储库将使许多相互关联的领域的新研究成为可能。这项研究将有助于提高对双语/多语制的普遍认识。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Mona Diab其他文献

Improving Coherence of Language Model Generation with Latent Semantic State
提高语言模型生成与潜在语义状态的一致性
  • DOI:
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Amanda Askell;Yuntao Bai;Anna Chen;Dawn Drain;Deep Ganguli;T. Henighan;Andy Jones;Benjamin Mann;Nova Dassarma;Nelson El;Zac Hatfield;Danny Hernandez;John Kernion;Kamal Ndousse;Catherine Olsson;Dario Amodei;Tom Brown;J. Clark;Sam Mc;Chris Olah;Jared Kaplan;Nick Ryder;Jared D Subbiah;Prafulla Kaplan;A. Dhariwal;P. Neelakantan;Girish Shyam;Amanda Sastry;Sandhini Askell;Ariel Agarwal;Herbert;Gretchen Krueger;R. Child;Aditya Ramesh;Daniel M. Ziegler;Jeffrey Wu;Christopher Winter;Mark Hesse;Eric Chen;Mateusz Sigler;Scott teusz Litwin;Benjamin Gray;Jack Chess;Christopher Clark;Sam Berner;Alec McCandlish;Ilya Radford;Sutskever Dario;Amodei;Joshua Maynez;Shashi Narayan;Bernd Bohnet;Kurt Shuster;Spencer Poff;Moya Chen;Douwe Kiela;Shane Storks;Qiaozi Gao;Yichi Zhang;Joyce Chai;Niket Tandon;Keisuke Sakaguchi;Bhavana Dalvi;Dheeraj Rajagopal;Peter Clark;Michal Guerquin;Kyle Richardson;Eduard H. Hovy;A. Dataset;Rowan Zellers;Ari Holtzman;Matthew E. Peters;Roozbeh Mottaghi;Aniruddha Kembhavi;Ali Farhadi;Chunting Zhou;Graham Neubig;Jiatao Gu;Mona Diab;Francisco Guzmán;Luke Zettlemoyer
  • 通讯作者:
    Luke Zettlemoyer
Investigating Cultural Alignment of Large Language Models
研究大型语言模型的文化一致性
  • DOI:
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Badr AlKhamissi;Muhammad N. ElNokrashy;Mai AlKhamissi;Mona Diab
  • 通讯作者:
    Mona Diab
Arabic natural language processing for Qur’anic research: a systematic review
  • DOI:
    10.1007/s10462-022-10313-2
  • 发表时间:
    2022-12-02
  • 期刊:
  • 影响因子:
    13.900
  • 作者:
    Muhammad Huzaifa Bashir;Aqil M. Azmi;Haq Nawaz;Wajdi Zaghouani;Mona Diab;Ala Al-Fuqaha;Junaid Qadir
  • 通讯作者:
    Junaid Qadir
Combining Discrete Wavelet and Cosine Transforms for Efficient Sentence Embedding
结合离散小波和余弦变换实现高效句子嵌入
Author Correction: Arabic natural language processing for Qur’anic research: a systematic review
  • DOI:
    10.1007/s10462-023-10390-x
  • 发表时间:
    2023-03-24
  • 期刊:
  • 影响因子:
    13.900
  • 作者:
    Muhammad Huzaifa Bashir;Aqil M. Azmi;Haq Nawaz;Wajdi Zaghouani;Mona Diab;Ala Al-Fuqaha;Junaid Qadir
  • 通讯作者:
    Junaid Qadir

Mona Diab的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Mona Diab', 18)}}的其他基金

CI-P: Towards the Creation of a Unified Repository for MultiLingual and CrossLingual Multiword Expressions
CI-P:为多语言和跨语言多词表达式创建统一存储库
  • 批准号:
    1513116
  • 财政年份:
    2015
  • 资助金额:
    $ 7.8万
  • 项目类别:
    Standard Grant
CI-ADDO-NEW: Collaborative Research: A Repository for Annotating Multilingual Code Switched Data
CI-ADDO-NEW:协作研究:用于注释多语言代码交换数据的存储库
  • 批准号:
    1343530
  • 财政年份:
    2013
  • 资助金额:
    $ 7.8万
  • 项目类别:
    Standard Grant
CI-ADDO-NEW: Collaborative Research: A Repository for Annotating Multilingual Code Switched Data
CI-ADDO-NEW:协作研究:用于注释多语言代码交换数据的存储库
  • 批准号:
    1205556
  • 财政年份:
    2012
  • 资助金额:
    $ 7.8万
  • 项目类别:
    Standard Grant
SGER: Automatic Processing of Natural Language Code Switching
SGER:自然语言代码切换的自动处理
  • 批准号:
    0749062
  • 财政年份:
    2007
  • 资助金额:
    $ 7.8万
  • 项目类别:
    Standard Grant

相似国自然基金

Research on Quantum Field Theory without a Lagrangian Description
  • 批准号:
    24ZR1403900
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
Cell Research
  • 批准号:
    31224802
  • 批准年份:
    2012
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Cell Research
  • 批准号:
    31024804
  • 批准年份:
    2010
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Cell Research (细胞研究)
  • 批准号:
    30824808
  • 批准年份:
    2008
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
  • 批准号:
    10774081
  • 批准年份:
    2007
  • 资助金额:
    45.0 万元
  • 项目类别:
    面上项目

相似海外基金

Collaborative Research: GEO OSE Track 2: Developing CI-enabled collaborative workflows to integrate data for the SZ4D (Subduction Zones in Four Dimensions) community
协作研究:GEO OSE 轨道 2:开发支持 CI 的协作工作流程以集成 SZ4D(四维俯冲带)社区的数据
  • 批准号:
    2324714
  • 财政年份:
    2024
  • 资助金额:
    $ 7.8万
  • 项目类别:
    Standard Grant
Collaborative Research: Maritime to Inland Transitions Towards ENvironments for Convection Initiation (MITTEN CI)
合作研究:海洋到内陆向对流引发环境的转变(MITTEN CI)
  • 批准号:
    2349935
  • 财政年份:
    2024
  • 资助金额:
    $ 7.8万
  • 项目类别:
    Continuing Grant
Collaborative Research: Maritime to Inland Transitions Towards ENvironments for Convection Initiation (MITTEN CI)
合作研究:海洋到内陆向对流引发环境的转变(MITTEN CI)
  • 批准号:
    2349934
  • 财政年份:
    2024
  • 资助金额:
    $ 7.8万
  • 项目类别:
    Continuing Grant
Collaborative Research: Frameworks: MobilityNet: A Trustworthy CI Emulation Tool for Cross-Domain Mobility Data Generation and Sharing towards Multidisciplinary Innovations
协作研究:框架:MobilityNet:用于跨域移动数据生成和共享以实现多学科创新的值得信赖的 CI 仿真工具
  • 批准号:
    2411152
  • 财政年份:
    2024
  • 资助金额:
    $ 7.8万
  • 项目类别:
    Standard Grant
Collaborative Research: Maritime to Inland Transitions Towards ENvironments for Convection Initiation (MITTEN CI)
合作研究:海洋到内陆向对流引发环境的转变(MITTEN CI)
  • 批准号:
    2349936
  • 财政年份:
    2024
  • 资助金额:
    $ 7.8万
  • 项目类别:
    Continuing Grant
Collaborative Research: Frameworks: MobilityNet: A Trustworthy CI Emulation Tool for Cross-Domain Mobility Data Generation and Sharing towards Multidisciplinary Innovations
协作研究:框架:MobilityNet:用于跨域移动数据生成和共享以实现多学科创新的值得信赖的 CI 仿真工具
  • 批准号:
    2411153
  • 财政年份:
    2024
  • 资助金额:
    $ 7.8万
  • 项目类别:
    Standard Grant
Collaborative Research: GEO OSE Track 2: Developing CI-enabled collaborative workflows to integrate data for the SZ4D (Subduction Zones in Four Dimensions) community
协作研究:GEO OSE 轨道 2:开发支持 CI 的协作工作流程以集成 SZ4D(四维俯冲带)社区的数据
  • 批准号:
    2324709
  • 财政年份:
    2024
  • 资助金额:
    $ 7.8万
  • 项目类别:
    Standard Grant
Collaborative Research: GEO OSE Track 2: Developing CI-enabled collaborative workflows to integrate data for the SZ4D (Subduction Zones in Four Dimensions) community
协作研究:GEO OSE 轨道 2:开发支持 CI 的协作工作流程以集成 SZ4D(四维俯冲带)社区的数据
  • 批准号:
    2324713
  • 财政年份:
    2024
  • 资助金额:
    $ 7.8万
  • 项目类别:
    Standard Grant
Collaborative Research: Maritime to Inland Transitions Towards ENvironments for Convection Initiation (MITTEN CI)
合作研究:海洋到内陆向对流引发环境的转变(MITTEN CI)
  • 批准号:
    2349937
  • 财政年份:
    2024
  • 资助金额:
    $ 7.8万
  • 项目类别:
    Continuing Grant
Collaborative Research: GEO OSE Track 2: Developing CI-enabled collaborative workflows to integrate data for the SZ4D (Subduction Zones in Four Dimensions) community
协作研究:GEO OSE 轨道 2:开发支持 CI 的协作工作流程以集成 SZ4D(四维俯冲带)社区的数据
  • 批准号:
    2324710
  • 财政年份:
    2024
  • 资助金额:
    $ 7.8万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了