Cross-Lingual Knowledge Representation and Alignment in LLMs
法学硕士中的跨语言知识表示和协调
基本信息
- 批准号:2876276
- 负责人:
- 金额:--
- 依托单位:
- 依托单位国家:英国
- 项目类别:Studentship
- 财政年份:2023
- 资助国家:英国
- 起止时间:2023 至 无数据
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Large language models (LLMs) have demonstrated outstanding performance across many downstream tasks. However, they manifest language-dependent disparate capabilities with optimised performances on high-resource language, limiting the effectiveness on low-resource language tasks. This limitation is primarily attributed to the inherent knowledge grounding imbalance between languages, which manifests in two key aspects: knowledge disparities and cross-lingual knowledge asynchronicity. Knowledge disparities refer to the fact that LLMs may provide different responses when presented with the same question posed in different languages. Cross-lingual knowledge asynchronicity, on the other hand, relates to situations where a model updating its knowledge on one particular language does not synchronize such increments for other languages. These challenges are not readily addressed through applying Neural Machine Translation (NMT) techniques, as an NMT model (LLMs or not) inherits the same issues of language imbalance.To ensure consistent model performance for inputs in diverse languages, we aim to explore cross-lingual knowledge representation and alignment in LLMs. The research involves understanding the characteristics of cross-lingual knowledge representation and applying the insights to develop methods to align cross-lingual knowledge. Our research methods fall into two main avenues: external transformation and internal alignment. External transformation involves integrating a multilingual converter outside LLMs to facilitate the conversion of low-resource language representations into high-resource language representations, i.e., English. Internal alignment is to infer the feature of cross-lingual knowledge and apply specific neural network modifications to ensure the consistency of cross-knowledge representations.We define three fundamental research questions:RQ1: What is the nature of cross-lingual knowledge representation within LLMs? The line of inquiry involves developing cross-lingual probes to unveil the inner working mechanism of LLMs. We hope the gained insights guide us to develop instruments to solve knowledge grounding imbalance between languages.RQ2: What is the optimal way to align internal cross-lingual knowledge representation? The focus is on developing knowledge alignment techniques to effectively transfer knowledge from a high-resource language to its low-resource counterparts. The research may include introducing neural networks to transfer knowledge representations across various languages.RQ3: How can we align cross-lingual knowledge by bringing in external structures? We may introduce an external multilingual knowledge representation converter to transform representations from other languages into a high-resource language (i.e., English), ensuring LLMs consistency across language settings.
大型语言模型(LLM)在许多下游任务中表现出出色的性能。然而,它们表现出依赖于语言的不同能力,在高资源语言上具有优化的性能,限制了低资源语言任务的有效性。这种局限性主要归因于语言之间固有的知识基础不平衡,这种不平衡主要表现在两个方面:知识差异和跨语言知识的重复性。知识差距指的是,当以不同的语言提出相同的问题时,法学硕士可能会提供不同的回答。另一方面,跨语言知识的重复性涉及到模型在一种特定语言上更新其知识时不同步其他语言的这种增量的情况。这些挑战无法通过应用神经机器翻译(NMT)技术轻易解决,因为NMT模型(无论是否为LLM)继承了相同的语言不平衡问题。为了确保不同语言输入的模型性能一致,我们的目标是探索跨语言知识LLM中的表示和对齐。该研究涉及理解跨语言知识表示的特征,并将这些见解应用于开发跨语言知识对齐的方法。我们的研究方法分为两个主要途径:外部转型和内部调整。外部转换涉及在LLM外部集成多语言转换器,以便于将低资源语言表示转换为高资源语言表示,即,英语内部对齐(Internal alignment)是指通过对跨语言知识的特征进行推断,并对神经网络进行特定的修改,以保证跨语言知识表示的一致性。我们定义了三个基本的研究问题:RQ1:LLM中跨语言知识表示的本质是什么?研究方向包括开发跨语言探针,以揭示LLM的内部工作机制。我们希望所获得的见解指导我们开发工具来解决语言之间的知识基础不平衡。RQ2:什么是调整内部跨语言知识表示的最佳方法?重点是开发知识调整技术,以有效地将知识从高资源语言转移到低资源语言。研究可能包括引入神经网络来跨各种语言传递知识表示。RQ3:我们如何通过引入外部结构来调整跨语言知识?我们可以引入外部多语言知识表示转换器来将表示从其他语言转换为高资源语言(即,英语),确保LLM跨语言设置的一致性。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
其他文献
吉治仁志 他: "トランスジェニックマウスによるTIMP-1の線維化促進機序"最新医学. 55. 1781-1787 (2000)
Hitoshi Yoshiji 等:“转基因小鼠中 TIMP-1 的促纤维化机制”现代医学 55. 1781-1787 (2000)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
LiDAR Implementations for Autonomous Vehicle Applications
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
吉治仁志 他: "イラスト医学&サイエンスシリーズ血管の分子医学"羊土社(渋谷正史編). 125 (2000)
Hitoshi Yoshiji 等人:“血管医学与科学系列分子医学图解”Yodosha(涉谷正志编辑)125(2000)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Effect of manidipine hydrochloride,a calcium antagonist,on isoproterenol-induced left ventricular hypertrophy: "Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,K.,Teragaki,M.,Iwao,H.and Yoshikawa,J." Jpn Circ J. 62(1). 47-52 (1998)
钙拮抗剂盐酸马尼地平对异丙肾上腺素引起的左心室肥厚的影响:“Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('', 18)}}的其他基金
An implantable biosensor microsystem for real-time measurement of circulating biomarkers
用于实时测量循环生物标志物的植入式生物传感器微系统
- 批准号:
2901954 - 财政年份:2028
- 资助金额:
-- - 项目类别:
Studentship
Exploiting the polysaccharide breakdown capacity of the human gut microbiome to develop environmentally sustainable dishwashing solutions
利用人类肠道微生物群的多糖分解能力来开发环境可持续的洗碗解决方案
- 批准号:
2896097 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
A Robot that Swims Through Granular Materials
可以在颗粒材料中游动的机器人
- 批准号:
2780268 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Likelihood and impact of severe space weather events on the resilience of nuclear power and safeguards monitoring.
严重空间天气事件对核电和保障监督的恢复力的可能性和影响。
- 批准号:
2908918 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Proton, alpha and gamma irradiation assisted stress corrosion cracking: understanding the fuel-stainless steel interface
质子、α 和 γ 辐照辅助应力腐蚀开裂:了解燃料-不锈钢界面
- 批准号:
2908693 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Field Assisted Sintering of Nuclear Fuel Simulants
核燃料模拟物的现场辅助烧结
- 批准号:
2908917 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Assessment of new fatigue capable titanium alloys for aerospace applications
评估用于航空航天应用的新型抗疲劳钛合金
- 批准号:
2879438 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Developing a 3D printed skin model using a Dextran - Collagen hydrogel to analyse the cellular and epigenetic effects of interleukin-17 inhibitors in
使用右旋糖酐-胶原蛋白水凝胶开发 3D 打印皮肤模型,以分析白细胞介素 17 抑制剂的细胞和表观遗传效应
- 批准号:
2890513 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Understanding the interplay between the gut microbiome, behavior and urbanisation in wild birds
了解野生鸟类肠道微生物组、行为和城市化之间的相互作用
- 批准号:
2876993 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
相似海外基金
Impact of Lingual Endurance Exercise on Rehabilitation of Swallowing Impairments after lschemic Stroke
舌耐力运动对缺血性中风后吞咽障碍康复的影响
- 批准号:
10644397 - 财政年份:2023
- 资助金额:
-- - 项目类别:
An extractive AI model for simultaneous cross-lingual, cross-jurisdictional contract analysis
用于同步跨语言、跨司法管辖区合同分析的提取式人工智能模型
- 批准号:
10074563 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Collaborative R&D
Miasano: Mobile Health App for Improved Evidence-Based Rehabilitation Outcomes and Multi-Lingual Clinical Interactions
Miasano:移动健康应用程序可改善循证康复结果和多语言临床交互
- 批准号:
10601742 - 财政年份:2022
- 资助金额:
-- - 项目类别:
ReBabel - deep learning cross-lingual vocal characteristic matching with automated lip-sync
ReBabel - 深度学习跨语言声音特征与自动唇形同步匹配
- 批准号:
10014754 - 财政年份:2022
- 资助金额:
-- - 项目类别:
Collaborative R&D
Explainable AI-Based Multi-Lingual Content Moderation System
可解释的基于人工智能的多语言内容审核系统
- 批准号:
73632 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Study
Multi-lingual modelling of conversational speech
会话语音的多语言建模
- 批准号:
2676033 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Studentship
Mechanisms of Lingual Motor Plasticity of Post-Stroke Dysphagia in an Animal Model
动物模型中风后吞咽困难的舌运动可塑性机制
- 批准号:
10403156 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Basic research of the sex difference in tongue neuropathic pain caused by lingual nerve injury in mice
小鼠舌神经损伤所致舌神经病理性疼痛性别差异的基础研究
- 批准号:
20K07746 - 财政年份:2020
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Scientific Research (C)
SBIR Phase I: Engineering a novel 3D metal printed orthodontic system for lingual attachment-enabled clear aligner therapy
SBIR 第一阶段:设计新型 3D 金属打印正畸系统,用于支持舌侧附着的透明矫正器治疗
- 批准号:
1938533 - 财政年份:2020
- 资助金额:
-- - 项目类别:
Standard Grant
Defining gene expression and regulation in lingual taste and non-taste papilla epithelium
定义舌味和非味乳头上皮的基因表达和调控
- 批准号:
10116729 - 财政年份:2020
- 资助金额:
-- - 项目类别: