权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Towards Trustworthy Large Language Models

迈向可信赖的大型语言模型

基本信息

批准号：
2895111
负责人：
金额：
--
依托单位：
University of Liverpool
依托单位国家：
英国
项目类别：
Studentship
财政年份：
2023
资助国家：
英国
起止时间：
2023 至无数据
项目状态：
未结题

来源：
https://gtr.ukri.org/projects?ref=studentship-2895111
关键词：
Towards Trustworthy Large Language Models

项目摘要

In the past few years Large Language models (broadly speaking foundational models) (e.g. ChatGPT, GPT-3 Brown et al. [2020], GPT-4 OpenAI [2023]) have stirred up the field of Artificial Intelligence (AI). More specifically, with the recent release of ChatGPT in November 2022, a wider section of audience got to experience the generative power of LLMs. The generative power of large language models (LLM) has been successfully applied in different areas of natural language processing tasks. Along with the revolutionary impact, many questions have been raised regarding the stakes of using LLMs in different applications. Broadly speaking a significant portion of the scientific community has advised to use LLMs in a socially responsible and ethical way Nat [2023]. Consequently, the aim of this project is to build explainable LLMs. The end user for LLMs can be of different types. The user may be a domain expert using an NLP model which uses LLMs at its back end or a stakeholder, investing in an AI product, which uses LLMs or someone having no AI expertise. Each type of user should be able to trust the output provided by LLMs. Existing research has shown that explaining the output of an AI model to a user should help to increase a user's trust in the system. Broadly speaking, the idea of explainability is to understand the working principle of an AI model with a simple explainer module which can mimic the original AI model. In this project we would like to specifically focus on explaining the output of LLMs to every type of users (i.e. domain experts, stakeholders, common people). The overall goal of this research proposal is to increase transparency of the LLMs using explainability techniques. Along with transparency, explainable LLM can also help to identify any kind of bias present in the model itself. Eventually explainable LLMs is a step towards the goal of creating a socially responsible AI environment.

在过去的几年里，大型语言模型（广义上说是基础模型）（例如ChatGPT，GPT-3 Brown et al. [2020]，GPT-4 OpenAI [2023]）已经搅动了人工智能（AI）领域。更具体地说，随着2022年11月ChatGPT的发布，更广泛的受众体验到了LLM的生成能力。大型语言模型（LLM）的生成能力已成功应用于自然语言处理任务的不同领域。沿着革命性的影响，许多问题已经提出了关于在不同的应用程序中使用LLMs的风险。从广义上讲，科学界的一个重要部分已经建议以社会责任和道德的方式使用LLM Nat [2023]。因此，本项目的目的是建立可解释的LLM。LLM的最终用户可以是不同类型。用户可能是使用NLP模型的领域专家，该模型在其后端使用LLM或投资于AI产品的利益相关者，该产品使用LLM或没有AI专业知识的人。每种类型的用户都应该能够信任LLM提供的输出。现有的研究表明，向用户解释AI模型的输出应该有助于增加用户对系统的信任。广义地说，可解释性的概念是通过一个简单的解释器模块来理解AI模型的工作原理，该模块可以模仿原始AI模型。在这个项目中，我们希望特别专注于向每种类型的用户（即领域专家，利益相关者，普通人）解释LLM的输出。本研究提案的总体目标是使用可解释性技术增加LLM的透明度。沿着透明度，可解释LLM还可以帮助识别模型本身存在的任何类型的偏差。最终，可解释的LLM是朝着创造一个对社会负责的AI环境的目标迈出的一步。