Detecting Training Abuses in Neural Nets
检测神经网络中的训练滥用
基本信息
- 批准号:2301656
- 负责人:
- 金额:--
- 依托单位:
- 依托单位国家:英国
- 项目类别:Studentship
- 财政年份:2019
- 资助国家:英国
- 起止时间:2019 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Many military systems carry out classification tasks. For example, a system might be required to distinguish between an allied tank and an enemy tank (a classic problem in machine learning). Modern machine learning approaches are being brought to bear on classification problems within the military domain and more widely. Neural networks, a technology that, loosely speaking, makes decisions in a manner analogous to the way the human brain works, play a particularly prominent role. Most work proceeds on the assumption that all is benign. But imagine if an enemy wanted to cause your classifier to work well, except when presented with a very specific classification task. For example, an enemy tank with a particular appearance could be classified as an allied one, with very significant consequences. Can an enemy engineer such behaviour? In certain circumstances, yes! It depends on how and by whom the classifier system was built. The building of such systems is often outsourced in some way, e.g. because the procurer lacks the computational capability to craft an effective system or by the use of publicly generated components. We often refer to hidden malicious functionality that can be invoked when convenient as a 'trapdoor'. Trapdoors are often very difficult to detect. Imagine a system that classified perfectly thousands of tank examples provided by you. It seems like this is a very good system. But the system may have been trained so that an enemy tank with "666" painted on its side is misclassified. If you don't know this specific condition you would have little reason to generate a test example to discover it. Neural networks are also notoriously opaque in rendering apparent how they make decisions and this makes this sort of trapdoor detection particularly hard. We might reasonably ask whether or how well we can detect such trapdoors. There are various levels at which understanding may be sought. Thus, determining whether a system has a trapdoor in it (yes/no) is a simpler and less ambitious task than seeking the specific trapdoor condition (the "666' indicated above). Though there is a fair amount on trapdoors in the literature, typically addressing issues of planting or detecting trapdoors, there appears to be little concerned with characterising them. It would seem clear that any detection technique is likely to be more successful on some trapdoors than on others. This raises the question, however, as to how to describe those where the technique works well and those where it performs less well. A rigorous approach to detection, the primary goal of this project, requires a nuanced understanding of trapdoors. In particular, a characterisation of trapdoors together with measurements of their properties, e.g. how much a trapdoor example deviates from a normal example, is essential. If trapdoor generation is now considered, the characterisation of trapdoors allows more refined specification of properties we would like an inserted trapdoor to have. This serves two purposes: firstly, it facilitates a more nuanced generational capability for practical operational purposes, i.e. for someone who wants to benefit from planting a trapdoor in the real world; and secondly, it allows researchers (initially ourselves!) to generate sets of trapdoors for rigorous evaluation of detection techniques. We can define what it means to 'cover' the trapdoor space in some way, much as we cover input or other space in general testing. Since there is no extant workable characterisation of trapdoors there is also clearly no extant generational capability.
许多军事系统执行分类任务。例如,可能需要一个系统来区分盟军坦克和敌方坦克(机器学习中的经典问题)。现代机器学习方法正在应用于军事领域乃至更广泛的分类问题。神经网络是一种以类似于人脑工作方式做出决策的技术,它发挥着尤为突出的作用。大多数工作都是在一切都是良性的假设下进行的。但想象一下,如果敌人想让你的分类器正常工作,除非面临非常具体的分类任务。例如,具有特定外观的敌方坦克可以被归类为盟军坦克,从而产生非常严重的后果。敌人可以策划这样的行为吗?在某些情况下,是的!这取决于分类器系统的构建方式和由谁构建。此类系统的构建通常以某种方式外包,例如因为采购者缺乏构建有效系统或使用公共生成组件的计算能力。我们经常将方便时调用的隐藏恶意功能称为“活板门”。活板门通常很难被发现。想象一下,一个系统可以对您提供的数千个坦克示例进行完美分类。看起来这是一个非常好的系统。但该系统可能经过训练,导致侧面涂有“666”的敌方坦克被错误分类。如果您不知道这个特定条件,您就没有理由生成测试示例来发现它。神经网络在渲染决策方式方面也是出了名的不透明,这使得这种活板门检测特别困难。我们可能会合理地问我们是否能够检测到此类活板门,或者检测到的程度如何。可以在多个层面上寻求理解。因此,确定系统中是否有活板门(是/否)比寻找特定活板门条件(上面指出的“666”)是一项更简单、更不那么雄心勃勃的任务。尽管文献中有大量关于活板门的内容,通常解决植入或检测活板门的问题,但似乎很少关注表征它们。很明显,任何检测技术都可能在某些活板门上更成功。 比其他人。然而,这就提出了一个问题,即如何描述该技术效果良好的部分和效果较差的部分。严格的检测方法是该项目的主要目标,需要对活板门有细致入微的了解。特别是活板门的表征及其属性的测量,例如活板门示例与正常示例的偏离程度至关重要。如果现在考虑陷门生成,则特征 活板门允许对我们希望插入的活板门具有的属性进行更精细的规范。这有两个目的:首先,它有助于为实际操作目的提供更细致的生成能力,即对于那些想要从现实世界中植入活板门中受益的人来说;其次,它允许研究人员(最初是我们自己!)生成一组活板门,以对检测技术进行严格评估。 我们可以以某种方式定义“覆盖”活板门空间的含义,就像我们在一般测试中覆盖输入或其他空间一样。由于活板门没有现存的可行的特征,因此显然也没有现存的生成能力。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
其他文献
吉治仁志 他: "トランスジェニックマウスによるTIMP-1の線維化促進機序"最新医学. 55. 1781-1787 (2000)
Hitoshi Yoshiji 等:“转基因小鼠中 TIMP-1 的促纤维化机制”现代医学 55. 1781-1787 (2000)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
LiDAR Implementations for Autonomous Vehicle Applications
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
吉治仁志 他: "イラスト医学&サイエンスシリーズ血管の分子医学"羊土社(渋谷正史編). 125 (2000)
Hitoshi Yoshiji 等人:“血管医学与科学系列分子医学图解”Yodosha(涉谷正志编辑)125(2000)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Effect of manidipine hydrochloride,a calcium antagonist,on isoproterenol-induced left ventricular hypertrophy: "Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,K.,Teragaki,M.,Iwao,H.and Yoshikawa,J." Jpn Circ J. 62(1). 47-52 (1998)
钙拮抗剂盐酸马尼地平对异丙肾上腺素引起的左心室肥厚的影响:“Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('', 18)}}的其他基金
An implantable biosensor microsystem for real-time measurement of circulating biomarkers
用于实时测量循环生物标志物的植入式生物传感器微系统
- 批准号:
2901954 - 财政年份:2028
- 资助金额:
-- - 项目类别:
Studentship
Exploiting the polysaccharide breakdown capacity of the human gut microbiome to develop environmentally sustainable dishwashing solutions
利用人类肠道微生物群的多糖分解能力来开发环境可持续的洗碗解决方案
- 批准号:
2896097 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
A Robot that Swims Through Granular Materials
可以在颗粒材料中游动的机器人
- 批准号:
2780268 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Likelihood and impact of severe space weather events on the resilience of nuclear power and safeguards monitoring.
严重空间天气事件对核电和保障监督的恢复力的可能性和影响。
- 批准号:
2908918 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Proton, alpha and gamma irradiation assisted stress corrosion cracking: understanding the fuel-stainless steel interface
质子、α 和 γ 辐照辅助应力腐蚀开裂:了解燃料-不锈钢界面
- 批准号:
2908693 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Field Assisted Sintering of Nuclear Fuel Simulants
核燃料模拟物的现场辅助烧结
- 批准号:
2908917 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Assessment of new fatigue capable titanium alloys for aerospace applications
评估用于航空航天应用的新型抗疲劳钛合金
- 批准号:
2879438 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Developing a 3D printed skin model using a Dextran - Collagen hydrogel to analyse the cellular and epigenetic effects of interleukin-17 inhibitors in
使用右旋糖酐-胶原蛋白水凝胶开发 3D 打印皮肤模型,以分析白细胞介素 17 抑制剂的细胞和表观遗传效应
- 批准号:
2890513 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Understanding the interplay between the gut microbiome, behavior and urbanisation in wild birds
了解野生鸟类肠道微生物组、行为和城市化之间的相互作用
- 批准号:
2876993 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
相似海外基金
New approaches to training deep probabilistic models
训练深度概率模型的新方法
- 批准号:
2613115 - 财政年份:2025
- 资助金额:
-- - 项目类别:
Studentship
REU Site: ASL-English Bilingual Cognitive and Educational Neuroscience Training and Research Experience (ASL-English Bilingual CENTRE)
REU网站:ASL-英语双语认知和教育神经科学培训和研究经验(ASL-英语双语中心)
- 批准号:
2349454 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Standard Grant
Collaborative Research: CyberTraining: Pilot: PowerCyber: Computational Training for Power Engineering Researchers
协作研究:CyberTraining:试点:PowerCyber:电力工程研究人员的计算培训
- 批准号:
2319895 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Standard Grant
Collaborative Research: CyberTraining: Implementation: Medium: Training Users, Developers, and Instructors at the Chemistry/Physics/Materials Science Interface
协作研究:网络培训:实施:媒介:在化学/物理/材料科学界面培训用户、开发人员和讲师
- 批准号:
2321102 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Standard Grant
CAREER: Mitigating the Lack of Labeled Training Data in Machine Learning Based on Multi-level Optimization
职业:基于多级优化缓解机器学习中标记训练数据的缺乏
- 批准号:
2339216 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Continuing Grant
Project Incubation: Training Undergraduates in Collaborative Research Ethics
项目孵化:培养本科生合作研究伦理
- 批准号:
2316154 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Standard Grant
ARC Training Centre for Automated Vehicles in Rural and Remote Regions
ARC农村和偏远地区自动驾驶汽车培训中心
- 批准号:
IC230100001 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Industrial Transformation Training Centres
A study of in-school in-service training for HRTs and senka teachers, addressing teacher training needs.
针对 HRT 和 Senka 教师的校内在职培训研究,解决教师培训需求。
- 批准号:
24K04150 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Scientific Research (C)
DMS-EPSRC: Asymptotic Analysis of Online Training Algorithms in Machine Learning: Recurrent, Graphical, and Deep Neural Networks
DMS-EPSRC:机器学习中在线训练算法的渐近分析:循环、图形和深度神经网络
- 批准号:
EP/Y029089/1 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Research Grant
C-NEWTRAL: smart CompreheNsive training to mainstrEam neW approaches for climaTe-neutRal cities through citizen engAgement and decision-making support
C-NEWTRAL:智能综合培训,通过公民参与和决策支持将气候中和城市的新方法纳入主流
- 批准号:
EP/Y032640/1 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Research Grant