权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

RI: Small: Computational Techniques for Large Multi-Step Incomplete-Information Games

RI：小型：大型多步不完全信息博弈的计算技术

基本信息

批准号：
1617590
负责人：
Tuomas Sandholm
金额：
$ 45万
依托单位：
Carnegie-Mellon University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2016
资助国家：
美国
起止时间：
2016-07-01 至 2019-06-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1617590&HistoricalAwards=false
关键词：
RI Small Computational Techniques Large

项目摘要

Game-theoretic solution concepts provide a sound definition of how rational agents should act and update their beliefs in multiagent settings. The ability to compute such solutions in large incomplete-information games is a key capability in a myriad of applications, such as in negotiations, cybersecurity, physical security, medicine, and auctions. To achieve such strategically robust intelligence, the solution concepts must be accompanied by computational techniques for finding such solutions. Only then will the definitions be truly operational. The PI proposes a host of techniques for this. The proposed work will enable game theory to be an operational tool for analyzing large-scale settings. The methodology is application independent, so it has extremely broad applicability. To ensure scalability, the techniques will be benchmarked on very-large-scale games. This is on a path to a vision where software agents conduct commerce on behalf of humans and companies, or advise them. That leads to increased social welfare (or increase in other measures of desirability of outcomes) through better decision making. It also enables broader and fairer access because it helps put less experienced/educated people/companies on an equal footing with expert market participants. Broader access, in turn, increases the benefits of (electronic) commerce further, and the benefits get distributed more fairly across segments of society. The proposed algorithms can also help others in their research by 1) providing counter-examples to incorrect hypotheses (by rapidly generating and solving games within the class of interest, and observing properties of the equilibria) and 2) helping guide the formulation of new theorems by solving numerous cases.The proposed research has four high-level technical prongs: (1) The PI will leverage his recent breakthrough (with S. Singh) that enables game abstraction algorithms (which have to be lossy in order to create small enough models to solve) to create strategies that have bounds on exploitability. He proposes to broaden the framework to general sequential games, to develop better action and state abstraction algorithms, and to study abstraction both for scalability and modeling purposes. He also proposes algorithms that create imperfect-recall abstractions that have bounds, are potential-aware, support efficient distributed equilibrium finding, and have compact representations. In addition, he proposes techniques for optimal action abstraction and ways to do abstraction during equilibrium finding and during execution of the game strategy. (2) He proposes directions around the question of how opponents' actions should be mapped to the abstract model. He also plans to determine why making one's strategy less randomized can---surprisingly---be beneficial. (3) He proposes parallelization and sampling techniques for the counterfactual regret equilibrium-finding algorithm, and ways to solve imperfect-recall game abstractions. He also proposes techniques for effective, detailed endgame and midgame solving, as well as techniques that leverage endgame solving in finding an equilibrium for the entire game. He also proposes a new computationally feasible equilibrium refinement. (4) He proposes major scalability enhancements to algorithms that combine game-theoretic reasoning and opponent modeling. He proposes new directions based on a recent breakthrough (with S. Ganzfried) that shows that fully safe opponent exploitation is possible. He also proposes to study the three-way tradeoff among exploitation, exploitability, and exploration.

博弈论的解决方案的概念提供了一个合理的定义，理性的代理应该如何行动，并更新他们的信念在多智能体设置。在大型不完全信息博弈中计算此类解决方案的能力是谈判、网络安全、物理安全、医疗和拍卖等众多应用中的关键能力。为了实现这种战略上强大的智能，解决方案的概念必须伴随着计算技术，以找到这样的解决方案。只有这样，这些定义才能真正具有可操作性。PI为此提出了一系列技术。拟议的工作将使博弈论成为一个操作工具，用于分析大规模的设置。该方法与应用无关，因此具有非常广泛的适用性。为了确保可扩展性，这些技术将在超大规模游戏上进行基准测试。这是一条通往软件代理代表人类和公司进行商业活动或为他们提供建议的愿景的道路。这导致通过更好的决策来增加社会福利（或增加其他衡量结果可取性的措施）。它还使更广泛和更公平的准入成为可能，因为它有助于将经验较少/受教育程度较低的人/公司与专家市场参与者置于平等的地位。更广泛的接触反过来又进一步增加了（电子）商务的好处，而且好处在社会各阶层之间得到更公平的分配。所提出的算法还可以通过以下方式帮助其他人进行研究：1）为不正确的假设提供反例（通过快速生成和解决感兴趣的类内的游戏，并观察均衡的属性）; 2）通过解决大量案例来帮助指导新定理的制定。所提出的研究有四个高级别的技术分支：（1）PI将利用他最近的突破（与S. Singh），它使游戏抽象算法（为了创建足够小的模型来求解，这些算法必须是有损的）能够创建具有可利用性界限的策略。他建议将框架扩展到一般的序列游戏，开发更好的动作和状态抽象算法，并研究抽象的可扩展性和建模目的。他还提出了一些算法，这些算法可以创建具有边界的不可回忆抽象，具有潜在意识，支持高效的分布式均衡发现，并具有紧凑的表示。此外，他还提出了最佳行动抽象的技术，以及在寻找均衡和执行游戏策略期间进行抽象的方法。(2)他围绕对手的行动应该如何映射到抽象模型的问题提出了方向。他还计划确定为什么使一个人的策略不那么随机化会令人惊讶地有益。(3)他提出了并行化和抽样技术的反事实遗憾平衡发现算法，以及解决不记得游戏抽象的方法。他还提出了有效的技术，详细的残局和中期解决，以及技术，利用残局解决找到一个均衡的整个游戏。他还提出了一个新的计算上可行的平衡细化。(4)他提出了对结合联合收割机博弈论推理和对手建模的算法的重大可扩展性增强。他根据最近的突破（与S.这表明完全安全的对手剥削是可能的。他还建议研究开发、可开发性和探索之间的三方权衡。