Défense de thèse de Monsieur James Main

Titre de la dissertation: The Many Faces of Strategy Complexity

Promoteur: Monsieur Mickael Randour

Résumé de la dissertation: A reactive system is a system that continuously interacts with its (uncontrollable) environment. Controllers for reactive systems are notoriously difficult to design, due to the possibly infinite behaviours that the environment may exhibit. This motivates the need for approaches to automatically design controllers. Reactive synthesis allows one to obtain a correct-by-construction controller automatically from a formal specification. The synthesis problem can be solved by means of a game-theoretic approach: we model the interaction of the system and the environment as a game and compute well-performing strategies of the system in this game. A strategy of the system player in such a game is the formal counterpart of a controller of the system.

A central question is to understand how complex strategies must be to enforce specifications. The classical representation of a strategy is via a Mealy machine, i.e., a finite automaton with outputs along its transitions. This model is used to define the classical measure of strategy complexity: the size of the smallest Mealy machine inducing it. This is known as the memory of the strategy. We explore different visions of strategy complexity: starting from the classical model, moving on to randomisation and finally to alternative representations.

First, we consider strategy complexity in the memory framework in multi-player turn-based games played on deterministic graphs. We consider multi-player games with (variants of) reachability objectives, and focus on Nash equilibria, a classical solution concept in multi-player games. We study the sufficient memory to design Nash equilibria in which a given set of players win. We obtain that the memory needed in games with reachability objectives for such Nash equilibria depends only on the number of players, and that finite memory suffices if all players aim to visit their targets infinitely often rather than only once.

Second, we consider randomisation in strategies. Randomisation is useful to balance different goals or to hide one’s intentions from others. Randomisation in strategies can be integrated into decision making in different ways. With mixed strategies, one tosses a coin at the start of a play to select a deterministic strategy (possibly among infinitely many), and follows this strategy for the entire play. With behavioural strategies, one tosses a coin at each step to select an action. Kuhn’s theorem, a seminal result in game theory, asserts the equivalence of these two models of randomisation in a broad class of games, called games with perfect recall. We investigate an analogue of Kuhn’s theorem for finite-memory strategies: we classify the different variants of randomised strategies based on Mealy machines with respect to their expressiveness and obtain a hierarchy of randomised finite-memory strategies.

As all models of randomisation do not share the same expressiveness, it yields another measure of strategy complexity. This measure is not directly related to memory requirements: there can be a trade-off between memory and randomisation requirements in general. We thus investigate randomisation requirements in a setting in which randomisation is required: Markov decision processes (MDPs) with multiple objectives. A Markov decision process is a one-player game where the environment is fully stochastic. Each strategy in an MDP with multiple objectives yields a vector of expected payoffs: we investigate the structure of the set of such expectation vectors under all strategies. We obtain that in this setting, under wide-ranging assumptions, a limited form of randomisation suffices.

Finally, we study an alternative representation of strategies in a class of infinite-state MDPs. We study one-counter MDPs: finite MDPs augmented with a counter that can be decremented, incremented, or left unchanged on each transition. In this setting, strategies with no memory need not admit a finite representation. We consider a natural class of counter-based strategies that admit finite representations based on finitely representable partitions of counter values into intervals. For two reachability-based objectives, we provide PSPACE algorithms to solve the problem of checking whether a strategy enforces the objective with high enough probability and to solve the problem of determining whether there exists a well-performing strategy whose representation satisfies constraints either on its structure, or on the number and size of intervals it uses.

Adresse

Avenue du Champ de Mars, 22
7000 Mons, Belgium

Voir dans Google Maps