Partner Event: Bisimulation metrics for representation learning

About the event

Join Huawei for this Coffee House Tech talk, taking place on 22nd of February.

Part 1 – Probabilistic bisimulation and bisimulation metrics 11am-12 noon

Probabilistic bisimulation is an equivalence relation that captures behavioural similarity in reactive probabilistic systems also called Labelled Markov Processes. The original definition, due to Larsen and Skou (1989) was defined on discrete spaces. With a strong finite-branching assumption, they established a logical characterization theorem in the spirit of van Benthem and Hennessy-Milner. This work was extended to continuous state spaces by Desharnais et al. (1997-98) and they were able to show a logical characterization result with no finite branching assumption as well as with a significantly more parsimonious logic. The proof uses special properties of analytic spaces. One can argue that an equivalence relation is not the right tool to study quantitative systems. In 1999 Desharnais et al. developed a metric analogue of probabilistic bisimulation and established approximation results using it. Later Ferns et al. (2004-05) extended this metric to Markov decision processes and showed that it gave bounds on the optimal value function, thus establishing an important connection with reinforcement learning (RL). I will describe these developments starting from the Larsen-Skou work to the work of Ferns et al.

Part 2 – The MICo distance and representation learning in RL 1330pm-1430pm

In this second part I will critique bisimulation metrics and will present a cheaper proxy for it called the MICo distance. I will describe how this can be used to improve the quality of representation learning. This is work due to Castro et al. from NeurIPS 2021. This talk will assume no knowledge of representation learning. If time permits I will briefly mention connections with the theory of reproducing kernel Hilbert spaces.

The logical characterization work on continuous spaces represents joint work with Blute, Desharnais and Edalat and the metric work is joint with Desharnais, Gupta and Jagadeesan. The connection to RL is joint work with Ferns and Precup. The work on representation learning is joint work with Castro, Kastner and Rowland