We study the continuous-time q-learning in the mean-field jump-diﬀusion models from the representative agent’s perspective. We introduce the integrated q-function in decoupled form (decoupled Iq-function) and establish its martingale characterization together with the value function, which provides a unified policy evaluation rule for both mean-field game (MFG) and mean-field control (MFC) problems. Moreover, depending on the task to solve the MFG or MFC problem, we can employ the decoupled Iq-function by diﬀerent means to learn the mean-field equilibrium policy or the mean-field optimal policy respectively. As a result, we devise a unified q-learning algorithm for both MFG and MFC problems by utilizing all test policies stemming from the mean-field interactions. For several examples in the jump-diﬀusion setting, within and beyond the LQ framework, we can obtain the exact parameterization of the decoupled Iq-functions and the value functions, and illustrate our algorithm from the representative agent’s perspective with satisfactory performance. Joint work with Xiang Yu and Xiaoli Wei.

]]>To answer the question posed in the title, we first need to define what we mean by "better." In this talk, we will look at the tournament designs of football and table tennis from an optimal stopping perspective. In particular, we consider the problem of finding the optimal scheme for a knock-out tournament with 2^n players, aiming at determining the top player. In each game in the tournament, we observe a real-time score, modeled by a Brownian motion with drift where the drift reflects the players' relative abilities. We can stop observing the game when the outcome seems clear and decide who advances. However, the longer a match is played, the more cost one needs to pay. We formulate and solve a stopping problem to minimise the probability of eliminating the best player while keeping the cost of observation low. The result will tell us how to smartly distribute the time cost across tournament games, and thus, reveals which sport has a superior design. Additionally, we discuss a few variants of the problem and some possible generalisations.

]]>This paper investigates the asymptotic behavior of the linear-quadratic stochastic optimal control problems. By establishing a connection between the ergodic cost problem and the so-called cell problem in the homogenization of Hamilton-Jacobi equations, we reveal the turnpike properties of the linear-quadratic stochastic optimal control problems from various perspectives.

]]>It will initially be considered the asymptotic behavior of the solution of a mean-field system of Backward Stochastic Differential Equations with Jumps (BSDEs), as the multitude of the system equations grows to infinity, to independent and identically distributed (IID) solutions of McKean–Vlasov BSDEs. This property is known in the literature as backward propagation of chaos. Afterwards, it will be provided the suitable framework for the stability of the aforementioned property to hold. In other words, assuming a sequence of mean-field systems of BSDEs which propagate chaos, then their solutions, as the multitude of the system equations grows to infinity, approximates an IID sequence of solutions of the limiting McKean–Vlasov BSDE. The generality of the framework allows to incorporate either discrete-time or continuous-time approximating mean-field BSDE systems.

]]>