In the context of team Cornet’s seminars, Andrea Fox (LIA) will present his research work on Safe Reinforcement Learning for Video Admission Control, on December 8, 2023, at 11:35 in the meeting room.
Abstract: Mobile video cameras have become a pervasive commodity and represent an important candidate source to enhance video analytic applications. Yet, while available in large quantities, the limitations of the edge computing infrastructure require the careful selection of which video flows to process at any point in time to maximize the amount of information extracted by deployed applications. In this paper, we present an admission control scheme for mobile video streams originating from different areas and dispatched to multiple processing servers over an edge computing infrastructure. We introduce a model rooted in the theory of Constrained Markov Decision Processes (CMDPs) that captures the problem of ensuring adequate area coverage to applications, while accounting for constraints of edge servers and access network capacity. On top of this model, we develop two new policies based on specialized primal-dual constrained Reinforcement Learning methods that solve the optimal admission control problem. The first, called DR-CPO, adopts reward decomposition reinforcement learning. This technique effectively mitigates state-space explosion, achieves optimality, and significantly accelerates the convergence process compared to existing baselines. The second, called AS-CPO, employs specialized function approximation methods, namely state aggregation, to provide further gains in convergence time. This comes at the price of sub-optimality but still outperforms standard Deep Reinforcement Learning baselines. Extensive results show that our solution achieves 13% higher reward than baselines on a wide variety of environments requiring on average only 9% of the time to converge to optimality.