défense publique de la dissertation de doctorat de Monsieur Tanguy BOSSER
Titre de la dissertation : « Neural Marked Temporal Point Processes for Probabilistic Predictive Modeling of Continuous-Time Event Data ».
Promoteurs de thèse: Tom Mens et Souhaib Ben Taieb
Résumé de la dissertation: Sequences of labeled events observed at irregular intervals in continuous time are ubiquitous across various application domains. Typical examples include user interactions on social media platforms, visits to restaurants, online shopping activity, electronic health records, and earthquake occurrences in seismology. Temporal Point Processes (TPP) provide a mathematical framework for modeling these sequences, enabling inferences such as predicting the arrival time of future events and their associated label, called mark. However, classical TPP models are often constrained by strong assumptions, limiting their ability to capture complex real-world event dynamics. To overcome this limitation, neural TPP models have emerged as a flexible alternative, leveraging neural network parametrizations to improve modeling capabilities. Since its introduction, the field of neural TPP modeling witnessed rapid development, with the emergence of numerous models and their successful applications to a diverse set of real-world problems.While recent studies demonstrate the effectiveness of neural TPP models, they often lack a unified setup, relying on different baselines, datasets, and experimental configurations. This makes it difficult to identify key factors driving improvements in predictive accuracy, hindering future research progress. Additionally, modeling labeled event sequence data is a challenging problem from a statistical perspective in the sense that it requires joint modeling of discrete distributions over marks and continuous distributions over time.However, standard learning strategies for neural TPP models typically enforce these two distributions to be learned jointly on a common set of trainable parameters, which can lead to challenges during optimization. Moreover, due to model misspecification or lack of training data, these probabilistic models may provide a poor approximation of the true, unknown underlying process. Consequently, prediction regions extracted directly from the model may be unreliable, failing to reflect on the true underlying uncertainty of the model.Within this scope, the aim of this thesis is to identify and address key limitations that hinder the predictive accuracy of modern neural TPP models, alongside devising approaches that enable to faithfully quantify their uncertainty. To that end, we begin by presenting a comprehensive large-scale experimental study that systematically evaluates the predictive accuracy of state-of-the-art neural TPP models. We thoroughly investigate the influence of major architectural components in modeling the predictive distributions of arrival times and marks, and shed light on specific design choices that lead to increased predictive accuracy. Furthermore, we delve into the less explored area of probabilistic calibration for neural TPP models, and highlight that they are frequently miscalibrated with respect to the distribution of marks. Our study aims to provide valuable insights into the performance and characteristics of neural TPP models, contributing to a better understanding of their strengths and limitations.We then demonstrate that learning a marked TPP model can be framed as a two-task learning problem, where both tasks share a common set of trainable parameters that are optimized jointly. We show that this common practice can lead to the emergence of conflicting gradients during training, where task-specific gradients are pointing in opposite directions. When such conflicts arise, following the average gradient can be detrimental to the learning of each individual tasks, resulting in overall degraded performance. To overcome this issue, we introduce novel parametrizations for neural TPP models that allow for separate modeling and training of each task, effectively avoiding the problem of conflicting gradients. Through experiments on multiple real-world event sequence datasets, we demonstrate the benefits of our framework compared to the original model formulations.Finally, we develop more reliable methods for uncertainty quantification in neural TPP models via the framework of conformal prediction. A primary objective is to generate a distribution-free joint prediction region for an event’s arrival time and mark, with a finite-sample marginal coverage guarantee. A key challenge is to handle both a strictly positive, continuous response and a categorical response, without distributional assumptions. We first consider a simple but overly conservative approach that combines individual prediction regions for the event’s arrival time and mark. Then, we introduce a more effective method based on bivariate highest density regions derived from the joint predictive density of arrival times and marks. By leveraging the dependencies between these two variables, this method excludes unlikely combinations of the two, resulting in sharper prediction regions while still attaining the pre-specified coverage level. We also explore the generation of individual univariate prediction regions for events’ arrival times and marks through conformal regression and classification techniques. Moreover, we evaluate the stronger notion of conditional coverage. Finally, through extensive experimentation on both simulated and real-world datasets, we assess the validity and efficiency of these methods.
7000 Mons, Belgium