Applied researcher at MILA R&D

Interested in Machine Learning and Optimization

gaetan.marceau.caron@rd.mila.quebec

[07/11/2017] Yann Ollivier and I publish at GSI 2017 our contribution *Natural Langevin Dynamics for Neural Networks*

[Best paper award GSI2017]

[21/08/2017] I give three courses at the IVADO french summer school

[13/03/2017] I am an applied researcher at MILA R&D in deep learning.

[18/05/2016] We release a torch implementation of the quasi-diagonal riemannian layer, partially funded by the Center of Data Science of Paris-Saclay

[11/04/2016] Alexandre Allauzen and I give a 3-hours tutorial on Deep Learning with Torch at Telecom Paristech (exercise, solutions)

[25/02/2016] Yann Ollivier and I publish a preprint entitled Practical Riemannian Neural Network. (preprint, code and lecture)

[12/01/2016] I give tutorials for the Foundations of ML II at the Centrale-Supelec University (France)

[28/11/2015] I give tutorials for the Deep Learning courses at the Paris-Saclay University (France)

[01/10/2015] We receive a code consolidator fellowship from the Center of Data Science for supporting the rewriting of our research code into a C++ library

[08/08/2015] I am attending the Deep Learning Summer School in Montréal (Canada).

[06/07/2015] I am attending the ICML conference in Lille (France)

This project, in collaboration with with Yann Ollivier (CNRS), provides first experimental results on non-synthetic datasets for the quasi-diagonal Riemannian gradient descents for neural networks introduced in [Ollivier, 2015]. These include the MNIST, SVHN, and FACE datasets as well as a previously unpublished electroencephalogram dataset. The quasi-diagonal Riemannian algorithms consistently beat simple stochastic gradient gradient descents by a varying margin. The computational overhead with respect to simple backpropagation is around a factor 2. Perhaps more interestingly, these methods also reach their final performance quickly, thus requiring fewer training epochs and a smaller total computation time. We also present an implementation guide to these Riemannian gradient descents for neural networks, showing how the quasi-diagonal versions can be implemented with minimal effort on top of existing routines which compute gradients. (preprint, code and lecture)

Abstract - In this thesis, we investigate the issue of optimizing the aircraft operators' demand with the airspace capacity by taking into account uncertainty in air traffic management. In the first part of the work, we identify the main causes of uncertainty of the trajectory prediction (TP), the core component underlying automation in ATM systems. We study the problem of online parameter-tuning of the TP during the climbing phase with the optimization algorithm CMA-ES. The main conclusion, corroborated by other works in the literature, is that ground TP is not sufficiently accurate nowadays to support fully automated safety-critical applications. Hence, with the current data sharing limitations, any centralized optimization system in Air Traffic Control should consider the human-in-the-loop factor, as well as other uncertainties. Consequently, in the second part of the thesis, we develop models and algorithms from a network global perspective and we describe a generic uncertainty model that captures flight trajectories uncertainties and infer their impact on the occupancy count of the Air Traffic Control sectors. This usual indicator quantifies coarsely the complexity managed by air traffic controllers in terms of number of flights. In the third part of the thesis, we formulate a variant of the Air Traffic Flow and Capacity Management problem in the tactical phase for bridging the gap between the network manager and air traffic controllers. The optimization problem consists in minimizing jointly the cost of delays and the cost of congestion while meeting sequencing constraints. In order to cope with the high dimensionality of the problem, evolutionary multi-objective optimization algorithms are used with an indirect representation and some greedy schedulers to optimize flight plans. An additional uncertainty model is added on top of the network model, allowing us to study the performances and the robustness of the proposed optimization algorithm when facing noisy context. We validate our approach on real-world and artificially densified instances obtained from the Central Flow Management Unit in Europe.