Neural Information Processing Systems, New Orleans, 2022
Abstract: We present a new strategy to prove the convergence of Deep Learning architectures to a zero training (or even testing) loss by gradient flow. Our analysis is centered on the notion of Rayleigh quotients in order to prove Kurdyka-Lojasiewicz inequalities for a broader set of neural network architectures and loss functions. We show that Rayleigh quotients provide a unified view for several convergence analysis techniques in the literature. Our strategy produces a proof of convergence for various examples of parametric learning. In particular, our analysis does not require the number of parameters to tend to infinity, nor the number of samples to be finite, thus extending to test loss minimization and beyond the over-parameterized regime.
Full text & code : [ OpenReview ] [ Github ]
Neural Information Processing Systems, NeurReps Workshop, New Orleans, 2022
Abstract: We consider the problem of learning a periodic one-dimensional signal with neural networks, and designing models that are able to extrapolate the signal well beyond the training window. First, we show that multi-layer perceptrons with ReLU activations are provably unable to perform this task, and lead to poor performance in practice even close to the training window. Then, we propose a novel architecture using sine activation functions along with a well-chosen non-convex regularization, that is able to extrapolate the signal with low error well beyond the training window. Our architecture is several orders of magnitude better than its competitors for distant extrapolation (beyond 100 periods of the signal), while being able to accurately recover the frequency spectrum of the signal in a multi-tone setting.
Full text & code : [ OpenReview ] [ Github ]
Laboratoire de Mathématiques LMO, Orsay
Oct 2020 - Jun 2021
Research internship with Lénaïc Chizat (CNRS) on the implicit bias induced by the gradient descent algorithm on two-layer neural networks. Characterized the continuous limit point as the Bregman projection with hyperbolic entropy potential of the initialization weights to the set of zero-loss weights, with linear convergence speed under some technical assumptions.
Internship report : [ pdf ]
Upstride SAS, Station F, Paris
Feb - Aug 2020
Research internship with Wilder Lopes exploring computational efficiency of variational auto- encoders defined over Clifford algebras. Demonstrated experimentally superior reconstruction performance of networks leveraging higher-dimensional algebras on small images.
Technicolor AI Lab (acquired by Interdigital), San Francisco (CA)
Feb - Aug 2019
Research internship with Swayambhoo Jain on compression of neural networks. Developed a fast compression method able to cut up to 90% of weights with no drop in accuracy by casting layerwise compression as a series of convex activation reconstruction problems.
Internship report : [ html ] [ pdf ] [ slides ]
Massachussets Institute of Technology, Boston (MA)
Jun - Aug 2018
Research Internship with Philippe Rigollet (MIT) on reconstruction of cellular trajectories in gene expression space with optimal transport. The resulting toolkit for single cell RNA sequencing timeseries analysis is open source and available as a Python package.
Waddington Optimal Transport : broadinstitute/wot (diverged since)
Internship report (in french) : [ html ] [ pdf ] [ slides ]
Deep Learning course by Marc Lelarge (INRIA - ENS), ENS Paris
Introduction to neural network compression concepts and recent results, with a focus and practical session on activation reconstruction.
Resources : [ Lecture slides ] [ Practical Session ] [ Practical Session Solution ]
From Oct 2021 to present
INRIA - ENS, Paris. DYOGENE Project-team
Advised by Marc Lelarge and Kevin Scaman
Reparameterizations of deep neural networks for structured data with symmetries.
Final year of the ENS cursus
École Normale Supérieure, Paris, 2020-2021
Additional advanced courses on stochastic processes and algebraic geometry.
Mathématiques, Vision & Apprentissage (MVA)
École Normale Supérieure, Paris, 2018-2020
Advanced mathematics and computer science, focused on Machine Learning
École Normale Supérieure, Paris, 2017-2018
Solid basis in modern mathematics and computer science.
Lycée Louis-le-Grand, Paris, 2015-2017
Post-secondary program in advanced maths and physics leading to nationwide entrance examinations to the Grandes Écoles for scientific studies
Lycée Hoche, Versailles, 2015
A-levels French equivalent
Awarded with highest honours
UNIX-like 64-bit micro-kernel with MMU handling, dynamic memory allocation, hardware interruptions, multi-processing, and basic filesystem for the Raspberry Pi 3 (before even Linux implements 64-bit support)
Source code available on github: robindar/sysres-os
Small SMT solver for equality theory decision procedures.
Implements DPLL, two-watched literals, and is fully unit-tested.
Source code available on github: robindar/semver-smt
Compiler for a small (yet Turing-complete) subset of Rust.
Borrow-checked and compiled down to x86 assembly.
Source code available on github: robindar/compil-petitrust
"RISC V"-style basic processor emulator in Minijazz (Netlist superset) and Minijazz-to-C compiler. Supports few instructions but has a good build system and is unit-tested
Source code available on gitlab: alpr-sysdig/processor
School project (TIPE)
Genetic algorithm to find good solutions to the Traveling Salesman Problem and a testing structure around it to optimize meta-parameters like population size, mutation probability or crossover method
Headless Debian to practice web design and server administration
Also acts as a personal Git server and occasional blog
If you have a project that you want to get started, think you need my help with something, or just fancy saying hi, send me a message, I'm always happy to help !Message Me