David A. R. Robin

David A. R. Robin's picture

PhD candidate at ENS

Machine Learning


  1. [ICLR 24] Random Sparse Lifts: Con­struc­tion, Ana­ly­sis and Con­ver­gence of fi­ni­te sparse net­works

    International Conference on Learning Representations, Vienna, 2024

    Abstract: We present a framework to define a large class of neural networks for which, by construction, training by gradient flow provably reaches arbitrarily low loss when the number of parameters grows. Distinct from the fixed-space global optimality of non-convex optimization, this new form of convergence, and the techniques introduced to prove such convergence, pave the way for a usable deep learning convergence theory in the near future, without overparameterization assumptions relating the number of parameters and training samples. We define these architectures from a simple computation graph and a mechanism to lift it, thus increasing the number of parameters, generalizing the idea of increasing the widths of multi-layer perceptrons. We show that architectures similar to most common deep learning models are present in this class, obtained by sparsifying the weight tensors of usual architectures at initialization. Leveraging tools of algebraic topology and random graph theory, we use the computation graph’s geometry to propagate properties guaranteeing convergence to any precision for these large sparse models.

    Full text :  [ OpenReview ]

  2. [NeurIPS 22] Con­ver­gence be­yond the over­pa­ram­et­er­ized re­gi­me with Ray­leigh quot­ients

    Neural Information Processing Systems, New Orleans, 2022

    Abstract: We present a new strategy to prove the convergence of Deep Learning architectures to a zero training (or even testing) loss by gradient flow. Our analysis is centered on the notion of Rayleigh quotients in order to prove Kurdyka-Lojasiewicz inequalities for a broader set of neural network architectures and loss functions. We show that Rayleigh quotients provide a unified view for several convergence analysis techniques in the literature. Our strategy produces a proof of convergence for various examples of parametric learning. In particular, our analysis does not require the number of parameters to tend to infinity, nor the number of samples to be finite, thus extending to test loss minimization and beyond the over-parameterized regime.

    Full text & code :  [ OpenReview ] [ Github ]

  3. [NeurReps 22] Pe­rio­dic Sig­nal Re­covery with Re­gu­la­rized Sine Neu­ral Net­works

    Neural Information Processing Systems, Neur­Reps Work­shop, New Orleans, 2022

    Abstract: We consider the problem of learning a periodic one-dimensional signal with neural networks, and designing models that are able to extrapolate the signal well beyond the training window. First, we show that multi-layer perceptrons with ReLU activations are provably unable to perform this task, and lead to poor performance in practice even close to the training window. Then, we propose a novel architecture using sine activation functions along with a well-chosen non-convex regularization, that is able to extrapolate the signal with low error well beyond the training window. Our architecture is several orders of magnitude better than its competitors for distant extrapolation (beyond 100 periods of the signal), while being able to accurately recover the frequency spectrum of the signal in a multi-tone setting.

    Full text & code :  [ OpenReview ] [ Github ]


  1. Research Internship : Hypentropic reparameterization

    Laboratoire de Mathématiques LMO, Orsay

    Oct 2020 - Jun 2021

    Research internship with Lénaïc Chizat (CNRS) on the implicit bias induced by the gradient descent algorithm on two-layer neural networks. Characterized the continuous limit point as the Bregman projection with hyperbolic entropy potential of the initialization weights to the set of zero-loss weights, with linear convergence speed under some technical assumptions.

    Internship report :  [ pdf ]

  2. Research Internship : Clifford-valued networks

    Upstride SAS, Station F, Paris

    Feb - Aug 2020

    Research internship with Wilder Lopes exploring computational efficiency of variational auto- encoders defined over Clifford algebras. Demonstrated experimentally superior reconstruction performance of networks leveraging higher-dimensional algebras on small images.

  3. Research Internship : Neural network compression

    Technicolor AI Lab (acquired by Interdigital), San Francisco (CA)

    Feb - Aug 2019

    Research internship with Swayambhoo Jain on compression of neural networks. Developed a fast compression method able to cut up to 90% of weights with no drop in accuracy by casting layerwise compression as a series of convex activation reconstruction problems.

    Internship report :  [ html ][ pdf ][ slides ]

  4. Research Internship : Optimal Transport

    Massachussets Institute of Technology, Boston (MA)

    Jun - Aug 2018

    Research Internship with Philippe Rigollet (MIT) on reconstruction of cellular trajectories in gene expression space with optimal transport. The resulting toolkit for single cell RNA sequencing timeseries analysis is open source and available as a Python package.

    Waddington Optimal Transport : broadinstitute/wot (diverged since)

    Internship report (in french) :  [ html ][ pdf ][ slides ]


  1. Teaching Assistant : Deep Learning

    Deep Learning (MAP583) course by Kevin Scaman (INRIA - ENS), École Polytechnique

    Practical introduction to deep learning and all implementation details, with a focus on coverage of a large amount of different data domains and network architectures.

    Resources :  [ Synapses page ][ Practicals repository ][ Custom python package ]

  2. Guest Lecture : Neural network compression

    Deep Learning course by Marc Lelarge (INRIA - ENS), ENS Paris

    Introduction to neural network compression concepts and recent results, with a focus and practical session on activation reconstruction.

    Resources :  [ Lecture slides ][ Practical Session ][ Practical Session Solution ]


  1. PhD in Mathematics

    From Oct 2021 to present

    INRIA - ENS, Paris. DYOGENE Project-team

    Advised by Marc Lelarge and Kevin Scaman

    Reparameterizations of deep neural networks for structured data with symmetries.

  2. Diplôme de l'ENS (Info-Maths)

    Final year of the ENS cursus

    École Normale Supérieure, Paris, 2020-2021

    Additional advanced courses on stochastic processes and algebraic geometry.

  3. M. Sc. Computer Science

    Mathématiques, Vision & Apprentissage (MVA)

    École Normale Supérieure, Paris, 2018-2020

    Advanced mathematics and computer science, focused on Machine Learning

    Coursework includes:

    • Category theory
    • Network modelisation
    • Parallel programming
    • General Robotics
    • Convex optimization
    • Computer vision
    • Deep Learning
    • General Topology
    • Differential Geometry
    • Reinforcement Learning
    • Natural Language Processing
    • Optimal Transport
    • Graphical Models
    • Kernel Methods
  4. B. Sc. Computer Science

    École Normale Supérieure, Paris, 2017-2018

    Solid basis in modern mathematics and computer science.

    Coursework includes:

    • Mathematical Logic
    • Formal languages
    • Algebra
    • Cryptology
    • Information theory
    • λ-calculus and calculability
    • Processor's architectures
    • Operating Systems
    • Databases
    • Compilation
    • Randomized algorithms
    • Semantics and Verification

    Lycée Louis-le-Grand, Paris, 2015-2017

    Post-secondary program in advanced maths and physics leading to nationwide entrance examinations to the Grandes Écoles for scientific studies

  6. Baccalaureate in science

    Lycée Hoche, Versailles, 2015

    A-levels French equivalent

    Awarded with highest honours




Other publications

  1. [ASIA CCS 20] Return-oriented programming on RISC-V

    ACM Asia Conference on Computer and Communications Security, Taipei Taiwan, 2020

    Abstract: We provide the first analysis on the feasibility of Return-Oriented programming (ROP) on RISC-V, a new instruction set architecture targeting embedded systems. We show the existence of a new class of gadgets, using several Linear Code Sequences And Jumps (LCSAJ), undetected by current Galileo-based ROP gadget searching tools. We argue that this class of gadgets is rich enough on RISC-V to mount complex ROP attacks, bypassing traditional mitigation like DEP, ASLR, stack canaries, G-Free and some compiler-based backward-edge CFI, by jumping over any guard inserted by a compiler to protect indirect jump instructions. We provide examples of such gadgets, as well as a proof-of-concept ROP chain, using C code injection to leverage a privilege escalation attack on two standard Linux operating systems. Additionally, we discuss some of the required mitigations to prevent such attacks and provide a new ROP gadget finder algorithm that handles this new class of gadgets.

    Full text :  [ ACM Link ][ ArXiv ]


  1. Raspberry Pi 3 64-bit OS

    UNIX-like 64-bit micro-kernel with MMU handling, dynamic memory allocation, hardware interruptions, multi-processing, and basic filesystem for the Raspberry Pi 3 (before even Linux implements 64-bit support)

    Source code available on github: robindar/sysres-os

  2. SMT Solver

    Small SMT solver for equality theory decision procedures.

    Implements DPLL, two-watched literals, and is fully unit-tested.

    Source code available on github: robindar/semver-smt

  3. Rust compiler

    Compiler for a small (yet Turing-complete) subset of Rust.

    Borrow-checked and compiled down to x86 assembly.

    Source code available on github: robindar/compil-petitrust

  4. RISC V processor emulator

    "RISC V"-style basic processor emulator in Minijazz (Netlist superset) and Minijazz-to-C compiler. Supports few instructions but has a good build system and is unit-tested

    Source code available on gitlab: alpr-sysdig/processor

  5. Genetic algorithms for the Traveling Salesman Problem

    School project (TIPE)

    Genetic algorithm to find good solutions to the Traveling Salesman Problem and a testing structure around it to optimize meta-parameters like population size, mutation probability or crossover method

  6. Online portfolio


    Headless Debian to practice web design and server administration

    Also acts as a personal Git server and occasional blog

Let's work together

If you have a project that you want to get started, think you need my help with something, or just fancy saying hi, send me a message, I'm always happy to help !

Message Me