David A. R. Robin

PhD candidate at ENS

Machine Learning

Publications

[ICLR 24] Random Sparse Lifts: Construction, Analysis and Convergence of finite sparse networks

International Conference on Learning Representations, Vienna, 2024

Abstract: We present a framework to define a large class of neural networks for which, by construction, training by gradient flow provably reaches arbitrarily low loss when the number of parameters grows. Distinct from the fixed-space global optimality of non-convex optimization, this new form of convergence, and the techniques introduced to prove such convergence, pave the way for a usable deep learning convergence theory in the near future, without overparameterization assumptions relating the number of parameters and training samples. We define these architectures from a simple computation graph and a mechanism to lift it, thus increasing the number of parameters, generalizing the idea of increasing the widths of multi-layer perceptrons. We show that architectures similar to most common deep learning models are present in this class, obtained by sparsifying the weight tensors of usual architectures at initialization. Leveraging tools of algebraic topology and random graph theory, we use the computation graph’s geometry to propagate properties guaranteeing convergence to any precision for these large sparse models.

Full text : [ OpenReview ]
[NeurIPS 22] Convergence beyond the overparameterized regime with Rayleigh quotients

Neural Information Processing Systems, New Orleans, 2022

Abstract: We present a new strategy to prove the convergence of Deep Learning architectures to a zero training (or even testing) loss by gradient flow. Our analysis is centered on the notion of Rayleigh quotients in order to prove Kurdyka-Lojasiewicz inequalities for a broader set of neural network architectures and loss functions. We show that Rayleigh quotients provide a unified view for several convergence analysis techniques in the literature. Our strategy produces a proof of convergence for various examples of parametric learning. In particular, our analysis does not require the number of parameters to tend to infinity, nor the number of samples to be finite, thus extending to test loss minimization and beyond the over-parameterized regime.

Full text & code : [ OpenReview ] [ Github ]
[NeurReps 22] Periodic Signal Recovery with Regularized Sine Neural Networks

Neural Information Processing Systems, NeurReps Workshop, New Orleans, 2022

Abstract: We consider the problem of learning a periodic one-dimensional signal with neural networks, and designing models that are able to extrapolate the signal well beyond the training window. First, we show that multi-layer perceptrons with ReLU activations are provably unable to perform this task, and lead to poor performance in practice even close to the training window. Then, we propose a novel architecture using sine activation functions along with a well-chosen non-convex regularization, that is able to extrapolate the signal with low error well beyond the training window. Our architecture is several orders of magnitude better than its competitors for distant extrapolation (beyond 100 periods of the signal), while being able to accurately recover the frequency spectrum of the signal in a multi-tone setting.

Full text & code : [ OpenReview ] [ Github ]

Patents

Linear neural reconstruction for deep neural network compression

WO 2020/236976 A1, filed with Interdigital, Palo Alto (CA)

Nov 2020 (WIPO)

Patent for the neural-network compression algorithm developed during the internship with Technicolor AI Lab / Interdigital. Layer-wise weight compression recast as a sequence of convex activation-reconstruction problems, preserving the output of the deep neural network while reducing its size, in memory and on disk.

Google Patent page : [ summary ] [ pdf ]

Experience

Research Internship : Hypentropic reparameterization

Laboratoire de Mathématiques LMO, Orsay

Oct 2020 - Jun 2021

Research internship with Lénaïc Chizat (CNRS) on the implicit bias induced by the gradient descent algorithm on two-layer neural networks. Characterized the continuous limit point as the Bregman projection with hyperbolic entropy potential of the initialization weights to the set of zero-loss weights, with linear convergence speed under some technical assumptions.

Internship report : [ pdf ]
Research Internship : Clifford-valued networks

Upstride SAS, Station F, Paris

Feb - Aug 2020

Research internship with Wilder Lopes exploring computational efficiency of variational auto- encoders defined over Clifford algebras. Demonstrated experimentally superior reconstruction performance of networks leveraging higher-dimensional algebras on small images.
Research Internship : Neural network compression

Technicolor AI Lab (acquired by Interdigital), San Francisco (CA)

Feb - Aug 2019

Research internship with Swayambhoo Jain on compression of neural networks. Developed a fast compression method able to cut up to 90% of weights with no drop in accuracy by casting layerwise compression as a series of convex activation reconstruction problems.

Internship report : [ html ] [ pdf ] [ slides ] [ patent ]
Research Internship : Optimal Transport

Massachussets Institute of Technology, Boston (MA)

Jun - Aug 2018

Research Internship with Philippe Rigollet (MIT) on reconstruction of cellular trajectories in gene expression space with optimal transport. The resulting toolkit for single cell RNA sequencing timeseries analysis is open source and available as a Python package.

Waddington Optimal Transport : broadinstitute/wot (diverged since)

Internship report (in french) : [ html ] [ pdf ] [ slides ]

Teaching

Teaching Assistant : Deep Learning

Deep Learning (MAP583) course by Kevin Scaman (INRIA - ENS), École Polytechnique

Practical introduction to deep learning and all implementation details, with a focus on coverage of a large amount of different data domains and network architectures.

Resources : [ Synapses page ] [ Practicals repository ] [ Custom python package ]
Guest Lecture : Neural network compression

Deep Learning course by Marc Lelarge (INRIA - ENS), ENS Paris

Introduction to neural network compression concepts and recent results, with a focus and practical session on activation reconstruction.

Resources : [ Lecture slides ] [ Practical Session ] [ Practical Session Solution ]

Education

PhD in Mathematics

From Oct 2021 to present

INRIA - ENS, Paris. DYOGENE Project-team

Advised by Marc Lelarge and Kevin Scaman

Reparameterizations of deep neural networks for structured data with symmetries.
Diplôme de l'ENS (Info-Maths)

Final year of the ENS cursus

École Normale Supérieure, Paris, 2020-2021

Additional advanced courses on stochastic processes and algebraic geometry.
M. Sc. Computer Science

Mathématiques, Vision & Apprentissage (MVA)

École Normale Supérieure, Paris, 2018-2020

Advanced mathematics and computer science, focused on Machine Learning

Coursework includes:
- Category theory
- Network modelisation
- Parallel programming
- General Robotics
- Convex optimization
- Computer vision
- Deep Learning
- General Topology
- Differential Geometry
- Reinforcement Learning
- Natural Language Processing
- Optimal Transport
- Graphical Models
- Kernel Methods
B. Sc. Computer Science

École Normale Supérieure, Paris, 2017-2018

Solid basis in modern mathematics and computer science.

Coursework includes:
- Mathematical Logic
- Formal languages
- Algebra
- Cryptology
- Information theory
- λ-calculus and calculability
- Processor's architectures
- Operating Systems
- Databases
- Compilation
- Randomized algorithms
- Semantics and Verification
CPGE MPSI-MP*

Lycée Louis-le-Grand, Paris, 2015-2017

Post-secondary program in advanced maths and physics leading to nationwide entrance examinations to the Grandes Écoles for scientific studies
Baccalaureate in science

Lycée Hoche, Versailles, 2015

A-levels French equivalent

Awarded with highest honours

Skills

Languages

French – Mother tongue
English – Fluent
Spanish – Fluent

I.T.

Programming – C, Crystal, OCaml, Rust, Ruby, Bash, Python, Java, Julia
Web – HTML, CSS, JavaScript, SASS, PHP
Tools – Git, Vim, RSpec, LaTeX
Linux Fluent, BSD enthusiast

Other publications

[ASIA CCS 20] Return-oriented programming on RISC-V

ACM Asia Conference on Computer and Communications Security, Taipei Taiwan, 2020

Abstract: We provide the first analysis on the feasibility of Return-Oriented programming (ROP) on RISC-V, a new instruction set architecture targeting embedded systems. We show the existence of a new class of gadgets, using several Linear Code Sequences And Jumps (LCSAJ), undetected by current Galileo-based ROP gadget searching tools. We argue that this class of gadgets is rich enough on RISC-V to mount complex ROP attacks, bypassing traditional mitigation like DEP, ASLR, stack canaries, G-Free and some compiler-based backward-edge CFI, by jumping over any guard inserted by a compiler to protect indirect jump instructions. We provide examples of such gadgets, as well as a proof-of-concept ROP chain, using C code injection to leverage a privilege escalation attack on two standard Linux operating systems. Additionally, we discuss some of the required mitigations to prevent such attacks and provide a new ROP gadget finder algorithm that handles this new class of gadgets.

Full text : [ ACM Link ] [ ArXiv ]

Projects

Raspberry Pi 3 64-bit OS

UNIX-like 64-bit micro-kernel with MMU handling, dynamic memory allocation, hardware interruptions, multi-processing, and basic filesystem for the Raspberry Pi 3 (before even Linux implements 64-bit support)

Source code available on github: robindar/sysres-os
SMT Solver

Small SMT solver for equality theory decision procedures.

Implements DPLL, two-watched literals, and is fully unit-tested.

Source code available on github: robindar/semver-smt
Rust compiler

Compiler for a small (yet Turing-complete) subset of Rust.

Borrow-checked and compiled down to x86 assembly.

Source code available on github: robindar/compil-petitrust
RISC V processor emulator

"RISC V"-style basic processor emulator in Minijazz (Netlist superset) and Minijazz-to-C compiler. Supports few instructions but has a good build system and is unit-tested

Source code available on gitlab: alpr-sysdig/processor
Genetic algorithms for the Traveling Salesman Problem

School project (TIPE)

Genetic algorithm to find good solutions to the Traveling Salesman Problem and a testing structure around it to optimize meta-parameters like population size, mutation probability or crossover method
Online portfolio

https://www.robindar.com

Headless Debian to practice web design and server administration

Also acts as a personal Git server and occasional blog

Let's work together

If you have a project that you want to get started, think you need my help with something, or just fancy saying hi, send me a message, I'm always happy to help !

Message Me

David A. R. Robin

PhD candidate at ENS

Machine Learning

Publications

[ICLR 24] Random Sparse Lifts: Con­struc­tion, Ana­ly­sis and Con­ver­gence of fi­ni­te sparse net­works

[NeurIPS 22] Con­ver­gence be­yond the over­pa­ram­et­er­ized re­gi­me with Ray­leigh quot­ients

[NeurReps 22] Pe­rio­dic Sig­nal Re­covery with Re­gu­la­rized Sine Neu­ral Net­works

Patents

Linear neural reconstruction for deep neural network compression

Experience

Research Internship : Hypentropic reparameterization

Research Internship : Clifford-valued networks

Research Internship : Neural network compression

Research Internship : Optimal Transport

Teaching

Teaching Assistant : Deep Learning

Guest Lecture : Neural network compression

Education

PhD in Mathematics

Diplôme de l'ENS (Info-Maths)

M. Sc. Computer Science

B. Sc. Computer Science

CPGE MPSI-MP*

Baccalaureate in science

Skills

Languages

I.T.

Other publications

[ASIA CCS 20] Return-oriented programming on RISC-V

Projects

Raspberry Pi 3 64-bit OS

SMT Solver

Rust compiler

RISC V processor emulator

Genetic algorithms for the Traveling Salesman Problem

Online portfolio

Let's work together

[ICLR 24] Random Sparse Lifts: Construction, Analysis and Convergence of finite sparse networks

[NeurIPS 22] Convergence beyond the overparameterized regime with Rayleigh quotients

[NeurReps 22] Periodic Signal Recovery with Regularized Sine Neural Networks