The recent use of machine learning in high stakes applications has been pushing many industrial actors to rethink how safety-critical systems (such as planes or cars) can be certified before being manufactured and used. Key questions have emerged, such as: how to properly define safety of systems with learning components? how to formally guarantee safety? which new mathematical guarantees would be needed from the ML research community?
This workshop will bring together machine learning researchers with international authorities and industrial experts from sectors where certification and reliability is a critical issue. It will consist of invited talks, a poster session, and group discussions. The goal is to present key open industrial questions, traditional methods in critical software verification and certification (and AI-related challenges), as well as an introduction to several promising mathematical theories (distribution-free uncertainty quantification, deep learning theory, formal methods, and rigorous numerics).
This workshop is organized by the DEEL project1. We hope this workshop will help shape the future research agenda toward the middle-term objective of certifying critical systems involving AI components.
Deep learning opens up promising prospects for aviation as in many other fields. However, it raises the crucial question of the level of confidence that can be placed in these techniques when used in safety-critical applications and of their compatibility with strict certification requirements.
EASA has published its Artificial Intelligence Roadmap at the beginning of 2020, with a view to enabling the approval of AI-based solutions and with an initial focus on machine learning techniques.
Generalization bounds are ubiquitous in statistical learning theory, yet they are sometimes poorly understood in machine learning practice.
The goal of this talk is to provide a brief overview of statistical concepts and tools related to generalization. More precisely, we will touch upon concentration inequalities, multiple testing issues, and the well-known VC bound for binary classification.
The dominant term in PAC-Bayes bounds is often the Kullback–Leibler divergence between the posterior and prior. For so-called linear PAC-Bayes risk bounds based on the empirical risk of a fixed posterior kernel, it is possible to minimize the expected value of the bound by choosing the prior to be the expected posterior, which we call the oracle prior on the account that it is distribution dependent.
In this work, we show that the bound based on the oracle prior can be suboptimal: In some cases, a stronger bound is obtained by using a data-dependent oracle prior, i.e., a conditional expectation of the posterior, given a subset of the training data that is then excluded from the empirical risk term. While using data to learn a prior is a known heuristic, its essential role in optimal bounds is new. In fact, we show that using data can mean the difference between vacuous and nonvacuous bounds. We apply this new principle in the setting of nonconvex learning, simulating data-dependent oracle priors on MNIST and Fashion MNIST with and without held-out data, and demonstrating new nonvacuous bounds in both cases.
Joint work with Gintare Karolina Dziugaite, Kyle Hsu, Waseem Gharbieh and Gabriel Arpino based on https://arxiv.org/abs/2006.10929
The area of distribution-free uncertainty quantification (UQ) aims to provide UQ guarantees without making distributional assumptions. For regression problems, one natural framework involves conformal prediction, but for classification problems, calibration is a natural goal.
We will present some approaches for binary and multiclass post-hoc calibration that have guarantees without making distributional assumptions.
Joint work with Chirag Gupta and Aleksandr Podkopaev. The part on binary classification was published at NeurIPS'20 (https://arxiv.org/abs/2006.10564) and the parts on multiclass calibration include work in progress (to be submitted).
Embedded critical software are subject to strong safety requirements and undergo stringent verification and validation processes. While testing remains a key method, there is a movement towards formal methods as they can offer strong mathematical guarantees about systems behaviors. One such method, semantic static analysis by abstract interpretation, allows discovering at compile-time properties of the executions of a program, such as the absence of certain classes of errors, without actually executing the program, by direct and automatic inspection of the source code. Commercial static analysis tools are now deployed since a decade and routinely used in embedded industries to certify traditional (non AI-enabled) software.
After presenting the principles of abstract interpretation, I will present the design of a such tool, Astrée, a state of the art analyzer that has been applied on synchronous and concurrent embedded critical C code to prove the absence of run-time errors. We hope that this experience gives us insights on how to address the certification challenges posed by the introduction of machine learning into embedded critical systems.
In this talk I will discuss some of the latest progress we have made on the topic of certification of neural networks, including the latest convex relaxations and available systems. I will also briefly discuss a new method we recently developed that can extract certified robustness out of large scale networks, a fundamental challenge blocking broader applicability of certification methods that has so far not been addressed.
In various fields, ranging from aerospace engineering or robotics to weather prediction or computer-assisted mathematical proofs, fast and precise computations are essential. Rigorous numerics (sometimes called validated computing as well) is a relatively recent field, developed in the last 20 years, which uses numerical computations, yet is able to provide rigorous mathematical statements about the obtained result, such as guaranteed and reasonably tight error bounds. This area of research deals with problems that cannot or are difficult and costly in time to be solved by traditional mathematical formal methods, like problems that have a large search space, problems for which closed forms given by symbolic computations are not available or too difficult to obtain, or problems in nonlinear analysis.
The aim of this talk is to provide an introduction to several rigorous computing methods and algorithms developed based on the theory of set-valued analysis (in specific function spaces) as well as by combining symbolic and numerical computations. These techniques are illustrated with some applications related to the efficient finite precision evaluation of numerical functions (some of which appear in practical space mission analysis and design).
Self-driving is a safety-critical application. The need for increased explainability is strong for such self-driving models.
In this talk, I first present deep learning framework developed by leading autonomous driving companies.
I then discuss explainability challenges and methods for such self-driving systems, gathering contributions from computer vision, deep learning, and autonomous driving research fields. Some methods providing explanations to a black-box self-driving system in a post-hoc fashion are presented.
Finally, considering natural language explanation, our BEEF model developed within valeo.ai for driving behavior explanation with multi-level fusion is detailed.