ARCH21: Papers with Abstracts

Papers
Abstract. We present the results of the ARCH1 2021 friendly competition for formal verification of continuous and hybrid systems with linear continuous dynamics. In its fifth edition, four tools have been applied to solve nine different benchmark problems in the category for linear continuous dynamics (in alphabetical order): CORA, HyDRA, JuliaReach, and SpaceEx. This report is a snapshot of the current landscape of tools and the types of benchmarks they are particularly suited for. Due to the diversity of problems, we are not ranking tools, yet the presented results provide one of the most complete assessments of tools for the safety verification of continuous and hybrid systems with linear continuous dynamics up to this date.
Abstract. We present the results of a friendly competition for formal verification of continuous and hybrid systems with nonlinear continuous dynamics. The friendly competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in 2021. This year, 5 tools Ariadne, CORA, DynIbex, JuliaReach and Kaa (in alphabetic order) participated. These tools are applied to solve reachability analysis problems on five benchmark problems, two of them featuring hybrid dynamics. We do not rank the tools based on the results, but show the current status and discover the potential advantages of different tools.
Abstract. This report presents the results of a friendly competition for formal verification and policy synthesis of stochastic models. It also introduces new benchmarks within this category, and recommends next steps for this category towards next year’s edition of the competition. The friendly competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in Spring/Summer 2021.
Abstract. This report presents the results of a friendly competition for formal verification of continuous and hybrid systems with artificial intelligence (AI) components. Specifically, machine learning (ML) components in cyber-physical systems (CPS), such as feedforward neural networks used as feedback controllers in closed-loop systems are considered, which is a class of systems classically known as intelligent control systems, or in more modern and specific terms, neural network control systems (NNCS). We more broadly refer to this category as AI and NNCS (AINNCS). The friendly competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in 2021. In the third edition of this AINNCS category at ARCH-COMP, three tools have been applied to solve seven different benchmark problems, (in alphabetical order): JuliaReach, NNV, and Verisig. JuliaReach is a new participant in this category, Verisig participated previously in 2019 and NNV has participated in all previous competitions. This report is a snapshot of the current landscape of tools and the types of benchmarks for which these tools are suited. Due to the diversity of problems, lack of a shared hardware platform, and the early stage of the competition, we are not ranking tools in terms of performance, yet the presented results combined with 2020 results probably provide the most complete assessment of current tools for safety verification of NNCS.
Abstract. This paper reports on the Hybrid Systems Theorem Proving (HSTP) category in the ARCH-COMP Friendly Competition 2021. The characteristic features of the HSTP cate- gory remain as in the previous editions [MST+18, MST+19, MMJ+20], it focuses on flexi- bility of programming languages as structuring principles for hybrid systems, unambiguity and precision of program semantics, and mathematical rigor of logical reasoning principles. The benchmark set includes nonlinear and parametric continuous and hybrid systems and hybrid games, each in three modes: fully automatic verification, semi-automatic verifica- tion from proof hints, proof checking from scripted tactics. This instance of the competition introduces extensions to the scripting language, a comparison of the influence of arithmetic backend versions on verification performance in KeYmaera X, as well as improvements in the HHL Prover.
Abstract. This report presents the results from the 2021 friendly competition in the ARCH work- shop for the falsification of temporal logic specifications over Cyber-Physical Systems. We briefly describe the competition settings, which have been inherited from the previ- ous years, give background on the participating teams and tools and discuss the selected benchmarks. Apart from new requirements and participants, the major novelty in this instalment is that falsifying inputs have been validated independently. During this pro- cess, we uncovered several issues like configuration errors and computational discrepancies, stressing the importance of this kind of validation.
Abstract. This report presents the results of the repeatability evaluation for the 5th Interna- tional Competition on Verifying Continuous and Hybrid Systems (ARCH-COMP’21). The competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in 2021, affiliated with the 7th IFAC Conference on Analysis and Design of Hybrid Systems (ADHS’21). In its fifth edition, seventeen tools submitted artifacts through a Git repository for the repeatability evaluation, applied to solve bench- mark problems for seven competition categories. The majority of participants adhered to the requirements for this year’s repeatability evaluation, namely to submit scripts to automatically install and execute tools in containerized virtual environments (specifically Dockerfiles to execute within Docker), and several categories used performance evalua- tion information from a common execution platform. The repeatability results represent a snapshot of the current landscape of tools and the types of benchmarks for which they are particularly suited and for which others may repeat their analyses. Due to the diversity of problems in verification of continuous and hybrid systems, as well as basing on standard practice in repeatability evaluations, we evaluate the tools with pass and/or failing being repeatable.
Abstract. Tool presentation: Safety-critical systems often require guaranteed state estimation instead of estimating the most-likely state. While a lot of research on guaranteed state estimation has been conducted, there exists no tool for this purpose. Since guaranteed state estimation is in many cases a reachability problem or closely related to reachability analysis,
this paper presents its implementation in the continuous reachability analyzer (CORA). We present how we integrated different types of observers, different set representations, and linear as well as nonlinear dynamics. The scalability and usefulness of the implemented
observers is demonstrated for a scalable tank system.
Abstract. Tool Presentation: We present ORBITADOR, a tool for stability analysis of dynamical systems. ORBITADOR uses a method that generates a bounded invariant set of a differential system with a given set of initial conditions around a point x0 to prove the existence of a limit cycle. This invariant has the form of a tube centered on the Euler approximate solution starting at x0, which has for radius an upper bound on the distance between the approximate solution and the exact ones. The method consists in finding a real T > 0 such that the “snapshot” of the tube at time t = (i+1)T is included in the snapshot at t = iT , for some integer i with adding a small bounded uncertainty. This uncertainty allows using an approximate value T of the exact period. We successfully applied ORBITADOR to several classical examples of periodical systems.
Abstract. We propose a benchmark for the verification of autonomous vehicles. By considering different traffic scenarios from the CommonRoad database, we obtain several thousands of different verification tasks, where the verification problem is to prove that the con- sidered tracking controller safely follows a given reference trajectory despite disturbances and measurement errors. The dynamic of the car is described by a nonlinear kinematic single-track model. Since the feedback matrix for the tracking controller is time-varying, the dynamic of the controlled system changes constantly. Because of this, the proposed benchmark is well-suited to evaluate how robustly reachability tools can handle changing system dynamics.
Abstract. Falsification is a testing method for cyber-physical systems where numerical optimization is used to find counterexamples of a given specification that the system must fulfill. The falsification process uses quantitative semantics that play the role of objective functions to minimize the distance to falsifying the specification. Falsification has gained attention due to its versatile applicability, and much work exists on various ways of implementing the falsification process, often focusing on which optimization algorithm to use, or more recently, the semantics for the formal requirements. In this work, we look at some practical aspects of input generation, i.e., the mapping from parameters used as optimization variables to signals that form the actual test cases for the system. This choice is critical but often overlooked. It is assumed that problem experts can guide how to parameterize inputs; however, this assumption is often too optimistic in practice. We observe that pulse generation is a surprisingly good first option that can falsify many common benchmarks after only a few simulations while requiring only a few parameters per signal.