a665b89d4c0ff7c636aad7060a2b39b2fc560693 | https://www.semanticscholar.org/paper/a665b89d4c0ff7c636aad7060a2b39b2fc560693 | A Hybrid Predictive Model for Mitigating Health and Economic Factors during a Pandemic | | 1748955616 | Lisa Veiber | 2399971 | Salah Ghamizi | 1704747 | Jean-Sébastien Sottet | | | | | | | | | | |
14996cc316d471b84124c77ef9ff5922ad97b155 | https://www.semanticscholar.org/paper/14996cc316d471b84124c77ef9ff5922ad97b155 | Data-driven Simulation and Optimization for Covid-19 Exit Strategies | The rapid spread of the Coronavirus SARS-2 is a major challenge that led almost all governments worldwide to take drastic measures to respond to the tragedy. Chief among those measures is the massive lockdown of entire countries and cities, which beyond its global economic impact has created some deep social and psychological tensions within populations. While the adopted mitigation measures (including the lockdown) have generally proven useful, policymakers are now facing a critical question: how and when to lift the mitigation measures? A carefully-planned exit strategy is indeed necessary to recover from the pandemic without risking a new outbreak. Classically, exit strategies rely on mathematical modeling to predict the effect of public health interventions. Such models are unfortunately known to be sensitive to some key parameters, which are usually set based on rules-of-thumb. In this paper, we propose to augment epidemiological forecasting with actual data-driven models that will learn to fine-tune predictions for different contexts (e.g., per country). We have therefore built a pandemic simulation and forecasting toolkit that combines a deep learning estimation of the epidemiological parameters of the disease in order to predict the cases and deaths, and a genetic algorithm component searching for optimal trade-offs/policies between constraints and objectives set by decision-makers. Replaying pandemic evolution in various countries, we experimentally show that our approach yields predictions with much lower error rates than pure epidemiological models in 75% of the cases and achieves a 95% R² score when the learning is transferred and tested on unseen countries. When used for forecasting, this approach provides actionable insights into the impact of individual measures and strategies. | 2399971 | Salah Ghamizi | 81525994 | Renaud Rwemalika | 1748955616 | Lisa Veiber | 2478227 | Maxime Cordy | 3023999 | Tegawendé F. Bissyandé | 37766916 | Mike Papadakis | 2445617 | Jacques Klein | 47681863 | Y. L. Traon |
32e8a4ac3a5609b961cf67b2aa2bec8a548e499c | https://www.semanticscholar.org/paper/32e8a4ac3a5609b961cf67b2aa2bec8a548e499c | Search-based adversarial testing and improvement of constrained credit scoring systems | Credit scoring systems are critical FinTech applications that concern the analysis of the creditworthiness of a person or organization. While decisions were previously based on human expertise, they are now increasingly relying on data analysis and machine learning. In this paper, we assess the ability of state-of-the-art adversarial machine learning to craft attacks on a real-world credit scoring system. Interestingly, we find that, while these techniques can generate large numbers of adversarial data, these are practically useless as they all violate domain-specific constraints. In other words, the generated examples are all false positives as they cannot occur in practice. To circumvent this limitation, we propose CoEvA2, a search-based method that generates valid adversarial examples (satisfying the domain constraints). CoEvA2 utilizes multi-objective search in order to simultaneously handle constraints, perform the attack and maximize the overdraft amount requested. We evaluate CoEvA2 on a major bank’s real-world system by checking its ability to craft valid attacks. CoEvA2 generates thousands of valid adversarial examples, revealing a high risk for the banking system. Fortunately, by improving the system through adversarial training (based on the produced examples), we increase its robustness and make our attack fail. | 2399971 | Salah Ghamizi | 2478227 | Maxime Cordy | 35582593 | Martin Gubri | 37766916 | Mike Papadakis | 2008292911 | Andrey Boystov | 47681863 | Y. L. Traon | 88521185 | A. Goujon | | |
5b51be9d2cfea76340d7469797d996effbefae38 | https://www.semanticscholar.org/paper/5b51be9d2cfea76340d7469797d996effbefae38 | Replication package for “Search-Based Adversarial Testing and Improvement of Constrained Credit Scoring Systems”, accepted at ESEC/FSE 2020 | | 2399971 | Salah Ghamizi | 2478227 | Maxime Cordy | 35582593 | Martin Gubri | 37766916 | Mike Papadakis | 2008292911 | Andrey Boystov | 47681863 | Y. L. Traon | 88521185 | A. Goujon | | |
9bcf271c2d2e26ce16087d1bbee323f16352ffff | https://www.semanticscholar.org/paper/9bcf271c2d2e26ce16087d1bbee323f16352ffff | FeatureNET: Diversity-Driven Generation of Deep Learning Models | We present FeatureNET, an open-source Neural Architecture Search (NAS) tool 1 that generates diverse sets of Deep Learning (DL) models. FeatureNET relies on a meta-model of deep neural networks, consisting of generic configurable entities. Then, it uses tools developed in the context of software product lines to generate diverse (maximize the differences between the generated) DL models. The models are translated to Keras and can be integrated into typical machine learning pipelines. FeatureNET allows researchers to generate seamlessly a large variety of models. Thereby, it helps choosing appropriate DL models and performing experiments with diverse models (mitigating potential threats to validity). As a NAS method, FeatureNET successfully generates models performing equally well with handcrafted models. | 2399971 | Salah Ghamizi | 2478227 | Maxime Cordy | 37766916 | Mike Papadakis | 47681863 | Y. L. Traon | | | | | | | | |
a3ba73258e52d8e573f61115ff3840b1800ea833 | https://www.semanticscholar.org/paper/a3ba73258e52d8e573f61115ff3840b1800ea833 | Pandemic Simulation and Forecasting of exit strategies:Convergence of Machine Learning and EpidemiologicalModels | The COVID-19 pandemic has created a public health emergency unprecedented in this century. The lack of accurate knowledge regarding the outcomes of the virus has made it challenging for policymakers to decide on appropriate countermeasures to mitigate its impact on society, in particular the public health and the very healthcare system. While the mitigation strategies (including the lockdown) are getting lifted, understanding the current impacts of the outbreak remains challenging. This impedes any analysis and scheduling of measures required for the different countries to recover from the pandemic without risking a new outbreak. Therefore, we propose a novel approach to build realistic data-driven pandemic simulation and forecasting models to support policymakers. Our models allow the investigation of mitigation/recovery measures and their impact. Thereby, they enable appropriate planning of those measures, with the aim to optimize their societal benefits. Our approach relies on a combination of machine learning and classical epidemiological models, circumventing the respective limitations of these techniques to allow a policy-making based on established knowledge, yet driven by factual data, and tailored to each country’s specific context. | 2399971 | Salah Ghamizi | 81525994 | Renaud Rwemalika | 2478227 | Maxime Cordy | 47681863 | Y. L. Traon | 37766916 | Mike Papadakis | | | | | | |
40b698db89881df1877cbb644863b408bcfd1b28 | https://www.semanticscholar.org/paper/40b698db89881df1877cbb644863b408bcfd1b28 | Adversarial Embedding: A robust and elusive Steganography and Watermarking technique | We propose adversarial embedding, a new steganography and watermarking technique that embeds secret information within images. The key idea of our method is to use deep neural networks for image classification and adversarial attacks to embed secret information within images. Thus, we use the attacks to embed an encoding of the message within images and the related deep neural network outputs to extract it. The key properties of adversarial attacks (invisible perturbations, nontransferability, resilience to tampering) offer guarantees regarding the confidentiality and the integrity of the hidden messages. We empirically evaluate adversarial embedding using more than 100 models and 1,000 messages. Our results confirm that our embedding passes unnoticed by both humans and steganalysis methods, while at the same time impedes illicit retrieval of the message (less than 13% recovery rate when the interceptor has some knowledge about our model), and is resilient to soft and (to some extent) aggressive image tampering (up to 100% recovery rate under jpeg compression). We further develop our method by proposing a new type of adversarial attack which improves the embedding density (amount of hidden information) of our method to up to 10 bits per pixel. | 2399971 | Salah Ghamizi | 2478227 | Maxime Cordy | 37766916 | Mike Papadakis | 47681863 | Y. L. Traon | | | | | | | | |
454cbd438765731ababe434071d29a6a8f76c0e5 | https://www.semanticscholar.org/paper/454cbd438765731ababe434071d29a6a8f76c0e5 | Automated Search for Configurations of Convolutional Neural Network Architectures | Convolutional Neural Networks (CNNs) are intensively used to solve a wide variety of complex problems. Although powerful, such systems require manual configuration and tuning. To this end, we view CNNs as configurable systems and propose an end-to-end framework that allows the configuration, evaluation and automated search for CNN architectures. Therefore, our contribution is threefold. First, we model the variability of CNN architectures with a Feature Model (FM) that generalizes over existing architectures. Each valid configuration of the FM corresponds to a valid CNN model that can be built and trained. Second, we implement, on top of Tensorflow, an automated procedure to deploy, train and evaluate the performance of a configured model. Third, we propose a method to search for configurations and demonstrate that it leads to good CNN models. We evaluate our method by applying it on image classification tasks (MNIST, CIFAR-10) and show that, with limited amount of computation and training, our method can identify high-performing architectures (with high accuracy). We also demonstrate that we outperform existing state-of-the-art architectures handcrafted by ML researchers. Our FM and framework have been released to support replication and future research. | 2399971 | Salah Ghamizi | 2478227 | Maxime Cordy | 37766916 | Mike Papadakis | 47681863 | Y. L. Traon | | | | | | | | |
df5021deca82a398cbb6d1bc2cffbab0eb37cac5 | https://www.semanticscholar.org/paper/df5021deca82a398cbb6d1bc2cffbab0eb37cac5 | Automated Search for Configurations of Deep Neural Network Architectures | Deep Neural Networks (DNNs) are intensively used to solve a wide variety of complex problems. Although powerful, such systems require manual configuration and tuning. To this end, we view DNNs as configurable systems and propose an end-to-end framework that allows the configuration, evaluation and automated search for DNN architectures. Therefore, our contribution is threefold. First, we model the variability of DNN architectures with a Feature Model (FM) that generalizes over existing architectures. Each valid configuration of the FM corresponds to a valid DNN model that can be built and trained. Second, we implement, on top of Tensorflow, an automated procedure to deploy, train and evaluate the performance of a configured model. Third, we propose a method to search for configurations and demonstrate that it leads to good DNN models. We evaluate our method by applying it on image classification tasks (MNIST, CIFAR-10) and show that, with limited amount of computation and training, our method can identify high-performing architectures (with high accuracy). We also demonstrate that we outperform existing state-of-the-art architectures handcrafted by ML researchers. Our FM and framework have been released %and are publicly available to support replication and future research. | 2399971 | Salah Ghamizi | 2478227 | Maxime Cordy | 37766916 | Mike Papadakis | 47681863 | Y. L. Traon | | | | | | | | |
21d297420bf200693b1b92fe77f7204d9d860ab7 | https://www.semanticscholar.org/paper/21d297420bf200693b1b92fe77f7204d9d860ab7 | Re-typograph phase I: a proof-of-concept for typeface parameter extraction from historical documents | This paper reports on the first phase of an attempt to create a full retro-engineering pipeline that aims to construct a complete set of coherent typographic parameters defining the typefaces used in a printed homogenous text. It should be stressed that this process cannot reasonably be expected to be fully automatic and that it is designed to include human interaction. Although font design is governed by a set of quite robust and formal geometric rulesets, it still heavily relies on subjective human interpretation. Furthermore, different parameters, applied to the generic rulesets may actually result in quite similar and visually difficult to distinguish typefaces, making the retro-engineering an inverse problem that is ill conditioned once shape distortions (related to the printing and/or scanning process) come into play. This work is the first phase of a long iterative process, in which we will progressively study and assess the techniques from the state-of-the-art that are most suited to our problem and investigate new directions when they prove to not quite adequate. As a first step, this is more of a feasibility proof-of-concept, that will allow us to clearly pinpoint the items that will require more in-depth research over the next iterations. | 1801759 | B. Lamiroy | 2566397 | Thomas Bouville | 2484045 | Julien Blégean | 3130250 | Hongliu Cao | 2399971 | Salah Ghamizi | 2031214 | R. Houpin | 2055353321 | Matthias Lloyd | | |