Adversarial Attacks on A.I. Systems — NextCon, Jan 2019

Please download to get full document.

View again

All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
 6
 
  Machine Learning is itself just another tool, susceptible to adversarial attacks. These can have huge implications, especially in a world with self-driving cars and other automation. In this talk, we will look at recent developments in the world of adversarial attacks on the A.I. systems, and how far we have come in mitigating these attacks.
Share
Transcript
  • 1. Adversarial Attacks on A.I. Systems 1 Anant Jain Co-founder, commonlounge.com (Compose Labs) https://commonlounge.com
 https://index.anantja.in NEXTCon Online AI Tech Talk Series Friday, Jan 18
  • 2. 2 Are adversarial examples simply a fun toy problem for researchers? Or an example of a deeper and more chronic frailty in our models? Motivation
  • 3. What are Adversarial attacks? CIA Model of Security Threat models Examples and demos of Adversarial attacks Proposed Defenses against adversarial attacks Intuition behind Adversarial attacks What’s next? 3 Outline
  • 4. What are Adversarial attacks? CIA Model of Security Threat models Examples and demos of Adversarial attacks Proposed Defenses against adversarial attacks Intuition behind Adversarial attacks What’s next? 4 Outline
  • 5. What exactly is “learnt” in Machine Learning? 5 Introduction
  • 6. 6 Source: http://www.cleverhans.io/security/privacy/ml/2016/12/16/breaking-things-is-easy.html
  • 7. What exactly is “learnt” in Machine Learning? Discussion 1. Neural Network 7 Feed-forward neural network
  • 8. What exactly is “learnt” in Machine Learning? Discussion Feed-forward neural network 1. Neural Network 2. Weights 3. Back-propagation 8
  • 9. What exactly is “learnt” in Machine Learning? Discussion 1. Neural Network 2. Weights 3. Cost Function 9 Feed-forward neural network
  • 10. What exactly is “learnt” in Machine Learning? Discussion 1. Neural Network 2. Weights 3. Cost Function 4. Gradient Descent 10 Feed-forward neural network
  • 11. What exactly is “learnt” in Machine Learning? Discussion 1. Neural Network 2. Weights 3. Cost Function 4. Gradient Descent 5. Back Propagation 11 Feed-forward neural network
  • 12. 12 Source: http://www.cleverhans.io/security/privacy/ml/2016/12/16/breaking-things-is-easy.html
  • 13. What are Adversarial attacks? CIA Model of Security Threat models Examples and demos of Adversarial attacks Proposed Defenses against adversarial attacks Intuition behind Adversarial attacks What’s next? 13 Outline
  • 14. CIA Model of Security 14 Discussion
  • 15. CIA Model of Security 15 Discussion • Confidentiality • Must not leak the training data used to train it • Eg: sensitive medical data
  • 16. CIA Model of Security 16 Discussion • Confidentiality • Integrity: • should not be possible to alter predictions • during training by poisoning training data sets • during inference by showing the system adversarial examples
  • 17. CIA Model of Security 17 Discussion • Confidentiality • Integrity • Accessibility • force a machine learning system to go into failsafe mode • Examples: Force an autonomous car to pull over
  • 18. What are Adversarial attacks? CIA Model of Security Threat models Examples and demos of Adversarial attacks Proposed Defenses against adversarial attacks Intuition behind Adversarial attacks What’s next? 18 Outline
  • 19. Threat Models 19 Discussion
  • 20. Threat Models 20 Discussion • White box: • Adversary has knowledge of the machine learning model architecture and its parameters
  • 21. Threat Models 21 Discussion • White box: • Adversary has knowledge of the machine learning model architecture and its parameters • Black box • Adversary only capable of interacting with the model by observing its predictions on chosen inputs • More realistic
  • 22. Threat Models 22 Discussion
  • 23. Threat Models 23 Discussion • Non-targeted attack • Force the model to misclassify the adversarial image
  • 24. Threat Models 24 Discussion • Non-targeted attack • Force the model to misclassify the adversarial image • Targeted attack • Get the model to classify the input as a specific target class, which is different from the true class
  • 25. 25 Discussion What are Adversarial Attacks?
  • 26. 26
  • 27. 27 Common Attacks
  • 28. 28 Common Attacks 1. Fast Gradient Sign Method (FGSM)
  • 29. 29 Common Attacks 1. Fast Gradient Sign Method (FGSM)
  • 30. 30 Common Attacks 2. Targeted Fast Gradient Sign Method (T-FGSM)
  • 31. 31 Common Attacks 3. Iterative Fast Gradient Sign Method (I-FGSM) Both one-shot methods (FGSM andT-FGSM) have lower success rates when compared to the iterative methods (I-FGSM) in white box attacks, however when it comes to black box attacks the basic single-shot methods turn out to be more effective.The most likely explanation for this is that the iterative methods tend to overfit to a particular model.
  • 32. 32 Boosting Adversarial attacks with Momentum (MI-FGSM) Winning attack at NIPS 2017
  • 33. 33 Boosting Adversarial attacks with Momentum (MI-FGSM) Winning attack at NIPS 2017
  • 34. 34 More attacks
  • 35. 35 • Deep Fool: • Iteratively “linearizes” the loss function at an input point (taking the tangent to the loss function at that point), and applies the minimal perturbation necessary. More attacks
  • 36. 36 • Deep Fool: • Iteratively “linearizes” the loss function at an input point (taking the tangent to the loss function at that point), and applies the minimal perturbation necessary. • Carlini’s Attack: • Optimizes for having the minimal distance from the original example, under the constraint of having the example be misclassified by the original model • Costly but very effective More attacks
  • 37. What are Adversarial attacks? CIA Model of Security Threat models Examples and demos of Adversarial attacks Proposed Defenses against adversarial attacks Intuition behind Adversarial attacks What’s next? 37 Outline
  • 38. 38 https://arxiv.org/pdf/1312.6199.pdf
  • 39. 39 Take a correctly classified image (left image in both columns), and add a tiny distortion (middle) to fool the ConvNet with the resulting image (right).
  • 40. 40 https://arxiv.org/pdf/1412.6572.pdf
  • 41. 41 https://arxiv.org/pdf/1412.6572.pdf
  • 42. 42 https://arxiv.org/pdf/1607.02533.pdf
  • 43. 43 Source: http://www.cleverhans.io/security/priva easy.html Adversarial examples can be printed out on normal paper and photographed with a standard resolution smartphone and still cause a classifier to, in this case, label a “washer” as a “safe”.
  • 44. 44 https://arxiv.org/pdf/1712.09665.pdf
  • 45. Demo 45
  • 46. Demo 46
  • 47. Download “Demitasse” 47 bit.ly/image-recog Download VGG-CNN-F (Binary Compression) model data (106 MB)
  • 48. What are the implications of these attacks? 48 Discussion
  • 49. What are the implications of these attacks? 49 Discussion •Self Driving Cars: A patch may make a car think that a Stop Sign is a Yield Sign
  • 50. What are the implications of these attacks? 50 Discussion •Self Driving Cars: A patch may make a car think that a Stop Sign is a Yield Sign •Alexa: Voice-based Personal Assistants: Transmit sounds that sound like noise, but give specific commands (video)
  • 51. What are the implications of these attacks? 51 Discussion •Self Driving Cars: A patch may make a car think that a Stop Sign is a Yield Sign •Alexa: Voice-based Personal Assistants: Transmit sounds that sound like noise, but give specific commands (video) •Ebay: Sell livestock and other banned substances.
  • 52. 52 Three remarkable factors about Adversarial examples
  • 53. 53 Three remarkable factors about Adversarial examples • Small perturbation • Amount of noise added is imperceivable
  • 54. 54 Three remarkable factors about Adversarial examples • Small perturbation • Amount of noise added is imperceivable • High Confidence • It was easy to attain high confidence in the incorrect classification
  • 55. 55 Three remarkable factors about Adversarial examples • Small perturbation • Amount of noise added is imperceivable • High Confidence • It was easy to attain high confidence in the incorrect classification • Transferability • Didn’t depend on the specific ConvNet used for the task.
  • 56. What are Adversarial attacks? CIA Model of Security Threat models Examples and demos of Adversarial attacks Proposed Defenses against adversarial attacks Intuition behind Adversarial attacks What’s next? 56 Outline
  • 57. How do you defend A.I. systems from these attacks? 57 Discussion
  • 58. How do you defend A.I. systems from these attacks? 58 Discussion • Adversarial training  • Generate a lot of adversarial examples and explicitly train the model not to be fooled by each of them • Improves the generalization of a model when presented with adversarial examples at test time.
  • 59. How do you defend A.I. systems from these attacks? 59 Discussion • Defensive distillation smooths the model’s decision surface in adversarial directions exploited by the adversary. • Train the model to output probabilities of different classes, rather than hard decisions about which class to output. • Creates a model whose surface is smoothed in the directions an adversary will typically try to exploit.
  • 60. What are Adversarial attacks? CIA Model of Security Threat models Examples and demos of Adversarial attacks Proposed Defenses against adversarial attacks Intuition behind Adversarial attacks What’s next? 60 Outline
  • 61. 61 Are adversarial examples simply a fun toy problem for researchers? Or an example of a deeper and more chronic frailty in our models? Motivation
  • 62. 62 Model linearity
  • 63. 63 Model linearity • Linear models’ behavior outside of the region where training data is concentrated is quite pathological.
  • 64. 64 Model linearity In the example above, if we move in a direction perpendicular to the decision boundary, we can, with a relatively small-magnitude vector, push ourselves to a place where the model is very confident in the wrong direction
  • 65. 65 Model linearity • Linear models’ behavior outside of the region where training data is concentrated is quite pathological.
 • In a high-dimensional space, each individual pixel might only increase by a very small amount, but have those small differences contribute to a dramatic difference in the weights * inputs dot product.
  • 66. 66 Model linearity Within the space of possible nonlinear activation functions, modern deep nets have actually settled on one that is very close to linear: the Rectified Linear Units. (ReLU)
  • 67. 67 Model linearity
  • 68. 68 From Ian Goodfellow’s key paper on the topic:
 “Using a network that has been designed to be sufficiently linear–whether it is a ReLU or maxout network, an LSTM, or a sigmoid network that has been carefully configured not to saturate too much– we are able to fit most problems we care about, at least on the training set. The existence of adversarial examples suggests that being able to explain the training data or even being able to correctly label the test data does not imply that our models truly understand the tasks we have asked them to perform. Instead, their linear responses are overly confident at points that do not occur in the data distribution, and these confident predictions are often highly incorrect. …One may also conclude that the model families we use are intrinsically flawed. Ease of optimization has come at the cost of models that are easily misled.”
  • 69. What are Adversarial attacks? CIA Model of Security Threat models Examples and demos of Adversarial attacks Proposed Defenses against adversarial attacks Intuition behind Adversarial attacks What’s next? 69 Outline
  • 70. • We would like our models to be able to “fail gracefully” when used in production 70 What’s next?
  • 71. • We would like our models to be able to “fail gracefully” when used in production • We would want to push our models to exhibit appropriately low confidence when they’re operating out of distribution 71 What’s next?
  • 72. • We would like our models to be able to “fail gracefully” when used in production • We would want to push our models to exhibit appropriately low confidence when they’re operating out of distribution • Real problem here: models exhibiting unpredictable and overly confident performance outside of the training distribution. Adversarial examples are actually just an imperfect proxy to this problem. 72 What’s next?
  • 73. Machine Learning is itself just another tool, susceptible to adversarial attacks. These can have huge implications, especially in a world with self-driving cars and other automation. 73 Summary
  • 74. Thanks for attending the talk! 74 Anant Jain Co-founder, commonlounge.com (Compose Labs) https://commonlounge.com/pathfinder
 https://index.anantja.in Commonlounge.com is an online-learning platform similar to Coursera/Udacity, except our courses are in the form of lists of text-based tutorials, quizzes and step- by-step projects instead of videos. 
 
 Check out our Deep Learning Course!
  • 75. Bonus Privacy issues in ML (and how the two can be unexpected allies) 75
  • 76. Privacy issues in ML (and how the two can be unexpected allies) 76 Bonus • Lack of fairness and transparency when learning algorithms process the training data.
  • 77. Privacy issues in ML (and how the two can be unexpected allies) 77 Bonus • Lack of fairness and transparency when learning algorithms process the training data. • Training data leakage: How do you make sure that ML Systems do not memorize sensitive information about the training set, such as the specific medical histories of individual patients? Differential Privacy
  • 78. PATE (Private Aggregator of Teacher Ensembles) 78
  • 79. Generative Adversarial Networks (GANs) 79 Bonus
  • 80. Generative Adversarial Networks (GANs) 80
  • 81. Applications of GANs 81 Bonus
  • 82. Applications of GANs 82 Bonus •Creativity suites (Photo, video editing): Interactive image editing (Adobe Research), Fashion, Digital Art (Deep Dream 2.0)
  • 83. Applications of GANs 83 Bonus •Creativity suites (Photo, video editing): Interactive image editing (Adobe Research), Fashion, Digital Art (Deep Dream 2.0) •3D objects: Shape Estimation (from 2D images), Shape Manipulation
  • 84. Applications of GANs 84 Bonus •Creativity suites (Photo, video editing): Interactive image editing (Adobe Research), Fashion, Digital Art (Deep Dream 2.0) •3D objects: Shape Estimation (from 2D images), Shape Manipulation •Medical (Insilico Medicine): Drug discovery, Molecule development
  • 85. Applications of GANs 85 Bonus •Creativity suites (Photo, video editing): Interactive image editing (Adobe Research), Fashion, Digital Art (Deep Dream 2.0) •3D objects: Shape Estimation (from 2D images), Shape Manipulation •Medical (Insilico Medicine): Drug discovery, Molecule development •Games / Simulation: Generating realistic environments (buildings, graphics, etc), includes inferring physical laws, and relation of objects to one another
  • 86. Applications of GANs 86 Bonus • Creativity suites (Photo, video editing): Interactive image editing (Adobe Research), Fashion, Digital Art (Deep Dream 2.0) • 3D objects: Shape Estimation (from 2D images), Shape Manipulation • Medical (Insilico Medicine): Drug discovery, Molecule development • Games / Simulation: Generating realistic environments (buildings, graphics, etc), includes inferring physical laws, and relation of objects to one another • Robotics: Augmenting real-world training with virtual training
  • Related Search
    Similar documents
    View more
    We Need Your Support
    Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

    Thanks to everyone for your continued support.

    No, Thanks
    SAVE OUR EARTH

    We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

    More details...

    Sign Now!

    We are very appreciated for your Prompt Action!

    x