Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine...

22
Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

Transcript of Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine...

Page 1: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

Down The Dependency Rabbit HoleMachine Learning as a first line of defense in Intel’s Dependency Review Process

Page 2: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

2@twitter handle

$ envNAME=John AndersenHOME=Portland ORUSER=pdxjohnnyWORK=IntelROLE=Open Source Security Software [email protected]=Embedded, Linux, Containers, Concurrency, Web Apps, PythonTHESIS=Machine Learning

Page 3: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

3

Dependency Evaluation Review Form

@twitter handle

Page 4: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

▪ Initial dataset is made up of

▪ URL of source repo

▪ Security team’s classification (Good / Bad)

▪ Review form data

▪ Plan

▪ Train model on dataset

▪ Assess accuracy

▪ Given URL, collect data to answer form questions

▪ Predict classification by feeding collected data to model

4

Automation Attempt One

@twitter handle

Page 5: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

5

Reviewers Rarely Fill Out Evaluation Form

@twitter handle

Bad

Datase

t

Page 6: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

▪ Initial dataset is made up of

▪ URL of source repo

▪ Security team’s classification (Good / Bad)

▪ Forms mostly not filled out 👎

▪ Plan

▪ Train model on dataset

▪ Assess accuracy

▪ Given URL, collect data to answer form questions

▪ Predict classification by feeding collected data to model

▪ ~60% Accuracy 🚫

6

Automation Attempt One

@twitter handle

Page 7: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

▪ Initial dataset is made up of

▪ URL of source repo

▪ Security team’s classification (Good / Bad)

▪ Plan

▪ Given URL, collect data

▪ Train model on dataset

▪ Assess accuracy

▪ Predict classification by feeding collected data to model

7

Automation Attempt Two

@twitter handle

Page 8: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

8

Brainstorming

@twitter handle

Page 9: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

Quarterly Ratio of Lines of Comments to Code

Page 10: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

10

Prediction Data Flow

@twitter handle

Labeled Data

Git Repo URL

Commits

Authors

...Tensorflow

Deep Neural Network

Good

Bad

Page 11: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

11

Request Classification Estimation

Page 12: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

Data Flow Facilitator for Machine LearningMachine Learning made easy

Page 13: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

▪ Data Flow

▪ Dataset generation

▪ Concurrency without dealing with locking

▪ Sources

▪ CSV

▪ JSON

▪ MySQL

▪ Models

▪ Tensorflow

▪ SciKit

13

Abstractions DFFML Provides

@twitter handle

Page 14: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

▪ Python Library

▪ Command Line Interface

▪ HTTP API

▪ JavaScript Client

14

Consistent API

@twitter handle

Page 15: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

15

Should I Be Installing This?

@twitter handle

Page 16: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

16

What is a Data Flow?

@twitter handle

Page 17: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

17

Deploy Anywhere - Command Line

Page 18: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

18

Deploy Anywhere - HTTP

Page 19: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

19

Deploy Anywhere - HTTP

Page 20: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

20

Deploy Anywhere - HTTP

Page 21: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

21

Extend Without Writing Code - Modify DataFlow

@twitter handle

Page 22: Down The Dependency Rabbit Hole - intel.github.io · Down The Dependency Rabbit Hole Machine Learning as a first line of defense in Intel’s Dependency Review Process

▪ How to Integrate Machine Learning Tutorial

▪ https://intel.github.io/dffml/usage/integration.html

▪ shouldi

▪ pip install shouldi && shouldi install some-package-name

▪ https://intel.github.io/dffml/tutorials/operations.html

▪ Use and Contribute!

▪ Weekly Meetings: Tuesdays at 9 AM

▪ Gitter, Mailing List, and Meeting Links: https://intel.github.io/dffml/community.html

▪ Documentation: https://intel.github.io/dffml

▪ Q&A

22

Where To Go From Here

@twitter handle