ITAC 2016 Where Open Source Meets Audit Analytics

25
December 8, 2016 Andrew Clark, IT Auditor / Internal Audit Data Scientist Astec Industries, Inc., M.S. Data Science Candidate Where Open Source Meets Audit Analytics

Transcript of ITAC 2016 Where Open Source Meets Audit Analytics

Page 1: ITAC 2016 Where Open Source Meets Audit Analytics

December 8, 2016Andrew Clark, IT Auditor / Internal Audit Data Scientist

Astec Industries, Inc., M.S. Data Science Candidate

Where Open Source Meets Audit Analytics

Page 2: ITAC 2016 Where Open Source Meets Audit Analytics

Overview

1. What is open source software?

2. Why is it important?

3. What are the benefits of using open source software for analytics over CAATs?

4. How do I begin using open source software for analytics?

5. Case study

6. The application of advanced analytic techniques

Page 3: ITAC 2016 Where Open Source Meets Audit Analytics

Meet Open Source

Page 4: ITAC 2016 Where Open Source Meets Audit Analytics

Open Source Software

“Open source software is software whose source code is available for modification or enhancement by anyone.”What Is Open Source?" Opensource.com. Accessed June 12, 2016. https://opensource.com/resources/what-open-source.

Page 5: ITAC 2016 Where Open Source Meets Audit Analytics

Open Source examples

1. Linux (mainly)

2. Android (mainly)

3. Firefox

4. R programming language

5. Git

6. Docker

Page 6: ITAC 2016 Where Open Source Meets Audit Analytics

Why is it important?

Vibrant community

Frequent updates

Potential for strong security

Cutting edge technology

Customizable

Cost

Page 7: ITAC 2016 Where Open Source Meets Audit Analytics

How does Open Source relate to Audit Analytics?

State of the art technology

Computer science's best and brightest love to contribute

Customizable

Scalability

Beautiful visualizations

Analytics and Data Science leaders use almost exclusively open source frameworks for their analytics, i.e. Google, Facebook, Uber, Airbnb, etc.

Page 8: ITAC 2016 Where Open Source Meets Audit Analytics

"Bubble Charts." Plotly. Accessed August 14, 2016. https://plot.ly/python/bubble-charts/.

Page 9: ITAC 2016 Where Open Source Meets Audit Analytics

Benefits over traditional CAATs

ACL, IDEA, Arbutus, the existing market leaders

Not very user friendly

Requires extensive training to use effectively

Not very flexible

Does not provide the output auditors are expecting

Page 10: ITAC 2016 Where Open Source Meets Audit Analytics

So what do we do about it?

Page 11: ITAC 2016 Where Open Source Meets Audit Analytics

Enter Python (and R)

Page 12: ITAC 2016 Where Open Source Meets Audit Analytics

Open source, general purpose programming language

High level of support

Used by some of the best and brightest in Data Science

Extensive scientific, mathematic, data wrangling and visualization libraries

Most popular first language in computer science departments across America (http://tinyurl.com/knw5mdv)

What is Python?

"About Python." Python.org. Accessed August 14, 2016. https://www.python.org/about/.

Page 13: ITAC 2016 Where Open Source Meets Audit Analytics

What is R?

"R is a language and environment for statistical computing and graphics."- "What Is R?" The R Project for Statistical Computing. Accessed August 14, 2016. https://www.r-

project.org/about.html.

Used widely by statisticians for statistical analysis

As a result of its widespread use, thousands of easy to implement libraries that provide *all* widely used statistical techniques

Is not a 'real' programming language

Page 14: ITAC 2016 Where Open Source Meets Audit Analytics

How would we go about using Python (or R)?

The hard way: by learning it

The even harder way: hire an auditor with programming, analytics and auditing experience

The *easiest* and most effective way: create a cross functional team by borrowing a programmer from IT and a business analyst from the business.

Page 15: ITAC 2016 Where Open Source Meets Audit Analytics

Example Python (and R) analytic test

https://github.com/aclarkData/AuditAnalytics

999 amount, weekends and keywords journal entry tests

Steps:

Input libraries

Import data

Wrangle as needed

Export to folder

Email

Schedule - Task Scheduler in Windows, Cron, or equivalent in Unix based system, i.e. Mac and Linux

Page 16: ITAC 2016 Where Open Source Meets Audit Analytics
Page 17: ITAC 2016 Where Open Source Meets Audit Analytics
Page 18: ITAC 2016 Where Open Source Meets Audit Analytics
Page 19: ITAC 2016 Where Open Source Meets Audit Analytics
Page 20: ITAC 2016 Where Open Source Meets Audit Analytics

Machine Learning

In essence, a machine understanding patterns in data without having to be explicitly programmed.

Very, very powerful technology that is transforming banking, search engines, advertising, and soon, every industry.

Examples: Credit card fraud detection, target demographic advertising, anomalous sensory data, etc.

Page 21: ITAC 2016 Where Open Source Meets Audit Analytics

Machine Learning Cont.

Numerous possibilities for utilizing machine learning and related technology, e.x. Natural Language Processing, etc., for Financial Auditing

For example, unsupervised clustering algorithm in use at Astec Industries.

Latest developments are only available in open source software or expensive statistical or computational programs such as SAS, which currently runs at a minimum of $9,200 upfront per single user license plus annual fees - “SAS® Analytics Pro."

SAS®. Accessed August 26, 2016. https://www.sas.com/store/software/analytics-pro/prodPERSANL.html.

Page 22: ITAC 2016 Where Open Source Meets Audit Analytics

Possibilities:

Time Series Machine Learning for predicting account balances

Natural Language Processing techniques for contract review and summarization - current bottleneck is (OCR) Optical Character Recognition technology.

Sentiment Analysis for Journal Entry and Transaction descriptions.

Jupyter notebooks for reproducible analytics and audit documentation

Page 23: ITAC 2016 Where Open Source Meets Audit Analytics

https://try.jupyter.org/

Page 24: ITAC 2016 Where Open Source Meets Audit Analytics

Conclusion:

Definition of Open Source Software

Unlimited possibilities for a customizable analytics experience

Scalable

Real world example

Machine Learning and the future of audit analytics

Page 25: ITAC 2016 Where Open Source Meets Audit Analytics

THANK YOU!

Please Remember To Fill Out YourSession Evaluation Forms!