Introduction of Deep Learning

30
Introduction of Deep Learning

Transcript of Introduction of Deep Learning

Introduction of

Deep Learning

AI (Artificial Intelligence)

the intelligence exhibited by machines or software

Turing Test

a test of a machine's ability to exhibit intelligent

behavior equivalent to, or indistinguishable from, that of

a human

Deductive Reasoning

the process of reasoning from one or more

statements (premises) to reach a logically certain

conclusion

Inductive Reasoning

reasoning in which the premises are viewed as

supplying strong evidence for the truth of the

conclusion

IBM Watson

a question answering computer system capable of

answering questions posed in natural language

Watson

advanced natural language processing, information retrieval, knowledge

representation, automated reasoning, and machine learning technologies to the

field of open domain question answering

Software

Watson uses IBM's DeepQA software and the Apache UIMA (Unstructured Information

Management Architecture) framework. The system was written in various languages,

including Java, C++, and Prolog, and runs on the SUSE Linux Enterprise Server 11 operating

system using Apache Hadoop framework to provide distributed computing.

Hardware

Watson is composed of a cluster of ninety IBM Power 750 servers, each of which uses a 3.5

GHz POWER7 eight core processor, with four threads per core. In total, the system has 2,880

POWER7 processor threads and has 16 terabytes of RAM.

Data

The sources of information for Watson include encyclopedias, dictionaries, thesauri,

newswire articles, and literary works. Watson also used databases, taxonomies, and

ontologies. Specifically, DBPedia, WordNet, and Yago were used. The IBM team provided

Watson with millions of documents, including dictionaries, encyclopedias, and other

reference material that it could use to build its knowledge. Although Watson was not

connected to the Internet during the game, it contained 200 million pages of structured and

unstructured content consuming four terabytes of disk storage, including the full text of

Wikipedia.

Deep Blue

a chess-playing computer developed by IBM

AlphaGo

a computer program developed by Google DeepMind in

London to play the board game Go

Machine Learning

the study and construction of algorithms that can learn from and make predictions on data

A core objective of a learner is to generalize from its experience.

the ability of a learning machine to perform accurately on new, unseen examples/tasks after having experienced a learning data set.

Approaches of Machine Learning

• Decision tree learning

• Artificial neural networks

• Support vector machines

• Bayesian networks

• Clustering

• Genetic algorithms

• …

Applied Machine Learning Process

1. Define the Problem

Step 1: What is the problem?

Step 2: Why does the problem need to be solved?

Step 3: How would I solve the problem?

2. Prepare Data

Step 1: Data Selection

Step 2: Data Preprocessing

Step 3: Data Transformation

3. Spot Check Algorithms

4. Improve Results

5. Present Results

Deep Learning

a branch of machine learning based on a set of

algorithms that attempt to model high-level abstractions

in data by using multiple processing layers, with

complex structures or otherwise, composed of multiple

non-linear transformations

Deep learning has been characterized as a buzzword, or

a rebranding of neural networks.

Neural Network

to estimate or approximate functions that can

depend on a large number of inputs and are

generally unknown

Objective

Solution

( )

Hidden Layer

weight

NN vs. Deep NN

Deep Neural Network

• Pretraining

– unsupervised learning

Deep Neural Network

• Droupout

Deep Neural Network

• Early stopping

– overfitting

NN vs. Deep NN

Variants of Deep Architectures

• Brief about deep neural networks

• Deep belief networks

• Convolutional neural networks

• Recurrent neural networks

• Stacked (de-noising) auto-encoders

• Spike-and-slab RBMs

• …

Applications

Applications

Libraries

• Caffe – A deep learning framework specializing in image recognition.

• CNTK – open source deep-learning Computational Network Toolkit by Microsoft Research.

• ConvNetJS – A Javascript library for training deep learning models. It contains online demos.

• Deeplearning4j – An open-source deep-learning library written for Java with LSTMs and convolutional networks. It provides parallelization with

CPUs and GPUs.

• Gensim – A toolkit for natural language processing implemented in the Python programming language.

• Keras – deep learning framework capable of running on top of either TensorFlow or Theano.

• NVIDIA cuDNN – A GPU-accelerated library of primitives for deep neural networks.

• OpenNN – An open source C++ library which implements deep neural networks and provides parallelization with CPUs.

• TensorFlow – Google's open source machine learning library in C++ and Python with APIs for both. It provides parallelization with CPUs and

GPUs.

• Theano – An open source machine learning library for Python.

• Torch – An open source software library for machine learning based on the Lua programming language.

• Apache SINGA – A General Distributed Deep Learning Platform.

References

1. , , “ ”, , 22 1 , 2015.1.

2. , “ 1 ”, http://www.slideshare.net/DonghunLee20/1-59501887

3. Wikipedia, http://en.wikipedia.org

4. Jason Brownlee, “Process for working through Machine Learning Problems”,

http://machinelearningmastery.com/process-for-working-through-machine-learning-

problems/

5. DANIEL SHIFFMAN, “THE NATURE OF CODE”, http://natureofcode.com/book/