TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:
Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF....
-
Upload
sandra-parker -
Category
Documents
-
view
226 -
download
0
Transcript of Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF....
Andrew Ng
CS228: Deep Learning &
Unsupervised Feature Learning
Andrew Ng
Andrew Ng
How is computer perception done?
Image Low-levelvision features
Recognition
Object detection
Computer vision is hard!
Andrew Ng
How is computer perception done?
Image Vision features Recognition
Object detection
Audio Audio features Speaker ID
Audio classification
NLP
Text Text features
Text classification, MT, IR, etc.
Andrew Ng
Sensor representations
Input Learning/AIalgorithm
Low-level features
Andrew Ng
A plethora of sensors
Camera array
3d range scan (laser scanner)
3d range scans (flash lidar)
Audio
A general-purpose algorithm for good sensor representations?
Visible light image
Thermal Infrared
Andrew Ng
Sensor representation in the brain
[BrainPort; Martinez et al; Roe et al.]
Seeing with your tongueHuman echolocation (sonar)
Auditory cortex learns to see.
Auditory Cortex
Andrew Ng
Learning abstract representations
pixels
edges
object parts(combination of edges)
object models
[Related work: Deep learning, Hinton, Bengio, LeCun, and others.]
Andrew Ng
Feature learning for audio
Learned features correspond tophonemes and other “basic units”of sound.
Learned features
Algorithm:
Andrew Ng
TIMIT Phone classification AccuracyPrior art (Clarkson et al.,1999) 79.6%
Stanford Feature learning 80.3%
TIMIT Speaker identification AccuracyPrior art (Reynolds, 1995) 99.7%Stanford Feature learning 100.0%
Audio
Images
Multimodal (audio/video)
CIFAR Object classification Accuracy
Prior art (Yu and Zhang, 2010) 74.5%
Stanford Feature learning 79.6%
NORB Object classification Accuracy
Prior art (Ranzato et al., 2009) 94.4%
Stanford Feature learning 97.0%
AVLetters Lip reading Accuracy
Prior art (Zhao et al., 2009) 58.9%
Stanford Feature learning 65.8%
Galaxy
Other feature learning records: Different phone recognition task (Hinton), PASCAL VOC object classification (Yu)
Hollywood2 Classification Accuracy
Prior art (Laptev et al., 2004) 48%
Stanford Feature learning 53%
KTH Accuracy
Prior art (Wang et al., 2010) 92.1%
Stanford Feature learning 93.9%
UCF Accuracy
Prior art (Wang et al., 2010) 85.6%
Stanford Feature learning 86.5%
YouTube Accuracy
Prior art (Liu et al., 2009) 71.2%
Stanford Feature learning 75.8%
Video