Inside the Human Speechome

15
Charles Naut Inside the Human Speechome

description

Presentation on Deb Roy's Human Speechome Project for Anne Fernald's introductory seminar on language learning in children and adults.

Transcript of Inside the Human Speechome

Page 1: Inside the Human Speechome

Charles Naut

Inside the Human Speechome

Page 2: Inside the Human Speechome

Unnatural laboratory environments

Observations weeks or months apart

Imprecise diary studies

No visual context or sparse visual recordings

Disruptive observer effect

Non-longitudinal

Traditional Language Studies

Page 3: Inside the Human Speechome

Goal: “To better understand how children learn the meaning of words through analysis of observational recordings of child-caregiver interactions in natural contexts”

Attempt to model language development

Find role of linguistic environment

1 child recorded from birth to the age of three

10 hours a day

230,000 hours of audio-video recordings

The Human Speechome Project

Page 4: Inside the Human Speechome

14 ceiling mounted boundary layer microphones

11 overhead Omni-directional cameras

5 - 6 cameras on at any given time

3,000 feet of concealed wires

3 years

The Setup

Page 5: Inside the Human Speechome

Fish Eye Footage

Page 6: Inside the Human Speechome

http://www.media.mit.edu/cogmac/videos/blue_ball_low.mov

Progression of the word “ball”

Ball Video

Page 7: Inside the Human Speechome

Completion of recording phase

Focusing on important period of 9 months to 24 months of age (4,260 hours of recordings, 10 million words, 200 million frames)

28% transcribed (1,200 hours and 3 million words)

Current Status

Page 8: Inside the Human Speechome

Need fast and accurate speech transcription and video annotation

Current software doesn’t scale well for large dataset

Speech recognition and object tracking software are inaccurate in dynamic environments

Combine machine learning and manual human work

New Annotation Technology

Page 9: Inside the Human Speechome

BlitzScribe – rapid audio transcription

TotalRecall – Audio-video browsing and annotation

TrackMarks – Object tracking

Praat – Fundamental frequency extraction

Technology Innovation

Page 10: Inside the Human Speechome

BlitzScribe Streamlined Interface

Page 11: Inside the Human Speechome

Transcription Software Benchmarks

Page 12: Inside the Human Speechome

Object Tracking and Social Hotspots

Page 13: Inside the Human Speechome

Based on 72 days of data from months 9 - 24 (400,000 words)

Caretakers appropriately adjust their speech to assist children in word learning

Word births increase exponentially then drop after 20 months

Frequency and prosody of word in speech have strong correlation with date of word birth

Early Findings

Page 14: Inside the Human Speechome

Transcribe and annotate entire corpus

Continue study of months 9 – 24Effects of social hotspots on word learningCan body movements be used to detect word births

Share selected coded portions of corpus

Expand pilot study to more children

The Future

Page 15: Inside the Human Speechome

The End

Thank You