Inside the Human Speechome

Charles Naut

Inside the Human Speechome

Unnatural laboratory environments

Observations weeks or months apart

Imprecise diary studies

No visual context or sparse visual recordings

Disruptive observer effect

Non-longitudinal

Traditional Language Studies

Goal: “To better understand how children learn the meaning of words through analysis of observational recordings of child-caregiver interactions in natural contexts”

Attempt to model language development

Find role of linguistic environment

1 child recorded from birth to the age of three

10 hours a day

230,000 hours of audio-video recordings

The Human Speechome Project

14 ceiling mounted boundary layer microphones

11 overhead Omni-directional cameras

5 - 6 cameras on at any given time

3,000 feet of concealed wires

3 years

The Setup

Fish Eye Footage

http://www.media.mit.edu/cogmac/videos/blue_ball_low.mov

Progression of the word “ball”

Ball Video




Completion of recording phase

Focusing on important period of 9 months to 24 months of age (4,260 hours of recordings, 10 million words, 200 million frames)

28% transcribed (1,200 hours and 3 million words)

Current Status

Need fast and accurate speech transcription and video annotation

Current software doesn’t scale well for large dataset

Speech recognition and object tracking software are inaccurate in dynamic environments

Combine machine learning and manual human work

New Annotation Technology

BlitzScribe – rapid audio transcription

TotalRecall – Audio-video browsing and annotation

TrackMarks – Object tracking

Praat – Fundamental frequency extraction

Technology Innovation

BlitzScribe Streamlined Interface

Transcription Software Benchmarks

Object Tracking and Social Hotspots

Based on 72 days of data from months 9 - 24 (400,000 words)

Caretakers appropriately adjust their speech to assist children in word learning

Word births increase exponentially then drop after 20 months

Frequency and prosody of word in speech have strong correlation with date of word birth

Early Findings

Transcribe and annotate entire corpus

Continue study of months 9 – 24Effects of social hotspots on word learningCan body movements be used to detect word births

Share selected coded portions of corpus

Expand pilot study to more children

The Future

The End

Thank You

Inside the Human Speechome

Education

Transcript of Inside the Human Speechome