Sound Detection

Uploaded from authorPOINTLite
Views:
 
Category: Education
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Sound Detection: 

Sound Detection Derek Hoiem Rahul Sukthankar (mentor) August 24, 2004

Objective: 

Objective Learn model of sound object from few (10-20) examples and distinguish from all other sounds Examples of sound classes: Gunshots, screams, laughter, car horns, meow, dog bark, etc

Applications: 

Applications “Tell me if you hear a gunshot.” (monitoring) “Get me video clips containing dogs barking.” (search and retrieval) “What’s going on?” (scene understanding)

Why its difficult: 

Why its difficult Sound classes have large variations Sounds are often ambiguous without context Overlaid “noise” obscures sound

Sound or not?: 

Sound or not? Car horn Laser gun Dog bark Which of these sounds are not from their named classes?

Previous work: 

Previous work Sound Classification (Wold 1996, Casey 2001, etc) Categorize short sound clips Reasonable accuracy (5-20% error) Sound Detection (Defaux 2000, Piamsa-nga 1999) Localize and recognize sound objects in long clips Poor performance or assumption of unrealistic conditions (e.g., very quiet background)

Detection via Windowed Search: 

Detection via Windowed Search Long Track Break audio track into short overlapping short clips Clip Classifier Independently classify short clips as object or non-object Return locations of detected sound object

Representation: 

Representation meows phone rings Raw Representation

Classification Features: 

Classification Features Diverse feature set: Different sound classes are distinctive in different ways means and standard deviations of power at different frequencies Band-width, peaks, loudness, etc. 138 features in all

Classification by Decision Trees: 

Classification by Decision Trees Try to find simple rules that discriminate object from non-object Each decision is based on a threshold of a feature value Assign confidence based on likelihood of data for object and non-object classes at each leaf node Decision nodes Leaf Nodes

Boosted Trees: 

Boosted Trees Problem: One decision tree by itself may not be a great classifier Solution: Use several trees, with each one focusing on the mistakes of previously learned trees Adaboost: Weight training data uniformly Learn a decision tree classifier on weighted data Re-weight data giving more weight to incorrectly classified examples Final classification based on linear combination of confidences from all learned decision trees

Examples of Decision Trees: 

Examples of Decision Trees Low percentage of power in low frequencies in mid-time of sound Very high power amplitude range Meow Gunshot High power amplitude range More complex tree that focuses on examples misclassified by tree above Gunshot

Cascade of Classifiers: 

Cascade of Classifiers Goal: eliminate false positives with few false negatives in early stages Advantages: Allows use of large set of negative training examples Improves classification speed Dangers: cannot recover from false negatives Stage 1 Sound Clip Stage 2 Stage 3 Pass Fail Pass (5%) Pass (2%) Pass (0.005%) Fail Fail Fail

Results: Classification Error: 

Results: Classification Error

Results: ROC curves: 

Results: ROC curves Note: to approximate negative error rate divide FP by 25,000

Results: Anecdotal: 

Results: Anecdotal Gunshots Female Laugh Male Laugh Swords Scream