Sound Detection

Uploaded from authorPOINT Lite
Download as
 PPT
Presentation Description 

No description available

Views: 226
Like it  ( Likes) Dislike it  ( Dislikes)
Added: January 17, 2008 This Presentation is Public 
Presentation Category : Education All Rights Reserved
Presentation Transcript

Sound Detection: Sound Detection Derek Hoiem Rahul Sukthankar (mentor) August 24, 2004


Objective: Objective Learn model of sound object from few (10-20) examples and distinguish from all other sounds Examples of sound classes: Gunshots, screams, laughter, car horns, meow, dog bark, etc


Applications: Applications “Tell me if you hear a gunshot.” (monitoring) “Get me video clips containing dogs barking.” (search and retrieval) “What’s going on?” (scene understanding)


Why its difficult: Why its difficult Sound classes have large variations Sounds are often ambiguous without context Overlaid “noise” obscures sound


Sound or not?: Sound or not? Car horn Laser gun Dog bark Which of these sounds are not from their named classes?


Previous work: Previous work Sound Classification (Wold 1996, Casey 2001, etc) Categorize short sound clips Reasonable accuracy (5-20% error) Sound Detection (Defaux 2000, Piamsa-nga 1999) Localize and recognize sound objects in long clips Poor performance or assumption of unrealistic conditions (e.g., very quiet background)


Detection via Windowed Search: Detection via Windowed Search Long Track Break audio track into short overlapping short clips Clip Classifier Independently classify short clips as object or non-object Return locations of detected sound object


Representation: Representation meows phone rings Raw Representation


Classification Features: Classification Features Diverse feature set: Different sound classes are distinctive in different ways means and standard deviations of power at different frequencies Band-width, peaks, loudness, etc. 138 features in all


Classification by Decision Trees: Classification by Decision Trees Try to find simple rules that discriminate object from non-object Each decision is based on a threshold of a feature value Assign confidence based on likelihood of data for object and non-object classes at each leaf node Decision nodes Leaf Nodes


Boosted Trees: Boosted Trees Problem: One decision tree by itself may not be a great classifier Solution: Use several trees, with each one focusing on the mistakes of previously learned trees Adaboost: Weight training data uniformly Learn a decision tree classifier on weighted data Re-weight data giving more weight to incorrectly classified examples Final classification based on linear combination of confidences from all learned decision trees


Examples of Decision Trees: Examples of Decision Trees Low percentage of power in low frequencies in mid-time of sound Very high power amplitude range Meow Gunshot High power amplitude range More complex tree that focuses on examples misclassified by tree above Gunshot


Cascade of Classifiers: Cascade of Classifiers Goal: eliminate false positives with few false negatives in early stages Advantages: Allows use of large set of negative training examples Improves classification speed Dangers: cannot recover from false negatives Stage 1 Sound Clip Stage 2 Stage 3 Pass Fail Pass (5%) Pass (2%) Pass (0.005%) Fail Fail Fail


Results: Classification Error: Results: Classification Error


Results: ROC curves: Results: ROC curves Note: to approximate negative error rate divide FP by 25,000


Results: Anecdotal: Results: Anecdotal Gunshots Female Laugh Male Laugh Swords Scream