ADVANCED VOICE MORPHING

Views:
 
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

ADVANCED VOICE MORPHING:

ADVANCED VOICE MORPHING

Slide 2:

SPEECH = PITCH + ENVELOPE IN A CONVOLVED FORM SPEECH IDENTITY OF A PERSON

Slide 3:

PITCH FUNDAMENTAL FREQUENCY HARMONICS ENVELOPE PHONEME PHASE,AMPLITUDE,etc .

Slide 4:

frequecy

Slide 5:

“A” “O” PITCH AND ENVELOPE OF “A” AND “O”

Slide 6:

YOU(USER) CAN SING IN ‘LADY GAGA’S’ (TARGET) VOICE SPEECH FRAMES ‘LINEAR TRANSFORM’ DONE ON USERS VOICE FRAMES

Slide 7:

THIS NEEDS TRAINNING DATA SPECTRAL VECTORS X (USER)AND Y(TARGET) X*W=Y, W IS THE TRANSFORM MATRIX

Slide 8:

NO GLOBAL MATRIX EXISTS SO CLASSIFY SPEECH SOUND INTO CLASSES BY STATISTICAL CLASSIFIERS( eg:GMM ) GMM: GAUSSIAN MIXTURE MODEL

Slide 9:

ONE MATRIX FROM N => DISCONTINUITY VECTORS IN OVERLAP OF CLASSES NOT WELL TRANSFORMED SO USE ALL N MATRICES WITH WIEGHTAGE TO CLASS NEAR BY

Slide 10:

ESTIMATE TRANSFORM MATRICES: 2 METHODS 1 .LEAST SQUARE ERROR(LSE) ESTIMATION ACCURATE ALINMENT OF SOURCE AND TARGET VECTORS NEEDED FOR ESTIMATION DYNAMIC TIME WARPING(DTW) DOES THIS JOB NEEDS PARALLEL TRAINING DATA

Slide 11:

2 . MAXIMUM LIKELIHOOD(ML) ESTIMATION: NO PARALLEL TRAINING SPEECH RECOGNITION WITH ML NEEDED GMM THEN MAKES CLASSES AND MATRICES ARE ESTIMATED

Slide 12:

SYSTEM ENHANCEMENT: PHASE PREDICTION SPECTRAL REFINEMENT TRANSFORMING UNVOICED SOUNDS

Slide 13:

REALTIME VOICE MORPHING SINGING A MELODY IN LADY GAGA’S(TARGET) VOICE NEED HER PRERECORDED SONG NOW USER SINGS A SONG

Slide 14:

VOICE MORPHING SYSTEM SYSTEM BLOCK DIAGRAM

Slide 15:

PHASE PREDICTIONSYSTEM RECOGNIZE PHONEMES AND NOTES OF USER SEARCHES SAME SOUND IN TARGET THEN INTERPOLATE SELECTED VOICES

Slide 16:

SMS- SPECTRA MODELLING SYNTHESIS RECORDING DEVIDED INTO MORPHING UNITS EACH UNIT GIVEN NOTE AND PHONETIC INFORMATION USER’S SOUNDS ARE MAPPED INTO TARGETS IN PHASE

Slide 17:

HMM-HIDDEN MARKOV MODEL FOR RECOGNITION USER IS GIVEN CONTROL OVER PARAMETER INTERPOLATION UNVOICED PHONEMES LEFT UNTOUCHED

Slide 18:

VOICE ANALYSIS/SYNTHESIS USING SMS SMS => FREQUENCY AND AMPLITUDE VALUES CHARACTERISING IDENTITY IT ADAPTS AS IT RUNS ALSO FINDS SPECTRAL TILTS,FUNDAMENTAL FREQUENCY,HARMONICITY

Slide 19:

PHONETIC RECOGNITION/ALIGNMENT Fig 4.2 Recognition and matching of morphable units.

Slide 20:

MORPHING USER CAN INERPOLATE AMPLITUDE,FUNDAMENTAL FREQUENCY,SPECTRAL SHAPE ,etc. WHEN PHONEMES ARE DIFFERENT IN LENGTH, DO SKIP OR LOOP RESIDUES UNTUCHED ARE ADDED AT LAST AND THEN THE JOB IS DONE.

Slide 21:

REFERENCES : -HIGH QUALITY VOICE MORPHING BY HUI YEAND STEVE YOUNG -WWW.WIKIPEDIA.ORG

Slide 22:

THANK YOU