logging in or signing up ADVANCED VOICE MORPHING aSGuest91291 Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT lite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 605 Category: Science & Tech.. License: All Rights Reserved Like it (1) Dislike it (0) Added: March 24, 2011 This Presentation is Public Favorites: 1 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript ADVANCED VOICE MORPHING: ADVANCED VOICE MORPHINGSlide 2: SPEECH = PITCH + ENVELOPE IN A CONVOLVED FORM SPEECH IDENTITY OF A PERSONSlide 3: PITCH FUNDAMENTAL FREQUENCY HARMONICS ENVELOPE PHONEME PHASE,AMPLITUDE,etc .Slide 4: frequecySlide 5: “A” “O” PITCH AND ENVELOPE OF “A” AND “O”Slide 6: YOU(USER) CAN SING IN ‘LADY GAGA’S’ (TARGET) VOICE SPEECH FRAMES ‘LINEAR TRANSFORM’ DONE ON USERS VOICE FRAMESSlide 7: THIS NEEDS TRAINNING DATA SPECTRAL VECTORS X (USER)AND Y(TARGET) X*W=Y, W IS THE TRANSFORM MATRIXSlide 8: NO GLOBAL MATRIX EXISTS SO CLASSIFY SPEECH SOUND INTO CLASSES BY STATISTICAL CLASSIFIERS( eg:GMM ) GMM: GAUSSIAN MIXTURE MODELSlide 9: ONE MATRIX FROM N => DISCONTINUITY VECTORS IN OVERLAP OF CLASSES NOT WELL TRANSFORMED SO USE ALL N MATRICES WITH WIEGHTAGE TO CLASS NEAR BYSlide 10: ESTIMATE TRANSFORM MATRICES: 2 METHODS 1 .LEAST SQUARE ERROR(LSE) ESTIMATION ACCURATE ALINMENT OF SOURCE AND TARGET VECTORS NEEDED FOR ESTIMATION DYNAMIC TIME WARPING(DTW) DOES THIS JOB NEEDS PARALLEL TRAINING DATASlide 11: 2 . MAXIMUM LIKELIHOOD(ML) ESTIMATION: NO PARALLEL TRAINING SPEECH RECOGNITION WITH ML NEEDED GMM THEN MAKES CLASSES AND MATRICES ARE ESTIMATEDSlide 12: SYSTEM ENHANCEMENT: PHASE PREDICTION SPECTRAL REFINEMENT TRANSFORMING UNVOICED SOUNDSSlide 13: REALTIME VOICE MORPHING SINGING A MELODY IN LADY GAGA’S(TARGET) VOICE NEED HER PRERECORDED SONG NOW USER SINGS A SONGSlide 14: VOICE MORPHING SYSTEM SYSTEM BLOCK DIAGRAMSlide 15: PHASE PREDICTIONSYSTEM RECOGNIZE PHONEMES AND NOTES OF USER SEARCHES SAME SOUND IN TARGET THEN INTERPOLATE SELECTED VOICESSlide 16: SMS- SPECTRA MODELLING SYNTHESIS RECORDING DEVIDED INTO MORPHING UNITS EACH UNIT GIVEN NOTE AND PHONETIC INFORMATION USER’S SOUNDS ARE MAPPED INTO TARGETS IN PHASESlide 17: HMM-HIDDEN MARKOV MODEL FOR RECOGNITION USER IS GIVEN CONTROL OVER PARAMETER INTERPOLATION UNVOICED PHONEMES LEFT UNTOUCHEDSlide 18: VOICE ANALYSIS/SYNTHESIS USING SMS SMS => FREQUENCY AND AMPLITUDE VALUES CHARACTERISING IDENTITY IT ADAPTS AS IT RUNS ALSO FINDS SPECTRAL TILTS,FUNDAMENTAL FREQUENCY,HARMONICITYSlide 19: PHONETIC RECOGNITION/ALIGNMENT Fig 4.2 Recognition and matching of morphable units.Slide 20: MORPHING USER CAN INERPOLATE AMPLITUDE,FUNDAMENTAL FREQUENCY,SPECTRAL SHAPE ,etc. WHEN PHONEMES ARE DIFFERENT IN LENGTH, DO SKIP OR LOOP RESIDUES UNTUCHED ARE ADDED AT LAST AND THEN THE JOB IS DONE.Slide 21: REFERENCES : -HIGH QUALITY VOICE MORPHING BY HUI YEAND STEVE YOUNG -WWW.WIKIPEDIA.ORGSlide 22: THANK YOU You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
ADVANCED VOICE MORPHING aSGuest91291 Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT lite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 605 Category: Science & Tech.. License: All Rights Reserved Like it (1) Dislike it (0) Added: March 24, 2011 This Presentation is Public Favorites: 1 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript ADVANCED VOICE MORPHING: ADVANCED VOICE MORPHINGSlide 2: SPEECH = PITCH + ENVELOPE IN A CONVOLVED FORM SPEECH IDENTITY OF A PERSONSlide 3: PITCH FUNDAMENTAL FREQUENCY HARMONICS ENVELOPE PHONEME PHASE,AMPLITUDE,etc .Slide 4: frequecySlide 5: “A” “O” PITCH AND ENVELOPE OF “A” AND “O”Slide 6: YOU(USER) CAN SING IN ‘LADY GAGA’S’ (TARGET) VOICE SPEECH FRAMES ‘LINEAR TRANSFORM’ DONE ON USERS VOICE FRAMESSlide 7: THIS NEEDS TRAINNING DATA SPECTRAL VECTORS X (USER)AND Y(TARGET) X*W=Y, W IS THE TRANSFORM MATRIXSlide 8: NO GLOBAL MATRIX EXISTS SO CLASSIFY SPEECH SOUND INTO CLASSES BY STATISTICAL CLASSIFIERS( eg:GMM ) GMM: GAUSSIAN MIXTURE MODELSlide 9: ONE MATRIX FROM N => DISCONTINUITY VECTORS IN OVERLAP OF CLASSES NOT WELL TRANSFORMED SO USE ALL N MATRICES WITH WIEGHTAGE TO CLASS NEAR BYSlide 10: ESTIMATE TRANSFORM MATRICES: 2 METHODS 1 .LEAST SQUARE ERROR(LSE) ESTIMATION ACCURATE ALINMENT OF SOURCE AND TARGET VECTORS NEEDED FOR ESTIMATION DYNAMIC TIME WARPING(DTW) DOES THIS JOB NEEDS PARALLEL TRAINING DATASlide 11: 2 . MAXIMUM LIKELIHOOD(ML) ESTIMATION: NO PARALLEL TRAINING SPEECH RECOGNITION WITH ML NEEDED GMM THEN MAKES CLASSES AND MATRICES ARE ESTIMATEDSlide 12: SYSTEM ENHANCEMENT: PHASE PREDICTION SPECTRAL REFINEMENT TRANSFORMING UNVOICED SOUNDSSlide 13: REALTIME VOICE MORPHING SINGING A MELODY IN LADY GAGA’S(TARGET) VOICE NEED HER PRERECORDED SONG NOW USER SINGS A SONGSlide 14: VOICE MORPHING SYSTEM SYSTEM BLOCK DIAGRAMSlide 15: PHASE PREDICTIONSYSTEM RECOGNIZE PHONEMES AND NOTES OF USER SEARCHES SAME SOUND IN TARGET THEN INTERPOLATE SELECTED VOICESSlide 16: SMS- SPECTRA MODELLING SYNTHESIS RECORDING DEVIDED INTO MORPHING UNITS EACH UNIT GIVEN NOTE AND PHONETIC INFORMATION USER’S SOUNDS ARE MAPPED INTO TARGETS IN PHASESlide 17: HMM-HIDDEN MARKOV MODEL FOR RECOGNITION USER IS GIVEN CONTROL OVER PARAMETER INTERPOLATION UNVOICED PHONEMES LEFT UNTOUCHEDSlide 18: VOICE ANALYSIS/SYNTHESIS USING SMS SMS => FREQUENCY AND AMPLITUDE VALUES CHARACTERISING IDENTITY IT ADAPTS AS IT RUNS ALSO FINDS SPECTRAL TILTS,FUNDAMENTAL FREQUENCY,HARMONICITYSlide 19: PHONETIC RECOGNITION/ALIGNMENT Fig 4.2 Recognition and matching of morphable units.Slide 20: MORPHING USER CAN INERPOLATE AMPLITUDE,FUNDAMENTAL FREQUENCY,SPECTRAL SHAPE ,etc. WHEN PHONEMES ARE DIFFERENT IN LENGTH, DO SKIP OR LOOP RESIDUES UNTUCHED ARE ADDED AT LAST AND THEN THE JOB IS DONE.Slide 21: REFERENCES : -HIGH QUALITY VOICE MORPHING BY HUI YEAND STEVE YOUNG -WWW.WIKIPEDIA.ORGSlide 22: THANK YOU