40 Cross-Language Speech Dependent Lip-Synchronization · Datasets Results Cross-language results...

Datasets

Results

Cross-language results

Approach

Cross-Language Speech Dependent Lip-SynchronizationAbhishek Jha 1 Vikram Voleti2 Vinay Namboodiri 3 C. V. Jawahar 1

1 CVIT, KCIS, IIIT Hyderabad, India 2Mila, Université de Montréal, Canada 3 Department of Computer Science, IIT Kanpur, India

Motivation

Goal: Given a video in foreign language, and dubbed speech in regional language, synchronize the lips in video to match the dubbed audio.

❖ Exponential growth in internet user in the last few years.❖ A majority of these users are college/school-goers and young adults.❖ Diverse demographic of students all over the world enroll in MOOCs

Challenges: Audio to lip fiducial generation, cross-language lip synchronization, cross identity lip synchronization. User-based study

Email: abhishek.jha@research.iiit.ac.invikram.voleti@gmail.com

Project page: preon.iiit.ac.in/~abhishek_jha/lipsync_inst_vid

● Hindi Speech corpus: 2.5 hours of audio visual speech

● GRID Corpus: 33 speakers, 1000 phrase each (51 words)

● Andrew Ng ML videos: 16000 image frames extracted from deeplearning.ai MOOC videos.

● Telugu movies: scenes with protagonist’s face, extracted from Telugu movies

Contributions❖ Developed a cross-language lip-synchronization model for dubbed

speech videos (e.g. Hindi dubbing for English videos).

❖ Developed an automated pipeline to curate a dataset to train the proposed model.

❖ Conducted a user-based study, which shows learners prefer lip-sync for dubbed videos.

❖ Audio -> Lip landmarks : Train Time-delayed (TD) LSTM on grid corpus and Hindi speech corpus.❖ Intermediate processing: To search for the best face to append with generated lip landmarks.

❖ Face + Lip landmarks-> New face : Train U-Net on videos from movie dataset.

❖ Inference:➢ Generate lip-landmarks from audio using TD-LSTM,➢ Generate new frames by modifying original Andrew Ng video frames using U-Net

● C: Comfort level● US: Un-synced ● S: Lip-synced● LS%: Lip-sync percentage

Dataset curation - automated pipeline

U-Net results

Recomputed ROI frames

Landmark augmentation

t1 t2 t3 tn. . . . .

O1Time delay

. . . . .

Lip-landmarks (40D)

100 x 12D MFCC features

Time-delayed LSTM Generation using U-Net

All lip-landmarks for dubbed audio

Clustering (6 clusters)

Lip-landmarks representation

Frame selection

Intermediate-processing

Frame extraction

Target person’s

Face detection

Face recognition

Landmark extraction

Lippolygon

ROI masking

Hindi video

VoiceActivity

Detection

Video split

MFCC extraction

Face detection

Landmark extraction

0.4 . .0.21

1 . . .0.31

0.5 . . .1.0 ...

Identity image

Speech video

Dubbing video

C - US C - S LS% - US LS% - S

Mean 2.51 3.1 23.86 45.95

Std. dev. 1.07 0.6 25.9 24.1

1. Audio -> Lip landmarks Video frame + New Lip landmarks video frame

40 Cross-Language Speech Dependent Lip-Synchronization · Datasets Results Cross-language results...

Documents

Transcript of 40 Cross-Language Speech Dependent Lip-Synchronization · Datasets Results Cross-language results...

2018 JI Retail Price List - The Amber Health, Beauty ... · Lip Fixation® Lip Stain/Gloss £23.00 Lip Pencil £14.00 PlayOn® Lip Crayon £15.00 PureGloss® Lip Gloss £19.95 PureMoist®

Lip Reconstruction Following Traumatic Lip Injuries | Dr. Alfred Khallouf

3D Face Modeling Michaël De Smet. Topics to Discuss 3D Morphable Models 3D face reconstruction Face recognition Lip synchronization.

TO IMPROVE THE STRENGTH OF THE TONGUE, LIP, JAW …...TMJ Air Springs Myotalea ¨ TLJ - Cross Section +..2!+"(,*1,+#'$,5 ¥ The Lip Press Tube strengthens the lip muscles, which improves

IP Telephone Installer Guide - AT&T BusinessT Business... · 2019. 10. 4. · LIP-6812 & LIP-6830 Installer Guide 1 June, ‘07 1. Introducing the LIP-6812 & LIP-6830 1.1 LIP-6812

Advanced Locking Techniques Coarse-grained Synchronization Fine-grained Synchronization Optimistic Synchronization Lazy Synchronization.

Challenges in lip synchronization in dubbing; a case of ...

Multi-modal Translation and Evaluation of Lip ...ivizlab.sfu.ca/arya/Papers/ACM/AAMS-02/Workshop on Embodied... · Multi-modal Translation and Evaluation of Lip-synchronization using

Lip Fusion |Lip Fusion Reviews |Lip Fusion - Lip Plumper

Automatic Lip- Synchronization Using Linear Prediction of Speech Christopher Kohnert SK Semwal University of Colorado, Colorado Springs.

Give Me Some Lip Anatomy of the Lip and Lip Care Technologies

Franck Petit INRIA / LIP Lyon Stabilization, Synchronization of Logical Clocks and Applications.

Trade, Finance, Specialization and Synchronization€¦ · synchronization. The results obtain in a variety of cross-sections and panels. They re-lateto arecent strand of International

SIMULATION OF TURKISH LIP MOTION AND FACIAL EXPRESSIONS IN A 3D … · 2010. 7. 21. · 3D ENVIRONMENT AND SYNCHRONIZATION WITH A TURKISH SPEECH ENGINE Akagündüz, Erdem ... girdisi

MULTIPLEXING AND DEMULTIPLEXING HEVC VIDEO BITSTREAM WITH AAC AUDIO BITSTREAM,ACHIEVING LIP SYNCHRONIZATION BY MRUDULA WARRIER UNDER THE GUIDANCE OF DR.

Cisco Webex Room Panorama Data Sheet · Cisco Natural Audio Module (amplifier) ... Automatic noise reduction Active lip synchronization Full-range multichannel and bass speakers ...

Radial Lip Seals with PTFE Sealing Lip

Care Drinks · 2020-06-08 · Lip balm Lip balm basic Fairtrade lip balm Lip butter alu Lip balm eco Lip balm recycled Lip balm specials Nail files Tooth picks Dental floss Tooth

Cross-Platform Data Synchronization

Raj Jain Professor of Computer and Information Sciences ...jain/talks/ftp/netsem5.pdf · Inter-Media Synchronization: Between audio and video (Lip sync) Intra-Media Synchronization: