Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T....

20
Collection of speech production ultrasound data Donald Derrick 12 , Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute) 2 University of Canterbury (NZILBB)

Transcript of Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T....

Page 1: Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute)

Collection of speech production ultrasound data

Donald Derrick12, Romain Fiasson2 and Catherine T. Best1

1University of Western Sydney (MARCS Institute)2University of Canterbury (NZILBB)

Page 2: Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute)

2

Introduction• Ultrasound Imaging– Uses high frequency sound waves to image density

changes in soft tissue• Ideal for imaging the surface of the tongue• But cannot penetrate air or bone boundaries

– Can miss the tongue tip and/or root– Palate trace difficult – No pharyngeal wall recording

– Provides noisy medical-grade images• Intended for diagnosis• Often difficult to measure directly

Page 3: Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute)

3

• Ultrasound can miss tongue tip/root

• Pick the right probe and placement– Narrow– Curved-array– Away from bone

• Forward of notch for root

• Adjacent to notch for tongue tip

Ultrasound - edges

Derrick and Fiasson (In Prep)

Page 4: Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute)

4

Ultrasound – frame rate• Ultrasound frame rate factor of:– Ultrasound CPU– Image smoothing– Image lines– Image angle

• Combined with video capture methods– Trade portability, A/V sync, fps

Page 5: Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute)

5

• Ultrasound speed– Video capture

• SD 24/30 fps interlaced• HD 48/60 fps interlaced• A/V sync

– Cineloop full speed• Short duration (8-16 second)• External A/V sync only

– Frame grabber 60 fps progressive• Drops frames• A/V not synced

– Semi-raw capture best• No longer supported by anyone

– B/M Progressive scan• Full speed m-mode, 1D lines

Ultrasound – video capture

Gick, Wilson, and Derrick (2013)

Page 6: Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute)

6

• Ultrasound speed– Video capture

• SD 24/30 fps interlaced• HD 48/60 fps interlaced• A/V sync

– Cineloop full speed• Short duration (8-16 second)• External A/V sync only

– Frame grabber 60 fps progressive• Drops frames• A/V not synced

– Semi-raw capture best• No longer supported by anyone

– B/M Progressive scan• Full speed m-mode, 1D lines

Ultrasound – video capture

Miller and Finch (2011)

Page 7: Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute)

7

• Ultrasound speed– Video capture

• SD 24/30 fps interlaced• HD 48/60 fps interlaced• A/V sync

– Cineloop full speed• Short duration (8-16 second)• External A/V sync only

– Frame grabber 60 fps progressive• Drops frames• A/V not synced

– Semi-raw capture best• No longer supported by anyone

– B/M Progressive scan• Full speed m-mode, 1D lines

Ultrasound – video capture

Derrick and Fiasson (In Prep)

Page 8: Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute)

8

• Ultrasound speed– Video capture

• SD 24/30 fps interlaced• HD 48/60 fps interlaced• A/V sync

– Cineloop full speed• Short duration (8-16 second)• External A/V sync only

– Frame grabber 60 fps progressive• Drops frames• A/V not synced

– Semi-raw capture best• No longer supported by anyone

– B/M Progressive scan• Full speed m-mode, 1D lines

Ultrasound – video capture• QuickTime, Final Cut Pro, Adobe

Premier– Don’t work with all frame

grabbers– Interfere with frame rate/quality

• Oh g-d, the pain, the PAIN!

• FFMPEG– 58-60 FPS

• With SSD, 64 bit computer, and x264

– Requires UNIX command-line skills

– Post-processing synchronization based on transient bursts (‘tatata’)

Page 9: Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute)

9

• Ultrasound speed– Video capture

• SD 24/30 fps interlaced• HD 48/60 fps interlaced• A/V sync

– Cineloop full speed• Short duration (8-16 second)• External A/V sync only

– Frame grabber 60 fps progressive• Drops frames• A/V not synced

– Semi-raw capture (best)• Only EchoB supports now

– B/M Progressive scan• Full speed m-mode, 1D lines

Ultrasound – video capture

http://www.articulateinstruments.com/ultrasound-products/

Page 10: Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute)

10

• Ultrasound speed– Video capture

• SD 24/30 fps interlaced• HD 48/60 fps interlaced• A/V sync

– Cineloop full speed• Short duration (8-16 second)• External A/V sync only

– Frame grabber 60 fps progressive• Drops frames• A/V not synced

– Semi-raw capture best• No longer supported by anyone

– B/M Progressive scan• Full speed m-mode, 1D lines

Ultrasound – video capture

Gick, Wilson, and Derrick (2013)

Page 11: Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute)

11

• Hand-held– Easy– Useful in field/with

children

• Head rest– Reduces motion to μ 1mm– Moves with jaw

• Metal head mounting– Effective– Negates jaw motion

• Non-metal mounting– Effective– Moves with jaw

Ultrasound – head stabilization

Stone (2005)

Page 12: Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute)

12

• Hand-held– Easy– Useful in field/with children

• Head rest– Reduces motion to μ 1mm– Moves with jaw

• Metal head mounting– Effective– Negates jaw motion

• Non-metal mounting– Effective– Moves with jaw

Ultrasound – head stabilization

Gick (2002)

Gick, Bird, and Wilson (2005)

Page 13: Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute)

13

• Hand-held– Easy– Useful in field/with

children

• Head rest– Reduces motion to μ 1mm– Moves with jaw

• Metal head mounting– Effective– Negates jaw motion

• Non-metal mounting– Effective– Moves with jaw

Ultrasound – head stabilization

http://www.articulateinstruments.com/ultrasound-products/

Page 14: Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute)

14

• Hand-held– Easy– Useful in field/with

children

• Head rest– Reduces motion to μ 1mm– Moves with jaw

• Metal head mounting– Effective– Negates jaw motion

• Non-metal mounting– Effective– Moves with jaw

Ultrasound – head stabilization

Derrick and Fiasson (In Prep)

Page 15: Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute)

15

• Diagnostic– Easier, less data storage– Must be defined

carefully

• Direct measurement– Slow, tedious– More rich/useful

Ultrasound - Measurements

Derrick and Gick (2012)

Page 16: Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute)

• Diagnostic– Easier, less data storage– Must be defined

carefully

• Direct measurement– Slow, tedious– More rich/useful

Ultrasound - Measurements

Tiede’s “GetContours”

Page 17: Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute)

17

Discussion• Ultrasound provides tongue shape and dynamic

information– Can do so at high temporal and spatial resolution

• Head stabilization has tradeoffs– Free jaw motion invalidates palate measurements– Restrained jaw motion restricts speech unnaturally

• Ultrasound can be used for diagnostic or direct-measurement analysis– Diagnostic is fast but uses little of the data– Direct-measurement is slow but uses more data

Page 18: Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute)

18

References• Derrick, D. and Fiasson, R. (In Prep). Co-collection and co-

registration of speech production ultrasound and articulometry data.

• Derrick, D. and Gick, B. (2012). Speech rate influences categorical variation of English flaps and taps during normal speech. Journal of the Acoustical Society of America. 131(4):3345.

• Gick, B., Wilson, I. and Derrick, D. (2013). Articulatory Phonetics. Wiley-Blackwell.

• Gick, B., Bird, S., and Wilson, I. (2005). Techniques for field application of lingual ultrasound imaging. Clinical Linguistics and Phonetics. 19(6/7):503-514.

Page 19: Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute)

19

References• Gick, B. (2002). The use of ultrasound for linguistic phonetic

fieldwork. Journal of the International Phonetic Association. 32(2):113-121.

• Miller, A. and Finch, K. (2011). Corrected High-Frame Rate Anchored Ultrasound With Software Alignment. Journal of Speech, Language, and Hearing Research. 54:471-486.

• Stone, M. (2005). A Guide to Analysing Tongue Motion from Ultrasound Images. Clinical Linguistics and Phonetics. 19(6/7):455-501.

• Tiede. M. (2010). {MVIEW: Multi-channel visualization application for displaying dynamic sensor movements. Development

Page 20: Collection of speech production ultrasound data Donald Derrick 12, Romain Fiasson 2 and Catherine T. Best 1 1 University of Western Sydney (MARCS Institute)

20

References• Tiede, M. (2005). MVIEW: software for

visulalizing and analysis of concurrently recorded movement data.