Stochastic Analysis of Natural Conversation Corpora, with automatic detection of speech details

9
Stochastic Analysis of Natural Conversation Corpora, with automatic detection of speech details Emilien Gorène

description

Stochastic Analysis of Natural Conversation Corpora, with automatic detection of speech details. Emilien Gorène. Automatic processing for pathologic speech : The case of schizophrenia. Pre processing :. - PowerPoint PPT Presentation

Transcript of Stochastic Analysis of Natural Conversation Corpora, with automatic detection of speech details

Page 1: Stochastic Analysis of Natural Conversation Corpora, with automatic detection of speech details

Stochastic Analysis of Natural Conversation Corpora, with automatic

detection of speech details

Emilien Gorène

Page 2: Stochastic Analysis of Natural Conversation Corpora, with automatic detection of speech details

Automatic processing for pathologic speech : The case of schizophrenia

Under direction of Maud Champagne-Lavau and Laurent Prévot.

• Recording subjects/experimentator conversations • Transcribing precisely with Praat• Aligned on the signal by Sppas

Tagging in part of speech, and graphic interface to visualize the transcriptions (souce lpl-tal-ptools S. Rauzy)

Pre processing :

• 22 volounteers : 6 hours• 488 transcriptions• 35.000 tokens

• 2 populations : Healthy Subject = Controls / Schizophrenics = Patients

Page 3: Stochastic Analysis of Natural Conversation Corpora, with automatic detection of speech details

According to literature : Schizophrenics’ language presents differences at

multiple levels...

2 differents tasks : Describing comics and a single picture

18 points automatically detected and measured.

Automatic processing for pathologic speech : The case of schizophrenia

o Durationo Silent numbero Number of tokenso Variety of tokenso Fluency

o Lexical Richnesso Number of verbso Number of action verbso Number of mind verbso Possessives pronoun

o Definite pronounso …

Page 4: Stochastic Analysis of Natural Conversation Corpora, with automatic detection of speech details

Utilization of semi-automatic detection on pathologic corpus is justified.

The only recording in a specific task may predict a psychiatric pathology.

We globally find same results on 2 corpora but some indicators are in opposition.

The inter-subjectal and inter-situational variations are important.

Automatic processing for pathologic speech : The case of schizophrenia

What can we learn ?

Page 5: Stochastic Analysis of Natural Conversation Corpora, with automatic detection of speech details

Exploiting the variation in conversational feedback to characterize the nature and quality of language interactions

How to analyze linguistics and cyclic phenomenons evolving through time ?

Can we use tools coming from biological sciences on language ? Silent/Speech/Feedback, a good categorization of convergence,

else what ?

Maptask-Aix (http://sldr.org/wiki/sldr000732) : 3h30 of recording, 300+ files…

Two speakers in interraction to retrace the good way on a map. One is the Giver, the other the follower

Categorization in 3 : speech, silent, feedback to create unambiguous categories.

Under the direction of Noël Nguyen and Laurent Prévot.

Page 6: Stochastic Analysis of Natural Conversation Corpora, with automatic detection of speech details

After normalization, we draw representations of degrees of similarity with more or less time delay

Software R with package CRQA

All pairs show similar graphics : maximum of similarity on the « time delay 0 » = default

position.

Exploiting the variation in conversational feedback to characterize the nature and quality of language interactions

Page 7: Stochastic Analysis of Natural Conversation Corpora, with automatic detection of speech details

Cross-recurrence tools are useful for language sciences too.

Now we can study cyclics phenomens and their evolving through time.

The three categorized states haven’t consequences on results.

This analysis shows systematic results for any subject : little variation

Exploiting the variation in conversational feedback to characterize the nature and quality of language interactions

What does this tell us ?

Page 8: Stochastic Analysis of Natural Conversation Corpora, with automatic detection of speech details

Thesis project

2 differents studies for the same project :The Variation can be better defined and used as an

asset.

We would like to use close semi-automatic method on more numerous and diversified datas to develop this idea.

This is the announced goal of my thesis !

And after ?

Page 9: Stochastic Analysis of Natural Conversation Corpora, with automatic detection of speech details

Thank you for your attention.

Stochastic Analysis of Natural Conversation Corpora, with automatic detection of speech details