eprints.lincoln.ac.ukeprints.lincoln.ac.uk/26720/1/Revised reliability ms_22_02_2017.docx · Web...

RUNNING HEAD: Good reliability in atypical speech lateralisation

Measurement reliability of atypical language lateralisation assessed using functional

transcranial Doppler ultrasound

Jessica C. Hodgsona, 1 and John M. Hudsona

aSchool of Psychology, University of Lincoln, Lincoln, UK

1Present address: NIHR Hearing Biomedical Research Unit, Nottingham, UK

Corresponding Author

Jessica C Hodgson

NIHR Hearing Biomedical Research Unit

Ropewalk House

Nottingham NG1 5DU

UK

E-mail: [email protected]

Phone: + 44 (0) 115 823 2636

1


Abstract

It is well established that some individuals present with atypical, non-left hemisphere,

cerebral lateralisation for language processing. However previous studies exploring the reli-

ability of functional blood flow responses to detect lateralised activation during speech have

focused only on individuals with typical left sided dominance. Here we report test-retest and

between-task reliability measures obtained with functional transcranial Doppler ultrasound in

47 participants, including 9 with atypical language presentation. Results showed good test-

retest reliability in atypically lateralised individuals, even after an interval of 120 days.

Between-task reliability was weaker, but still within acceptable ranges.

Key words:

Transcranial Doppler

Cerebral blood flow measurement

Speech

Hemispheric Lateralisation

Reliability

2


Introduction

It is well established that the left cerebral hemisphere is dominant for language

processing and production in the majority of people. However it is also known that some

individuals have atypical hemispheric representation for speech processing, such that there is

a deviation from the typical pattern of left hemisphere dominance (Knecht et al., 2000a;

Deppe et al., 2000). This reduced left sided bias is observed more frequently in individuals

who are left handed (Knecht et al., 2000b) and in some neurodevelopmental disorders such as

Dyslexia (Illingworth and Bishop, 2009), Specific Language Impairment (Bishop et al., 2014)

and Developmental Coordination Disorder (Hodgson and Hudson, 2016). However, little is

known about why atypical language lateralisation occurs (see Bishop, 2013) and whether

such lateralisation profiles are stable across tasks and between measurement sessions.

Increased variability in lateralisation indices have been reported in people with atypical

language representation (Knecht et al., 2003) as well as in young children (Kohler et al, 2015)

and there is a suggestion that laterality profiles of individuals who display a reduced left

hemisphere bias may be indicative of distributed cortical activation due to task complexity,

rather than of altered language processing (Brownsett et al., 2014).

Here we report on the test-retest and between-task reliability of functional

Transcranial Doppler (fTCD) for measuring hemispheric speech lateralisation, with a focus

on the measurement reliability in individuals with atypically represented speech. fTCD is an

ultrasound technology which uses a 2 MHz pulsed sound wave to insonate through areas of

temporal bone in order to detect cerebral blood flow velocity (cBFV). Changes in velocities

within the middle cerebral arteries can be examined during various cognitive tasks involving

speech and language, motor action, perception and visuo spatial processing. Bi-modal

recording allows simultaneous measurements to be taken from the left and right sides of the

head, meaning the methodology provides a useful role in the cognitive neuroscience of

3


hemispheric lateralisation. The advantages of fTCD are that the technology is quick to

administer, very affordable (especially compared to other imaging techniques) and portable.

fTCD is highly suitable for use with young children, patient groups and others not able to

undergo more invasive or intimidating imaging procedures. As a research tool fTCD is

becoming increasingly popular, helped by the recent advances in analysis software available

(Badcock et al., 2012).

Previous reports indicate good reliability for measures of speech lateralisation using

fTCD (Knecht et al., 1998b; Stroobant and Vingerhoets, 2001), but it is less clear whether this

is also the case for individuals who display atypical hemispheric lateralisation. Small sample

sizes in previous studies on test re-test reliability (10 and 20 subjects respectively) mean it is

difficult to draw conclusions about variance levels within atypical dominance, as none of the

subjects in these studies had atypical speech representation. In contrast, between-task

reliability for speech lateralisation has been more widely assessed using fTCD (Bishop, Watt

and Papadatou-Pastou, 2009; Stroobant, Van Boxstael and Vingerhoets, 2011) primarily with

a view to ascertaining reliability of child-friendly paradigms designed to probe speech

compared to standard verbal fluency tasks used with adults. But lateralisation profiles in these

studies are often only reported at the group level, again meaning that judgements about

individual variability are difficult to make.

Methods

We obtained language lateralisation indices using fTCD imaging during a word

generation task (Knecht et al., 2000a) from 47 healthy adult participants (15 males; aged 18-

59 yrs, mean age = 23.5 yrs; SD age = 8.4; 18 right handed and 29 left handed). Hand

preference was determined by responses to a 21-item handedness inventory (Flowers and

Hudson, 2013), from which handedness quotients were derived using the following formula:

4


(Right – Left) / (Right + Left) *100. Scores above 0 denoted right handedness and scores

below 0 denoted left handedness. We deliberately targeted left handed individuals to increase

the likelihood of atypical language representation in our sample. The same 47 participants

returned to the lab to undergo a second session of fTCD imaging during the same word

generation task between 59 and 121 days after session 1 (mean separation was 81 days, SD:

18.2). For 33 of the participants (11 males; mean age = 22.1 yrs; SD age = 5.3; 11 right

handed and 22 left handed) lateralisation indices from a second speech production paradigm,

animation description (Bishop, Watt and Papadatou-Pastou, 2009) were also obtained during

session 1, allowing for a within subjects comparison of task reliability. The reduction in

sample size is due to variability in the set-up time between participants, meaning in 15 cases

there wasn’t time to run the second speech paradigm. Ethical approval for the work was

obtained from University of Lincoln School of Psychology, and all participants gave

informed consent. None had neurological or cerebrovascular disorders, or impairments with

language or reading; all had normal or corrected to normal vision.

Speech Paradigms

Word Generation: this task involves participants generating words to a single letter

cue. Each trial began with a 5 s period in which participants were prompted to clear their

mind. A letter was then presented in the centre of the computer screen for 15 s, during which

time participants were required to silently generate as many words as possible that began with

the letter displayed. (At the onset of the trial a 500 ms epoch marker was simultaneously sent

to the transcranial Doppler). Following the generation phase, to ensure task compliance,

participants were requested to report the words aloud within a 5 s period. The trial concluded

with a 35 s period of relaxation to allow CBFV to return to baseline before the onset of the

next trial. The WG paradigm consisted of 23 trials in total. Letter presentation was

5


randomised and no letter was presented more than once to any given participant. The letters

‘Q’, ‘X’ and ‘Y’ were excluded from the task. Within fTCD ultrasound research word

generation has been used extensively (Knecht et al., 2000a, b; Bishop, Watt and Papadatou-

Pastou, 2009; Hodgson and Hudson, 2016) and is widely considered to be a reliable paradigm

for determining language dominance in this technique (Knecht et al, 1998b).

Animation Description: this task was developed from the desire to test pre-literate

children on speech production tasks (Bishop, Watt and Papadatou-Pastou, 2009), in order to

answer questions about the developmental trajectory of hemispheric language lateralisation.

The paradigm, (described in detail by Bishop, Badcock and Holt, 2010), requires participants

to watch a 12 second cartoon in silence, and then to report what they had seen in the clip at

the onset of a question mark ‘speak’ prompt. This ‘speak’ phase lasts for 10 s, which is then

followed by a rest phase for 8 s to allow the CBFV signal to return to baseline. The baseline

period is taken from the ‘watch’ phase of the paradigm. Each trial lasts 30 s and there are a

total of 20 animation clips displayed, in a random order generated by a python based

computer script.

fTCD Analysis

Relative changes in cBFV within the left and right Middle Cerebral Arteries (MCAs)

were assessed using bilateral fTCD monitoring from a commercially available system (DWL

Doppler-BoxTMX: manufacturer, DWL Compumedics Germany GmbH). A 2-MHz transducer

probe attached to an adjustable headset was positioned over each temporal acoustic window

bilaterally. PsychoPy Software (Pierce, 2007) controlled the speech production paradigms

and sent marker pulses to the Doppler system to denote the onset of a trial. Data were

analysed off-line with a MATLAB (Mathworks Inc., Sherborn, MA, USA) based software

6


package called dopOSCCI (see Badcock et al., 2012 for a detailed description). Data

processing and analysis for the Animation description paradigm was undertaken as per

Hodgson, Hirst and Hudson (2016), and the word generation paradigm was analysed as

outlined in Hodgson and Hudson (2016).

Speech laterality indices were derived for each participant based on the difference

between left and right sided activity within a 2 s window, when compared to a baseline rest

period of 10s. The activation window was centralised to the time point at which the left-right

deviation was greatest within the period of interest (POI) (Badcock et al., 2012). In the word

generation paradigm the POI ranged from 3 – 13 s following presentation of the stimulus

letter. For the animation description task the POI ranged from 12 – 22 s following onset of the

trial. Speech laterality was assumed to be clear in all cases in which the LI deviated by > 2

SE from 0. Left-hemisphere or right-hemisphere speech dominance was indicated by positive

or negative indices respectively. Cases with an LI < 2 SE from 0 were categorised as having

bilateral speech representation. Individuals were categorised as having ‘Typical’ speech

representation if they displayed a clear LI score which was positive, alternatively individuals

with a bilateral LI score or a clear LI score which was negative were categorised as having

‘Atypical’ speech representation (Flowers and Hudson, 2013; Hodgson, Hirst and Hudson,

2016). Participants required a minimum of 75% acceptable trials to be included in the

analysis; all participants reached this threshold.

Results

LI scores from the word generation paradigm resulted in 9 individuals classified as

atypically lateralised (displaying either right sided activation or activation less than 2 SE

from 0; LI scores ranged from -4.43 to 0.81) and the remaining 38 individuals with typical

left hemisphere lateralisation (LI scores ranged from 1.19 to 6.61). LI scores from Time 1

7


(T1) and Time 2 (T2) on the word generation task revealed a strong positive correlation, r

(47) = 0.79 p = 0.0001, indicating that fTCD has a good test re-test reliability even after a

delay in re-testing of over 120 days (see Figure 1a). During this task 8 individuals with

atypical speech laterality at T1 all replicated an atypical lateralisation profile at T2. One

individual shifted from a bi-lateral profile at T1 to a right sided bias at T2.

To assess the comparability, rather than just the relationship, between the two

measurements taken, a Bland-Altman (B-A) analysis (Altman and Bland, 1983) was

conducted. This is a method of quantifying agreement between two quantitative

measurements by constructing limits of agreement. These statistical limits are calculated by

using the mean and the standard deviations of the differences between two measurements (see

Giavarina, 2015 for overview of method). The mean of the differences between each set of

measurements is also known as the measurement bias. The bias between LI scores taken from

T1 and T2 was -0.17 (B-A standard deviation = 1.67), and the resulting limits of agreement

(LOA), allowing for +/- 1.96 standard deviations from the mean LI, were -3.43 (lower LOA)

and 3.10 (upper LOA). These figures were calculated as follows: Bias +/- 1.96*SD. The

differences between LI scores from T1 and T2 can be plotted against the mean of the two

measurements, which allows for the investigation of any possible relationship between

measurement error and the estimated ‘true’ value. Inspection of the resulting B-A plot (see

Figure 2a) indicates that only 3 data points (1 atypically lateralised and 2 typically lateralised)

fall outside of the maximum limit of agreement, indicating that these points are more than

1.96 standard deviations from the calculated bias. The majority cluster within the calculated

limits, indicating good overall agreement between measurements taken at time points 1 and 2.

Results from the between-task reliability analysis revealed that the animation

description speech paradigm classified 8 individuals as atypically lateralised (LI scores

ranged from -4.47 to -1.22); however 3 of these cases were participants previously

8


categorised with typical left hemisphere dominance during the word generation task. This

deviation in a small number of cases is reflected by a weaker correlation between the

animation description LIs and the word generation LIs from T1, r (33) = 0.50 p = 0.003 (see

Figure 1b), compared with the test-retest correlation, but at 0.50 it still denotes an acceptable

level of agreement between tests.

Bland-Altman analysis on the sets of LI scores from each speech paradigm indicate

that the Animation description task mean LI scores deviated by 0.31 (B-A bias; B-A standard

deviation = 2.28) from the mean word generation LI scores overall. The calculated limits of

agreement were -4.16 (lower LOA) and 4.78 (upper LOA). These figures are greater than in

the previous test-retest reliability analysis, which suggests there is increased variance in LI

scores between these two tasks. Visual inspection of the resulting B-A plot (see Figure 2b)

indicates that only 2 data points fell outside of these calculated limits of agreement,

suggesting that, despite the increased variance in LI scores between the two tasks, the

agreement between the paradigms on derived speech lateralisation scores is still statistically

acceptable.

9


Figure 1. a) Plot of test re-test correlation between mean LI scores from the word generation task at test times 1 and 2. b) Plot of the correlation

between mean LI scores on the two speech production tasks; Negative values indicate right hemisphere activation and positive values indicate

left hemisphere

10


Figure 2. Bland-Altman plots depicting a) the mean of the laterality indices from test times 1 and 2 (derived from the word generation task)

against the difference between the laterality indices from test times 1 and 2; b) the mean of the laterality indices from the word generation task

and the animation description task against the difference between the laterality indices from each task. On each plot the solid line represents the

bias between the measurements, and the dashed lines represent the upper and lower limits of agreement (which equate to 1.96 standard

deviations in either direction). Values falling within these dashed lines indicate acceptable measurement agreement.

11


Conclusions

This study is one of the first to directly review the reliability of atypical hemispheric

speech representation as measured by fTCD. Good agreement was found between two test

points when using the word generation paradigm to measure hemispheric lateralisation of

speech production, even for individuals with atypical speech lateralisation. This suggests that

within-subjects measurements of speech lateralisation using this paradigm are relatively

stable across time, providing additional support for the use of this task for deriving

lateralisation indices in both typical and atypically lateralised participants (see also Knecht et

al 2000a). One potential caveat to this is the unknown impact of the length of time between

testing sessions on the reliability results, with a mean of 81 days it was a relatively long retest

duration and therefore may have introduced excess variability to the results. However, it is

worth noting that the previous study by Knecht and colleagues (1998b) which measured test-

retest reliability on the same paradigm using the same fTCD had a much more variable

interval length, ranging from one month to 14 months, and used a significantly smaller

sample (n=10). Despite these differences the reliability of the LI scores at T1 and T2 was

good in both cases, suggesting that varying retest interval length would not significantly alter

the agreement of the LI results.

In addition, comparison of language dominance scores across two speech production

tasks showed acceptable between-task reliability; however there were differences in atypical

classification in a small number of cases, and overall reliability and agreement was reduced

in comparison to the test-retest analysis. This increased variation in participants’ LI scores

likely reflects the different requirements of each task in terms of the level of language

construction and subsequent speech output required. The animation description task requires

participants to make a linguistically coherent and structured response, in comparison to the

simpler phonological and lexical response required by the fluency-based word generation

12


task. This may explain why there was greater variance in LI scores in the animation

description task, reflecting increased cognitive processing. It is possible therefore to conclude

that as these tasks are making different requirements on the language network, they should

not be interchanged experimentally for purposes of language lateralisation research, without

theoretical justification. That view, however, over-simplifies the point of using different

speech paradigms for assessment of hemispheric dominance, part of which is to find robust

speech paradigms that tap into the wide range of language processes in order to examine in

more detail whether particular aspects of language produce different lateralisation patterns.

As such, the need for data on the relative comparability of cortical responses between

paradigms is very necessary. Furthermore, the original motivation behind the development of

the animation description task was specifically to elicit robust speech responses in pre-literate

young children, in order to gain insight into age related changes in speech lateralisation

(Bishop et al., 2009; see also Hodgson et al, 2016). The authors themselves note that the task

requirements of that task do vary from the more widely used word generation paradigm

(Bishop et al., 2009), but felt this was an acceptable variation in order to address

developmental questions of speech processing.

It is worth stressing again, however, that the between-task reliability was statistically

acceptable, and that these observations are included here to address the relative difference in

strength of correlation between the two sets of reliability measurements presented in this

paper. Overall this data suggests that fTCD produces reliable measurements of hemispheric

speech dominance, but it also reinforces the need for detailed analyses on the precise

contribution of each hemisphere to lateralisation profiles arising from different aspects of

speech processing (see Bishop, 2013). Future work should focus on understanding the

component processes of speech tasks with regards to the differing physiological and vascular

responses each generate.

13


Authors’ contributions

JCH designed and performed experiments, performed the statistical analysis of the data,

wrote the manuscript; JMH designed the experiments and wrote the manuscript

14


References

Altman, D. G. & Bland, J. M. (1983). Measurement in medicine: the analysis of method

comparison studies. Statistician, 32, 307–17. http://dx.doi.org/10.2307/2987937

Badcock, N. A., Holt, G., Holden, A., & Bishop, D. V. (2012). dopOSCCI: A functional

transcranial doppler ultrasonography summary suite for the assessment of cerebral

lateralisation of cognitive function. Journal of Neuroscience Methods, 204(2), 383-

388.

Bishop, D. (2013). Cerebral asymmetry and language development: cause, correlate, or

consequence? Science, 340, 1230531. doi: 10.1126/science.1230531

Bishop, D., Badcock, N., & Holt, G. (2010). Assessment of cerebral lateralisation in children

using functional transcranial doppler ultrasound (fTCD). Journal of Visualized

Experiments, 43, 2161. doi: 10.3791/2161

Bishop, D., Holt, G, Whitehouse, A. & Groen, M. (2014). No population bias to left-

hemisphere language in 4-year-olds with language impairment. PeerJ, 2:e507

https://doi.org/10.7717/peerj.507

Bishop, D., Watt, H. & Papadatou-Pastou, M. (2009). An efficient and reliable method for

measuring cerebral lateralisation during speech with functional Transcranial Doppler

ultrasound. Neuropsychologia, 47, 587-590.

doi:10.1016/j.neuropsychologia.2008.09.013 .

Brownsett, S., Warren, J., Geranmayeh, F., Woodhead, Z., Leech, R. & Wise, R. (2014).

Cognitive control and its impact on recovery from aphasic stroke. Brain, 137, 242-

254.

Deppe, M., Knecht, S., Papke, K., Lohmann, H., Fleischer, H., Heindel, W., . . . Henningsen,

H. (2000). Assessment of hemispheric language Lateralisation; a comparison between

fMRI and fTCD. Journal of Cerebral Blood Flow & Metabolism, 20(2), 263-268.

15


Flowers, K. & Hudson, J. (2013). Motor laterality as an indicator of speech laterality.

Neuropsychology, 27, 256-65. doi: 10.1037/a0031664

Giavarina, D. (2015). Understanding Bland Altman analysis. Biochemia Medica, 25(2), 141–

51. http://dx.doi.org/10.11613/BM.2015.015

Hodgson, J. C., Hirst, R. & Hudson, J. M. (2016). Hemispheric speech lateralisation in the

developing brain is related to motor praxis ability. Developmental Cognitive

Neuroscience, 22, 9–17. http://dx.doi.org/10.1016/j.dcn.2016.09.00

Hodgson, J. C. & Hudson, J. M. (2016). Atypical speech lateralization in adults with

developmental coordination disorder demonstrated using functional transcranial

Doppler ultrasound. Journal of Neuropsychology, epub ahead of print;

doi: 10.1111/jnp.12102

Illingworth, S. & Bishop, D. (2009). Atypical cerebral lateralisation in adults with

compensated developmental dyslexia demonstrated using functional transcranial

Doppler ultrasound. Brain and Language, 11, 61-65. doi:10.1016/j.bandl.2009.05.002

Knecht, S., Deppe, M., Dräger, B., Bobe, L., Lohmann, H., Ringelstein, E., & Henningsen, H.

(2000a). Language lateralisation in healthy right-handers. Brain, 123, 74-81. doi:

10.1093/brain/123.1.74

Knecht, S., Dräger, B., Deppe, M., Bobe, L., Lohmann, H., Flöel, A., . . . Henningsen, H.

(2000b). Handedness and hemispheric language dominance in healthy humans. Brain,

123, 2512-2518. doi: 10.1093/brain/123.12.2512

Knecht, S., Deppe, M., Ringelstein, E –B., Wirtz, M., Lohmann, H., Dräger, B., …,

Henningsen, H. (1998b). Reproducibility of Functional Transcranial Doppler

Sonography in Determining Hemispheric Language Lateralization. Stroke, 29, 1155-

1159

Knecht, S., Jansen, A., Frank, A., van Randenborgh, J., Sommer, J., Kanowski, M. & Heinze,

16


H., (2003). How atypical is atypical language dominance? Neuroimage, 18, 917–927.

doi:10.1016/S1053-8119(03)00039-9

Kohler, M., Keage, H., Spooner, R., Flitton, A., Hofmann, J., Churches, O., …, Badcock, N.

(2015). Variability in lateralised blood flow response to language is associated with

language development in children aged 1–5 years. Brain and Language, 145-146, 34-

41. doi:10.1016/j.bandl.2015.04.004.

Peirce, J. W. (2007). PsychoPy—psychophysics software in python. Journal of Neuroscience

Methods, 162(1), 8-13.

Stroobant, N., Van Boxstael, J., & Vingerhoets, G. (2011). Language lateralization in

children: A functional transcranial doppler reliability study. Journal of

Neurolinguistics, 24(1), 14-24. doi: 10.1016/j.jneuroling.2010.07.00

Stroobant, N. & Vingerhoets, G. (2001). Test-retest reliability of functional transcranial

Doppler ultrasonography. Ultrasound in Medicine and Biology, 27, 509–514.

http://dx.doi.org/10.1016/S0301-5629(00)00325-2

17

eprints.lincoln.ac.ukeprints.lincoln.ac.uk/26720/1/Revised reliability ms_22_02_2017.docx · Web...

Documents

Transcript of eprints.lincoln.ac.ukeprints.lincoln.ac.uk/26720/1/Revised reliability ms_22_02_2017.docx · Web...