Discovering psycholinguistic effect timecourses with ...

172
Discovering psycholinguistic effect timecourses with deconvolutional time series regression Cory Shain November 7, 2018, Department of Cognitive Science, Johns Hopkins University

Transcript of Discovering psycholinguistic effect timecourses with ...

Page 1: Discovering psycholinguistic effect timecourses with ...

Discovering psycholinguistic effect timecourses with deconvolutionaltime series regression

Cory Shain

November 7, 2018, Department of Cognitive Science, Johns Hopkins University

Page 2: Discovering psycholinguistic effect timecourses with ...

This talk in one slide

+ Temporal diffusion of effects can be a serious confound in psycholinguistic data

+ Modeling temporal diffusion is problematic with existing tools+ Proposal:

+ Deconvolutional time series regression (DTSR)+ Continuous-time mixed-effects deconvolutional regression model+ Can be applied to any time series

+ Results:+ Recovers known temporal structures with high fidelity+ Finds plausible, replicable, and high-resolution estimates of temporal structure in reading

data

+ Documented open-source Python package supports easy adoption

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 3: Discovering psycholinguistic effect timecourses with ...

This talk in one slide

+ Temporal diffusion of effects can be a serious confound in psycholinguistic data

+ Modeling temporal diffusion is problematic with existing tools+ Proposal:

+ Deconvolutional time series regression (DTSR)+ Continuous-time mixed-effects deconvolutional regression model+ Can be applied to any time series

+ Results:+ Recovers known temporal structures with high fidelity+ Finds plausible, replicable, and high-resolution estimates of temporal structure in reading

data

+ Documented open-source Python package supports easy adoption

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 4: Discovering psycholinguistic effect timecourses with ...

This talk in one slide

+ Temporal diffusion of effects can be a serious confound in psycholinguistic data

+ Modeling temporal diffusion is problematic with existing tools+ Proposal:

+ Deconvolutional time series regression (DTSR)+ Continuous-time mixed-effects deconvolutional regression model+ Can be applied to any time series

+ Results:+ Recovers known temporal structures with high fidelity+ Finds plausible, replicable, and high-resolution estimates of temporal structure in reading

data

+ Documented open-source Python package supports easy adoption

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 5: Discovering psycholinguistic effect timecourses with ...

This talk in one slide

+ Temporal diffusion of effects can be a serious confound in psycholinguistic data

+ Modeling temporal diffusion is problematic with existing tools+ Proposal:

+ Deconvolutional time series regression (DTSR)+ Continuous-time mixed-effects deconvolutional regression model+ Can be applied to any time series

+ Results:+ Recovers known temporal structures with high fidelity+ Finds plausible, replicable, and high-resolution estimates of temporal structure in reading

data

+ Documented open-source Python package supports easy adoption

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 6: Discovering psycholinguistic effect timecourses with ...

This talk in one slide

+ Temporal diffusion of effects can be a serious confound in psycholinguistic data

+ Modeling temporal diffusion is problematic with existing tools+ Proposal:

+ Deconvolutional time series regression (DTSR)+ Continuous-time mixed-effects deconvolutional regression model+ Can be applied to any time series

+ Results:+ Recovers known temporal structures with high fidelity+ Finds plausible, replicable, and high-resolution estimates of temporal structure in reading

data

+ Documented open-source Python package supports easy adoption

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 7: Discovering psycholinguistic effect timecourses with ...

This talk in one slide

+ Temporal diffusion of effects can be a serious confound in psycholinguistic data

+ Modeling temporal diffusion is problematic with existing tools+ Proposal:

+ Deconvolutional time series regression (DTSR)+ Continuous-time mixed-effects deconvolutional regression model+ Can be applied to any time series

+ Results:+ Recovers known temporal structures with high fidelity+ Finds plausible, replicable, and high-resolution estimates of temporal structure in reading

data

+ Documented open-source Python package supports easy adoption

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 8: Discovering psycholinguistic effect timecourses with ...

This talk in one slide

+ Temporal diffusion of effects can be a serious confound in psycholinguistic data

+ Modeling temporal diffusion is problematic with existing tools+ Proposal:

+ Deconvolutional time series regression (DTSR)+ Continuous-time mixed-effects deconvolutional regression model+ Can be applied to any time series

+ Results:+ Recovers known temporal structures with high fidelity+ Finds plausible, replicable, and high-resolution estimates of temporal structure in reading

data

+ Documented open-source Python package supports easy adoption

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 9: Discovering psycholinguistic effect timecourses with ...

This talk in one slide

+ Temporal diffusion of effects can be a serious confound in psycholinguistic data

+ Modeling temporal diffusion is problematic with existing tools+ Proposal:

+ Deconvolutional time series regression (DTSR)+ Continuous-time mixed-effects deconvolutional regression model+ Can be applied to any time series

+ Results:+ Recovers known temporal structures with high fidelity+ Finds plausible, replicable, and high-resolution estimates of temporal structure in reading

data

+ Documented open-source Python package supports easy adoption

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 10: Discovering psycholinguistic effect timecourses with ...

This talk in one slide

+ Temporal diffusion of effects can be a serious confound in psycholinguistic data

+ Modeling temporal diffusion is problematic with existing tools+ Proposal:

+ Deconvolutional time series regression (DTSR)+ Continuous-time mixed-effects deconvolutional regression model+ Can be applied to any time series

+ Results:+ Recovers known temporal structures with high fidelity+ Finds plausible, replicable, and high-resolution estimates of temporal structure in reading

data

+ Documented open-source Python package supports easy adoption

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 11: Discovering psycholinguistic effect timecourses with ...

This talk in one slide

+ Temporal diffusion of effects can be a serious confound in psycholinguistic data

+ Modeling temporal diffusion is problematic with existing tools+ Proposal:

+ Deconvolutional time series regression (DTSR)+ Continuous-time mixed-effects deconvolutional regression model+ Can be applied to any time series

+ Results:+ Recovers known temporal structures with high fidelity+ Finds plausible, replicable, and high-resolution estimates of temporal structure in reading

data

+ Documented open-source Python package supports easy adoption

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 12: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Time matters a lot in psycholinguistics+ Psycholinguistic data are generated by people with brains+ The brain is a dynamical system that responds to its environment in time+ Most (all?) psycholinguistic data are underlyingly time series+ The brain’s response to a stimulus may be slow (temporally diffuse)+ Psycholinguistic measures may capture lingering response to preceding events

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 13: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Time matters a lot in psycholinguistics+ Psycholinguistic data are generated by people with brains+ The brain is a dynamical system that responds to its environment in time+ Most (all?) psycholinguistic data are underlyingly time series+ The brain’s response to a stimulus may be slow (temporally diffuse)+ Psycholinguistic measures may capture lingering response to preceding events

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 14: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Time matters a lot in psycholinguistics+ Psycholinguistic data are generated by people with brains+ The brain is a dynamical system that responds to its environment in time+ Most (all?) psycholinguistic data are underlyingly time series+ The brain’s response to a stimulus may be slow (temporally diffuse)+ Psycholinguistic measures may capture lingering response to preceding events

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 15: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Time matters a lot in psycholinguistics+ Psycholinguistic data are generated by people with brains+ The brain is a dynamical system that responds to its environment in time+ Most (all?) psycholinguistic data are underlyingly time series+ The brain’s response to a stimulus may be slow (temporally diffuse)+ Psycholinguistic measures may capture lingering response to preceding events

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 16: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Time matters a lot in psycholinguistics+ Psycholinguistic data are generated by people with brains+ The brain is a dynamical system that responds to its environment in time+ Most (all?) psycholinguistic data are underlyingly time series+ The brain’s response to a stimulus may be slow (temporally diffuse)+ Psycholinguistic measures may capture lingering response to preceding events

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 17: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Time matters a lot in psycholinguistics+ Psycholinguistic data are generated by people with brains+ The brain is a dynamical system that responds to its environment in time+ Most (all?) psycholinguistic data are underlyingly time series+ The brain’s response to a stimulus may be slow (temporally diffuse)+ Psycholinguistic measures may capture lingering response to preceding events

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 18: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Signal processing provides a framework for capturing temporal diffusion+ Stimuli and responses can be recast as convolutionally-related signals+ Relation described by an impulse response function (IRF)+ If we can discover the structure of the IRF (deconvolution), we can convolve predictors with it

to obtain a model of the response that takes diffusion directly into account

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 19: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Signal processing provides a framework for capturing temporal diffusion+ Stimuli and responses can be recast as convolutionally-related signals+ Relation described by an impulse response function (IRF)+ If we can discover the structure of the IRF (deconvolution), we can convolve predictors with it

to obtain a model of the response that takes diffusion directly into account

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 20: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Signal processing provides a framework for capturing temporal diffusion+ Stimuli and responses can be recast as convolutionally-related signals+ Relation described by an impulse response function (IRF)+ If we can discover the structure of the IRF (deconvolution), we can convolve predictors with it

to obtain a model of the response that takes diffusion directly into account

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 21: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Signal processing provides a framework for capturing temporal diffusion+ Stimuli and responses can be recast as convolutionally-related signals+ Relation described by an impulse response function (IRF)+ If we can discover the structure of the IRF (deconvolution), we can convolve predictors with it

to obtain a model of the response that takes diffusion directly into account

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 22: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Deconvolution is hard for psycholinguistic time series+ Major frameworks are discrete time

+ Finite impulse response models (FIR) (Dayal and MacGregor 1996)+ Vector autoregression (VAR) (Sims 1980)

+ Why is this a problem? Variably-spaced events

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 23: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Deconvolution is hard for psycholinguistic time series+ Major frameworks are discrete time

+ Finite impulse response models (FIR) (Dayal and MacGregor 1996)+ Vector autoregression (VAR) (Sims 1980)

+ Why is this a problem? Variably-spaced events

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 24: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Deconvolution is hard for psycholinguistic time series+ Major frameworks are discrete time

+ Finite impulse response models (FIR) (Dayal and MacGregor 1996)+ Vector autoregression (VAR) (Sims 1980)

+ Why is this a problem? Variably-spaced events

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 25: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Deconvolution is hard for psycholinguistic time series+ Major frameworks are discrete time

+ Finite impulse response models (FIR) (Dayal and MacGregor 1996)+ Vector autoregression (VAR) (Sims 1980)

+ Why is this a problem? Variably-spaced events

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 26: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Deconvolution is hard for psycholinguistic time series+ Major frameworks are discrete time

+ Finite impulse response models (FIR) (Dayal and MacGregor 1996)+ Vector autoregression (VAR) (Sims 1980)

+ Why is this a problem? Variably-spaced events

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 27: Discovering psycholinguistic effect timecourses with ...

0 0 + 1∆ 0 + 2∆ 0 + 3∆ 0 + 4∆

0.4

0.6

0.8

1

Time

Res

pons

e

0 0.25 0.5 0.75 1seconds

there ago, time long A

Variable spacing, ∆ not fixed, can’t deconvolve

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 28: Discovering psycholinguistic effect timecourses with ...

0 0 + 1∆ 0 + 2∆ 0 + 3∆ 0 + 4∆

0.4

0.6

0.8

1

Time

Res

pons

e

0 0.25 0.5 0.75 1seconds

there ago, time long A

Variable spacing, ∆ not fixed, can’t deconvolve

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 29: Discovering psycholinguistic effect timecourses with ...

0 0 + 1∆ 0 + 2∆ 0 + 3∆ 0 + 4∆

0.4

0.6

0.8

1

Time

Res

pons

e

0 0.25 0.5 0.75 1seconds

there ago, time long A

Variable spacing, ∆ not fixed, can’t deconvolveShain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 30: Discovering psycholinguistic effect timecourses with ...

0 0 + 1∆ 0 + 2∆ 0 + 3∆ 0 + 4∆

0.4

0.6

0.8

1

Time

Res

pons

e

0 0.25 0.5 0.75 1seconds

there ago, time long A

Sparse solution: Add lots of coefficients.

∆ = 0.01, but few coefficients have data.

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 31: Discovering psycholinguistic effect timecourses with ...

0 0 + 25∆ 0 + 50∆ 0 + 75∆ 0 + 100∆

0.4

0.6

0.8

1

Time

Res

pons

e

0 0.25 0.5 0.75 1seconds

there ago, time long A

Sparse solution: Add lots of coefficients.

∆ = 0.01, but few coefficients have data.

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 32: Discovering psycholinguistic effect timecourses with ...

0 0 + 25∆ 0 + 50∆ 0 + 75∆ 0 + 100∆

0.4

0.6

0.8

1

Time

Res

pons

e

0 0.25 0.5 0.75 1seconds

there ago, time long A

Sparse solution: Add lots of coefficients. ∆ = 0.01, but few coefficients have data.Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 33: Discovering psycholinguistic effect timecourses with ...

0 0 + 1∆ 0 + 2∆ 0 + 3∆ 0 + 4∆

0.4

0.6

0.8

1

Time

Res

pons

e

0 0.25 0.5 0.75 1seconds

there ago, time long A

Distortionary solution: Delete temporal variation.

∆ uninterpretable, time model broken.

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 34: Discovering psycholinguistic effect timecourses with ...

0 0 + 1∆ 0 + 2∆ 0 + 3∆ 0 + 4∆

0.4

0.6

0.8

1

Time

Res

pons

e

0 0.25 0.5 0.75 1seconds

there ago, time long Athere ago, time long A

Distortionary solution: Delete temporal variation.

∆ uninterpretable, time model broken.

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 35: Discovering psycholinguistic effect timecourses with ...

0 0 + 1∆ 0 + 2∆ 0 + 3∆ 0 + 4∆

0.4

0.6

0.8

1

Time

Res

pons

e

0 0.25 0.5 0.75 1seconds

there ago, time long Athere ago, time long A

Distortionary solution: Delete temporal variation. ∆ uninterpretable, time model broken.Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 36: Discovering psycholinguistic effect timecourses with ...

Motivation

+ “Distortionary solution” might look familiar

+ Spillover models like this are widely used in psycholinguistics (Erlich and Rayner 1983)+ Problems with spillover

+ Ignores temporal localization of events, only retains relative order+ May introduce multicolinearity+ Difficult to motivate choice of spillover configuration+ Prone to overfitting and non-convergence, especially with random effects

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 37: Discovering psycholinguistic effect timecourses with ...

Motivation

+ “Distortionary solution” might look familiar

+ Spillover models like this are widely used in psycholinguistics (Erlich and Rayner 1983)+ Problems with spillover

+ Ignores temporal localization of events, only retains relative order+ May introduce multicolinearity+ Difficult to motivate choice of spillover configuration+ Prone to overfitting and non-convergence, especially with random effects

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 38: Discovering psycholinguistic effect timecourses with ...

Motivation

+ “Distortionary solution” might look familiar

+ Spillover models like this are widely used in psycholinguistics (Erlich and Rayner 1983)+ Problems with spillover

+ Ignores temporal localization of events, only retains relative order+ May introduce multicolinearity+ Difficult to motivate choice of spillover configuration+ Prone to overfitting and non-convergence, especially with random effects

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 39: Discovering psycholinguistic effect timecourses with ...

Motivation

+ “Distortionary solution” might look familiar

+ Spillover models like this are widely used in psycholinguistics (Erlich and Rayner 1983)+ Problems with spillover

+ Ignores temporal localization of events, only retains relative order+ May introduce multicolinearity+ Difficult to motivate choice of spillover configuration+ Prone to overfitting and non-convergence, especially with random effects

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 40: Discovering psycholinguistic effect timecourses with ...

Motivation

+ “Distortionary solution” might look familiar

+ Spillover models like this are widely used in psycholinguistics (Erlich and Rayner 1983)+ Problems with spillover

+ Ignores temporal localization of events, only retains relative order+ May introduce multicolinearity+ Difficult to motivate choice of spillover configuration+ Prone to overfitting and non-convergence, especially with random effects

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 41: Discovering psycholinguistic effect timecourses with ...

Motivation

+ “Distortionary solution” might look familiar

+ Spillover models like this are widely used in psycholinguistics (Erlich and Rayner 1983)+ Problems with spillover

+ Ignores temporal localization of events, only retains relative order+ May introduce multicolinearity+ Difficult to motivate choice of spillover configuration+ Prone to overfitting and non-convergence, especially with random effects

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 42: Discovering psycholinguistic effect timecourses with ...

Motivation

+ “Distortionary solution” might look familiar

+ Spillover models like this are widely used in psycholinguistics (Erlich and Rayner 1983)+ Problems with spillover

+ Ignores temporal localization of events, only retains relative order+ May introduce multicolinearity+ Difficult to motivate choice of spillover configuration+ Prone to overfitting and non-convergence, especially with random effects

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 43: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Deconvolution is hard for psycholinguistic time series

+ Failure to control for temporal diffusion can lead to misleading models

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 44: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Deconvolution is hard for psycholinguistic time series

+ Failure to control for temporal diffusion can lead to misleading models

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 45: Discovering psycholinguistic effect timecourses with ...

CASE IN POINT

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 46: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Shain et al. (2016): analysis of large SPR corpus (Futrell et al. 2018)

+ Significant effects of constituent wrap-up and dependency locality

+ First strong evidence of memory effects in broad-coverage sentence processing

+ Paper has a couple of citations

+ Accepted as a long-form talk at CUNY 2017

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 47: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Shain et al. (2016): analysis of large SPR corpus (Futrell et al. 2018)

+ Significant effects of constituent wrap-up and dependency locality

+ First strong evidence of memory effects in broad-coverage sentence processing

+ Paper has a couple of citations

+ Accepted as a long-form talk at CUNY 2017

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 48: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Shain et al. (2016): analysis of large SPR corpus (Futrell et al. 2018)

+ Significant effects of constituent wrap-up and dependency locality

+ First strong evidence of memory effects in broad-coverage sentence processing

+ Paper has a couple of citations

+ Accepted as a long-form talk at CUNY 2017

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 49: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Shain et al. (2016): analysis of large SPR corpus (Futrell et al. 2018)

+ Significant effects of constituent wrap-up and dependency locality

+ First strong evidence of memory effects in broad-coverage sentence processing

+ Paper has a couple of citations

+ Accepted as a long-form talk at CUNY 2017

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 50: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Shain et al. (2016): analysis of large SPR corpus (Futrell et al. 2018)

+ Significant effects of constituent wrap-up and dependency locality

+ First strong evidence of memory effects in broad-coverage sentence processing

+ Paper has a couple of citations

+ Accepted as a long-form talk at CUNY 2017

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 51: Discovering psycholinguistic effect timecourses with ...

Motivation

β-ms t-value p-valueConstituent wrap-up 1.54 8.15 2.33e-14Dependency locality 1.10 6.48 4.87e-10

But after spilling over one baseline variable...

Constituent wrap-up: p = 0.816Dependency locality: p = 0.370

Tiny tweak to timecourse modeling→ huge impact on hypothesis testing

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 52: Discovering psycholinguistic effect timecourses with ...

Motivation

β-ms t-value p-valueConstituent wrap-up 1.54 8.15 2.33e-14Dependency locality 1.10 6.48 4.87e-10

But after spilling over one baseline variable...

Constituent wrap-up: p = 0.816Dependency locality: p = 0.370

Tiny tweak to timecourse modeling→ huge impact on hypothesis testing

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 53: Discovering psycholinguistic effect timecourses with ...

Motivation

β-ms t-value p-valueConstituent wrap-up 1.54 8.15 2.33e-14Dependency locality 1.10 6.48 4.87e-10

But after spilling over one baseline variable...

Constituent wrap-up: p = 0.816Dependency locality: p = 0.370

Tiny tweak to timecourse modeling→ huge impact on hypothesis testing

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 54: Discovering psycholinguistic effect timecourses with ...

Motivation

β-ms t-value p-valueConstituent wrap-up 1.54 8.15 2.33e-14Dependency locality 1.10 6.48 4.87e-10

But after spilling over one baseline variable...

Constituent wrap-up: p = 0.816Dependency locality: p = 0.370

Tiny tweak to timecourse modeling→ huge impact on hypothesis testing

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 55: Discovering psycholinguistic effect timecourses with ...

Motivation

Deconvolution of psycholinguistic timecourses is both difficult and important.What should we do?

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 56: Discovering psycholinguistic effect timecourses with ...

0 0 + 1∆ 0 + 2∆ 0 + 3∆ 0 + 4∆

0.4

0.6

0.8

1

Time

Res

pons

e

What if we had a continuous IRF?E.g. f(x; β) = βe−βx

If we could fit β, we could predict the response anywhere

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 57: Discovering psycholinguistic effect timecourses with ...

0 0 + 1∆ 0 + 2∆ 0 + 3∆ 0 + 4∆

0.4

0.6

0.8

1

Time

Res

pons

e

What if we had a continuous IRF?

E.g. f(x; β) = βe−βx

If we could fit β, we could predict the response anywhere

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 58: Discovering psycholinguistic effect timecourses with ...

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0.4

0.6

0.8

1

Time

Res

pons

e

What if we had a continuous IRF?E.g. f(x; β) = βe−βx

If we could fit β, we could predict the response anywhere

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 59: Discovering psycholinguistic effect timecourses with ...

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0.4

0.6

0.8

1

Time

Res

pons

e

What if we had a continuous IRF?E.g. f(x; β) = βe−βx

If we could fit β, we could predict the response anywhereShain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 60: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Continuous-time deconvolution would+ Avoid discretizing time into lags+ Support variably-spaced events+ Support unsynchronized events+ Apply without sparsity/distortion to any psycholinguistic time series

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 61: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Continuous-time deconvolution would+ Avoid discretizing time into lags+ Support variably-spaced events+ Support unsynchronized events+ Apply without sparsity/distortion to any psycholinguistic time series

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 62: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Continuous-time deconvolution would+ Avoid discretizing time into lags+ Support variably-spaced events+ Support unsynchronized events+ Apply without sparsity/distortion to any psycholinguistic time series

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 63: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Continuous-time deconvolution would+ Avoid discretizing time into lags+ Support variably-spaced events+ Support unsynchronized events+ Apply without sparsity/distortion to any psycholinguistic time series

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 64: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Continuous-time deconvolution would+ Avoid discretizing time into lags+ Support variably-spaced events+ Support unsynchronized events+ Apply without sparsity/distortion to any psycholinguistic time series

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 65: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Until recently, continuous-time deconvolution was hard because non-linear in itsparameters

+ Estimators would have to be derived by hand+ Derive likelihood function (depends on IRF)+ Find its 1st and 2nd derivatives w.r.t. all parameters+ Use derivatives to compute maximum likelihood estimators+ Repeat for new model

+ Recent developments in machine learning allow us to avoid this throughautodifferentiation and stochastic optimization

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 66: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Until recently, continuous-time deconvolution was hard because non-linear in itsparameters

+ Estimators would have to be derived by hand+ Derive likelihood function (depends on IRF)+ Find its 1st and 2nd derivatives w.r.t. all parameters+ Use derivatives to compute maximum likelihood estimators+ Repeat for new model

+ Recent developments in machine learning allow us to avoid this throughautodifferentiation and stochastic optimization

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 67: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Until recently, continuous-time deconvolution was hard because non-linear in itsparameters

+ Estimators would have to be derived by hand+ Derive likelihood function (depends on IRF)+ Find its 1st and 2nd derivatives w.r.t. all parameters+ Use derivatives to compute maximum likelihood estimators+ Repeat for new model

+ Recent developments in machine learning allow us to avoid this throughautodifferentiation and stochastic optimization

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 68: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Until recently, continuous-time deconvolution was hard because non-linear in itsparameters

+ Estimators would have to be derived by hand+ Derive likelihood function (depends on IRF)+ Find its 1st and 2nd derivatives w.r.t. all parameters+ Use derivatives to compute maximum likelihood estimators+ Repeat for new model

+ Recent developments in machine learning allow us to avoid this throughautodifferentiation and stochastic optimization

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 69: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Until recently, continuous-time deconvolution was hard because non-linear in itsparameters

+ Estimators would have to be derived by hand+ Derive likelihood function (depends on IRF)+ Find its 1st and 2nd derivatives w.r.t. all parameters+ Use derivatives to compute maximum likelihood estimators+ Repeat for new model

+ Recent developments in machine learning allow us to avoid this throughautodifferentiation and stochastic optimization

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 70: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Until recently, continuous-time deconvolution was hard because non-linear in itsparameters

+ Estimators would have to be derived by hand+ Derive likelihood function (depends on IRF)+ Find its 1st and 2nd derivatives w.r.t. all parameters+ Use derivatives to compute maximum likelihood estimators+ Repeat for new model

+ Recent developments in machine learning allow us to avoid this throughautodifferentiation and stochastic optimization

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 71: Discovering psycholinguistic effect timecourses with ...

Motivation

+ Until recently, continuous-time deconvolution was hard because non-linear in itsparameters

+ Estimators would have to be derived by hand+ Derive likelihood function (depends on IRF)+ Find its 1st and 2nd derivatives w.r.t. all parameters+ Use derivatives to compute maximum likelihood estimators+ Repeat for new model

+ Recent developments in machine learning allow us to avoid this throughautodifferentiation and stochastic optimization

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 72: Discovering psycholinguistic effect timecourses with ...

Proposal: Deconvolutional Time Series Regression

+ Jointly fits:+ Continuous-time parametric IRFs for each predictor+ Linear model on convolved predictors

+ Uses autodifferentiation and gradient-based

+ Applies to any time series using any set of parametric IRF kernels optimization

+ Provides an interpretable model that directly estimates temporal diffusion

+ O(1) model complexity on num. timesteps

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 73: Discovering psycholinguistic effect timecourses with ...

Proposal: Deconvolutional Time Series Regression

+ Jointly fits:+ Continuous-time parametric IRFs for each predictor+ Linear model on convolved predictors

+ Uses autodifferentiation and gradient-based

+ Applies to any time series using any set of parametric IRF kernels optimization

+ Provides an interpretable model that directly estimates temporal diffusion

+ O(1) model complexity on num. timesteps

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 74: Discovering psycholinguistic effect timecourses with ...

Proposal: Deconvolutional Time Series Regression

+ Jointly fits:+ Continuous-time parametric IRFs for each predictor+ Linear model on convolved predictors

+ Uses autodifferentiation and gradient-based

+ Applies to any time series using any set of parametric IRF kernels optimization

+ Provides an interpretable model that directly estimates temporal diffusion

+ O(1) model complexity on num. timesteps

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 75: Discovering psycholinguistic effect timecourses with ...

Proposal: Deconvolutional Time Series Regression

+ Jointly fits:+ Continuous-time parametric IRFs for each predictor+ Linear model on convolved predictors

+ Uses autodifferentiation and gradient-based

+ Applies to any time series using any set of parametric IRF kernels optimization

+ Provides an interpretable model that directly estimates temporal diffusion

+ O(1) model complexity on num. timesteps

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 76: Discovering psycholinguistic effect timecourses with ...

Proposal: Deconvolutional Time Series Regression

+ Jointly fits:+ Continuous-time parametric IRFs for each predictor+ Linear model on convolved predictors

+ Uses autodifferentiation and gradient-based

+ Applies to any time series using any set of parametric IRF kernels optimization

+ Provides an interpretable model that directly estimates temporal diffusion

+ O(1) model complexity on num. timesteps

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 77: Discovering psycholinguistic effect timecourses with ...

Proposal: Deconvolutional Time Series Regression

+ Jointly fits:+ Continuous-time parametric IRFs for each predictor+ Linear model on convolved predictors

+ Uses autodifferentiation and gradient-based

+ Applies to any time series using any set of parametric IRF kernels optimization

+ Provides an interpretable model that directly estimates temporal diffusion

+ O(1) model complexity on num. timesteps

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 78: Discovering psycholinguistic effect timecourses with ...

Proposal: Deconvolutional Time Series Regression

+ Jointly fits:+ Continuous-time parametric IRFs for each predictor+ Linear model on convolved predictors

+ Uses autodifferentiation and gradient-based

+ Applies to any time series using any set of parametric IRF kernels optimization

+ Provides an interpretable model that directly estimates temporal diffusion

+ O(1) model complexity on num. timesteps

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 79: Discovering psycholinguistic effect timecourses with ...

y

x

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 80: Discovering psycholinguistic effect timecourses with ...

y

x

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 81: Discovering psycholinguistic effect timecourses with ...

y

x

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 82: Discovering psycholinguistic effect timecourses with ...

y

∑••

∑•••

∑••

••∑

•••••∑

x∗

xLatent IRF

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 83: Discovering psycholinguistic effect timecourses with ...

Proposal: Deconvolutional Time Series Regression

+ Expands range of application of deconvolutional modeling (e.g. to reading)

+ Provides high-resolution estimates of temporal dynamics+ Documented open-source Python package supports

+ Mixed effects modeling (intercepts, slopes, and IRF parameters)+ Various IRF kernels (and more coming)+ Non-parametric IRFs through spline kernels+ Composition of IRF kernels+ MLE, Bayesian, and variational Bayesian inference modes

+ https://github.com/coryshain/dtsr

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 84: Discovering psycholinguistic effect timecourses with ...

Proposal: Deconvolutional Time Series Regression

+ Expands range of application of deconvolutional modeling (e.g. to reading)

+ Provides high-resolution estimates of temporal dynamics+ Documented open-source Python package supports

+ Mixed effects modeling (intercepts, slopes, and IRF parameters)+ Various IRF kernels (and more coming)+ Non-parametric IRFs through spline kernels+ Composition of IRF kernels+ MLE, Bayesian, and variational Bayesian inference modes

+ https://github.com/coryshain/dtsr

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 85: Discovering psycholinguistic effect timecourses with ...

Proposal: Deconvolutional Time Series Regression

+ Expands range of application of deconvolutional modeling (e.g. to reading)

+ Provides high-resolution estimates of temporal dynamics+ Documented open-source Python package supports

+ Mixed effects modeling (intercepts, slopes, and IRF parameters)+ Various IRF kernels (and more coming)+ Non-parametric IRFs through spline kernels+ Composition of IRF kernels+ MLE, Bayesian, and variational Bayesian inference modes

+ https://github.com/coryshain/dtsr

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 86: Discovering psycholinguistic effect timecourses with ...

Proposal: Deconvolutional Time Series Regression

+ Expands range of application of deconvolutional modeling (e.g. to reading)

+ Provides high-resolution estimates of temporal dynamics+ Documented open-source Python package supports

+ Mixed effects modeling (intercepts, slopes, and IRF parameters)+ Various IRF kernels (and more coming)+ Non-parametric IRFs through spline kernels+ Composition of IRF kernels+ MLE, Bayesian, and variational Bayesian inference modes

+ https://github.com/coryshain/dtsr

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 87: Discovering psycholinguistic effect timecourses with ...

Proposal: Deconvolutional Time Series Regression

+ Expands range of application of deconvolutional modeling (e.g. to reading)

+ Provides high-resolution estimates of temporal dynamics+ Documented open-source Python package supports

+ Mixed effects modeling (intercepts, slopes, and IRF parameters)+ Various IRF kernels (and more coming)+ Non-parametric IRFs through spline kernels+ Composition of IRF kernels+ MLE, Bayesian, and variational Bayesian inference modes

+ https://github.com/coryshain/dtsr

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 88: Discovering psycholinguistic effect timecourses with ...

Proposal: Deconvolutional Time Series Regression

+ Expands range of application of deconvolutional modeling (e.g. to reading)

+ Provides high-resolution estimates of temporal dynamics+ Documented open-source Python package supports

+ Mixed effects modeling (intercepts, slopes, and IRF parameters)+ Various IRF kernels (and more coming)+ Non-parametric IRFs through spline kernels+ Composition of IRF kernels+ MLE, Bayesian, and variational Bayesian inference modes

+ https://github.com/coryshain/dtsr

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 89: Discovering psycholinguistic effect timecourses with ...

Proposal: Deconvolutional Time Series Regression

+ Expands range of application of deconvolutional modeling (e.g. to reading)

+ Provides high-resolution estimates of temporal dynamics+ Documented open-source Python package supports

+ Mixed effects modeling (intercepts, slopes, and IRF parameters)+ Various IRF kernels (and more coming)+ Non-parametric IRFs through spline kernels+ Composition of IRF kernels+ MLE, Bayesian, and variational Bayesian inference modes

+ https://github.com/coryshain/dtsr

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 90: Discovering psycholinguistic effect timecourses with ...

Proposal: Deconvolutional Time Series Regression

+ Expands range of application of deconvolutional modeling (e.g. to reading)

+ Provides high-resolution estimates of temporal dynamics+ Documented open-source Python package supports

+ Mixed effects modeling (intercepts, slopes, and IRF parameters)+ Various IRF kernels (and more coming)+ Non-parametric IRFs through spline kernels+ Composition of IRF kernels+ MLE, Bayesian, and variational Bayesian inference modes

+ https://github.com/coryshain/dtsr

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 91: Discovering psycholinguistic effect timecourses with ...

Proposal: Deconvolutional Time Series Regression

+ Expands range of application of deconvolutional modeling (e.g. to reading)

+ Provides high-resolution estimates of temporal dynamics+ Documented open-source Python package supports

+ Mixed effects modeling (intercepts, slopes, and IRF parameters)+ Various IRF kernels (and more coming)+ Non-parametric IRFs through spline kernels+ Composition of IRF kernels+ MLE, Bayesian, and variational Bayesian inference modes

+ https://github.com/coryshain/dtsr

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 92: Discovering psycholinguistic effect timecourses with ...

DTSR Implementation Used Here

+ ShiftedGamma IRF kernel

f(x;α, β, δ) =βα(x − δ)α−1e−β(x−δ)

Γ(α)

+ Black box variational inference (BBVI)

+ Implemented in Tensorflow (Abadi et al. 2015) and Edward (Tran et al. 2016)

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 93: Discovering psycholinguistic effect timecourses with ...

DTSR Implementation Used Here

+ ShiftedGamma IRF kernel

f(x;α, β, δ) =βα(x − δ)α−1e−β(x−δ)

Γ(α)

+ Black box variational inference (BBVI)

+ Implemented in Tensorflow (Abadi et al. 2015) and Edward (Tran et al. 2016)

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 94: Discovering psycholinguistic effect timecourses with ...

DTSR Implementation Used Here

+ ShiftedGamma IRF kernel

f(x;α, β, δ) =βα(x − δ)α−1e−β(x−δ)

Γ(α)

+ Black box variational inference (BBVI)

+ Implemented in Tensorflow (Abadi et al. 2015) and Edward (Tran et al. 2016)

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 95: Discovering psycholinguistic effect timecourses with ...

Synthetic Evaluation

+ Sanity check: Can DTSR recover known IRFs?

+ Generate data from a model with known convolutional structure

+ Fit DTSR to that data and compare estimates to ground truth

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 96: Discovering psycholinguistic effect timecourses with ...

Synthetic Evaluation

+ Sanity check: Can DTSR recover known IRFs?

+ Generate data from a model with known convolutional structure

+ Fit DTSR to that data and compare estimates to ground truth

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 97: Discovering psycholinguistic effect timecourses with ...

Synthetic Evaluation

+ Sanity check: Can DTSR recover known IRFs?

+ Generate data from a model with known convolutional structure

+ Fit DTSR to that data and compare estimates to ground truth

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 98: Discovering psycholinguistic effect timecourses with ...

Synthetic Evaluation

Ground truth Estimated

ρ = 0

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 99: Discovering psycholinguistic effect timecourses with ...

Synthetic Evaluation

Ground truth Estimated

ρ = 0

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 100: Discovering psycholinguistic effect timecourses with ...

Synthetic Evaluation

Ground truth Estimated

ρ = 0.25

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 101: Discovering psycholinguistic effect timecourses with ...

Synthetic Evaluation

Ground truth Estimated

ρ = 0.5

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 102: Discovering psycholinguistic effect timecourses with ...

Synthetic Evaluation

Ground truth Estimated

ρ = 0.75

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 103: Discovering psycholinguistic effect timecourses with ...

Synthetic Evaluation

+ DTSR can recover known IRFs with high fidelity

+ Estimates are robust to multicolinearity

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 104: Discovering psycholinguistic effect timecourses with ...

Synthetic Evaluation

+ DTSR can recover known IRFs with high fidelity

+ Estimates are robust to multicolinearity

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 105: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Reading Times

+ Datasets:+ Natural Stories (SPR) (Futrell et al. 2018)+ Dundee (ET) (Kennedy, Pynte, and Hill 2003)+ UCL (ET) (Frank et al. 2013)

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 106: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Reading Times

+ Convolved predictors+ Saccade length (eye-tracking only)+ Word length+ Unigram logprob+ 5-gram surprisal+ Rate (DTSR only)

+ Linear predictors+ Sentence position+ Trial

+ Response: Log reading times (go-past for eye-tracking)

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 107: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Reading Times

+ Convolved predictors+ Saccade length (eye-tracking only)+ Word length+ Unigram logprob+ 5-gram surprisal+ Rate (DTSR only)

+ Linear predictors+ Sentence position+ Trial

+ Response: Log reading times (go-past for eye-tracking)

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 108: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Reading Times

+ Convolved predictors+ Saccade length (eye-tracking only)+ Word length+ Unigram logprob+ 5-gram surprisal+ Rate (DTSR only)

+ Linear predictors+ Sentence position+ Trial

+ Response: Log reading times (go-past for eye-tracking)

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 109: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Reading Times

+ Convolved predictors+ Saccade length (eye-tracking only)+ Word length+ Unigram logprob+ 5-gram surprisal+ Rate (DTSR only)

+ Linear predictors+ Sentence position+ Trial

+ Response: Log reading times (go-past for eye-tracking)

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 110: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Reading Times

+ Convolved predictors+ Saccade length (eye-tracking only)+ Word length+ Unigram logprob+ 5-gram surprisal+ Rate (DTSR only)

+ Linear predictors+ Sentence position+ Trial

+ Response: Log reading times (go-past for eye-tracking)

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 111: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Reading Times

+ Convolved predictors+ Saccade length (eye-tracking only)+ Word length+ Unigram logprob+ 5-gram surprisal+ Rate (DTSR only)

+ Linear predictors+ Sentence position+ Trial

+ Response: Log reading times (go-past for eye-tracking)

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 112: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Reading Times

+ Convolved predictors+ Saccade length (eye-tracking only)+ Word length+ Unigram logprob+ 5-gram surprisal+ Rate (DTSR only)

+ Linear predictors+ Sentence position+ Trial

+ Response: Log reading times (go-past for eye-tracking)

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 113: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Reading Times

+ Convolved predictors+ Saccade length (eye-tracking only)+ Word length+ Unigram logprob+ 5-gram surprisal+ Rate (DTSR only)

+ Linear predictors+ Sentence position+ Trial

+ Response: Log reading times (go-past for eye-tracking)

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 114: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Reading Times

+ Convolved predictors+ Saccade length (eye-tracking only)+ Word length+ Unigram logprob+ 5-gram surprisal+ Rate (DTSR only)

+ Linear predictors+ Sentence position+ Trial

+ Response: Log reading times (go-past for eye-tracking)

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 115: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Reading Times

+ Convolved predictors+ Saccade length (eye-tracking only)+ Word length+ Unigram logprob+ 5-gram surprisal+ Rate (DTSR only)

+ Linear predictors+ Sentence position+ Trial

+ Response: Log reading times (go-past for eye-tracking)

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 116: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Reading Times

+ More on Rate:+ Rate predictor is an intercept (vector of 1’s) that gets convolved with an IRF+ Captures effects of stimulus timing independently of stimulus properties+ Only detectable through deconvolution

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 117: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Reading Times

+ More on Rate:+ Rate predictor is an intercept (vector of 1’s) that gets convolved with an IRF+ Captures effects of stimulus timing independently of stimulus properties+ Only detectable through deconvolution

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 118: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Reading Times

+ More on Rate:+ Rate predictor is an intercept (vector of 1’s) that gets convolved with an IRF+ Captures effects of stimulus timing independently of stimulus properties+ Only detectable through deconvolution

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 119: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Reading Times

+ More on Rate:+ Rate predictor is an intercept (vector of 1’s) that gets convolved with an IRF+ Captures effects of stimulus timing independently of stimulus properties+ Only detectable through deconvolution

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 120: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Fitted IRFFi

xatio

ndu

ratio

n(lo

gm

s)

Self-Paced ReadingNatural Stories

Eye-TrackingDundee

Time (s)

Eye-TrackingUCL

Large negative influence of Rate (convolved intercept) suggests inertia

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 121: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Fitted IRFFi

xatio

ndu

ratio

n(lo

gm

s)

Self-Paced ReadingNatural Stories

Eye-TrackingDundee

Time (s)

Eye-TrackingUCL

Large negative influence of Rate (convolved intercept) suggests inertiaShain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 122: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Fitted IRFFi

xatio

ndu

ratio

n(lo

gm

s)

Self-Paced ReadingNatural Stories

Eye-TrackingDundee

Time (s)

Eye-TrackingUCL

Diffusion mostly restricted to first second after stimulus presentationShain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 123: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Fitted IRFFi

xatio

ndu

ratio

n(lo

gm

s)

Self-Paced ReadingNatural Stories

Eye-TrackingDundee

Time (s)

Eye-TrackingUCL

Top-down response slower than bottom-up (surp vs. word/sac. len) (Friederici 2002)Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 124: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Fitted IRFFi

xatio

ndu

ratio

n(lo

gm

s)

Self-Paced ReadingNatural Stories

Eye-TrackingDundee

Time (s)

Eye-TrackingUCL

Similar temporal profile across eye-tracking corporaShain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 125: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Fitted IRFFi

xatio

ndu

ratio

n(lo

gm

s)

Self-Paced ReadingNatural Stories

Eye-TrackingDundee

Time (s)

Eye-TrackingUCL

Null influence of unigram logprob (c.f. e.g. Levy 2008; Staub 2015)Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 126: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: System Comparison

Natural Stories Dundee UCL

Mean squared prediction error (MSPE), DTSR vs. competitorsLME (blue); LME-S (orange); GAM (green); GAM-S (red); DTSR (purple)

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 127: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Summary

+ Estimated IRFs shed new light on temporal dynamics in naturalistic reading

+ Estimates are plausible, replicable, and fine-grained

+ Models show high quality prediction performance, validating IRFs

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 128: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Summary

+ Estimated IRFs shed new light on temporal dynamics in naturalistic reading

+ Estimates are plausible, replicable, and fine-grained

+ Models show high quality prediction performance, validating IRFs

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 129: Discovering psycholinguistic effect timecourses with ...

Naturalistic Evaluation: Summary

+ Estimated IRFs shed new light on temporal dynamics in naturalistic reading

+ Estimates are plausible, replicable, and fine-grained

+ Models show high quality prediction performance, validating IRFs

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 130: Discovering psycholinguistic effect timecourses with ...

Hypothesis Testing

So how do I test a claim using DTSR?

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 131: Discovering psycholinguistic effect timecourses with ...

Hypothesis Testing

+ DTSR stochastically optimizes over a non-convex likelihood surface+ Nearly ubiquitous property of modern machine learning algorithms+ Introduces possibility of estimation noise

+ Convergence to a non-global optimum+ Imperfect convergence to an optimum+ Evaluation using Monte Carlo sampling (Bayesian only)

+ Estimates and training predictions/likelihoods are not guaranteed to be globally optimal

+ Differences between models may be influenced artifacts of fitting procedure

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 132: Discovering psycholinguistic effect timecourses with ...

Hypothesis Testing

+ DTSR stochastically optimizes over a non-convex likelihood surface+ Nearly ubiquitous property of modern machine learning algorithms+ Introduces possibility of estimation noise

+ Convergence to a non-global optimum+ Imperfect convergence to an optimum+ Evaluation using Monte Carlo sampling (Bayesian only)

+ Estimates and training predictions/likelihoods are not guaranteed to be globally optimal

+ Differences between models may be influenced artifacts of fitting procedure

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 133: Discovering psycholinguistic effect timecourses with ...

Hypothesis Testing

+ DTSR stochastically optimizes over a non-convex likelihood surface+ Nearly ubiquitous property of modern machine learning algorithms+ Introduces possibility of estimation noise

+ Convergence to a non-global optimum+ Imperfect convergence to an optimum+ Evaluation using Monte Carlo sampling (Bayesian only)

+ Estimates and training predictions/likelihoods are not guaranteed to be globally optimal

+ Differences between models may be influenced artifacts of fitting procedure

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 134: Discovering psycholinguistic effect timecourses with ...

Hypothesis Testing

+ DTSR stochastically optimizes over a non-convex likelihood surface+ Nearly ubiquitous property of modern machine learning algorithms+ Introduces possibility of estimation noise

+ Convergence to a non-global optimum+ Imperfect convergence to an optimum+ Evaluation using Monte Carlo sampling (Bayesian only)

+ Estimates and training predictions/likelihoods are not guaranteed to be globally optimal

+ Differences between models may be influenced artifacts of fitting procedure

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 135: Discovering psycholinguistic effect timecourses with ...

Hypothesis Testing

+ DTSR stochastically optimizes over a non-convex likelihood surface+ Nearly ubiquitous property of modern machine learning algorithms+ Introduces possibility of estimation noise

+ Convergence to a non-global optimum+ Imperfect convergence to an optimum+ Evaluation using Monte Carlo sampling (Bayesian only)

+ Estimates and training predictions/likelihoods are not guaranteed to be globally optimal

+ Differences between models may be influenced artifacts of fitting procedure

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 136: Discovering psycholinguistic effect timecourses with ...

Hypothesis Testing

+ DTSR stochastically optimizes over a non-convex likelihood surface+ Nearly ubiquitous property of modern machine learning algorithms+ Introduces possibility of estimation noise

+ Convergence to a non-global optimum+ Imperfect convergence to an optimum+ Evaluation using Monte Carlo sampling (Bayesian only)

+ Estimates and training predictions/likelihoods are not guaranteed to be globally optimal

+ Differences between models may be influenced artifacts of fitting procedure

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 137: Discovering psycholinguistic effect timecourses with ...

Hypothesis Testing

+ DTSR stochastically optimizes over a non-convex likelihood surface+ Nearly ubiquitous property of modern machine learning algorithms+ Introduces possibility of estimation noise

+ Convergence to a non-global optimum+ Imperfect convergence to an optimum+ Evaluation using Monte Carlo sampling (Bayesian only)

+ Estimates and training predictions/likelihoods are not guaranteed to be globally optimal

+ Differences between models may be influenced artifacts of fitting procedure

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 138: Discovering psycholinguistic effect timecourses with ...

Hypothesis Testing

+ DTSR stochastically optimizes over a non-convex likelihood surface+ Nearly ubiquitous property of modern machine learning algorithms+ Introduces possibility of estimation noise

+ Convergence to a non-global optimum+ Imperfect convergence to an optimum+ Evaluation using Monte Carlo sampling (Bayesian only)

+ Estimates and training predictions/likelihoods are not guaranteed to be globally optimal

+ Differences between models may be influenced artifacts of fitting procedure

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 139: Discovering psycholinguistic effect timecourses with ...

Hypothesis Testing

+ Despite not being provably optimal+ Synthetic results suggest DTSR does recover model near-optimally+ We want to understand a non-linear and non-convex world

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 140: Discovering psycholinguistic effect timecourses with ...

Hypothesis Testing

+ Despite not being provably optimal+ Synthetic results suggest DTSR does recover model near-optimally+ We want to understand a non-linear and non-convex world

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 141: Discovering psycholinguistic effect timecourses with ...

Hypothesis Testing

+ Despite not being provably optimal+ Synthetic results suggest DTSR does recover model near-optimally+ We want to understand a non-linear and non-convex world

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 142: Discovering psycholinguistic effect timecourses with ...

Hypothesis Testing

+ Three frameworks for using DTSR in hypothesis tests+ Directly compare DTSR models (permutation test)+ Use DTSR to transform predictors as inputs to linear models (2-step test)+ Use DTSR to (1) compute a Rate predictor and (2) motivate spillover structure

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 143: Discovering psycholinguistic effect timecourses with ...

Hypothesis Testing

+ Three frameworks for using DTSR in hypothesis tests+ Directly compare DTSR models (permutation test)+ Use DTSR to transform predictors as inputs to linear models (2-step test)+ Use DTSR to (1) compute a Rate predictor and (2) motivate spillover structure

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 144: Discovering psycholinguistic effect timecourses with ...

Hypothesis Testing

+ Three frameworks for using DTSR in hypothesis tests+ Directly compare DTSR models (permutation test)+ Use DTSR to transform predictors as inputs to linear models (2-step test)+ Use DTSR to (1) compute a Rate predictor and (2) motivate spillover structure

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 145: Discovering psycholinguistic effect timecourses with ...

Hypothesis Testing

+ Three frameworks for using DTSR in hypothesis tests+ Directly compare DTSR models (permutation test)+ Use DTSR to transform predictors as inputs to linear models (2-step test)+ Use DTSR to (1) compute a Rate predictor and (2) motivate spillover structure

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 146: Discovering psycholinguistic effect timecourses with ...

Hypothesis Testing

+ Three frameworks for using DTSR in hypothesis tests+ Directly compare DTSR models (permutation test)

+ Spirit: Machine learning “bakeoff”

+ Use DTSR to transform predictors as inputs to linear models (2-step test)+ Use DTSR to (1) compute a Rate predictor and (2) motivate spillover structure

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 147: Discovering psycholinguistic effect timecourses with ...

Hypothesis Testing

+ Three frameworks for using DTSR in hypothesis tests+ Directly compare DTSR models (permutation test)

+ Spirit: Machine learning “bakeoff”+ Use DTSR to transform predictors as inputs to linear models (2-step test)

+ Spirit: Pre-convolution with canonical HRF in fMRI

+ Use DTSR to (1) compute a Rate predictor and (2) motivate spillover structure

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 148: Discovering psycholinguistic effect timecourses with ...

Hypothesis Testing

+ Three frameworks for using DTSR in hypothesis tests+ Directly compare DTSR models (permutation test)

+ Spirit: Machine learning “bakeoff”+ Use DTSR to transform predictors as inputs to linear models (2-step test)

+ Spirit: Pre-convolution with canonical HRF in fMRI+ Use DTSR to (1) compute a Rate predictor and (2) motivate spillover structure

+ Spirit: Exploratory data analysis

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 149: Discovering psycholinguistic effect timecourses with ...

Hypothesis Testing: Example

Test p-valuePermutation 9.99e-05***

2-Step TBD

In-sample test for effect of Surprisal in Natural Stories

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 150: Discovering psycholinguistic effect timecourses with ...

Other applications of DTSR

+ Other response measures: E.g. HRF deconvolution with naturalistic stimuli

+ 2D predictors: E.g. effects of word cosine similarities

+ Composed IRFs: E.g. separating neural and hemodynamic responses in fMRI

+ Spline kernels: E.g. response shape discovery

+ ...

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 151: Discovering psycholinguistic effect timecourses with ...

Other applications of DTSR

+ Other response measures: E.g. HRF deconvolution with naturalistic stimuli

+ 2D predictors: E.g. effects of word cosine similarities

+ Composed IRFs: E.g. separating neural and hemodynamic responses in fMRI

+ Spline kernels: E.g. response shape discovery

+ ...

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 152: Discovering psycholinguistic effect timecourses with ...

Other applications of DTSR

+ Other response measures: E.g. HRF deconvolution with naturalistic stimuli

+ 2D predictors: E.g. effects of word cosine similarities

+ Composed IRFs: E.g. separating neural and hemodynamic responses in fMRI

+ Spline kernels: E.g. response shape discovery

+ ...

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 153: Discovering psycholinguistic effect timecourses with ...

Other applications of DTSR

+ Other response measures: E.g. HRF deconvolution with naturalistic stimuli

+ 2D predictors: E.g. effects of word cosine similarities

+ Composed IRFs: E.g. separating neural and hemodynamic responses in fMRI

+ Spline kernels: E.g. response shape discovery

+ ...

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 154: Discovering psycholinguistic effect timecourses with ...

Other applications of DTSR

+ Other response measures: E.g. HRF deconvolution with naturalistic stimuli

+ 2D predictors: E.g. effects of word cosine similarities

+ Composed IRFs: E.g. separating neural and hemodynamic responses in fMRI

+ Spline kernels: E.g. response shape discovery

+ ...

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 155: Discovering psycholinguistic effect timecourses with ...

You too can DTSR!

Demo...

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 156: Discovering psycholinguistic effect timecourses with ...

Conclusion

+ DTSR+ Provides plausible, replicable, and high resolution estimates of temporal dynamics+ Affords new insights into the temporal dynamics of reading behavior+ Recovers known ground-truth IRFs with high fidelity+ Applies to variably-spaced time series+ Can help avoid spurious findings due to poor control of temporal diffusion+ Can be integrated into various hypothesis testing frameworks+ Is supported by documented open-source software

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 157: Discovering psycholinguistic effect timecourses with ...

Conclusion

+ DTSR+ Provides plausible, replicable, and high resolution estimates of temporal dynamics+ Affords new insights into the temporal dynamics of reading behavior+ Recovers known ground-truth IRFs with high fidelity+ Applies to variably-spaced time series+ Can help avoid spurious findings due to poor control of temporal diffusion+ Can be integrated into various hypothesis testing frameworks+ Is supported by documented open-source software

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 158: Discovering psycholinguistic effect timecourses with ...

Conclusion

+ DTSR+ Provides plausible, replicable, and high resolution estimates of temporal dynamics+ Affords new insights into the temporal dynamics of reading behavior+ Recovers known ground-truth IRFs with high fidelity+ Applies to variably-spaced time series+ Can help avoid spurious findings due to poor control of temporal diffusion+ Can be integrated into various hypothesis testing frameworks+ Is supported by documented open-source software

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 159: Discovering psycholinguistic effect timecourses with ...

Conclusion

+ DTSR+ Provides plausible, replicable, and high resolution estimates of temporal dynamics+ Affords new insights into the temporal dynamics of reading behavior+ Recovers known ground-truth IRFs with high fidelity+ Applies to variably-spaced time series+ Can help avoid spurious findings due to poor control of temporal diffusion+ Can be integrated into various hypothesis testing frameworks+ Is supported by documented open-source software

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 160: Discovering psycholinguistic effect timecourses with ...

Conclusion

+ DTSR+ Provides plausible, replicable, and high resolution estimates of temporal dynamics+ Affords new insights into the temporal dynamics of reading behavior+ Recovers known ground-truth IRFs with high fidelity+ Applies to variably-spaced time series+ Can help avoid spurious findings due to poor control of temporal diffusion+ Can be integrated into various hypothesis testing frameworks+ Is supported by documented open-source software

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 161: Discovering psycholinguistic effect timecourses with ...

Conclusion

+ DTSR+ Provides plausible, replicable, and high resolution estimates of temporal dynamics+ Affords new insights into the temporal dynamics of reading behavior+ Recovers known ground-truth IRFs with high fidelity+ Applies to variably-spaced time series+ Can help avoid spurious findings due to poor control of temporal diffusion+ Can be integrated into various hypothesis testing frameworks+ Is supported by documented open-source software

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 162: Discovering psycholinguistic effect timecourses with ...

Conclusion

+ DTSR+ Provides plausible, replicable, and high resolution estimates of temporal dynamics+ Affords new insights into the temporal dynamics of reading behavior+ Recovers known ground-truth IRFs with high fidelity+ Applies to variably-spaced time series+ Can help avoid spurious findings due to poor control of temporal diffusion+ Can be integrated into various hypothesis testing frameworks+ Is supported by documented open-source software

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 163: Discovering psycholinguistic effect timecourses with ...

Conclusion

+ DTSR+ Provides plausible, replicable, and high resolution estimates of temporal dynamics+ Affords new insights into the temporal dynamics of reading behavior+ Recovers known ground-truth IRFs with high fidelity+ Applies to variably-spaced time series+ Can help avoid spurious findings due to poor control of temporal diffusion+ Can be integrated into various hypothesis testing frameworks+ Is supported by documented open-source software

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 164: Discovering psycholinguistic effect timecourses with ...

Thank you!

Acknowledgments:Reviewers and participants in CUNY 2018 and EMNLP 2018.

This work was supported by National Science Foundation grants #1551313 and #1816891. Allviews expressed are those of the author and do not necessarily reflect the views of the

National Science Foundation.

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 165: Discovering psycholinguistic effect timecourses with ...

References

Abadi, Martı́n et al. (2015).TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. url:http://download.tensorflow.org/paper/whitepaper2015.pdf.

Dayal, Bhupinder S and John F MacGregor (1996). “Identification of finite impulse responsemodels: methods and robustness issues”. In: Industrial & engineering chemistry research35.11, pp. 4078–4090.

Erlich, Kate and Keith Rayner (1983). “Pronoun assignment and semantic integration duringreading: Eye movements and immediacy of processing”. In:Journal of Verbal Learning & Verbal Behavior 22, pp. 75–87.

Frank, Stefan L et al. (2013). “Reading time data for evaluating broad-coverage models ofEnglish sentence processing”. In: Behavior Research Methods 45.4, pp. 1182–1190.

Friederici, Angela D (2002). “Towards a neural basis of auditory sentence processing”. In:Trends in Cognitive Sciences 6.2, pp. 78–84.

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 166: Discovering psycholinguistic effect timecourses with ...

References

Futrell, Richard et al. (2018). “The Natural Stories Corpus”. In:Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018).Ed. by Nicoletta Calzolari et al. Paris, France: European Language Resources Association(ELRA). isbn: 979-10-95546-00-9.

Kennedy, Alan, James Pynte, and Robin Hill (2003). “The Dundee corpus”. In:Proceedings of the 12th European conference on eye movement.

Levy, Roger (2008). “Expectation-based syntactic comprehension”. In: Cognition 106.3,pp. 1126–1177.

Shain, Cory et al. (2016). “Memory access during incremental sentence processing causesreading time latency”. In:Proceedings of the Computational Linguistics for Linguistic Complexity Workshop.Association for Computational Linguistics, pp. 49–58.

Sims, Christopher A (1980). “Macroeconomics and reality”. In:Econometrica: Journal of the Econometric Society, pp. 1–48.

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 167: Discovering psycholinguistic effect timecourses with ...

References

Staub, Adrian (2015). “The effect of lexical predictability on eye movements in reading: Criticalreview and theoretical interpretation”. In: Language and Linguistics Compass 9.8,pp. 311–327.

Tran, Dustin et al. (2016). “Edward: A library for probabilistic modeling, inference, andcriticism”. In: arXiv preprint arXiv:1610.09787.

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 168: Discovering psycholinguistic effect timecourses with ...

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 169: Discovering psycholinguistic effect timecourses with ...

Appendix: Synthetic data generation procedure

+ 10,000 data points 100ms apart

+ 20 randomly sampled covariates ∼ N(0, 1)

+ 20 unique coefficientsU(−50, 50)

+ 20 unique IRF+ k ∼ U(1, 6)+ θ ∼ U(0, 5)+ δ ∼ U(0, 1)

+ Noise added ∼ N(0, 202)

+ DTSR history window clipped at 128 observations

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 170: Discovering psycholinguistic effect timecourses with ...

Appendix: Reading time experiments

+ Natural Stories (Futrell et al. 2018)+ Constructed narratives, self-paced reading, 181 subjects, 485 sentences, 10,245 tokens,

848,768 fixation events+ Post-processing: Removed sentence boundaries, events for which subjects missed 4+

comprehension questions and fixations < 100 ms or > 3000 ms.

+ Dundee (Kennedy, Pynte, and Hill 2003)+ Newspaper editorials, eye-tracking, 10 subjects, 2,368 sentences, 51,502 tokens, 260,065

fixation events+ Post-processing: Removed document, screen, sentence, and line boundaries

+ UCL (Frank et al. 2013)+ Sentences from novels presented in isolation, eye-tracking, 42 subjects, 205 sentences,

1,931 tokens, 53,070 fixation events+ Post-processing: Removed sentence boundaries

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 171: Discovering psycholinguistic effect timecourses with ...

Appendix: Reading time experiments

+ Baselines+ LME (lme4) and GAM (mgcv)+ By-subject intercepts and slopes+ Spillover variants

+ No predictors spilled over+ Spillover 0-3 for each predictor (-S)

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,

Page 172: Discovering psycholinguistic effect timecourses with ...

Appendix: Reading time experiments

+ Data split+ Train (50%)+ Dev (25%)+ Test (25%)

Shain & Schuler (2018). Deconvolutional time series regression: A technique for modeling temporally diffuse effects. EMNLP 2018.,