Seminar October 18 , 2002 - High Performance Computing ... · Biostatistics for Dummies Biomedical...

23
Biostatistics for Dummies Biomedical Computing Cross-Training Seminar October 18 th , 2002

Transcript of Seminar October 18 , 2002 - High Performance Computing ... · Biostatistics for Dummies Biomedical...

Biostatistics for Dummies

Biomedical Computing Cross-Training Seminar

October 18th, 2002

What is “Biostatistics”?Techniquesl Mathematicsl Statisticsl Computing

Datal Medicinel Biology

What is “Biostatistics”?

Biological data

Knowledge of biological process

Common Applications(Medical and otherwise)

Clinical medicineEpidemiologicstudiesBiological laboratory researchBiological field researchGenetics

Environmental healthHealth servicesEcologyFisheriesWildlife biologyAgricultureForestry

Biostatisticians Work

Develop study designConduct analysisOversee and regulateDetermine policyTraining researchersDevelopment of new methods

Some Statistics on Biostatistics

Internet search (Google) > 210,000 hits

> 50 Graduate Programs in U.S.

Too much to cover inone hour!

Center Focus

MSU strengthsl Computational

simulation in physical sciences

l Environmental health sciences

Bioinformatics is crowded

Computational simulation in environmental health sciencesl Build on appreciable

MSU strengthl Establish ourselves

l Unique capability l Particular appeal to

NIEHS

Focus of Seminar

Statistical methodologiesl Computational simulation in environmental

health sciencesl Can be classified as “biostatistics”

Stochastic modelingl Time seriesl Spatial statistics*

The Application

Of interestl Cancer incidence ratel Pesticide exposure

Of concernl Agel Genderl Racel Socioeconomic status

Objectivesl Suitably adjust

cancer incidence rate

l Determine if relationship exists

l Develop modell Explain relationshipl Estimate cancer ratel Predict cancer rate

The Data

N.S.S. & U.S. Dept. of Commerce National T.I.S. (1972-2001, by county)l Number of acres

harvestedl Type of crop

MS State Dept. Health Central Cancer Registry (1996 – 1998, by person)l Tumor typel Agel Genderl Racel County of residencel Cancer morbidity

l Crude incidence/100,000

l Age adjusted incidence/100,000

Why (Bio)statistics?

Statisticsl Science of uncertaintyl Model order from

disorder

Disorder existsl Large scale rational

explanationl Smaller scale residual

uncertainty

Chaos

Deterministic equation Randomness

x0

Entropy

(Bio)statistical Data

Independent identically distributed Inhomogeneous dataDependent datal Time seriesl Spatial statistics

Time SeriesIdentically distributedTime dependentEqually spaced Randomness

Objectives in Time Series

Graphical descriptionl Time plotsl Correlation plotsl Spectral plots

ModelingInferencePrediction

Time Series Models

Linear Models Covariance stationaryl Constant meanl Constant variancel Covariance function

of distance in timee(t) ~ i.i.dl Zero meanl Finite variance

f square summable

Nonlinear Time Series

Amplitude-frequency dependenceJump phenomenonHarmonicsSynchronizationLimit cycles

Biomedical applicationsl Respirationl Lupus-erythematosis l Urinary introgen

excretionl Neural sciencel Human pupillary

system

Some Nonlinear Models

Nonlinear ARl Additive noise

Threshold l ARl Smoothed TARl Markov chain drivenl Fractals

Amplitude-dependent exponential ARBilinearAR with conditional heteroscedasticityFunctional coefficient AR

A Threshold Model

A Threshold Model

Describing Correlation

Autocorrelationl AR: exponential decayl MA: 0 past q

Partial autocorrelationl AR: 0 past pl MA: exponential decay

Cross-correlationRelationship to spectral density

Spatial Statistics*

Data componentsl Spatial locations

S = {s1,s2,…,sn}l Observable variable

{Z(s1),Z(s2),…,Z(sn)}l sÎ D Ì Rk

Correlation

Data structuresl Geostatisticall Latticel Point patterns or

marked spatial point processes

l Objects

Assumptions on Zand D

Biological Applications

Geostatisticsl Soil sciencel Public health

Latticel Remote sensingl Medical imaging

Point patternsl Tumor growth ratel In vitro cell growth

Spatial Temporal Models

Combine time series with spatial dataApplicationl Time element

timel Pesticide exposure develop cancer

l Spatial element

l Proximity to pesticide use