STATA_BC_PLINK.RJLA.NOV2007.ppt

64
:NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BIOSTATISTIC/BIOINFORMATIC TOOLS FOR BIOSTATISTIC/BIOINFORMATIC TOOLS FOR GENETICS DATA: GENETICS DATA: DATA MANAGEMENT AND ANALYSIS DATA MANAGEMENT AND ANALYSIS RICHARD ANNEY RICHARD ANNEY NEUROPSYCHIATRIC GENETICS RESEARCH GROUP NEUROPSYCHIATRIC GENETICS RESEARCH GROUP WORKSHEET, TUTORIALS AND SLIDES AVAILABLE ON P:\Personal Folders\anneyr\stata9\talk http://www.medicine.tcd.ie/psychiatry/research/ neuropsychiatry/

description

 

Transcript of STATA_BC_PLINK.RJLA.NOV2007.ppt

Page 1: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

BIOSTATISTIC/BIOINFORMATIC TOOLS FOR BIOSTATISTIC/BIOINFORMATIC TOOLS FOR GENETICS DATA: GENETICS DATA:

DATA MANAGEMENT AND ANALYSISDATA MANAGEMENT AND ANALYSIS

RICHARD ANNEYRICHARD ANNEYNEUROPSYCHIATRIC GENETICS RESEARCH GROUPNEUROPSYCHIATRIC GENETICS RESEARCH GROUP

WORKSHEET, TUTORIALS AND SLIDES AVAILABLE ON

P:\Personal Folders\anneyr\stata9\talk

http://www.medicine.tcd.ie/psychiatry/research/neuropsychiatry/

Page 2: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

OverviewOverview

Page 3: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

STATA9STATA9

• A STATISTICAL SOFTWARE PACKAGE

• LESS PRETTY THAN SPSS GUI

• POWERFUL AND “SCRIPT” FRIENDLY

• LESS CLICKING AND DROP-DOWN …MORE SCRIPTING

Page 4: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

STATA9: SET UP FOLDER STRUCTURESTATA9: SET UP FOLDER STRUCTURE

• SET UP FOLDERS TO STORE YOUR;

• DO-FILES

• CR FILE• AN FILE

• DTA-FILES

• LOG-FILES

• INPUT-FILES (TXT)

• OUTPUT-FILES

Page 5: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PROBLEM 1: PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDYBASIC CASE-CONTROL ASSOCIATION STUDY

• HOW DO I GET FILES INTO STATA?

• HOW DO I MERGE MY DATA WITH ANOTHER FILE?

• CAN I GENERATE A FEW BASIC STATISTICS ON MY MARKERS?

• CAN I PERFORM A CASE-CONTROL STUDY?

• IS MY QUANTITATIVE VARIABLE ASSOCIATED WITH A GENOTYPE?

Page 6: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

STATA9: LOOK AT ME!! MAIN WINDOWSTATA9: LOOK AT ME!! MAIN WINDOW

Page 7: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

STATA9: LOOK AT ME!! DO-WINDOWSTATA9: LOOK AT ME!! DO-WINDOW

Page 8: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

STATA9: LOOK AT ME!! MAIN WINDOWSTATA9: LOOK AT ME!! MAIN WINDOW

Page 9: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

STATA9: LOOK AT ME!! DTA-EDITOR STATA9: LOOK AT ME!! DTA-EDITOR WINDOWWINDOW

Page 10: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PROBLEM 1: PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDYBASIC CASE-CONTROL ASSOCIATION STUDY

Page 11: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PROBLEM 1: PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDYBASIC CASE-CONTROL ASSOCIATION STUDY

cr00 genotype_qtlsnp.do

1. ADDING TAB-TEXT FILES TO STATA USING THE INSHEET COMMAND, SORTING THE KEY VARIABLE USING THE SORT COMMAND AND SAVE AS *.DTA FILES USING THE SAVE COMMAND

2. CONVERTING “STRINGS” TO NUMBER VARIABLES USING THE GENERATE AND REPLACE COMMAND

3. MERGING USING THE KEY VARIABLE USING THE MERGE COMMAND

4. TABULATING THE MERGE USING THE TABULATE COMMAND AND ORDER VARIABLES USING THE ORDER VARIABLE

Page 12: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PROBLEM 1: PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDYBASIC CASE-CONTROL ASSOCIATION STUDY

cr00 genotype_qtlsnp.do

1. ADDING TAB-TEXT FILES TO STATA USING THE INSHEET COMMAND, SORTING THE KEY VARIABLE USING THE SORT COMMAND AND SAVE AS *.DTA FILES USING THE SAVE COMMAND

2. CONVERTING “STRINGS” TO NUMBER VARIABLES USING THE GENERATE AND REPLACE COMMAND

3. MERGING USING THE KEY VARIABLE USING THE MERGE COMMAND

4. TABULATING THE MERGE USING THE TABULATE COMMAND AND ORDER VARIABLES USING THE ORDER VARIABLE

Page 13: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PROBLEM 1: PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDYBASIC CASE-CONTROL ASSOCIATION STUDY

cr00 genotype_qtlsnp.do

1. ADDING TAB-TEXT FILES TO STATA USING THE INSHEET COMMAND, SORTING THE KEY VARIABLE USING THE SORT COMMAND AND SAVE AS *.DTA FILES USING THE SAVE COMMAND

2. CONVERTING “STRINGS” TO NUMBER VARIABLES USING THE GENERATE AND REPLACE COMMAND

3. MERGING USING THE KEY VARIABLE USING THE MERGE COMMAND

4. TABULATING THE MERGE USING THE TABULATE COMMAND AND ORDER VARIABLES USING THE ORDER VARIABLE

Page 14: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PROBLEM 1: PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDYBASIC CASE-CONTROL ASSOCIATION STUDY

cr00 genotype_qtlsnp.do

1. ADDING TAB-TEXT FILES TO STATA USING THE INSHEET COMMAND, SORTING THE KEY VARIABLE USING THE SORT COMMAND AND SAVE AS *.DTA FILES USING THE SAVE COMMAND

2. CONVERTING “STRINGS” TO NUMBER VARIABLES USING THE GENERATE AND REPLACE COMMAND

3. MERGING USING THE KEY VARIABLE USING THE MERGE COMMAND

4. TABULATING THE MERGE USING THE TABULATE COMMAND AND ORDER VARIABLES USING THE ORDER VARIABLE

Page 15: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PROBLEM 1: PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDYBASIC CASE-CONTROL ASSOCIATION STUDY

• THE COMBINED *.DTA FILE

• THE TABULATE FUNCTION

• 1= ONLY IN 1st FILE

• 2=ONLY IN 2nd FILE

• 3=IN BOTH 1st & 2nd FILE

Page 16: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PROBLEM 1: PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDYBASIC CASE-CONTROL ASSOCIATION STUDY

cr00 genotype_qtlsnp.do

1. ADDING TAB-TEXT FILES TO STATA USING THE INSHEET COMMAND, SORTING THE KEY VARIABLE USING THE SORT COMMAND AND SAVE AS *.DTA FILES USING THE SAVE COMMAND

2. CONVERTING “STRINGS” TO NUMBER VARIABLES USING THE GENERATE AND REPLACE COMMAND

3. MERGING USING THE KEY VARIABLE USING THE MERGE COMMAND

4. TABULATING THE MERGE USING THE TABULATE COMMAND AND ORDER VARIABLES USING THE ORDER VARIABLE

Page 17: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PROBLEM 1: PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDYBASIC CASE-CONTROL ASSOCIATION STUDY

an00 genotype_qtlsnp.do

• CREATING THE LOG FILE USING THE LOG COMMAND

• OPENING THE *.DTA FILE USING THE USE COMMAND

• CREATING GENOTYPE VARIABLES FROM ALLELE VARIABLES USING GTYPE PROTOCOL

• TABULATE THE GENOTYPE VARIABLES USING THE TABULATE COMMAND

Page 18: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PROBLEM 1: PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDYBASIC CASE-CONTROL ASSOCIATION STUDY

1. TEST HWE USING GTAB COMMAND

2. TEST HWE USING GENHW COMMAND

Page 19: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PROBLEM 1: PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDYBASIC CASE-CONTROL ASSOCIATION STUDY

1. TEST PAIR-WISE LINKAGE DISEQUILIBRIUM USING PWLD COMMAND

2. TEST ASSOCIATION WITH BINARY TRAIT USING GENCC COMMAND

Page 20: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PROBLEM 1: PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDYBASIC CASE-CONTROL ASSOCIATION STUDY

• QTLSNP COMMAND MODELS

• CODOMINANT (THREE MODELS)

• DOMINANT

• RECESSIVE

Page 21: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PROBLEM 1: PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDYBASIC CASE-CONTROL ASSOCIATION STUDY

1. TEST WHETHER A QUANTITATIVE VARIABLE IS ASSOCIATED WITH DIFFERENT INHERITENCE MODELS USING QTLSNP COMMAND - CODOMINANT

Page 22: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PROBLEM 1: PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDYBASIC CASE-CONTROL ASSOCIATION STUDY

1. TEST WHETHER A QUANTITATIVE VARIABLE IS ASSOCIATED WITH DIFFERENT INHERITENCE MODELS USING QTLSNP COMMAND – DOMINANT

2. NOT ASSOCIATED SO MINIMAL OUTPUT

Page 23: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PROBLEM 1: PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDYBASIC CASE-CONTROL ASSOCIATION STUDY

1. TEST WHETHER A QUANTITATIVE VARIABLE IS ASSOCIATED WITH DIFFERENT INHERITENCE MODELS USING QTLSNP COMMAND - RECESSIVE

Page 24: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PROBLEM 1: PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDYBASIC CASE-CONTROL ASSOCIATION STUDY

Page 25: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

BC|SNPmax©BC|SNPmax©

• DATABASE AND ANALYSIS PLATFORM

• MASTER DATABASE FOR STORING ALL OUR “MASTER” GENETIC AND PHENOTYPE DATASETS

• ONGOING PROCESS TO UPLOAD AND MANAGE DATA

Page 26: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

BC|SNPmax: StructureBC|SNPmax: Structure

• FIVE DOMAINS;

1. GENOTYPES/SNPS

2. MAPS

3. PEDIGREES

4. AFFECTION

5. PHENOTYPES

Page 27: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

BC|SNPmax: StructureBC|SNPmax: Structure

• FIVE DOMAINS;

1. GENOTYPES/SNPS

2. MAPS

3. PEDIGREES

4. AFFECTION

5. PHENOTYPES

Page 28: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

BC|SNPmax: StructureBC|SNPmax: Structure

• FIVE DOMAINS;

1. GENOTYPES/SNPS

2. MAPS

3. PEDIGREES

4. AFFECTION

5. PHENOTYPES

Page 29: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

BC|SNPmax: StructureBC|SNPmax: Structure

• FIVE DOMAINS;

1. GENOTYPES/SNPS

2. MAPS

3. PEDIGREES

4. AFFECTION

5. PHENOTYPES

Page 30: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

BC|SNPmax: StructureBC|SNPmax: Structure

• FIVE DOMAINS;

1. GENOTYPES/SNPS

2. MAPS

3. PEDIGREES

4. AFFECTION

5. PHENOTYPES

Page 31: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

BC|SNPmax: StructureBC|SNPmax: Structure

• FIVE DOMAINS;

1. GENOTYPES/SNPS

2. MAPS

3. PEDIGREES

4. AFFECTION

5. PHENOTYPES

Page 32: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

FROM OUTPUT TO GEN-FILE (VIA FROM OUTPUT TO GEN-FILE (VIA STATA)STATA)

• TWO EXAMPLES

1. BASIC EXCEL FILE

2. TAQ-MAN FILE

Page 33: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

FROM OUTPUT TO GEN-FILE (VIA STATA):FROM OUTPUT TO GEN-FILE (VIA STATA):BASIC EXCEL FILEBASIC EXCEL FILE

Page 34: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

FROM OUTPUT TO GEN PED AFF-FILE (VIA FROM OUTPUT TO GEN PED AFF-FILE (VIA STATA):STATA):BASIC EXCEL FILEBASIC EXCEL FILE

Page 35: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

FROM OUTPUT TO GEN-FILE (VIA STATA):FROM OUTPUT TO GEN-FILE (VIA STATA):BASIC EXCEL FILEBASIC EXCEL FILE

Page 36: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

FROM OUTPUT TO GEN-FILE (VIA STATA):FROM OUTPUT TO GEN-FILE (VIA STATA):BASIC EXCEL FILEBASIC EXCEL FILE

Page 37: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

FROM OUTPUT TO GEN-FILE (VIA STATA):FROM OUTPUT TO GEN-FILE (VIA STATA):BASIC EXCEL FILEBASIC EXCEL FILE

Page 38: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

FROM OUTPUT TO GEN-FILE (VIA STATA):FROM OUTPUT TO GEN-FILE (VIA STATA): TAQ-MAN FILE TAQ-MAN FILE

Page 39: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

FROM OUTPUT TO GEN-FILE (VIA STATA):FROM OUTPUT TO GEN-FILE (VIA STATA): TAQ-MAN FILE TAQ-MAN FILE

Page 40: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

FROM OUTPUT TO GEN-FILE (VIA STATA):FROM OUTPUT TO GEN-FILE (VIA STATA): TAQ-MAN FILE TAQ-MAN FILE

Page 41: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

BC|SNPmax BC|SNPmax

Page 42: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

BC|SNPmax: Types of AnalysisBC|SNPmax: Types of Analysis

• QUALITY• PED-CHECK• MERLIN• BASIC MEASURES (MAF, HWE,

CALL)• FAMILY-BASED

• MENDEL• MERLIN• GENEHUNTER• SIMWALK• FBAT/PBAT• TRANSMIT• QTDT• PLINK• HAPLOVIEW• R-PACKAGE

• CASE-CONTROL• ALLELE ASSOCIATION• MENDEL• PHASE• SNPHAP• PLINK• R-PACKAGE

Page 43: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

BC|SNPmax: Types of AnalysisBC|SNPmax: Types of Analysis

• FOR MOST ANALYSIS YOU NEED TO SELECT MATCHED

• GEN

• PED

• MAP – b128 NOW UPLOADED

• AFF

Page 44: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

BC|SNPmax BC|SNPmax

Page 45: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

BC|SNPmax BC|SNPmax

Page 46: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

BC|SNPmax BC|SNPmax

Page 47: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

BC|SNPmax BC|SNPmax

Page 48: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

BC|SNPmax BC|SNPmax

Page 49: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

BC|SNPmax BC|SNPmax

Page 50: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

BC|SNPmax BC|SNPmax

Page 51: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

BC|SNPmax BC|SNPmax

Page 52: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

BC|SNPmax BC|SNPmax

Page 53: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PLINK… GETTING STARTEDPLINK… GETTING STARTED

Page 54: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PLINK…PLINK…

• RUNNING PLINK FROM YOUR OWN COMPUTER

• WHY?

1. MULTIPLE ANALYSES2. KEEP A RECORD OF YOUR WORK IN BAT AND SCRPT3. EASE OF USE4. EASE OF REPEATING TASK5. SCRIPTS NOT DROP DOWN MENUS6. RUNNING >1 CHROMOSOME (BC|SNPmax

ADDRESSED)7. POST-ANALYSIS INTERGRATION USING PERL AND

STATA

Page 55: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PLINK…PLINK…

• FOLDER STRUCTURE

• ANALYSIS

• DATASET

• OUTPUT

Page 56: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PLINK… DATASETSPLINK… DATASETS

• PED & MAP

• BINARY FILES

• BINARY PED (BED)

• BINARY MAP (BIM)

• FAMILY FILES (FAM)

Page 57: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PLINK…PLINK…

• PED & MAP

• BINARY FILES

• BINARY PED (BED)

• BINARY MAP (BIM)

• FAMILY FILES (FAM)

Page 58: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PLINK…PLINK…

• PED & MAP

• BINARY FILES

• BINARY PED (BED)

• BINARY MAP (BIM)

• FAMILY FILES (FAM)

Page 59: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PLINK…PLINK…

• PED & MAP

• BINARY FILES

• BINARY PED (BED)

• BINARY MAP (BIM)

• FAMILY FILES (FAM)

Page 60: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PLINK…PLINK…

• PED & MAP

• BINARY FILES

• BINARY PED (BED)

• BINARY MAP (BIM)

• FAMILY FILES (FAM)

Page 61: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

EXAMPLE ANALYSES IN PLINK…EXAMPLE ANALYSES IN PLINK…

• DATA TRANSFORMATION

• DATA FILTERING AND PRUNING

• DATA MERGING

• SUMMARY STATS

• MISSINGNESS

• HWE

• MAF

• MENDEL ERRORS

• INCLUSION THRESHOLDS

• POPULATION STRATIFICATION

• ASSOCIATION

• CASE/CONTROL

• QTL

• GxE

• NEW MULTIPLE CORRECTION TESTING (--adjust)

• FAMILY-BASED• TDT• POO

• PERMUTATION• EPISTASIS• HAPLOTYPE ANALYSIS• NEW PROXY-ASSOCIATION (FROM

SNP TO HAPLOTYPE)• R-PACKAGE• NEW MODIFY OUTPUT

• PLOG10• P<x• GENOMIC CONTROL• QQ-PLOT

Page 62: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PLINK… : RUNNING TDT IN PLINKPLINK… : RUNNING TDT IN PLINK

• CAN RUN FROM COMMAND LINE AND USING gPLINK (GUI)

• RECOMMEND BAT AND SCRPT FILES

Page 63: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

PLINK… : SUMMARY TABLES IN STATAPLINK… : SUMMARY TABLES IN STATA

• INSHEET THE TDT.CLEAN FILE

• ADD GENE NAMES

• ADD CHROMOSOME POSITION

• ADJUST OR TO RISK

• GENERATE GRAPHS OF DATA

• GENERATE TABLES BY GENE

• GENERATE TABLES BY POSITION

• GENERATE TABLES BY P-VALUE

• SELECT COLUMNS FOR OTHER ANALYSES (GENMAPP)

Page 64: STATA_BC_PLINK.RJLA.NOV2007.ppt

:NEUROPSYCHIATRIC GENETICS

[BIOSTATISTICS|BIOINFORMATICS] CORE

THE END!THE END!