SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved....

21
SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and international treaties. Unauthorized reproduction of this presentation, or any portion of it, may result in severe civil and criminal penalties and will be prosecuted to maximum extent possible under the law.

Transcript of SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved....

Page 1: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

SAS for Categorical Data

Copyright © 2004 Leland Stanford Junior University. All rights reserved.Warning: This presentation is protected by copyright law and international treaties. Unauthorized reproduction of this presentation, or any portion of it, may result in severe civil and criminal penalties and will be prosecuted to maximum extent possible under the law.

Page 2: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

SAS SAS is a huge integrated data management

and analysis suite. It takes years to master 20% of SAS. Most people take weeks if not months to get comfortable working with it.

The course I teach has online slides which demonstrate how to do categorical data analyses as well as data management. http://www.stanford.edu/class/hrp223/ Topic 0 has information on using SAS as a

calculator.

Page 3: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

Using SAS When you start SAS in a windowing

environment you automatically have access to at least 4 windows. The (enhanced) program editor is a place where

you type instructions to SAS. The log window gives you feedback on how SAS

interprets your work. The output window displays any printed results

from your request. There is also a two pained window. One,

Explorer, allows you to look at data sets. The other, Results, acts like a hyperlinked table of contents for the output window.

Page 4: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

Telling SAS what to do.

You type instructions in the program editor and then push the run button.

The instructions you will use for this class will be data steps (to create data sets) and procedures (to analyze data sets).

Page 5: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

Data steps In data steps you can create variables (a variable is

just like a box that can hold either numbers or letters). You can do math on variables including using functions that are build into SAS.

data work.someData;theAnswer = 1 + 1;

run; After you type the instructions you have to tell SAS to

actually do the work. Push the running person icon to do this.

Page 6: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

Data steps The above code will

create a data set that will exist until you quit SAS. You can view it as if it was a spreadsheet by double clicking on Libraries then the Work library and finally the data set inside the SAS Explorer window.

Page 7: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

Functions SAS has thousands of functions built in:data work.blah;

numberOne = 1;someTrigThing = sin(numberOne);

run; I have tried to document the ones that

students frequently need in Lecture 2 of 223. Take a look at the slides labeled Frequently Used Functions.

Page 8: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

Finding fuctions … or you can look up the function in the SAS

online documentation. One of the useful links in the useful links

section of the class website http://www.stanford.edu/class/hrp223/2002f/usefulLinks.html

is the SAS online documentation. The URL of SAS OnLineDoc is:

http://v9doc.sas.com/sasdoc/ If you enter a bad password 3 times and it

will take you to the registration page. Access to the documentation is free.

Page 9: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

Example of a Function

If you roll a die 50 times what's the chance that you'll get more than 10 "6"'s?

data work.pfft;x = 1 - CDF('BINOMIAL',10, 1/6, 50) ;

run;

Page 10: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

Procedures

SAS has many built in statistical analysis procedures. The ones you will use for this class are: proc freq – contingency tables

See 223 topics 12 and 13 proc logistic – logistic regression

See 223 topics 14 and 15

Page 11: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

Real data looks like this: data work.epi; input subjectID exposure $ disease $; datalines; 1 exposed Diseased 2 exposed Diseased 3 exposed Diseased 4 exposed Diseased 5 exposed notDiseased 6 notExposed notDiseased 7 exposed Diseased 8 exposed Diseased 9 exposed Diseased 10 notExposed notDiseased 11 exposed notDiseased 12 exposed Diseased 13 notExposed Diseased 14 notExposed notDiseased 15 exposed Diseased 16 exposed Diseased 17 exposed notDiseased 18 notExposed notDiseased 19 exposed Diseased 20 exposed Diseased 21 exposed Diseased 22 notExposed notDiseased 23 exposed notDiseased 24 exposed Diseased 25 notExposed notDiseased 26 exposed notDiseased 27 notExposed notDiseased 28 exposed Diseased ; run;

Page 12: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

Contingency tables You can get a frequency table like this:proc freq data = work.epi;

tables exposure * disease;run;

Page 13: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

Contingency tables analysisproc freq data= epi;

tables exposure*disease /chisq;run;

Page 14: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

Grouped Data You will get grouped data in statistics classes… In a case-control study of 50 patients with pancreatic

cancer and 50 hospital controls, 15 patients and 25 controls are non-coffee-drinkers, 15 patients and 10 controls are mid-level coffee drinkers, and 20 patients and 15 controls are high-octane coffee addicts.  What are the odds ratios for the association between coffee drinking and pancreatic cancer (comparing high to low, high to none, low to none, and any to none)?

Page 15: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

Grouped Datadata work.epi; input exposure $ disease $ people; datalines;notExposed diseased 15notExposed notDiseased 25little diseased 15little notDiseased 10lots diseased 20lots notDiseased 15;run;

Page 16: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

Problems…

proc freq data = epi;tables exposure * disease;

run;

Page 17: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

Weighted dataproc freq data = epi;

weight people;tables exposure * disease;

run;

Page 18: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

Analysis of weighted dataproc freq data = epi;

weight people;tables exposure * disease /relrisk;where exposure in ("notExpos", "lots");

run;

Page 19: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

Other groups

Just copy and paste the proc freq and pick different groups.

To get the combined groups use a character format (if you took 223) or just add the two exposed groups by hand.

Page 20: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

Formats in Freqproc format;

value $coffee"lots" = "Exposed""little" = "Exposed""notExpos" = "notExpos";

run;

proc freq data = epi;weight people;format exposure $coffee.;tables exposure * disease /relrisk;

run;

Page 21: SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.

Analysis of Formatted Grouped Data