17. Reading free-format data - Conservatoire national des...

17
17. Reading free-format data GIORGIO RUSSOLILLO - Cours de prépara)on à la cer)fica)on SAS «Base Programming» 377

Transcript of 17. Reading free-format data - Conservatoire national des...

Page 1: 17. Reading free-format data - Conservatoire national des ...maths.cnam.fr/IMG/pdf/SAS_17_cle43dced.pdf · 17. Reading free-format data GIORGIO RUSSOLILLO - Cours de préparaon à

17.Readingfree-formatdata

GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming» 377

Page 2: 17. Reading free-format data - Conservatoire national des ...maths.cnam.fr/IMG/pdf/SAS_17_cle43dced.pdf · 17. Reading free-format data GIORGIO RUSSOLILLO - Cours de préparaon à

Readingfreeformatdata:Thelistinput

378GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»

Arawdatasetisfree-formatwhenitisnotarrangedinfixedfields.->Fieldsareseparatedbyadelimiter

Listinputallowsreadingstandardandnon-standarddata

DATASAS-dataset(s);INFILEfile-specifica+on<op+ons>;INPUTvariable<$>;

RUN;

-  Inlistinputyoudont’needtospecifystartcolandendcol-  youindicateadelimiter(Default=‘’)forsepara)ngthefields

Becauselistinputdoesnotspecifycolumnloca)ons,-  Allfieldsmustbeseparatedbyatleastoneblankoranotherdelimiter-  Fieldsmustbereadinorderfromle|toright-  Youcannotskiporre-readfields

Page 3: 17. Reading free-format data - Conservatoire national des ...maths.cnam.fr/IMG/pdf/SAS_17_cle43dced.pdf · 17. Reading free-format data GIORGIO RUSSOLILLO - Cours de préparaon à

Theop)onDLM=

379GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»

DLMisanop)onoftheINFILEstatement:

INFILEfile-specifica+onDLM=delimiter(s);

DELIMITER=isanaliasforDLM=

delimiter(s)canbe:-  alistof(upto200)characters(inclosedinquota)onmarks)toreadasdelimiters.

-  ThedelimitermustNOTbeacharacterthatoccursinadatavalue-  acharactervariablewhosevaluebecomethedelimiter

FILENAMEimport"\\psf\Home\Documents\MySASFiles\ReadingRawData\Freefinput.txt";DATAimp_Freefinput;

INFILEimportDLM=",";INPUTGender$AgeHeightYears;

RUN;PROCPRINTDATA=imp_Freefinput;RUN;

Page 4: 17. Reading free-format data - Conservatoire national des ...maths.cnam.fr/IMG/pdf/SAS_17_cle43dced.pdf · 17. Reading free-format data GIORGIO RUSSOLILLO - Cours de préparaon à

Specifyingalistofvariables

380GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»

WhenusingListinput,youcanspecifyarangeofvariablesintheINPUTstatement.Ifyouspecifyarangeofcharactervariablesboththevariablesnamesandthe$symbolmustbeenclosedinparentheses.E.g.:-  INPUTGio1-Gio3;-  INPUT(Gio1-Gio3)($);

Page 5: 17. Reading free-format data - Conservatoire national des ...maths.cnam.fr/IMG/pdf/SAS_17_cle43dced.pdf · 17. Reading free-format data GIORGIO RUSSOLILLO - Cours de préparaon à

Readingmissingvaluesattheendofarecord

381GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»

MISSOVERisanop)onoftheINFILEstatement:

INFILEfile-specifica+onMISSOVER;

-  TheMISSOVER op)on prevents SAS from reading the next record if, whenusing list input, it doesnotfind values in the current line for all the INPUTstatementvariables.

-  Attheendofthecurrentrecord,valuesthatareexpectedbutnotfoundaresettomissing.

Page 6: 17. Reading free-format data - Conservatoire national des ...maths.cnam.fr/IMG/pdf/SAS_17_cle43dced.pdf · 17. Reading free-format data GIORGIO RUSSOLILLO - Cours de préparaon à

Readingmissingdataatthebeginningormiddleofarecord

382GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»

TheDelimiterSensi)veData(DSD)op)on:-  Treatstwoconsecu)vedelimitersasamissingvalue-  Removesquota)onmarksfromvalues-  NB:Itsetsthedefaultdelimitertoacomma*

DSDisanop)onoftheINFILEstatement:

INFILEfile-specifica+onDSD;

*Ifthedatausesmul)pledelimitersorasingledelimiterotherthatacomma,simplyspecifydelimitervalues(s)intheDLM=op)on

TheDSDop)oncanalsobeusedtoreadrawdatawhenthereisamissingvalueatthebeginningofarecord,aslongasadelimiterprecedesthefirstvalueintherecord

TheDSDop)onisusedtoreadrawdatawhenthereisamissingvalueinthemiddleofarecord

Page 7: 17. Reading free-format data - Conservatoire national des ...maths.cnam.fr/IMG/pdf/SAS_17_cle43dced.pdf · 17. Reading free-format data GIORGIO RUSSOLILLO - Cours de préparaon à

Example:MISSOVERandDSD(1)

383GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»

FILENAMEimport'\\psf\Home\Documents\MySASFiles\ReadingRawData\FreefMDinput.txt';DATAimp_FreefMDinput;

INFILEimport;INPUTGender$AgeHeightYears;

RUN;PROCPRINTdata=imp_FreefMDinput;RUN;

SASmisinterpretsthethesecondlineanditcannotreadthethird…weneedMISSOVER

FILENAMEimport'\\psf\Home\Documents\MySASFiles\ReadingRawData\FreefMDinput.txt';DATAimp_FreefMDinput;

INFILEimportMISSOVER;INPUTGender$AgeHeightYears;

RUN;PROCPRINTdata=imp_FreefMDinput;RUN;

Problematthefi|hline…weneedDSD

Page 8: 17. Reading free-format data - Conservatoire national des ...maths.cnam.fr/IMG/pdf/SAS_17_cle43dced.pdf · 17. Reading free-format data GIORGIO RUSSOLILLO - Cours de préparaon à

Example:MISSOVERandDSD(2)

384GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»

DATAimp_FreefMDinput;INFILEimportMISSOVERDSD;INPUTGender$AgeHeightYears;

RUN;PROCPRINTDATA=imp_FreefMDinput;RUN;

ThedefaultdelimiterwhenusingDSDiscomma…

DATAimp_FreefMDinput;INFILEimportMISSOVERDSDDLM="";INPUTGender$AgeHeightYears;

RUN;PROCPRINTDATA=imp_FreefMDinput;RUN;

..weneedDLM=op)on!

Page 9: 17. Reading free-format data - Conservatoire national des ...maths.cnam.fr/IMG/pdf/SAS_17_cle43dced.pdf · 17. Reading free-format data GIORGIO RUSSOLILLO - Cours de préparaon à

TheLENGTHstatement

385GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»

-  Whenusinglistinput,charactervariablesareassignedadefaultlengthof8-  Iftheyarelongerthan8,theyaretruncated-  Youcanavoidtrunca)ngifyouaddaLENGTHstatementbeforetheinputstatement

N.B.:-  UsingLENGTHstatement,youdonotneedtospecifyvariable’stypeintheINPUT

statement.However,leavingthe$intheINPUTstatementwillnotproduceanerror

-  AvariablethatisdefinedinaLENGTHstatementwillappearfirstinthedataset,sinceitprecedestheINPUTstatement

FILENAMEimport'\\psf\Home\Documents\MySASFiles\ReadingRawData\Lengthlis)nput.txt';DATAimp_Lengthlis)nput;

INFILEimportOBS=3;LENGTHCity$12;INPUTCityPop86;

RUN;PROCPRINTdata=imp_Lengthlis)nput;RUN;

NB:OBS=andFIRSTOBS=workalsoinINFILEstatement

Page 10: 17. Reading free-format data - Conservatoire national des ...maths.cnam.fr/IMG/pdf/SAS_17_cle43dced.pdf · 17. Reading free-format data GIORGIO RUSSOLILLO - Cours de préparaon à

Modifiedlistinput:Readingvaluescontainingembeddedblanks

386GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»

-  Theampersand(&)modifierenablesyoutoreadcharactervaluesthatcontainsingleembeddedblanks.

-  Thevalueisreadun)ltwoormoreconsecuBveblanksareencountered-  NOotherdelimitercanbeusedtoindicatetheendofeachfield

FILENAMEimport'\\psf\Home\Documents\MySASFiles\ReadingRawData\Lengthlis)nput.txt';DATAimp_Lengthlis)nput;

INFILEimport;LENGTHCity$12;INPUTCity&Pop86;

RUN;PROCPRINTdata=imp_Lengthlis)nput;RUN;

DATAimp_Lengthlis)nput;INFILEimport;INPUTCity&$12.Pop86;

RUN;

YoucanalsoreadthevaluesforCitywiththe&modifier followed the $w. informat, whichdetermines the variable length (it should belargeenoughtoaccomodatethelongestvalue):

Page 11: 17. Reading free-format data - Conservatoire national des ...maths.cnam.fr/IMG/pdf/SAS_17_cle43dced.pdf · 17. Reading free-format data GIORGIO RUSSOLILLO - Cours de préparaon à

Modifiedlistinput:Readingnonstandardvalues

387GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»

-  Thecolon(:)modifierenablesyoutoread-  nonstandardvalues-  charactervaluesthatarelongerthat8characters,butwhichcontainno

embeddedblanks.-  The:indicatesthatvaluesarereadun)ladelimiterisencountered,andthenan

informatisapplied-  Ifaninformatforreadingcharactervaluesisspecified,thewvaluespecifiesthe

variable’slengthintheSASdataset(NOTthenumberofcolumnstobereadinthesourcefile!!),overridingthedefaultlength

-  N.B.:Thisisdifferentfromusinganumericinformatwithforma{edinput.Informa{edinput,youmustspecifyawvalueinordertoindicatethenumberofcolumnsofthefieldtoread

INPUTvariable:informat;

Page 12: 17. Reading free-format data - Conservatoire national des ...maths.cnam.fr/IMG/pdf/SAS_17_cle43dced.pdf · 17. Reading free-format data GIORGIO RUSSOLILLO - Cours de préparaon à

Readingnonstandardvalues:anexample

388GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»

FILENAMEimport'\\psf\Home\Documents\MySASFiles\ReadingRawData\Modlis)nput.txt';DATAimp_Modlis)nput;

INFILEimport;INPUTRankCity&$12.Pop86:COMMA.;

RUN;PROCPRINTdata=imp_Modlis)nput;RUN;

Impor)ngthedatasetModlis)nput:

FILENAMEimport'\\psf\Home\Documents\MySASFiles\ReadingRawData\Modlis)nput2.txt';DATAimp_Modlis)nput2;

INFILEimportDSD;INPUTRankCity:$12.Pop86:COMMA.;

RUN;PROCPRINTdata=imp_Modlis)nput2;RUN;

Impor)ngthedatasetModlis)nput2:

Page 13: 17. Reading free-format data - Conservatoire national des ...maths.cnam.fr/IMG/pdf/SAS_17_cle43dced.pdf · 17. Reading free-format data GIORGIO RUSSOLILLO - Cours de préparaon à

Mixinginputstyles:anexample

389GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»

FieldDescripBon

StarBngColumn

FieldWidth Datatype InputStyle

SSN 1 11 character column

DateofHire 13 7 date forma{ed

AnnualSalary 21 6 numeric forma{ed

Department 28 5to9 character list

PhoneExtension ? 4 character list

FILENAMEimport'\\psf\Home\Documents\MySASFiles\ReadingRawData\Mixinput.txt';DATAimp_Mixinput;

INFILEimport;INPUTSSN$1-11@[email protected]:$9.Phone$;

RUN;PROCPRINTdata=imp_Mixinput;RUN;

Page 14: 17. Reading free-format data - Conservatoire national des ...maths.cnam.fr/IMG/pdf/SAS_17_cle43dced.pdf · 17. Reading free-format data GIORGIO RUSSOLILLO - Cours de préparaon à

Crea)ngFree-formatData:twoalterna)ves

390GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»

FILEfile-specifica+on<DLM=‘delimiter’other-op+ons>;

PUTvariable<:format>;-  variable:nameofthevariablewhosevalueiswri{en-  ::precedesformat-  format:specifiestheformattouseforwri)ngthedatavalues

PROCEXPORTDATA=SAS-dataset;OUTFILE=filename<DLM=‘delimiter’>;RUN;

-  SAS-dataset:nameoftheSAS-datasettoexport-  filename:thecompletepathandfilenameoftheoutput-  delimiter:specifiesthedelimitertoseparatecolumnsofdataintheoutputfile

Page 15: 17. Reading free-format data - Conservatoire national des ...maths.cnam.fr/IMG/pdf/SAS_17_cle43dced.pdf · 17. Reading free-format data GIORGIO RUSSOLILLO - Cours de préparaon à

Crea)ngFree-formatData:specifyingadelimiter

391GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»

FILEfile-specifica+on<DLM=‘delimiter’other-op+ons>;

YoucanusetheDLM=op)onwithaFILEstatementtocreatecharacter-delimiterrawdataflies

PROCPRINTDATA=sasuser.finance;RUN;

DATA_NULL_;SETsasuser.finance;FILE"\\psf\Home\Documents\MySASFiles\myoutputs\finance"DLM=",";PUTSSNNameSalary:COMMA.Date:DATE9.;

RUN;

Page 16: 17. Reading free-format data - Conservatoire national des ...maths.cnam.fr/IMG/pdf/SAS_17_cle43dced.pdf · 17. Reading free-format data GIORGIO RUSSOLILLO - Cours de préparaon à

Crea)ngFree-formatData:usingtheDSDop)on

392GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»

Let’ssupposeyouwanttocreateacomma-delimitedfileinwhichtherearevalueswithaformatrequiringcommas.Thefilefinancemaybemisunderstoodiftheaboveprogramisused!

DATA_NULL_;SETsasuser.finance;FILE"\\psf\Home\Documents\MySASFiles\myoutputs\finance2"DSD;PUTSSNNameSalary:COMMA.Date:DATE9.;

RUN;

YoucanusetheDSDop)onintheFILEstatementtospecifythatvaluescontainingcommasshouldbeenclosedinquota)onmarks

FILEfile-specifica+on<DLM=‘delimiter’other-op+ons>;

NB:theDSDop)onusesacommaasadelimiter,sotheDLM=op)onisnotnecessaryhere

Page 17: 17. Reading free-format data - Conservatoire national des ...maths.cnam.fr/IMG/pdf/SAS_17_cle43dced.pdf · 17. Reading free-format data GIORGIO RUSSOLILLO - Cours de préparaon à

Readingvaluesthatcontaindelimiterswithinaquotedstring

393GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»

YoucanusetheDSDop)oninanINFILEstatementtoreadvaluesthatcontaindelimiterswithinaquotedstring:

FILENAMEimport"\\psf\Home\Documents\MySASFiles\myoutputs\finance2";DATAwork.finance2;

INFILEimportDSD;INPUTSSN:$11.Name$Salary:COMMA.Date:DATE9.;

RUN;PROCPRINTDATA=finance2;FORMATDateDATE9.;RUN;