Worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada.

18
worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada

Transcript of Worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada.

Page 1: Worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada.

worst, but still importable data I’ve ever seen

Arthur Tabachneck Insurance Bureau of Canada

Page 2: Worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada.

Coder’s CornerApril 12, 2010ForumSAS

suppose you had the following excel file:

format: textformat: as shown

format: m/d/yyyy

format: d/m/yyyy

format: textformat: d-mon

format: text

Page 3: Worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada.

Coder’s CornerApril 12, 2010ForumSAS

how the file got so bad:

members of a secretarial pool were asked to enter the data, in Excel, while they were covering the front desk

they (four different secretaries), obviously, weren’t given sufficient instructions

their task was simply to enter some data, which happened to include a date

Page 4: Worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada.

Coder’s CornerApril 12, 2010ForumSAS

proc import can only be used if:

you licenseSAS/Access Interface to PC File Formats

and

at least half of the relevant rows (based on your system’s and SAS guessingrows

settings) are formatted as dates

or

you manually edit the spreadsheet and/or change your guessing rows settings so that condition #2 holds

1

2

3

Page 5: Worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada.

Coder’s CornerApril 12, 2010ForumSAS

If proc import can be used, three steps are necessarystep 1: use mixed=no

Page 6: Worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada.

Coder’s CornerApril 12, 2010ForumSAS

which will import date formatted cellsand assign missing values to the other cells

Page 7: Worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada.

Coder’s CornerApril 12, 2010ForumSAS

step 2: use mixed=yeswhich will import all cells as text

Page 8: Worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada.

Coder’s CornerApril 12, 2010ForumSAS

step 3merge the two files and use inputn to read missing dates

data want (drop=bdate); set inputa; set inputb (rename=(date=bdate)); if missing(date) then do; options datestyle=dmy; date=inputn(bdate, ‘anydtdte’, 20); end; if missing(date) then do; date=inputn(catt(scan(bdate,2,’-’), scan(bdate,1,’-’), scan(bdate,3,’-’)), ‘anydtdte’, 20); end;run;

Page 9: Worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada.

Coder’s CornerApril 12, 2010ForumSAS

resulting in the following file

Page 10: Worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada.

Coder’s CornerApril 12, 2010ForumSAS

however, if proc import can’t be used

or

if you simply want a better solution

Page 11: Worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada.

Coder’s CornerApril 12, 2010ForumSAS

you can do it with DDE

options noxsync noxwait xmin;filename sas2xl dde 'excel|system';

step 1: set desired options and filename

Page 12: Worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada.

Coder’s CornerApril 12, 2010ForumSAS

Step 2: Open Exceldata _null_; length fid rc start stop time 8; fid=fopen('sas2xl','s'); if (fid le 0) then do; rc=system('start excel'); start=datetime(); stop=start+10; do while (fid le 0); fid=fopen('sas2xl','s'); time=datetime(); if (time ge stop) then fid=1; end; end; rc=fclose(fid);run;

Page 13: Worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada.

Coder’s CornerApril 12, 2010ForumSAS

Step 3: Open workbook and insert old-style macro sheetdata _null_; file sas2xl; put '[open("c:\worst data.xls")]';run;

data _null_; file sas2xl; put '[workbook.next()]'; put '[workbook.insert(3)]';run;

filename xlmacro dde 'excel|macro1!r1c1:r99c1‘ notab lrecl=200;

Page 14: Worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada.

Coder’s CornerApril 12, 2010ForumSAS

Step 4: Create and run Excel macrodata _null_; file xlmacro; put '=set.name("Tag",!$b$1)'; put '=formula("<>",Tag)'; put '=set.name("OldValue",!$c$1)'; put '=set.name("NewValue",!$b$2)'; put '=for.cell("CurrentCell",sheet1!$a$2:$a$99,true)'; put '=formula(get.cell(5,CurrentCell),OldValue)'; put '=formula("=concatenate(Tag,OldValue)",NewValue)'; put '=formula(NewValue, CurrentCell)'; put '=next()'; put '=halt(true)'; put '!dde_flush'; file sas2xl; put '[run("macro1!r1c1")]'; put '[workbook.activate("sheet1")]'; put ‘[error(false)]’; put '[save.as(“c:\DateTest",6)]'; put '[quit()]';run;

Page 15: Worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada.

Coder’s CornerApril 12, 2010ForumSAS

Step 5: Import the datadata want (keep=date); infile "c:\DateTest.csv" dsd dlm="," lrecl=32768 firstobs=2; informat rawdate $20.; input rawdate; format date date9.; rawdate=substr(rawdate,3); if anyalpha(rawdate) then do; options datestyle=dmy; date=inputn (rawdate , 'anydtdte' , 20 ); if missing(Date) then do; date=inputn(catt(scan(rawdate,2,'-'),scan(rawdate,1,'-'), scan(rawdate,3,'-')),'anydtdte' , 20) ; end; end; else Date=rawdate-21916;run;

Page 16: Worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada.

Coder’s CornerApril 12, 2010ForumSAS

and obtain the desired resultregardless of your system’s guessing rows setting

or how your data is arranged

Page 17: Worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada.

Coder’s CornerApril 12, 2010ForumSAS

Author Contact Information

Your comments and questions are valued and encouraged.

Contact the author:

Dr. Arthur TabachneckDirector, Data ManagementInsurance Bureau of Canada

Toronto, Ontario L3T 5K9 Canada

atabachneck at ibc dot ca orart297 at netscape dot net

Page 18: Worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada.

Coder’s CornerApril 12, 2010ForumSAS

Microsoft Corporation. Function Reference Microsoft EXCEL Spreadsheet with Business Graphics and Database: Version 4.0 for Apple® Macintosh® Series or Windows™ Series. Document AB26298-0592, 1992.

Vyverman, K. Excel Exposed: Using Dynamic Data Exchange to Extract Metadata from MS Excel Workbooks, SESUG 17, 2003, paper TU15, St. Pete Beach, FL

Vyverman, K. Re: How to flag special formatting from Excel in a SAS dataset. SAS-L Post , 2002, http://www.listserv.uga.edu/cgi-bin/wa?A2=ind0209a&L=sas-l&D=1&O=A&P=12088

Vyverman, K. Re: MS Excel column widths. SAS-L Post , 2002, http://www.listserv.uga.edu/cgi-bin/wa?A2=ind0201b&L=sas-l&P=25268

Vyverman, K. Using Dynamic Data Exchange to Export Your SAS Data to MS Excel – Against All ODS, Part I, SUGI 26, 2001, paper 190-27, Long Beach, CA.

Key References