Advanced XML Integration September 13-14, 2005 St. Louis, Missouri Sponsored by the U.S. Department...
-
Upload
giles-godwin-foster -
Category
Documents
-
view
214 -
download
0
Transcript of Advanced XML Integration September 13-14, 2005 St. Louis, Missouri Sponsored by the U.S. Department...
Advanced XML Integration
September 13-14, 2005St. Louis, Missouri
Sponsored by the U.S. Department of Housing and Urban Development
David Talbot, Data Systems International
Eric Jahn, United Way of Sarasota County
September 13-14, 2005 St. Louis, MissouriSponsored by the U.S. Department of Housing and Urban Development
2
What Is Data Integration and Why Should Our HMIS Do It?
Data integration is moving data from one computer system to another.
• Many local agencies already have their own data
tracking systems. • For these agencies, inputting into HMIS's web interface
is duplicate data entry. Duplication = Bad• Agencies can keep using their existing systems.
• Integration allows many Continua to combine their data into one place for reporting.
September 13-14, 2005 St. Louis, MissouriSponsored by the U.S. Department of Housing and Urban Development
3
How Does Data Integration Work?
• Two types of data integration:• Live Integration• Episodic
• Transport and messaging protocols being worked on by NHSDC and HUD’s National HMIS TA team.
September 13-14, 2005 St. Louis, MissouriSponsored by the U.S. Department of Housing and Urban Development
4
Data Warehousing
• Analysis and Reporting across continua• Multiple related systems in a city
• Example: Domestic Violence Shelters, AIDS Foundations, Homeless Consortium all using different systems but frequently serving the same requirements.
• Reporting for one or more systems statewide or nationally.
• Statewide or national picture of homeless.
• Why not just use the APRs generated from each system?• APR only answers specific questions, doesn’t allow the
data to be “sliced and diced”.
September 13-14, 2005 St. Louis, MissouriSponsored by the U.S. Department of Housing and Urban Development
5
How multiple systems report together
ClientTrack.NETOther
ProprietaryHome GrownApplications
HUD XML
UncleansedData
Warehouse Vendor Proprietary
HUD XML
Data Cleansing/De-duplication
CleansedData Warehouse
Continua Wide Reporting
CubeAnalysis
September 13-14, 2005 St. Louis, MissouriSponsored by the U.S. Department of Housing and Urban Development
6
De-Duplication
• Specific rules depending on how aggressive you want the results de-duplicated.• First/Last Name, SSN, DOB are common fields for de-
duplication.
• De-duplication without passing client identifiable information (Hashed Identifiers)• SHA-15 one way encryption of identifying information.• Feature demanded by Domestic Violence shelters for
good reason.• Each additional system a client’s data touches increases
the possible places where a confidentiality breach can occur.
September 13-14, 2005 St. Louis, MissouriSponsored by the U.S. Department of Housing and Urban Development
7
System 1
De-Duplication Without Passing Client Identifiable Info
David Talbot
SH
A15
7&SA(*HAJAKYZM HUD XML
System 2
David TalbotS
HA
15
7&SA(*HAJAKYZM
Data warehouse knows that “7&SA(*HAJAKYZM” is the same client butbut can not turn “7&SA(*HAJAKYZM” into “David Talbot”
September 13-14, 2005 St. Louis, MissouriSponsored by the U.S. Department of Housing and Urban Development
8
Common Strategies for Hashed Identifiers
• Precise: Concatenate first name, last name and date of birth or last 4 chars of SSN.• Few false “de-duplications”• Many records that should have been de-duplicated won’t
be due to things like misspelled names.
• More De-Duplications: Concatenate a SOUNDEX of first name, last name and date of birth.• More false “de-duplications”
September 13-14, 2005 St. Louis, MissouriSponsored by the U.S. Department of Housing and Urban Development
9
Why HUD XML?
• Allows vendors to support one import/export format instead of many.• www.hmis.info lists 28 vendors!• There are hundreds, maybe even thousands of home
grown applications out there in use by only one or two groups.
• Result:• Without HUD XML, bringing data together from multiple
vendors required custom import routines to be written for each vendor.
• With HUD XML, all vendors write to a common specification and only one import routine needs to be written.
September 13-14, 2005 St. Louis, MissouriSponsored by the U.S. Department of Housing and Urban Development
10
Basic Technology Behind HUD XML
• XSD – XML Schema• Supported by all modern programming languages.• Provides validation that a file conforms to the schema.
• Unfortunately just because XSD validates the file doesn’t mean it will just magically work 100%.• Many systems currently in use lack basic data validation
at data entry. Garbage In/Garbage Out.• Slight variances in interpretation of the schema can lead
to unintended results even though the export file is technically valid.
September 13-14, 2005 St. Louis, MissouriSponsored by the U.S. Department of Housing and Urban Development
11
Internet
HUD XML: State of the Union
• What are people currently doing with the standard? • Suncoast Partnership agency example: csv->HUD XML• Open Source Reference Implementation to be supported
by vendors.
AgencyDatabase
XML SOAP HTTP SSL XMLSOAPHTTPSSL
Data WarehouseApplication
PostgreSQLDatabase
September 13-14, 2005 St. Louis, MissouriSponsored by the U.S. Department of Housing and Urban Development
12
Vendor Support for HUD XML
• Vendor support matrix for HUD XML being published
• Why aren’t more vendors supporting HUD XML already?• Fear of “making it easy for customers to switch.”• Expectation that one of their customers should pay for it
as “custom work”.• Software built on outdated and/or non-mainstream
technologies that lack good tools to work with XML.• Lack of understanding on just how easy it is to
implement.
September 13-14, 2005 St. Louis, MissouriSponsored by the U.S. Department of Housing and Urban Development
13
How do I get my vendor to support HUD XML?
• Peer pressure. Vendor X is doing it and you should too.
• RFP Pressure. If you haven’t chosen a vendor, please include HUD XML as a baseline requirement in your RFP.
• Pay the vendor to implement the feature.
September 13-14, 2005 St. Louis, MissouriSponsored by the U.S. Department of Housing and Urban Development
14
How do I get my vendor to Support HUD XML?
• Peer pressure. Vendor X is doing it and you should too.
• RFP Pressure. If you haven’t chosen a vendor, please include HUD XML as a baseline requirement in your RFP.
• Pay the vendor to implement the feature.
September 13-14, 2005 St. Louis, MissouriSponsored by the U.S. Department of Housing and Urban Development
15
Technical Explanation-Header Information
• SourceDatabase• Contains core information about the data source.• Assign a database ID to each contributing system in your
continua.
• Export• Assigns a unique id, for the source database, identifying
the export.• Used so destination system can discover if it has
imported this exact file before.
September 13-14, 2005 St. Louis, MissouriSponsored by the U.S. Department of Housing and Urban Development
16
Technical Explanation-Program
• The word “program” has a thousand definitions in the gov/non-profit world.• Organizational Unit• Grant/Funding Source• Grouping of services packaged for special need
• In HUD XML it is all of the above, program is a hierarchical structure.
• Strategies for importing• Recommend treating root “programs” as organizational
units and sub-programs as grouping of services packaged for special need.
September 13-14, 2005 St. Louis, MissouriSponsored by the U.S. Department of Housing and Urban Development
17
Technical Explanation-ClientExport
• Optional element used to associate clients with exports.• Example:
• Client David Talbot is exported from Database 1 for the first time in the export identified as 25.
• A few months later, David Talbot is included in a new export identified as 29.
• The ClientExport record in export 29 would be:<ClientExport>
<CEPersonID>123</CEPersonID><CEExportID>
<CEExportIDNum>25</CEExportIDNum>
</CEExportID>
</ClientExport>• Target database knows this is the same PersonID 123 it sent
before.• ClientExport is technically redundant data and won’t be
needed for most systems if the DatabaseIDs are consistent.
September 13-14, 2005 St. Louis, MissouriSponsored by the U.S. Department of Housing and Urban Development
18
Technical Explanation-Client
• PersonID is unique to the source system. Destination system can use DatabaseID and PersonID together to uniquely identify a client.
• ProgramParticipation aligns to an enrollment.• ClientHistorical contains periodically gathered
data such as data gathered at entry and at exit.• ClientHistorical contains some of the most important
data for rich reporting because it shows client change over time.
• ServiceEvents has both a HUD category and an AIRS Code.
September 13-14, 2005 St. Louis, MissouriSponsored by the U.S. Department of Housing and Urban Development
19
Core Data Cleansing Strategies
• Client Identifiers are the first line of defense in data cleansing.
• Identify which systems/participants are providing the best data and give them “precedence” in the data warehouse.
• Set intelligent null values.• Cleanse using carefully chosen business rules.• AIRS Codes are very helpful for cleansing services
if most of your source data systems can support them.
September 13-14, 2005 St. Louis, MissouriSponsored by the U.S. Department of Housing and Urban Development
20
Related Standards
• AIRS XML is for exporting provider/organizational data.
• AIRS XML does not support exporting client data.• AIRS XML is broadly supported among I&R and
211 software vendors.• AIRS has agreed in principle to look at using HUD
XML as the export format for client data.• The NHSDC and the HMIS TA team are currently
working to define standards around how HUD XML files will be passed and processed.
September 13-14, 2005 St. Louis, MissouriSponsored by the U.S. Department of Housing and Urban Development
21
Further Information
Listserv at:http://groups-beta.google.com/group/HMIS_Data_Integration
Open Source Data Warehouse Project:http://casesync.net
Vendors: Watch hmis.info for pending vendor comparison table.