© 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this...
-
Upload
sean-schroeder -
Category
Documents
-
view
214 -
download
0
Transcript of © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this...
![Page 1: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/1.jpg)
© 2002 page 1
Data Mining Tools For ZLE
Copying and Use Restrictions:
Material under this presentation is the Intellectual Property of HP Corporation and Genus Software. Any use of the this material, in part or whole, except in context of Genus Data Mining Integrator and Data Mart Builder, without written permission from HP and Genus is prohibited.
![Page 2: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/2.jpg)
page 2© 2002
agenda
•data mining in ZLE solutions
•ZLE data mining toolkit
•toolkit demonstration
agenda
![Page 3: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/3.jpg)
© 2002 page 3
title text
Meta Group
• process of identifying and/or extracting previously unknown, non-trivial, unanticipated, important information from large sets of data
Gartner Group
• process of discovering meaningful new correlations, patterns and trends by sifting through large amounts of data stored in repositories, using pattern recognition technologies, statistical and mathematical techniques
![Page 4: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/4.jpg)
© 2002 page 4
title text
• role– determine most
effective responses to business events
•ZLE facilitates mining
by providing– a rich, integrated,
current data source– an integrated
operational environment into which models can be deployed
•data mining helps to realize the full business value of a ZLE system
![Page 5: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/5.jpg)
page 5© 2002
derive attributes
identify and definebusiness opportunity
create case set
deploy model
profile data
transform data
assess performance
train models
typically about 75% of process
ZLE data mining process• understand the opportunity
– identify and define business opportunity
• prepare data– profile and understand data– derive attributes– transform data– create case set
• build models– train models– assess model performance
• use models– deploy model– monitor model performance
monitor modelperformance
![Page 6: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/6.jpg)
page 6© 2002
agenda
•data mining in ZLE solutions
•ZLE data mining toolkit
•toolkit demonstration
agenda
![Page 7: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/7.jpg)
page 7© 2002
the ZLE data mining toolkit
•goal:– provide tools that facilitate ZLE data mining – reduce process cycle times dramatically
• three tools being developed by Genus Software:– data preparation– data transfer – model deployment
•partners: Genus, MicroStrategy, SAS
•product names:
– Genus Mining Integrator for NonStop SQL (all three tools)
– Genus Mart Builder for NonStop SQL (first two tools only)
![Page 8: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/8.jpg)
page 8© 2002
part of Genus toolkit
ZLE data mining analytical cycle
Data Store(NonStop SQL)
Data Preparation(profiling/transforming data)
Model Deployment(written to DB tables)
Data Transfer(fast parallel streams)
Mining Mart(Tru64/Windows)
Scoring
Engine
RulesEngin
e
Agg.Engin
eInte
ract
ion
Manager
Real-Time Scoring(using the Recommender)
part of ZDK 3
Modeling (SAS Enterprise Miner)
available from SAS
![Page 9: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/9.jpg)
page 9© 2002
agenda
•data mining in ZLE solutions
•ZLE data mining toolkit
•toolkit demonstration
agenda
![Page 10: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/10.jpg)
page 10© 2002
toolkit demonstration
•credit card fraud detection example
•opportunity: use ZLE data store data to predict, in real-time, which credit card purchases are likely to be fraudulent
•use tools to: – build a case set table with one row describing each
purchase
– transfer table to SAS server for modeling
– deploy predictive model to ZLE data store
– execute model in real-time to make fraud predictions
•steps described, including many tool screen shots
![Page 11: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/11.jpg)
© 2002 page 11
• based on the MicroStrategy (MSI) Business Intelligence toolset, leverages GUI, logical data model support, SQL generation, etc.
• uses NonStop SQL/MX DBMS, leverages sampling, TRANSPOSE, statistical functions, …
• custom tool developed by Genus using MSI SDK for NonStop SQL operations and functionality not supported by MSI tools
toolkit data preparation
solution
![Page 12: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/12.jpg)
page 12© 2002
two main ZLE data preparation tasks
1. profile tables– column names and types– partitioning information, attributes, key structure, …– column values
2. transform source tables– derive new attributes– aggregate to appropriate level– clean data– pivot– combine to form case set
![Page 13: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/13.jpg)
page 13© 2002
the MicroStrategy desktop
![Page 14: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/14.jpg)
page 14© 2002
MSI profile report: fraud vs. billing state
![Page 15: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/15.jpg)
page 15© 2002
NonStop SQL/MX sampling
•source table sampling– insert into CustSampselect * from Cust sample random 1 percent clusters of 10 blocksunion select * from Custwhere CardNo in (select CardNo from FrdFlg)
•enables interactive and exploratory data prep
•cleanly integrated into SQL
•performed efficiently in DP2
•easily accessible through Genus tool
![Page 16: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/16.jpg)
page 16© 2002
creating a materialized sample table using the Genus Data Mart Builder
![Page 17: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/17.jpg)
page 17© 2002
identifying source and sample method
![Page 18: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/18.jpg)
page 18© 2002
specifying materialized sample table
![Page 19: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/19.jpg)
page 19© 2002
transforming source data
Billions of Purchases
Millions of Accounts
PurchasePurchDt Amt Store Acct
102302 11:02:44 $4.50 423 8849940044102302 11:02:44 $88.38 221 8376636636102302 11:02:45 $121.33 221 8376636636102302 11:02:45 $19.99 73 3866493657
…102402 11:01:01 $43.84 743 8376636636102402 11:02:59 $77.01 23 5378366284102402 11:02:21 $11.63 189 8376636636102402 11:03:58 $144.00 270 3866493657
…102502 12:01:34 $289.08 45 6474538469102502 12:01:49 $71.99 301 3866493657102502 12:03:45 $38.23 219 5382638977102502 12:03:58 $58.84 17 3866493657
…
StoreSize Age CS
249 4 33337 9 88893 1 76102 19 43
219 12 44430 6 90501 14 23194 2 5
579 5 75220 13 34331 1 91430 8 18
AccountCR CrLim Ten1 1000 80 4600 460 1700 151 1700 15
0 4600 890 1000 10 2000 201 1500 12
0 3000 301 3300 280 2900 290 1800 16
P1 S1 A1 P3 S3 A30 0 0 0 0 01 1 $54 1 1 $540 0 0 0 0 00 0 0 0 0 0
1 1 $121 1 1 $1211 1 $54 1 1 $542 2 $79 2 2 $791 1 $20 3 1 $60
0 0 0 0 0 02 1 $54 4 1 $590 0 0 0 0 03 2 $55 5 2 $58
Purchase History Min Max Elec Vid Jewl
$1 $3 0 0 0$9 $17 1 1 0
$19 $42 0 0 1$4 $9 0 1 0
$8 $19 1 0 0$15 $22 1 1 0$1 $3 0 0 0
$11 $42 1 1 1
$19 $98 0 0 1$7 $22 0 1 0$4 $9 0 1 0$6 $14 1 0 1
ItemSummary Frd?
0100
0000
0001
Fraud
Aggregate and Pivot
![Page 20: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/20.jpg)
page 20© 2002
result: a case set for modeling
PurchDt Amt Store Acct102302 11:02:44 $4.50 423 8849940044102302 11:02:44 $88.38 221 8376636636102302 11:02:45 $121.33 221 8376636636102302 11:02:45 $19.99 73 3866493657
…102402 11:01:01 $43.84 743 4674847467102402 11:02:59 $77.01 23 5378366284102402 11:02:21 $11.63 189 8376636636102402 11:03:58 $144.00 270 3866493657
…102502 12:01:34 $289.08 45 6474538469102502 12:01:49 $71.99 301 3866493657102502 12:03:45 $38.23 219 5382638977102502 12:03:58 $58.84 17 3866493657
…
Size Age CS249 4 33337 9 88893 1 76102 19 43
219 12 44430 6 90501 14 23194 2 5
579 5 75220 13 34331 1 91430 8 18
CR CrLim Ten1 1000 80 4600 460 1700 151 1700 15
0 4600 890 1000 10 2000 201 1500 12
0 3000 301 3300 280 2900 290 1800 16
P1 S1 A1 P3 S3 A30 0 0 0 0 01 1 $54 1 1 $540 0 0 0 0 00 0 0 0 0 0
1 1 $121 1 1 $1211 1 $54 1 1 $542 2 $79 2 2 $791 1 $20 3 1 $60
0 0 0 0 0 02 1 $54 4 1 $590 0 0 0 0 03 2 $55 5 2 $58
Min Max Elec Vid Jewl$1 $3 0 0 0$9 $17 1 1 0
$19 $42 0 0 1$4 $9 0 1 0
$8 $19 1 0 0$15 $22 1 1 0$1 $3 0 0 0
$11 $42 1 1 1
$19 $98 0 0 1$7 $22 0 1 0$4 $9 0 1 0$6 $14 1 0 1
Frd?0100
1101
0001
Hundreds of Attributes
One Row Per Purchase
Mix of Fraud and No-Fraud Purchases
![Page 21: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/21.jpg)
page 21© 2002
MSI Datamart report summarizing items
![Page 22: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/22.jpg)
page 22© 2002
data transfer tool
Data Store
Mining Mart
NonStop SQL/MX
ASCII files
SAS data set
data transfer tool• task: transfer case set from data store to mining
mart
coordinator coordinator
– design
HTML
HTTP
JDBC
Web browserclient
Web server
Web App.
receive SAS importtransferreceive SAS importtransfer
receive SAS importtransferreceive SAS importtransfer
![Page 23: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/23.jpg)
page 23© 2002
data transfer specification screen
![Page 24: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/24.jpg)
page 24© 2002
transfer monitoring
![Page 25: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/25.jpg)
page 25© 2002
modeling in SAS enterprise miner
![Page 26: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/26.jpg)
© 2002 page 26
body copy
model export
score converter node generates Java model code
reporter node exports code and HTML report to project directory
![Page 27: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/27.jpg)
page 27© 2002
NonStop SQL/MX
Data Store
SAS Open
Metadata server
File/SAS server
SASEnterpri
seMiner
Mining Mart
model deployment tool• task
– copy model information to a ZLE Data Store
Model export/registration
– design
HTML
HTTP
JDBC access
Web browserclient
File/registryaccess
Web Server
Web App.
![Page 28: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/28.jpg)
page 28© 2002
starting the model deployment tool
![Page 29: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/29.jpg)
page 29© 2002
connecting to a Data Store
![Page 30: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/30.jpg)
page 30© 2002
a list of models in the Data Store
![Page 31: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/31.jpg)
page 31© 2002
viewing a deployed model
![Page 32: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/32.jpg)
page 32© 2002
selecting a SAS report directory
![Page 33: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/33.jpg)
page 33© 2002
viewing available reports
![Page 34: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/34.jpg)
page 34© 2002
viewing an Enterprise Miner report
![Page 35: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/35.jpg)
page 35© 2002
deploying a model
![Page 36: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/36.jpg)
page 36© 2002
deployment confirmation
![Page 37: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/37.jpg)
page 37© 2002
real-time scoring using the Recommender
Scoring Engine
Aggregation Engine
Rules Engine
Model Aggregates
Model Scores
DeployedModels
BusinessRules
AggregateDefinitions
Offers /Advice
CustomerData
Inte
ract
ion M
an
ag
er
![Page 38: © 2002 page 1 Data Mining Tools For ZLE Copying and Use Restrictions: Material under this presentation is the Intellectual Property of HP Corporation and.](https://reader035.fdocuments.in/reader035/viewer/2022081516/5513e6485503463a298b5b08/html5/thumbnails/38.jpg)
page 38© 2002
how to get the data mining tools
•Product Names
– Genus Mining Integrator for NonStop SQL (Data Preparation, Data Transfer, and Model Deployment tools)
– Genus Mart Builder for NonStop SQL (first two tools only)
•Can be ordered through HP, support provided by Genus
•Availability: calendar Q4 2002
•For more information, contact
– [email protected] (Product Manager)
– [email protected] (Program Manager)
– [email protected] (Development)