A Comparison of Leading Data Mining Tools · KDD-98: A Comparison of Leading Data Mining Tools A...

68
KDD-98: A Comparison of Leading Data Mining Tools A Comparison of Leading Data Mining Tools John F. Elder IV & Dean W. Abbott Elder Research Fourth International Conference on Knowledge Discovery & Data Mining Friday, August 28, 1998 New York, New York

Transcript of A Comparison of Leading Data Mining Tools · KDD-98: A Comparison of Leading Data Mining Tools A...

KDD-98: A Comparison of Leading Data Mining Tools

A Comparison of LeadingData Mining Tools

John F. Elder IV & Dean W. AbbottElder Research

Fourth International Conferenceon Knowledge Discovery & Data Mining

Friday, August 28, 1998

New York, New York

updated October 19, 1998© 1998 Elder Research T8-2

KDD-98: A Comparison of Leading Data Mining Tools

Copyright © 1998,John F. Elder IV and Dean W. Abbott

All rights reserved

Manufactured in the United States of America

updated October 19, 1998© 1998 Elder Research T8-3

KDD-98: A Comparison of Leading Data Mining Tools

Contacting Elder Research

Dr. John F. Elder IV

1006 Wildmere Place

Charlottesville, VA 22901

[email protected]

804-973-7673

Fax: 804-995-0064

Dean W. Abbott

3443 Villanova Avenue

San Diego, CA 92122-2310

[email protected]

619-450-0313

http://www.datamininglab.com

updated October 19, 1998© 1998 Elder Research T8-4

KDD-98: A Comparison of Leading Data Mining Tools

Tutorial Goals

• Compare and Summarize Data Mining Tools which:– Offer multiple modeling and classification algorithms

– Support project stages surrounding model construction

– Stand alone

– Are general-purpose

– Cost a lot

– We could get our hands on

• Include some (focused) Desktop Tools

Other Reports: Two Crows, Aberdeen Group, Elder Research(forthcoming ), Data Mining Journal

updated October 19, 1998© 1998 Elder Research T8-5

KDD-98: A Comparison of Leading Data Mining Tools

Topics

• Products covered

• Review of algorithms

• Comparative tables of properties

• Screen shots exemplifying qualities

• Summary of distinctives

updated October 19, 1998© 1998 Elder Research T8-6

KDD-98: A Comparison of Leading Data Mining Tools

Caveats

• We don’t know every tool well (and are sure tohave missed some!)– Level of exposure noted for each tool

• Our background (biasing our perspective)

– Very technical, “early adopters”

– Emphasize solving real-world applications

– More classification than estimation

• Field of tools is quite dynamic– New versions appear regularly

updated October 19, 1998© 1998 Elder Research T8-7

KDD-98: A Comparison of Leading Data Mining Tools

Data Mining Products

PRW

Model 1

updated October 19, 1998© 1998 Elder Research T8-8

KDD-98: A Comparison of Leading Data Mining Tools

Product Company URL

Ver

sion

T

este

d Our Experience

ClementineIntegral Solutions, Ltd.Integral Solutions, Ltd. http://www.isl.co.uk/clem.html 4 ModerateDarwin Thinking Machines, Corp.Thinking Machines, Corp. http://www.think.com/html/products/products.htm 3.0.1 ModerateDataCruncher DataMindDataMind http://www.datamindcorp.com 2.1.1 HighEnterprise Miner SAS InstituteSAS Institute http://www.sas.com/software/components/miner.html Beta ModerateGainSmarts Urban ScienceUrban Science http://www.urbanscience.com/main/gainpage.htm 4.0.3 LowIntelligent Miner IBMIBM http://www.software.ibm.com/data/iminer/ 2 LowMineSet Silicon Graphics, Inc.Silicon Graphics, Inc. http://www.sgi.com/Products/software/MineSet/ 2.5 LowModel 1 Group 1Group 1/Unica Technologies http://www.unica-usa.com/model1.htm 3.1 Moderate

ModelQuest AbTech Corp.AbTech Corp. http://www.abtech.com 1 ModeratePRW UnicaUnica Technologies, Inc. http://www.unica-usa.com/prodinfo.htm 2.1 High

CART Salford Systems http://www.salford-systems.com 3.5 ModerateNeuroShell Ward Systems Group, Inc. http://www.wardsystems.com/neuroshe.htm 3 ModerateOLPARS PAR Government Systems mailto://[email protected] 8.1 HighScenario Cognos http://www.cognos.com/busintell/products/index.html 2 ModerateSee5 RuleQuest Research http://www.rulequest.com/see5-info.html 1.07 ModerateS-Plus MathSoft http://www.mathsoft.com/splus/ 4 HighWizWhy WizSoft http://www.wizsoft.com/why.html 1.1 Moderate

Tools Evaluated

updated October 19, 1998© 1998 Elder Research T8-9

KDD-98: A Comparison of Leading Data Mining Tools

Categories for Comparisons

• Platforms Supported

• Algorithms Included– Decision Trees

– Neural Networks

– Other

• Data Input and Model Output Options

• Usability Ratings

• Visualization Capabilities

• Modeling Automation Methods

updated October 19, 1998© 1998 Elder Research T8-10

KDD-98: A Comparison of Leading Data Mining Tools

Platforms

PC

Sta

nd

alon

e (9

5/N

T)

Un

ix S

tan

dal

one

Un

ix S

erve

r /

P

C C

lien

t

NT

Ser

ver

/

PC

Cli

ent

Dat

abas

e C

onn

ecti

vity

Clementine √√ √+√+ √√Darwin √√ √√DataCruncher √√ √√ √√Enterprise Miner √√ √+√+ √√ √√GainSmarts √√ √√ √√Intelligent Miner √√ √√MineSet √√ √√Model 1 √√ √√ √√ √√ModelQuest √√ √√ √√PRW √√ √√

CART √√ √+√+Scenario √√ √√NeuroShell √√OLPARS √√ √√See5 √√ √+√+S-Plus √√ √−√−WizWhy √√

Keyblank no capability

√– some capability

√ good capability

√+ excellent capability

updated October 19, 1998© 1998 Elder Research T8-11

KDD-98: A Comparison of Leading Data Mining Tools

Tool Groupings

• PC (standalone)

• Flat Files

• One or Two Algorithms

• Data Fits into RAM

Desktop

• Multiple Platforms, Client-Server

• Flat Files or Direct DatabaseAccess

• Multiple Algorithm Types

• Large Databases

High End

updated October 19, 1998© 1998 Elder Research T8-12

KDD-98: A Comparison of Leading Data Mining Tools

• Intuitive Interface– Clear steps in data mining

process

– Non-technical terminology

– Familiar environment

• Descriptive Reporting– Domain terminology

– Graphical representations

End User Perspectives

Business

• Algorithm Options– Knobs to enhance model

performance

• Model Automation– Simplify model design cycle

– Documentation of steps usedin generating models(repeatability)

Technical

updated October 19, 1998© 1998 Elder Research T8-13

KDD-98: A Comparison of Leading Data Mining Tools

OLPARS: Interface

updated October 19, 1998© 1998 Elder Research T8-14

KDD-98: A Comparison of Leading Data Mining Tools

MineSet: Interface

updated October 19, 1998© 1998 Elder Research T8-15

KDD-98: A Comparison of Leading Data Mining Tools

PRW: Experiment Manager

updated October 19, 1998© 1998 Elder Research T8-16

KDD-98: A Comparison of Leading Data Mining Tools

Intelligent Miner: “Explorer” Interface

updated October 19, 1998© 1998 Elder Research T8-17

KDD-98: A Comparison of Leading Data Mining Tools

Enterprise Miner: Visual Interface

updated October 19, 1998© 1998 Elder Research T8-18

KDD-98: A Comparison of Leading Data Mining Tools

Clementine: Visual Interface

updated October 19, 1998© 1998 Elder Research T8-19

KDD-98: A Comparison of Leading Data Mining Tools

DataCruncher: Process Flow Diagram

updated October 19, 1998© 1998 Elder Research T8-20

KDD-98: A Comparison of Leading Data Mining Tools

Data Input & Model Output

Automatic Header

Save Data

FormatODBC

Native Database Drivers

Summary Reports

Output Source Code

Clementine √√ √√ √√Darwin √√ √√ √√DataCruncher √√ √√ √√ √√ √√Enterprise Miner √−√− √√ √√ √−√− √√GainSmarts √√ √√ √√ √√ √√Intelligent Miner √−√− √√MineSet √√ √√Model 1 √√ √√ √√ √√ √√ √√ModelQuest √√ √√ √√ √√PRW √√ √√ √√ √√ √√

CART √√Scenario √√ √√NeuroShell √√OLPARS √√See5 √−√−S-Plus √√ √√ √√ √√WizWhy √√ √√

updated October 19, 1998© 1998 Elder Research T8-21

KDD-98: A Comparison of Leading Data Mining Tools

PRW: Row Splitting

updated October 19, 1998© 1998 Elder Research T8-22

KDD-98: A Comparison of Leading Data Mining Tools

Model1: Variable Selection

updated October 19, 1998© 1998 Elder Research T8-23

KDD-98: A Comparison of Leading Data Mining Tools

Model1: Data Definitions

updated October 19, 1998© 1998 Elder Research T8-24

KDD-98: A Comparison of Leading Data Mining Tools

Scenario : Special Data Types

updated October 19, 1998© 1998 Elder Research T8-25

KDD-98: A Comparison of Leading Data Mining Tools

a > 4n y

b > 2n y

2 3a > 1n y

1 0

b>3.5n y

2

Decision Trees

updated October 19, 1998© 1998 Elder Research T8-26

KDD-98: A Comparison of Leading Data Mining Tools

Polynomial Networks

a N1

k N9

z1

z9 Double 16

f N6z6

d N4z4

h N8

e N5

Layer 0Layer 1

Layer 2

Single 14

Triple 21

Triple 17

Double 20

z5

z8

Multi-Linear 15

Double 19z20

z17

z14

z16

z19

z15U2 Y1

U7 Y2

Layer 3

(Normalizers)

Unitizers

z21

Z17

= 3.1 + 0.4a - .15b2

+ 0.9bc - 0.62abc + 0.5c3

updated October 19, 1998© 1998 Elder Research T8-27

KDD-98: A Comparison of Leading Data Mining Tools

Regression

Polynomial Networks (e.g. GMDH, ASPN)

Decision Trees(e.g., CART, CHAID, C5)

Logistic or SigmoidalNetworks (ANNs)

Hinging Hyperplanes, MARS

orders, terms

∑ ∑

“Consensus” ModelsParametrically Summarize Data Points

updated October 19, 1998© 1998 Elder Research T8-28

KDD-98: A Comparison of Leading Data Mining Tools

“Consensus” Models (continued)

Histogram

Radial Basis Function

Wavelets

orientation, bin width

family, order

function

updated October 19, 1998© 1998 Elder Research T8-29

KDD-98: A Comparison of Leading Data Mining Tools

Kernels

k-Nearest Neighbor

Delaunay Planes

shape, spread

k, distance metric

Projection Pursuit RegressionSpread, index

Goal, iterations

“Contributory” Modelsretain data points; each potentially affects estimate at new point

updated October 19, 1998© 1998 Elder Research T8-30

KDD-98: A Comparison of Leading Data Mining Tools

Properties of Algorithms

Algorithm Accurate Scalable Interpret-able

Useable Robust Versatile Fast Hot

Classical(LR, LDA) – C C– C – – C D

NeuralNetworks

C D D D – D DD CVisualization C DD C C CC D DDD C–

DecisionTrees

D C C C– C C C– C–PolynomialNetworks

C – D C– –D – –D –K-NearestNeighbors

D DD C– – –D D C DKernels C DD D –D D D C D

KeyC good

– neutral

D bad

updated October 19, 1998© 1998 Elder Research T8-31

KDD-98: A Comparison of Leading Data Mining Tools

Algorithms

Dec

isio

n T

rees

Lin

ear/

Stat

isti

cal

Mul

ti-l

ayer

Per

cept

rons

Nea

rest

Nei

ghbo

r

Rad

ial B

asis

Fun

ctio

ns

Bay

es

Rul

e In

duct

ion

Pol

ynom

ial N

etw

orks

Gen

eral

ized

Lin

ear

Mod

els

Tim

e Se

ries

Sequ

enti

al D

isco

very

K M

eans

Ass

ocia

tion

Rul

es

Koh

onen

Clementine √√ √√ √√ √√ √√ √√ √√Darwin √√ √√ √√Datamind √√Enterprise Miner √√ √√ √√ √√ √√ √√ √√ √√GainSmarts √√ √+√+Intelligent Miner √√ √−√− √√ √−√− √√ √√ √+√+ √√MineSet √√ √√ √√ √√Model 1 √+√+ √√ √√ √√ModelQuest √√ √√ √√ √√ √−√−PRW √+√+ √√ √√ √√ √√ √√

CART √√Cognos √√NeuroShell √+√+ √√ √−√−OLPARS √√ √√ √√ √√ √√ √√ √√See5 √√ √√SPlus √√ √+√+ √√ √√ √√WizWhy √√

updated October 19, 1998© 1998 Elder Research T8-32

KDD-98: A Comparison of Leading Data Mining Tools

Multi-Layer Perceptrons

Lea

rnin

g R

ate

Lea

rnin

g R

ate

Dec

ay

Mom

entu

m

Mul

tipl

e A

ctiv

atio

n F

unct

ions

Mul

tipl

e St

op C

rite

ria

Cro

ss-V

alid

atio

n

Nor

mal

ize

Inpu

ts

Adv

ance

d L

earn

ing

Alg

.

Oth

er C

ost

func

tion

s

Aut

omat

ic M

odel

Sel

ecti

on

Net

wor

k V

isua

l

Par

amet

er S

umm

ary

Clementine √√ √√ √√ √√Darwin √√ √√ √√ √√ √√ √√Enterprise Miner √√ √√ √√ √√ √√ √√ √√ √√ √√ √√Intelligent Miner √√ √√Model 1 √√ √√ √√ √√ √√ √√ √√PRW √√ √√ √√ √√ √√ √√ √√ √√

NeuroShell √√ √√ √√ √√ √√OLPARS √√ √√ √√ √√ √√ √√ √√

updated October 19, 1998© 1998 Elder Research T8-33

KDD-98: A Comparison of Leading Data Mining Tools

Enterprise Miner: Neural NetworkTraining

updated October 19, 1998© 1998 Elder Research T8-34

KDD-98: A Comparison of Leading Data Mining Tools

PRW: Neural Network Training

updated October 19, 1998© 1998 Elder Research T8-35

KDD-98: A Comparison of Leading Data Mining Tools

Decision Trees

"CA

RT

"

C5

or C

4.5

CH

AID

Oth

er

Pri

ors

Cla

ssif

icat

ion

Cos

ts

Mis

sing

Dat

a

Pru

ning

Sev

erit

y

Vis

ual T

rees

Clementine √√ √√ √√ √√ √−√−Darwin √√ √√ √√ √√Enterprise Miner √√ √−√− √√ √+√+ √√ √√ √√ √√GainSmarts √√ √√ √√ √√ √√Intelligent Miner √√ √√ √√MineSet √√ √√ √√ √√ √√ √√Model 1 √√ √√ √−√−ModelQuest √−√− √√ √√

CART √+√+ √√ √√ √√ √√Scenario √√ √√S-Plus √√ √√ √√ √√See5 √+√+ √√ √√ √√

updated October 19, 1998© 1998 Elder Research T8-36

KDD-98: A Comparison of Leading Data Mining Tools

CART: Tree Browsing

updated October 19, 1998© 1998 Elder Research T8-37

KDD-98: A Comparison of Leading Data Mining Tools

Scenario: Tree Browsing

updated October 19, 1998© 1998 Elder Research T8-38

KDD-98: A Comparison of Leading Data Mining Tools

Enterprise Miner: Tree Browsing

updated October 19, 1998© 1998 Elder Research T8-39

KDD-98: A Comparison of Leading Data Mining Tools

Enterprise Miner: Tree Results

updated October 19, 1998© 1998 Elder Research T8-40

KDD-98: A Comparison of Leading Data Mining Tools

MineSet: Tree Browser

updated October 19, 1998© 1998 Elder Research T8-41

KDD-98: A Comparison of Leading Data Mining Tools

Regression / Stats Linear LogisticComplexity

PenaltyCross-

ValidationInput

SelectionFactor

Analysis

Clementine YClementine √√Enterprise Miner √+√+ √+√+ √√ √√ √√ √√GainSmarts √+√+ √+√+ √√Intelligent Miner √−√− √√ √√MineSet √√Model 1 √√ √√ √√ √+√+ModelQuest Enterprise √√ √√ √√ √√ √√PRW √√ √√ √√ √+√+S-Plus √√ √+√+ √√ √√ √√ √√S-Plus √+√+ √+√+ √√ √√ √√ √√Scenario √√

updated October 19, 1998© 1998 Elder Research T8-42

KDD-98: A Comparison of Leading Data Mining Tools

MineSet: Bayes Distributions

updated October 19, 1998© 1998 Elder Research T8-43

KDD-98: A Comparison of Leading Data Mining Tools

Enterprise Miner: Clustering Results

updated October 19, 1998© 1998 Elder Research T8-44

KDD-98: A Comparison of Leading Data Mining Tools

Intelligent Miner: Clustering Results

updated October 19, 1998© 1998 Elder Research T8-45

KDD-98: A Comparison of Leading Data Mining Tools

Intelligent Miner: Cluster Explode

updated October 19, 1998© 1998 Elder Research T8-46

KDD-98: A Comparison of Leading Data Mining Tools

UsabilityData Loading and

ManipulationModel Building

Model Understanding

Technical Support

Overall

Clementine √+√+ √+√+ √+√+ √+√+ √+√+Darwin √√ √√ √+√+ √√ √√DataCruncher √+√+ √+√+ √√ √√ √√Enterprise Miner √√ √√ √√ √√ √√GainSmarts √+√+ √√ √√ √√ √√Intelligent Miner √√ √√ √√ √√ √√MineSet √√ √+√+ √+√+ √√ √+√+Model 1 √+√+ √+√+ √+√+ √+√+ √+√+ModelQuest Enterprise √√ √+√+ √+√+ √+√+ √+√+PRW √+√+ √+√+ √+√+ √+√+ √+√+

CART √−√− √√ √√ √√ √√Scenario √√ √+√+ √+√+ √√ √+√+NeuroShell √√ √√ √√ √√ √√OLPARS √−√− √√ √√ √√ √√See5 √√ √√ √√ √√ √√S-Plus √√ √√ √+√+ √√ √√WizWhy √√ √√ √+√+ √√ √√

updated October 19, 1998© 1998 Elder Research T8-47

KDD-98: A Comparison of Leading Data Mining Tools

Intelligent Miner: Statistics Report

updated October 19, 1998© 1998 Elder Research T8-48

KDD-98: A Comparison of Leading Data Mining Tools

Scenario: Result Reporting

updated October 19, 1998© 1998 Elder Research T8-49

KDD-98: A Comparison of Leading Data Mining Tools

WizWhy: Reporting

updated October 19, 1998© 1998 Elder Research T8-50

KDD-98: A Comparison of Leading Data Mining Tools

DataCruncher: Output Sensitivities

updated October 19, 1998© 1998 Elder Research T8-51

KDD-98: A Comparison of Leading Data Mining Tools

Darwin: Predictions

updated October 19, 1998© 1998 Elder Research T8-52

KDD-98: A Comparison of Leading Data Mining Tools

Darwin: Lift Chart (Excel)

updated October 19, 1998© 1998 Elder Research T8-53

KDD-98: A Comparison of Leading Data Mining Tools

CART: Gains Chart

updated October 19, 1998© 1998 Elder Research T8-54

KDD-98: A Comparison of Leading Data Mining Tools

Visualization HistogramsPie

Charts

Scatter/Line Plots

Rotating Scatter

Conditional Plots

Classification Decision Regions

Correlation Plots

Clementine √√ √√ √√ √−√− √√Darwin √−√− √−√− √−√−DataCruncher √√ √√ √√ √√Enterprise Miner √√ √√ √√ √−√− √√ √√GainSmarts √−√− √−√−Intelligent Miner √√ √√ √√ √√MineSet √√ √√ √√ √√ √√Model 1 √√ √√ √√ModelQuest Enterprise √√ √√PRW √√ √√ √√

CARTScenario √√NeuroShell √√OLPARS √√ √√ √√ √−√− √√ √√See5 √√S-Plus √√ √√ √√ √√ √√WizWhy

updated October 19, 1998© 1998 Elder Research T8-55

KDD-98: A Comparison of Leading Data Mining Tools

Clementine: Visualization

User-created sub-region

updated October 19, 1998© 1998 Elder Research T8-56

KDD-98: A Comparison of Leading Data Mining Tools

OLPARS: Visualization

updated October 19, 1998© 1998 Elder Research T8-57

KDD-98: A Comparison of Leading Data Mining Tools

OLPARS: Decision Space

updated October 19, 1998© 1998 Elder Research T8-58

KDD-98: A Comparison of Leading Data Mining Tools

Model 1: Target Sensitivities

updated October 19, 1998© 1998 Elder Research T8-59

KDD-98: A Comparison of Leading Data Mining Tools

MineSet: Geographical Visualization

updated October 19, 1998© 1998 Elder Research T8-60

KDD-98: A Comparison of Leading Data Mining Tools

Automation Method of AutomationFree Text

Annotation of Steps

Clementine Visual Programming, Programming Language √√Darwin Programming Language √√DataCruncher (Task manager)Enterprise Miner Visual Programming, Programming Language √√GainSmarts Macro Language, Wizards √−√−Intelligent Miner (Wizards)MineSet Data History, LogModel 1 Model WizardModelQuest Batch AgendaPRW Experiement Manager; Macros √√

CART Built-in Basic ScriptingScenarioNeuroShellOLPARSSee5S-Plus Scripting (S); C/C++WizWhy

updated October 19, 1998© 1998 Elder Research T8-61

KDD-98: A Comparison of Leading Data Mining Tools

PRW: Experiment Manager

updated October 19, 1998© 1998 Elder Research T8-62

KDD-98: A Comparison of Leading Data Mining Tools

Model 1: Model Summary

updated October 19, 1998© 1998 Elder Research T8-63

KDD-98: A Comparison of Leading Data Mining Tools

A Recent Breakthrough: Bundling

• Case Weights

• Data Values

• Guiding Parameters

• Variable Subsets

1) Construct varied models, and2) Combine their estimates

Generate component models by varying:

Combine estimates using:

• Estimator Weights

• Voting

• Advisor Perceptrons

• Partitions of Design Space

updated October 19, 1998© 1998 Elder Research T8-64

KDD-98: A Comparison of Leading Data Mining Tools

Example Bundling Techniques

• Bayes: sum estimates of possible models, weighted by priors

• GMDH (Ivakhenko 68) -- multiple layers of quadraticpolynomials, using two inputs each, fit by LR

• Stacking (Wolpert 92) -- train a 2nd-level (LR) model usingleave-1-out estimates of 1st-level (neural net) models

• Bagging (Breiman 96) (bootstrap aggregating) -- bootstrap data(to build trees mostly); take majority vote or average

• Bumping (Tibshirani 97) -- bootstrap, select single best

• Boosting (Freund & Shapire 96) -- weight error cases by βτ =(1-e(t))/e(t), iteratively re-model; weight model t by ln(βτ)

• Crumpling (Anderson & Elder 98) -- average cross-validations

• Born-Again (Breiman 98) -- invent new X data...

updated October 19, 1998© 1998 Elder Research T8-65

KDD-98: A Comparison of Leading Data Mining Tools

Distinctives Strengths Weaknesses

Clementine vis ual inte rface ; algorithm bread th sca lability

Darwin e ffic ient c lie nt-s e rve r; intuitive inte rface options no uns upe rvis e d ; limited vis ualization

DataCruncher e a s e o f us e s ingle a lgorithm

Enterprise Miner depth of algorithms ; visual inte rface harde r to use ; ne w product is s ue s

GainSmarts data trans formations , built on SAS ; a lgorithm option depth no uns upe rvis e d ; limited vis ualization

Intelligent Miner algorithm breadth; graphical tre e /clus te r output fe w a lgorithm options ; no automation

MineSet data visualization fe w a lgorithms; no model e xport

Model 1 e a s e o f us e ; automated model discove ry rea lly a ve rtical tool

ModelQuest breadth of algorithms some non-intuitive inte rface options

PRW e xtens ive a lgorithms; automated model s e le c tion limited visualization

CART depth of tree op tions difficult file I/O; limited visualization

Scenario e a s e o f us e narrow analysis path

NeuroShell multiple neural ne twork architec ture s unorthodox inte rface ; only neural ne tworks

OLPARS multiple s tatis tical algorithms; cla s s -bas e d vis ualization dated inte rface ; difficult file I/O

See5 depth of tree op tions limited visualization; fe w data options

S-Plus depth of algorithms ; visualization; programable /e xte ndable limited inductive m e thods ; s te e p lea rning curve

WizWhy e a s e o f us e ; e a s e o f mode l unde rs tanding limited visualization

updated October 19, 1998© 1998 Elder Research T8-66

KDD-98: A Comparison of Leading Data Mining Tools

Closing Observations

• Data Mining Tools Can:– Enhance inference process

– Speed up design cycle

• Data Mining Tools Can Not:– Substitute for statistical and domain expertise

• Users are advised to:– Get training on tools

– Be alert for product upgrades

updated October 19, 1998© 1998 Elder Research T8-67

KDD-98: A Comparison of Leading Data Mining Tools

Index to Tools(Page numbers in italics refer to screen captures)

Clementine 7, 8, 10, 18 , 20, 31, 32, 35, 41, 46, 54, 55 , 60, 65Darwin 7, 8, 10, 20, 31, 32, 35, 46, 51 , 52 , 54, 60, 65DataCruncher 7, 8, 10, 19 , 20, 31, 46, 50 , 54, 60, 65Enterprise Miner 7, 8, 10, 17 , 20, 31, 32, 33 , 35, 38 , 39 , 41, 43 , 46, 54, 60, 65Gainsmarts 7, 8, 10, 20, 31, 35, 41, 46, 54, 60, 65Intelligent Miner 7, 8, 10, 16 , 20, 31, 32, 35, 41, 44 , 45 , 46, 47 , 54, 60, 65MineSet 7, 8, 10, 14 , 20, 31, 35, 40 , 41, 42 , 46, 54, 59 , 60, 65Model1 7, 8, 10, 20, 22 , 23 , 31, 32, 35, 41, 46, 54, 58 , 60, 62 , 65ModelQuest 7, 8, 10, 20, 31, 35, 41, 46, 54, 60, 65PRW 7, 8, 10, 15 , 20, 21 , 31, 32, 34 , 41, 46, 54, 60, 61 , 65

CART 7, 8, 10, 20, 31, 35, 36 , 46, 53 , 54, 60, 65Neuroshell 7, 8, 10, 20, 31, 32, 46, 54, 60, 65pcOLPARS 7, 8, 10, 13 , 20, 31, 32, 46, 54, 56 , 57 , 60, 65Scenario 7, 8, 10, 20, 24 , 31, 35, 37 , 41, 46, 48 , 54, 60, 65See5 7, 8, 10, 20, 31, 35, 46, 54, 60, 65S-Plus 7, 8, 10, 20, 31, 35, 41, 46, 54, 60, 65WizWhy 7, 8, 10, 20, 31, 46, 49 , 54, 60, 65

updated October 19, 1998© 1998 Elder Research T8-68

KDD-98: A Comparison of Leading Data Mining Tools

Forthcoming Report

• Report provides detailed comparison of high-enddata mining tools, including capabilities, ease ofuse, and practical tips.

• Available for $695 from Elder Research(http://www.datamininglab.com), Q4 1998.

• Purchasers receive brief free consulting session toexplore report findings in more detail, if desired.

Note: The analyses and reviews were performed completely independently,and were made possible by the cooperation of the vendors, for which ElderResearch is very grateful. The companies, however, provided no financialsupport, and had no influence on its editorial content.