Supporting Decision Making · Computer technologies that support decision making Decision support...
Transcript of Supporting Decision Making · Computer technologies that support decision making Decision support...
Supporting Decision Making
A Framework for IS Management
Introduction (2)
Most computer systems support decision making because all software programs involve automating decision steps that people would take
Decision making is a process that involves a variety of activities, most of which handle information
A wide variety of computer-based tools and approaches can be used to confront the problem at hand and work through its solution
Introduction (3)
Computer technologies that support decision making
Decision support system (DSSs)Data miningExecutive information systems (EISs)Expert systems (ESs)Agent-based modeling
Multidisciplinary foundations for DS technologiesDatabase research, artificial intelligence, statistical inference, human-computer interaction, simulation methods, software engineering etc.
Case Example---A Problem-Solving Scenario
Using an EIS to discover a sales shortfall in one region
Investigate several possible causesEconomic conditionsCompetitive analysisWritten sales reportsA data mining analysis
Result: no clear problems revealed
Decision Support Systems---History
Two contributing areas of research in 1950s-1960s
Organizational decision making in CMUInteractive computer systems in MIT
Middle 1970s: single user and model-oriented DSSMiddle and late 1980s: EIS, GDSS, ODSS1990s: Data warehousing and OLAPLate 1990s-2000s
Data miningWeb-based analytical applications
What is a DSS?
A DSS aims to use IT to relieve humans of some decision making or help us make more informed decisions
Systems that support, not replace, managers in their decision-making activities
DSSs are defined as:Computer-based systemsThat help decision makersConfront ill-structured problemsThrough direct interactionWith data and analysis models
DSS Architecture (1)
DSS Architecture (2)
The Dialog ComponentLinking the user to the system
The Data ComponentData sources --- use all the important data sources within and outside the organization in the form of summarized data (DW & DM)
The Model ComponentModels provide the analysis capabilities for a DSS
Using a mathematical representation of the problem, algorithmic processes are employed to generate information to support decision making
A Taxonomy of DSS
Using the mode of assistance as the criterionA model-driven DSSA communication-driven DSSA data-driven DSS or data-oriented DSSA document-driven DSSA knowledge-driven DSS
Executive Information System (1)
The emphasis of EIS is on graphical displays and easy-to-use user interfaces
EIS can be viewed as a DSS that:Provides access to summary performance dataUses graphics to display and visualize the data in an easy-to-use fashion, and Has a minimum of analysis for modeling beyond the capability to "drill down" in summary data to examine components
Executive Information System (2)
EISs aim to provide both internal and external information relevant to meeting the strategic goals of the organization
Gauge company performanceScan the environment
EIS and data warehousing technologies are converging in the marketplace
The term EIS has lost popularity in favor of Business Intelligence
Data Mining: Motivations
The explosive growth of data: from TB to PBData collection and data availability
Automated data collection tools, database systems, Web, computerized society
Major sources of abundant dataBusiness: Web, e-commerce, transactions, stocks, … Science: remote sensing, bioinformatics, … Society and everyone: news, digital cameras, YouTube
We are drowning in data, but starving for knowledge!
“Necessity is the mother of invention”—Data mining—Automated analysis of massive data sets
What Is Data Mining?
Data mining (knowledge discovery from data) Extraction of interesting patterns or knowledge from huge amount of data
Alternative namesKnowledge discovery (mining) in databases (KDD), knowledge extraction, data/pattern analysis, data archeology, data dredging, information harvesting, business intelligence, etc.
Watch out: Is everything “data mining”? Simple search and query processing (Deductive) expert systems
Knowledge Discovery (KDD) Process
Data mining—core of knowledge discovery process
Data Cleaning
Data Integration
Databases
Data Warehouse
Task-relevant Data
Selection
Data Mining
Pattern Evaluation
Architecture: A Typical Data Mining System
data cleaning, integration, and selection
Database or Data Warehouse Server
Data Mining Engine
Pattern Evaluation
Graphical User Interface
Knowledge-Base
DatabaseData
WarehouseWorld-Wide
WebOther InfoRepositorie
s
Data Mining: Confluence of Multiple Disciplines
Data Mining
Database Technology Statistics
MachineLearning
PatternRecognition
Algorithm
OtherDisciplines
Visualization
Why Not Traditional Data Analysis?
Tremendous amount of dataAlgorithms must be highly scalable to handle TB of data
High-dimensionality of data Micro-array may have tens of thousands of dimensions
High complexity of dataData streams and sensor dataTime-series data, temporal data, sequence data Structure data, graphs, social networks and multi-linked dataHeterogeneous databases and legacy databasesSpatial, spatiotemporal, multimedia, text and Web dataSoftware programs, scientific simulations
New and sophisticated applications
Multi-Dimensional View of Data Mining (1)
Data to be minedRelational, data warehouse, transactional, stream, object-oriented/relational, active, spatial, time-series, text, multi-media, heterogeneous, legacy, WWW
Knowledge to be minedCharacterization, discrimination, association, classification, clustering, trend/deviation, outlier analysis, etc.Multiple/integrated functions and mining at multiple levels
Multi-Dimensional View of Data Mining (2)
Techniques utilizedDatabase-oriented, data warehouse (OLAP), machine learning, statistics, visualization, etc.
Applications adaptedRetail, telecommunication, banking, fraud analysis, bio-data mining, stock market analysis, text mining, Web mining, etc.
Data Mining Functionalities (1)
Multidimensional concept description: characterization and discrimination
Generalize, summarize, and contrast data characteristics, e.g., dry VS. wet regions
Frequent patterns, association, correlation vs. causality
Diaper Beer [0.5%, 75%]Classification and prediction
Construct models (functions) that describe and distinguish classes or concepts for future prediction
E.g., classify countries based on (climate), or classify cars based on (gas mileage)
Predict some unknown or missing numerical values
Data Mining Functionalities (2)
Cluster analysisClass label is unknown: Group data to form new classes, e.g., cluster houses to find distribution patternsMaximizing intra-class similarity & minimizing interclass similarity
Outlier analysisOutlier: Data object that does not comply with the general behavior of the dataNoise or exception? Useful in fraud detection, rare events analysis
Trend and evolution analysisTrend and deviation: e.g., regression analysisPeriodicity analysis
Major Issues in Data Mining (1)
Mining methodology Mining different kinds of knowledge from diverse data types, e.g., bio, stream, WebPerformance: efficiency, effectiveness, and scalabilityPattern evaluation: the interestingness problemIncorporation of background knowledgeHandling noise and incomplete dataParallel, distributed and incremental mining methodsIntegration of the discovered knowledge with existing one: knowledge fusion
Major Issues in Data Mining (2)
User interactionData mining query languages and ad-hoc miningExpression and visualization of data mining resultsInteractive mining of knowledge at multiple levels of abstraction
Applications and social impactsDomain-specific data mining & invisible data miningProtection of data security, integrity, and privacy
Artificial Intelligence (1)
AI is a group of technologies that attempts to mimic our senses and emulate certain aspects of human behavior such as reasoning and communication
1956, a conference in Dartmouth CollegeJohn McCarthy, Marvin Minsky, Allen Newell and Herbert Simon ( MIT, CMU and Stanford)
1965, H. A. Simon: "machines will be capable, within twenty years, of doing any work a man can do"1967, Marvin Minsky: "Within a generation ... the problem of creating 'artificial intelligence' will substantially be solved"Heavily funded by DARPA
Artificial Intelligence (2)
They had failed to recognize the difficulty of some of the problems they faced:
The lack of raw computing powerThe intractable combinatorial explosion of their algorithms,The difficulty of representing commonsense knowledge and doing commonsense reasoning,The incredible difficulty of perception and motionThe failings of logic
First AI WinterIn 1974, DARPA cut off all undirected, exploratory research in AI
Artificial Intelligence (3)
In the early 80s, the field was revived by the commercial success of expert systems
By 1985 the market for AI had reached more than a billion dollars.Minsky and others warned the community that enthusiasm for AI had spiraled out of control and that disappointment was sure to follow
Second AI WinterThe collapse of the Lisp Machine market in 1987
Artificial Intelligence (4)
In the 90s AI achieved its greatest successesArtificial intelligence was adopted throughout the technology industry, providing the heavy lifting for
Data mining LogisticsMedical diagnosis…
Expert System
An expert system is an automated type of analysis or problem-solving model that deals with a problem the way an "expert" does
The process involves consulting a base of knowledge or expertise to reason out an answer based on the characteristics of the problem
Architecture of an ES
InferenceEngine
KnowledgeBase
User
Interface
Description of a problem
Advice and explanation
User
Knowledge Representation
In AI, the primary aim of knowledge representation is to store knowledge so that programs can process it and achieve the verisimilitude of human intelligence
The representation theory has its origin in cognitive science
Knowledge can be represented in a number of ways
Case-based reasoningArtificial neural networksStored as rules
Case-based Reasoning (1)
Case-based reasoning The process of solving new problems based on the solutions of similar past problemsA case consists of a problem, its solution, and, typically, annotations about how the solution was derived
Case-based Reasoning (2)
Case-based reasoning as a four-step processRetrieve: given a target problem, retrieve cases from memory that are relevant to solving itReuse: map the solution from the previous case to the target problemRevise: test the new solution, if necessary, revise it.Retain: After the solution has been successfully adapted to the target problem, store the resulting experience as a new case in memory
Supervised vs. Unsupervised Learning
Supervised learningSupervision: The training data (observations, measurements, etc.) are accompanied by labels indicating the class of the observationsNew data is classified based on the training set
Unsupervised learningThe class labels of training data is unknownGiven a set of measurements, observations, etc. with the aim of establishing the existence of classes or clusters in the data
Artificial Neural Network (1)
An interconnected group of artificial neurons Using a mathematical or computational model for information processing based on a connectionisticapproach to computation. An adaptive system that changes its structure based on external or internal information that flows through the network.
ANNs can be used to model complex relationships between inputs and outputs or to find patterns in data
Non-linear statistical data modeling or decision making tools
Artificial Neural Network (2)
Training set:(1) high salary, owns a house, has a dog, [profitable customer](2) less than 3 years on job, prior bankruptcy, owns a dog, [deadbeat]......
Rule-based Systems (1)
Knowledge stored as rulesThe most commonly used form of rules is the if-then statemente.g. IF some condition THEN some action
A rule-based inference model: decision treeEach internal node (non-leaf node) denotes a test on an attributeEach branch represents an outcome of the testEach leaf node holds a class label
Rule-based Systems (2)
age income student credit_rating buys_computer<=30 high no fair no<=30 high no excellent no31…40 high no fair yes>40 medium no fair yes>40 low yes fair yes>40 low yes excellent no31…40 low yes excellent yes<=30 medium no fair no<=30 low yes fair yes>40 medium yes fair yes<=30 medium yes excellent yes31…40 medium no excellent yes31…40 high yes fair yes>40 medium no excellent no
Training dataset for decision tree buys_computer
Rule-based Systems (3)
age?
overcast
student? credit rating?
<=30 >40
no yes yes
yes
31..40
fairexcellentyesno
Decision tree buys_computer
Agent-based Modeling
Simulate the behavior that emerges from the decisions of a large number of distinct individuals
Computer generated agents, each making decisions typical of the decisions an individual would make in the real world
Trying to understand the mysteries of why businesses, markets, consumers, and other complex systems behave as they do
Toward the Real-Time Enterprise
The essence of the phrase real-time enterprise is that organizations can know how they are doing at the moment
Digitization and automation of some crucial enterprise activities traditionally completed by people
Esp. information analysis
Better sense-and-response
Real-time Reporting
Real-time reporting is occurring on a whole host of fronts including:
Enterprise nervous systemsA network that connects people, applications and devicesTo coordinate company operations
Straight-through processingTo reduce distortion in supply chains
Real-time CRMTo automate decision making relating to customers, and
Communicating objectsTo gain real-time data about the physical worldE.g. radio frequency identification device (RFID)
The Dark Side of Real Time
Object-to-object communication could compromise privacy
Knowing the exact location of a company truck every minute of the day is an invasion the driver's privacy
In the era of speed, a situation can become very bad very fast
E.g. "circuit breaker" to stop deep dives in NYSE