Examples of data mining Marketing & Advertisement (Case of Bank of America) In the past, each caller...

40
Introduction to Data Mining

Transcript of Examples of data mining Marketing & Advertisement (Case of Bank of America) In the past, each caller...

  • Slide 1
  • Slide 2
  • Examples of data mining Marketing & Advertisement (Case of Bank of America) In the past, each caller would have listened to the same marketing advertisement, whether or not it was relevant to the callers interests. we want to be as relevant as possible to each customer. Bank of Americas customer service representatives have access to individual customer profiles, so that the customer can be informed of new products or services that may be of greatest interest to him or her. Data mining helps to identify the type of marketing approach for a particular customer, based on the customers individual profile.
  • Slide 3
  • Examples of data mining Medical (Case of Childrens Memorial Hospital) Brain tumors represent the most deadly cancer among children, with nearly 3000 cases diagnosed per year in the United States, nearly half of which are fatal. Director of brain tumor research at Childrens Memorial Hospital in Chicago, has set the goal of building a gene expression database for pediatric brain tumors, in an effort to develop more effective treatment. As one of the first steps in tumor identification, the director uses the Clementine data mining software suite published by SPSS, Inc., to classify the tumor into one of 12 or so salient types.
  • Slide 4
  • WHAT IS DATA MINING?
  • Slide 5
  • Data mining is the process of discovering: Meaningful new correlations Patterns Trends by sifting through large amounts of data stored in repositories, using pattern recognition technologies as well as statistical and mathematical techniques.
  • Slide 6
  • WHAT IS DATA MINING? Data mining is the analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data in novel ways that are both understandable and useful to the data owner
  • Slide 7
  • WHAT IS DATA MINING? Data mining is interdisciplinary field bringing together techniques from machine learning, pattern recognition, statistics, databases and visualization to address the issue of information extraction from large data bases
  • Slide 8
  • Introduction Data mining skills are in high demand as organizations increasingly put data repositories online. Effectively analyzing information from customers, partners, and suppliers has become important to more companies. Many companies have implemented a data warehouse strategy and are now starting to look at what they can do with all that data
  • Slide 9
  • WHY DATA MINING? The ongoing remarkable growth in the field of data mining and knowledge discovery has been fueled by a variety of factors: The storing of the data in data warehouses, so that the entire enterprise has access to a reliable current database The availability of increased access to data from Web navigation and intranets The competitive pressure to increase market share in a globalized economy The development of off-the-shelf commercial data mining software suites The tremendous growth in computing power and storage capacity
  • Slide 10
  • FALLACIES OF DATA MINING Fallacy 1. There are data mining tools that we can turn loose on our data repositories and use to find answers to our problems. Reality. There are no automatic data mining tools that will solve your problems mechanically while you wait. Data mining is a process. Fallacy 2. The data mining process is autonomous, requiring little or no human oversight. Reality. As we saw above, the data mining process requires significant human interactivity at each stage. Even after the model is deployed, the introduction of new data often requires an updating of the model. Continuous quality monitoring and other evaluative measures must be assessed by human analysts.
  • Slide 11
  • FALLACIES OF DATA MINING Fallacy 3. Data mining pays for itself quite quickly. Reality. The return rates vary, depending on the startup costs, analysis personnel costs, data warehousing preparation costs, and so on. Fallacy 4. Data mining software packages are intuitive and easy to use. Reality. Again, ease of use varies. However, data analysts must combine subject matter knowledge with an analytical mind and a familiarity with the overall business or research model.
  • Slide 12
  • FALLACIES OF DATA MINING Fallacy 5. Data mining will identify the causes of our business or research problems. Reality. The knowledge discovery process will help you to uncover patterns of behavior. Again, it is up to humans to identify the causes. Fallacy 6. Data mining will clean up a messy database automatically. Reality. Well, not automatically. As a preliminary phase in the data mining process, data preparation often deals with data that has not been examined or used in years. Therefore, organizations beginning a newdata mining operation will often be confronted with the problem of data that has been lying around for years, is stale, and needs considerable updating.
  • Slide 13
  • Knowledge Discovery Process
  • Slide 14
  • The Knowledge Discovery in Databases process comprises of a few steps leading from raw data collections to some form of new knowledge. The iterative process consists of the following steps: Data cleaning: also known as data cleansing, it is a phase in which noise data and irrelevant data are removed from the collection. Data integration: at this stage, multiple data sources, often heterogeneous, may be combined in a common source. Data selection: at this step, the data relevant to the analysis is decided on and retrieved from the data collection. Data transformation: also known as data consolidation, it is a phase in which the selected data is transformed into forms appropriate for the mining procedure.
  • Slide 15
  • Knowledge Discovery Process Data mining: it is the crucial step in which clever techniques are applied to extract patterns potentially useful. Pattern evaluation: in this step, strictly interesting patterns representing knowledge are identified based on given measures. Knowledge representation: is the final phase in which the discovered knowledge is visually represented to the user. This essential step uses visualization techniques to help users understand and interpret the data mining results.
  • Slide 16
  • WHAT TASKS CAN DATA MINING ACCOMPLISH? The following list shows the most common data mining tasks. Estimation Prediction Classification Clustering Association
  • Slide 17
  • Classification There is a target categorical variable Example: income bracket: partitioned into 3 classes or categories: high income, middle income, and low income
  • Slide 18
  • Classification Classify the income brackets of persons not currently in the database, based on other characteristics associated with that person, such as age, gender, and occupation Classification Task
  • Slide 19
  • Classification - Steps First : examine the data set containing both the predictor variables and the (already classified) target variable i.e income bracket. The algorithm (software) learns about which combinations of variables are associated with which income brackets. For example, older females may be associated with the high- income bracket. This data set is called the training set. Second: The algorithm would look at new records, for which no information about income bracket is available. Based on the classifications in the training set, the algorithm would assign classifications to the new records. For example, a 63-year-old female professor might be classified in the high-income bracket.
  • Slide 20
  • Classification For example, in the medical field, suppose that we are interested in classifying the type of drug a patient should be prescribed, based on certain patient characteristics, such as the age of the patient and the patients sodium/potassium ratio. Figure 1.4 is a scatter plot of patients sodium/potassium ratio against patients ages for a sample of 200 patients. The particular drug prescribed is symbolized by the shade of the points. Light gray points indicate drug Y; medium gray points indicate drug A or X; dark gray points indicate drug B or C.
  • Slide 21
  • Which drug should be prescribed for a young patient with a high sodium/potassium ratio? Which drug should be prescribed for older patients with low sodium/potassium ratios?
  • Slide 22
  • Other Classification examples Determining whether a particular credit card transaction is fraudulent Assessing whether a mortgage application is a good or bad credit risk Diagnosing whether a particular disease is present Determining whether a will was written by the actual deceased, or fraudulently by someone else Identifying whether or not certain financial or personal behavior indicates a possible terrorist threat
  • Slide 23
  • Estimation Estimation is similar to classification except that the target variable is numerical rather than categorical. Example: estimating the systolic blood pressure reading of a hospital patient, based on the patients age, gender, body-mass index, and blood sodium levels The relationship between systolic blood pressure and the predictor variables in the training set would provide us with an estimation model. We can then apply that model to new cases.
  • Slide 24
  • Examples of Estimation Estimating the amount of money a randomly chosen family of four will spend for back-to-school shopping this new school season. Estimating the grade-point average (GPA) of a masters student, based on that students undergraduate GPA. Estimating the number of goals per game that Ronaldo will score during next UEFA, based on his last goals.
  • Slide 25
  • Prediction Prediction is similar to classification and estimation, except that for prediction, the results lie in the future regardless of numerical or categorical Any of the methods and techniques used for classification and estimation may also be used, for prediction. Traditional statistical methods - point estimation and confidence interval estimations, simple linear regression and correlation, multiple regression Data mining and knowledge discovery methods such as neural network
  • Slide 26
  • Examples of Prediction Predicting the price of a stock three months into the future Predicting the percentage increase in traffic deaths next year if the speed limit is increased Predicting the winner of this falls baseball World Series, based on a comparison of team statistics Predicting whether a particular molecule in drug discovery will lead to a profitable new drug for a pharmaceutical company
  • Slide 27
  • Examples of Prediction Predicting the price of a stock three months into the future
  • Slide 28
  • Clustering Clustering refers to the grouping of records, observations, or cases into classes of similar objects. A cluster is a collection of records that are similar to one another, and dissimilar to records in other clusters. Clustering differs from classification in that there is no target variable for clustering. The clustering task does not try to classify, estimate, or predict the value of a target variable. Instead, clustering algorithms seek to segment the entire data set into relatively homogeneous subgroups or clusters.
  • Slide 29
  • Slide 30
  • demographic profile of each of the geographic areas in the country, as defined by zip code. One of the clustering mechanisms they use is the PRIZM segmentation system, which describes every U.S. zip code area in terms of distinct lifestyle types
  • Slide 31
  • Clustering Example clusters for zip code 90210, Beverly Hills, California. Beverly Hills, CA 90210's most common PRIZM NE Segments are: NumberName 16Bohemian Mix 07Money & Brains 03Movers & Shakers 01Upper Crust 04Young Digerati
  • Slide 32
  • Examples of Clustering Group of residents according to zip codes. For accounting auditing purposes, to segment financial behavior into benign and suspicious categories As a dimension-reduction tool when the data set has hundreds of attributes For gene expression clustering, where very large quantities of genes may exhibit similar behavior
  • Slide 33
  • Association The association task for data mining is the job of finding which attributes go together. Most prevalent in the business world, where it is known as affinity analysis or market basket analysis The task of association seeks to uncover rules for quantifying the relationship between two or more attributes. Association rules are of the form If antecedent, then consequent, together with a measure of the support and confidence associated with the rule.
  • Slide 34
  • Association Another example, a particular supermarket may find that of the 1000 customers shopping on a Thursday night, 200 bought diapers, and of those 200 who bought diapers, 50 bought Formula Milk. Thus, the association rule would be If buy diapers, then buy Formula Milk with a support of 200/1000 = 20% and a confidence of 50/200 = 25%.
  • Slide 35
  • Examples of Association Investigating the proportion of subscribers to a companys cell phone plan that respond positively to an offer of a service upgrade Examining the proportion of children whose parents read to them who are themselves good readers Predicting degradation in telecommunications networks Finding out which items in a supermarket are purchased together and which items are never purchased together Determining the proportion of cases in which a new drug will exhibit dangerous side effects
  • Slide 36
  • Bohemian Mix Midscale, Middle Age Mix A collection of mobile urbanites, Bohemian Mix represents the nation's most liberal lifestyles. Its residents are an ethnically diverse, progressive mix of young singles, couples, and families ranging from students to professionals. In their funky row houses and apartments, Bohemian Mixers are the early adopters who are quick to check out the latest movie, nightclub, laptop, and microbrew.
  • Slide 37
  • Money & Brains Upscale, Older Mix The residents of Money & Brains seem to have it all: high incomes, advanced degrees, and sophisticated tastes to match their credentials. Many of these city dwellers are married couples with few children who live in fashionable homes on small, manicured lots.
  • Slide 38
  • Movers & Shakers Wealthy, Middle Age w/o Kids Movers & Shakers is home to America's up-and-coming business class: a wealthy suburban world of dual-income couples who are highly educated, typically between the ages of 35 and 54. Given its high percentage of executives and white-collar professionals, there's a decided business bent to this segment: members of Movers & Shakers rank number one for owning a small business and having a home office.
  • Slide 39
  • Upper Crust Wealthy Older w/o Kids The nation's most exclusive address, Upper Crust is the wealthiest lifestyle in America--a haven for empty- nesting couples between the ages of 45 and 64. No segment has a higher concentration of residents earning over $100,000 a year or possessing a postgraduate degree. And none has a more opulent standard of living.
  • Slide 40
  • Young Digerati Upscale, Younger Mix Young Digerati are tech-savvy and live in fashionable neighborhoods on the urban fringe. Affluent, highly educated, and ethnically mixed, Young Digerati communities are typically filled with trendy apartments and condos, fitness clubs and clothing boutiques, casual restaurants and all types of bars--from juice to coffee to microbrew.