Unsupervised Learning Clustering K-Means. Recall: Key Components of Intelligent Agents...

Unsupervised Learning Clustering K-Means Slide 2 Recall: Key Components of Intelligent Agents Representation Language: Graph, Bayes Nets, Linear functions Inference Mechanism: A*, variable elimination, Gibbs sampling Learning Mechanism: Maximum Likelihood, Laplace Smoothing, gradient descent, perceptron, k-Nearest Neighbor, many more: k- means, EM, PCA, ------------------------------------- Evaluation Metric: Likelihood, quadratic loss (a.k.a. squared error), regularized loss, margins, many more: 0-1 loss, conditional likelihood, precision/recall, Slide 3 Supervised vs. Unsupervised Learning Supervised Learning: Labeled Data X 11 X 12 X 1N Y1Y1 X 21 X 22 X 2N Y2Y2 X M1 X M2 X MN YMYM Unsupervised Learning: Unlabeled Data X 11 X 12 X 1N ? X 21 X 22 X 2N ? X M1 X M2 X MN ? In supervised learning, the learning algorithm is given training examples that contain inputs (the X values) and labels or outputs (the Y values). In unsupervised learning, the learning algorithm is given training examples that contain inputs (the X values), but no labels or outputs (no Y values). Its called unsupervised because there are no labels to help supervise the learning algorithm during the learning process, to get it to the right model. Slide 4 Example Unsupervised Problem 1 Are these data points distributed completely randomly, or do you see some structure in them? How many clusters do you see? None 1 2 3 4 5 X1X1 X2X2 Slide 5 Example Unsupervised Problem 1 Are these data points distributed completely randomly, or do you see some structure in them? Structured there are clusters! How many clusters do you see? None 1 2 3 4 5 X1X1 X2X2 Slide 6 Example Unsupervised Problem 2 There are 2 input variables, X1 and X2, in this space. So this is called a 2-dimensional space. How many dimensions are actually needed to describe this data? 0 1 2 3 X1X1 X2X2 Slide 7 Example Unsupervised Problem 2 There are 2 input variables, X1 and X2, in this space. So this is called a 2-dimensional space. How many dimensions are actually needed to describe this data? 1 dimension captures most of the variation in this data. 2 dimensions will capture everything. X1X1 X2X2 Slide 8 Types of Unsupervised Learning Density Estimation - Clustering (Example 1) - Dimensionality Reduction (Example 2) Factor Analysis - Blind signal separation Slide 9 Example Open Problem in AI: Unsupervised Image Segmentation (and Registration) Examples taken from (Felzenszwab and Huttenlocher, Int. Journal of Computer Vision, 59:2, 2004). http://cs.brown.edu/~pff/segment/.http://cs.brown.edu/~pff/segment/ Slide 10 The K-Means Clustering Algorithm Inputs: 1)Some unlabeled (no outputs) training data 2)A number K, which must be greater than 1 Output: A label between 1 and K for each data point, indicating which cluster the data point belongs to. Slide 11 Visualization of K-Means Data Slide 12 Visualization of K-Means 1. Generate K random initial cluster centers, or means. Slide 13 Visualization of K-Means 2. Assign each point to the closest mean point. Slide 14 Visualization of K-Means 2. Assign each point to the closest mean point. Visually, the mean points divide the space into a Voronoi diagram. Slide 15 Visualization of K-Means 3. Recompute the mean (center) of each colored set of data. Notice: means do not have to be at the same position as a data point, although some times they might be. Slide 16 Visualization of K-Means 3. Recompute the mean (center) of each colored set of data. Notice: means do not have to be at the same position as a data point, although some times they might be. Slide 17 Visualization of K-Means 4. Repeat steps 2 & 3 until the means stop moving (convergence). a. Repeat step 2 (assign each point to the nearest mean) Slide 18 Visualization of K-Means 4. Repeat steps 2 & 3 until the means stop moving (convergence). a. Repeat step 2 (assign each point to the nearest mean) Slide 19 Visualization of K-Means 4. Repeat steps 2 & 3 until the means stop moving (convergence). a. Repeat step 2 (assign each point to the nearest mean) b. Repeat step 3 (recompute means) Slide 20 Visualization of K-Means 4. Repeat steps 2 & 3 until the means stop moving (convergence). a. Repeat step 2 (assign each point to the nearest mean) b. Repeat step 3 (recompute means) Quiz: Where will the means be after the next iteration? Slide 21 Visualization of K-Means 4. Repeat steps 2 & 3 until the means stop moving (convergence). a. Repeat step 2 (assign each point to the nearest mean) b. Repeat step 3 (recompute means) Answer: Where will the means be after the next iteration? Slide 22 Visualization of K-Means 4. Repeat steps 2 & 3 until the means stop moving (convergence). a. Repeat step 2 (assign each point to the nearest mean) b. Repeat step 3 (recompute means) Quiz: Where will the means be after the next iteration? Slide 23 Visualization of K-Means 4. Repeat steps 2 & 3 until the means stop moving (convergence). a. Repeat step 2 (assign each point to the nearest mean) b. Repeat step 3 (recompute means) Answer: Where will the means be after the next iteration? Slide 24 Formal Description of the Algorithm Input: 1)X 11, , X 1N ; ; X M1, , X MN 2)K Output: Y 1 ; ; Y M, where each Y i is in {1, , K} Slide 25 Formal Description of the Algorithm Slide 26 Evaulation metric for K-means Slide 27 Complexity of K-Means Finding a globally-optimal solution to WCSS is known to be an NP-hard problem. K-means is known to converge to a local minimum of WCSS. K-means is a heuristic or greedy algorithm, with no guarantee that it will find the global optimum. On real datasets, K-means usually converges very quickly. Often, people run it multiple times with different random initializations, and choose the best result. In some cases, K-means will still take exponential time (assuming P!=NP), even to find a local minimum. However, such cases are rare in practice. Slide 28 Quiz Is K-means Classification or Regression? Generative or Discriminative? Parametric or Nonparametric? Slide 29 Answer Is K-means Classification or Regression? - classification: output is a discrete value (cluster label) for each point Generative or Discriminative? - discriminative: it has fixed input variables and output variables. Parametric or Nonparametric? - parametric: the number of cluster centers (K) does not change with the number of training data points Slide 30 Quiz Is K-means Supervised or Unsupervised? Online or batch? Closed-form or iterative? Slide 31 Answer Is K-means Supervised or Unsupervised? - Unsupervised Online or batch? - batch: if you add a new data point, you need to revisit all the training data to recompute the locally-optimal model Closed-form or iterative? -iterative: training requires many passes through the data Slide 32 Quiz Which of the following problems might be solved using K-Means? Check all that apply. For those that work, explain what the inputs and outputs (X and Y variables) would be. Segmenting an image Finding galaxies (dense groups of stars) in a telescopes image of the night sky Identify different species of bacteria from DNA samples of bacteria in seawater Slide 33 Answer Which of the following problems might be solved using K- Means? Check all that apply. For those that work, explain what the inputs and outputs (X and Y variables) would be. Segmenting an image: Yes. Inputs are the pixel intensities, outputs are segment labels. Finding galaxies (dense groups of stars) in a telescopes image of the night sky. Yes. Inputs are star locations, outputs are galaxy labels Identify different species of bacteria from DNA samples of bacteria in seawater. Yes. Inputs are gene sequences, outputs are species labels.

Unsupervised Learning Clustering K-Means. Recall: Key Components of Intelligent Agents...

Documents

Transcript of Unsupervised Learning Clustering K-Means. Recall: Key Components of Intelligent Agents...