Alexander Gammerman - Machine Learning for Big Data
-
Upload
royal-united-services-institute-for-defence-and-security-studies -
Category
Government & Nonprofit
-
view
114 -
download
8
description
Transcript of Alexander Gammerman - Machine Learning for Big Data
![Page 1: Alexander Gammerman - Machine Learning for Big Data](https://reader037.fdocuments.in/reader037/viewer/2022102922/54c6ced04a79592f088b459c/html5/thumbnails/1.jpg)
Machine Learning for Big Data
Alexander Gammerman
Computer Learning Research CentreRoyal Holloway, University of London
Trends in Big DataSTFC/RUSI: Big Data for Security and Resillience
March 7th, 2014
1 / 19
![Page 2: Alexander Gammerman - Machine Learning for Big Data](https://reader037.fdocuments.in/reader037/viewer/2022102922/54c6ced04a79592f088b459c/html5/thumbnails/2.jpg)
Layout
1 Debunking the myth
2 Machine Learning (Data Analytics)
3 Trends in Machine Learning for Big Data
4 Conclusions
2 / 19
![Page 3: Alexander Gammerman - Machine Learning for Big Data](https://reader037.fdocuments.in/reader037/viewer/2022102922/54c6ced04a79592f088b459c/html5/thumbnails/3.jpg)
”Fashionable” pursuit
AI, Cybernetics, Neural Networks, Expert Systems,Big Data?
Big Data, small data, any data – what we need is Data Analysis orData Analytics or Machine Learning
3 / 19
![Page 4: Alexander Gammerman - Machine Learning for Big Data](https://reader037.fdocuments.in/reader037/viewer/2022102922/54c6ced04a79592f088b459c/html5/thumbnails/4.jpg)
Machine Learning: what is it?
ML is intersection of Statistics and Computer Science.
Statistics deals with inferences to obtain valid conclusions from data undervarious models and assumptions.
Computer Science considers what is computable, develops efficientalgorithms and concerns with data storage and manipulation.
ML takes the past data, ”learns”, tries to find some rules, regularities inthe data in order to make predictions for the future examples. Efficientalgorithms have to be developed to make valid predictions.
4 / 19
![Page 5: Alexander Gammerman - Machine Learning for Big Data](https://reader037.fdocuments.in/reader037/viewer/2022102922/54c6ced04a79592f088b459c/html5/thumbnails/5.jpg)
Computer Learning Research Centre (CLRC) at RoyalHolloway, University of London
Established in 1998 to develop machine learning theory, including design ofefficient algorithms for data analysis.
CLRC Fellows, including several prominent ones, such as: Vapnik andChervonenkis (the two founders of statistical learning theory), Shafer(co-founder of the DempsterShafer theory), Rissanen (inventor of theMinumum Description Length principle), Levin (one of the 3 founders ofthe theory of NP-completeness, made fundamental contributions toKolmogorov complexity)
5 / 19
![Page 6: Alexander Gammerman - Machine Learning for Big Data](https://reader037.fdocuments.in/reader037/viewer/2022102922/54c6ced04a79592f088b459c/html5/thumbnails/6.jpg)
Recent years: explosion of interest in machine-learning methods, inparticular statistical learning theory. Statistical learning theory: similargoals to statistical science, but
it is nonparametric and
concerned with the problem of prediction.
6 / 19
![Page 7: Alexander Gammerman - Machine Learning for Big Data](https://reader037.fdocuments.in/reader037/viewer/2022102922/54c6ced04a79592f088b459c/html5/thumbnails/7.jpg)
Problems and Current Techniques
Classical techniques: small scale, low-dimensional data. But conceptualand computational difficulties for high-dimensional data. Validity ofpredictions. Confidence measures. Online prediction.
Current techniques for dimensionality problem: Support Vector Machine(Vapnik, 1995, 1998; Vapnik and Chervonenkis, 1974); Kernel Methods.New technique for validity problem: Conformal Predictors.
7 / 19
![Page 8: Alexander Gammerman - Machine Learning for Big Data](https://reader037.fdocuments.in/reader037/viewer/2022102922/54c6ced04a79592f088b459c/html5/thumbnails/8.jpg)
Projects
Compact Descriptors for Automatic Target Identification (withQinetiQ).
Statistical profiling of offenders (with the Home Office).
Material identification with atmosphere corrections (with WatefallSolutions).
Unmixing spectra (with Qinetiq).
Anomaly detection (vehicles) (with Thales).
Fault Diagnosis (with Marconi Instruments).
8 / 19
![Page 9: Alexander Gammerman - Machine Learning for Big Data](https://reader037.fdocuments.in/reader037/viewer/2022102922/54c6ced04a79592f088b459c/html5/thumbnails/9.jpg)
Projects – cont’d
Abdominal Pain (with Western General Hospital, Edinburgh).
Ovarian Cancer (with Institute for Women’s Health, UCL).
Depression (with Institute of Psychiatry, Kings College)
Child Leukemia (with Royal London Hospital)
Heart Diseases ((with Institute for Women’s Health, UCL).
Analysis of microarrays (with Veterinary Laboratory Agency –DEFRA)
Protein-Protein Interaction (EU project)
9 / 19
![Page 10: Alexander Gammerman - Machine Learning for Big Data](https://reader037.fdocuments.in/reader037/viewer/2022102922/54c6ced04a79592f088b459c/html5/thumbnails/10.jpg)
How much data do we need to answer our questions?
Big Data: V 3
Volume: Gigabyte(109); Terabyte (1012); Petabyte (1015); Exabyte(1018); Zettabyte (1021).
Variety: structured, semi-structured, unstructured; text, image, audio,video.
Velocity: dynamic; time-varying, etc.
Plus: high-dimensionality
But: if the answer is a Zettabyte what is the question?
The global data supply reached 2.8 zettabytes (ZB) in 2012 - or 2.8trillion GB - but just 0.5% of this is used for analysis, according to theDigital Universe Study. Volumes of data are projected to reach 40ZB by2020, or 5,247 GB per person.
10 / 19
![Page 11: Alexander Gammerman - Machine Learning for Big Data](https://reader037.fdocuments.in/reader037/viewer/2022102922/54c6ced04a79592f088b459c/html5/thumbnails/11.jpg)
We don’t need the big data per se - we need to have a problem first andthen decide how much data we need to solve the problem.
If a child wants to learn a concept of a car, he/she doesn’t need to have 1million or billion cars to learn the concept - enough 10 or 100.If we want to predict digits, we can learn on the first 100 or 1000 digitsand confidently with high accuracy, identify the next one.
11 / 19
![Page 12: Alexander Gammerman - Machine Learning for Big Data](https://reader037.fdocuments.in/reader037/viewer/2022102922/54c6ced04a79592f088b459c/html5/thumbnails/12.jpg)
Figure : USPS data
12 / 19
![Page 13: Alexander Gammerman - Machine Learning for Big Data](https://reader037.fdocuments.in/reader037/viewer/2022102922/54c6ced04a79592f088b459c/html5/thumbnails/13.jpg)
Figure : Conformal Predictors on USPS data: Online cumulative multiplepredictions at different confidence levels (”Hedging predictions in MachineLearning” by A.Gammerman and V.Vovk The Computer Journal (2007) 50 (2):151-163).
13 / 19
![Page 14: Alexander Gammerman - Machine Learning for Big Data](https://reader037.fdocuments.in/reader037/viewer/2022102922/54c6ced04a79592f088b459c/html5/thumbnails/14.jpg)
In fact, there is a well-known concept in machine learning. If in the pastpeople thought that the larger training set of data we have the moreaccurate results can be obtained. But the founders of statistical learningtheory, V.Vapnik and A.Cherovnenkis, showed that it is not just the lengthof the training data - it is actually another charachterisitcs called”capacity” that is more important.
14 / 19
![Page 15: Alexander Gammerman - Machine Learning for Big Data](https://reader037.fdocuments.in/reader037/viewer/2022102922/54c6ced04a79592f088b459c/html5/thumbnails/15.jpg)
Trends in Machine Learning for Big Data
How do we make machine learning algorithms scale to large datasets?There are two main approaches: (1) developing parallelizable MLalgorithms and integrating them with large parallel systems and (2)developing more efficient algorithms.
The data growth is driving the need for parallel and online algorithms andmodels that can handle this ”Big Data”.
Need to explore the computational foundations associated with performingthese analyses in the context of parallel and cloud architectures.
15 / 19
![Page 16: Alexander Gammerman - Machine Learning for Big Data](https://reader037.fdocuments.in/reader037/viewer/2022102922/54c6ced04a79592f088b459c/html5/thumbnails/16.jpg)
Large-scale modeling techniques and algorithms include
transductive and inductive models,
online compression models (extension of conformal predictors),
graphical models,
deep learning and semi-supervised learning algorithms,
clustering algorithms,
parallel learning algorithms.
The computational techniques provide a basic foundation in large-scaleprogramming, ranging from the basic ”parfor” to parallel abstractions,such as MapReduce (Hadoop) and GraphLab.
16 / 19
![Page 17: Alexander Gammerman - Machine Learning for Big Data](https://reader037.fdocuments.in/reader037/viewer/2022102922/54c6ced04a79592f088b459c/html5/thumbnails/17.jpg)
Transduction
Data General
Knowledgelearning
Particular
(future examples)
(past examples)
inductive
transduction deduction
Figure : Induction and Transduction [V.Vapnik, 1995]
17 / 19
![Page 18: Alexander Gammerman - Machine Learning for Big Data](https://reader037.fdocuments.in/reader037/viewer/2022102922/54c6ced04a79592f088b459c/html5/thumbnails/18.jpg)
Why use conformal predictions?
Why, after 100 years of research in statistics, do we need yet anothermethod of prediction?
It is simple and rigorous.
Given any of a wide range of learning/statistical prediction methods,conformal prediction can be used as a wrapper to provide a measureof confidence.
It is valid under weak assumptions.
It limits the fraction of prediction mistakes from the start. (Crudely, apredictor can either make a prediction, or else say dont know, possiblyin a graded way, such as giving a wide prediction interval.)
It works in practice.
18 / 19
![Page 19: Alexander Gammerman - Machine Learning for Big Data](https://reader037.fdocuments.in/reader037/viewer/2022102922/54c6ced04a79592f088b459c/html5/thumbnails/19.jpg)
Conclusions
”It took Deep Thought 7.5 million years to answer the ultimate question.As nobody knew what the ultimate question to Life, The Universe andEverything actually was, nobody knows what to make of the answer (42)”.
Nowdays, as John Poppelaars noticed, many people think that the BigData would help to find the ultimate question.
But I already know that it is not Big Data, and the answer is not 42, butthe Machine Learning.
19 / 19