A guided tour to Machine Learning using MATLAB® › wp-content › uploads › 2020 › ... ·...
Transcript of A guided tour to Machine Learning using MATLAB® › wp-content › uploads › 2020 › ... ·...
Thisversion:4/4/174:37PM
AguidedtourtoMachineLearningusingMATLAB®
©OgeMarques,PhD–2016-2017
Introduction
• Thisdocumentguidesyouthroughseveraltutorials,papers,andresourcesrelatedtoMachineLearning(withemphasisonimageandvisiontasks)usingMATLAB.
• ItassumesnopriorexposuretoMachineLearningorMATLAB.• Itisstructuredasastep-by-stepguide.Itisbestthatyoufollowitintheintended
sequence.
Part1-AccessingMATLABYouareexpectedtohavefrequentaccesstoacomputerrunningMATLABandsomeofitstoolboxes(notably,theImageProcessing,ComputerVisionSystem,StatisticsandMachineLearning,NeuralNetwork,andFuzzyLogictoolboxes)foryourassignmentsandprojects.
Herearesomeoptionstoconsider:
1. PurchaseyourowncopyofthestudentversionofMATLAB.Itcosts$99,isfullyfunctional,andcomeswithseveraltoolboxes.Formoredetails,goto:http://www.mathworks.com/academia/student_version/
2. AccessviatheEngineeringCloud(https://tsg.eng.fau.edu/software/matlab/).InstructionsonhowtoaccesstheEngineeringCloudarepostedhere:https://tsg.eng.fau.edu/software/vmware-remote-desktop-access/Onceconnected,selectthe"AllEngineeringStudents"poolofwindowsdesktops.Onceloggedintowindows,clickonStart->AllPrograms->MATLAB2016a->MATLAB2016a.BesuretosaveanyfilestoyourZ:drive.IfyouhaveissuesconnectingorstartingMATLAB,[email protected].
3. ConsiderafreealternativetoMATLAB.ThemostpopularisGNUOctave(http://www.gnu.org/software/octave/),whichhasbeenusedbymanyMachineLearningprofessors,students,andresearchers.Theyare"workinprogress"andarenot100%compatiblewithMATLAB.Sopleaseusethematyourownrisk.
Thisversion:4/4/174:37PM
Part2-LearningthebasicsofMATLAB1. TaketheMATLABOnramptrainingavailableat
https://matlabacademy.mathworks.com/R2016a/portal.html?course=gettingstarted(seeseparatedocumentforstep-by-stepinstructions).
2. (OPTIONAL)Watchthe46-min"IntroductiontoMATLAB"video:www.mathworks.com/videos/introduction-to-matlab-81592.htmlDon'tforgettodownloadtheassociatedsourcecode:http://www.mathworks.com/matlabcentral/fileexchange/49570-introduction-to-matlab--february-2015-
Part3-LearningthebasicsofMachineLearninginMATLAB1. Readthe"IntroducingMachineLearning"e-book(availableonCanvas).
2. Read"SupervisedLearningWorkflowandAlgorithms"
https://www.mathworks.com/help/stats/supervised-learning-machine-learning-workflow-and-algorithms.html
3. (OPTIONAL)Watchthe35-min"MachineLearningMadeEasy"video:www.mathworks.com/videos/machine-learning-with-matlab-100694.htmlDon'tforgettodownloadtheassociatedsourcecode:http://www.mathworks.com/matlabcentral/fileexchange/50232-machine-learning-made-easy
4. (OPTIONAL)Watchthe41-min"MachineLearningwithMATLAB"video:http://www.mathworks.com/videos/machine-learning-with-matlab-81984.htmlDon'tforgettodownloadtheassociatedsourcecode:http://www.mathworks.com/matlabcentral/fileexchange/42744-machine-learning-with-matlab
Thisversion:4/4/174:37PM
Part4-ClassificationusingdecisiontreesinMATLABInspiredbythestepsathttps://www.mathworks.com/help/stats/classification-trees-and-regression-trees.html
1. RuntheexamplefiledtIntro.m,payingattentiontothefollowingaspects:a. Howtoloadadataset(inthiscase,it'salreadyavailablein.matformat)b. Howtocreateadecisiontree,viewit,anduseittomakeapredictionusingunseen
datac. Howtocomputeresubstitutionerroroftheresultingclassificationtreed. Howtocomputecross-validationaccuracye. Howtoselecttheappropriatetreedepthf. Howtoprunethetree
2. RuntheexamplefiledtIris.m,payingattentiontothefollowingaspects:
a. Howtoloadadataset(inthiscase,it'salreadyavailablein.matformat)b. Howtoplotdifferentviewsofthedataset(wheneverfeasible)inordertobetter
understandthedatac. Howtocreateadecisiontree,viewit,anduseittomakeapredictionusingunseen
datad. Howtocomputeresubstitutionerroroftheresultingclassificationtreee. Howtocomputecross-validationaccuracy
Thisversion:4/4/174:37PM
Part5-LinearRegressioninMATLAB1. Runtheexamplesinthe'Stanford'subfolder.TheyarefromAndrewNg's"Machine
Learning"course(MOOC)–StanfordUniversity–Fall2011.a. ex1.mshowslinearregressionforonevariableb. ex1_multi.mshowslinearregressionwithmultiplevariables.Italsointroduces
featurenormalizationandthenormalequationmethod(analternativetogradientdescent)
2. RuntheexampleIntroToLinearRegression.minthe'Mathworks'subfolder.a. Itisbasedonhttps://www.mathworks.com/help/matlab/data_analysis/linear-
regression.html.Itbuildsandcomparestwosimplelinearregressionmodelsandintroducesthecoefficientofdetermination.
3. Runtheexamplesinthe'Regression_Demos'subfolder.Theyarealsoavailableat:http://www.mathworks.com/matlabcentral/fileexchange/35789-new-regression-capabilities-in-r2012a
a. (OPTIONAL,butrecommended)Watchtheassociatedwebinar/video:https://www.mathworks.com/videos/regression-analysis-with-matlab-new-statistics-toolbox-capabilities-in-r2012a-81869.html
b. Exploretheexamplesfollowingthissequence:StraightLine.m,CurvesSurfaces.m,andNonLinear.m.(SkiptheHousing.m,Model.mandtheGLMs.mexamples)
c. Don’tbeintimidatedordiscouragedbytherichamountofinformationavailableinsomeMATLABobjects,e.g.,LinearModel.
Part6-LogisticRegressioninMATLAB1. Runtheexamplesinthe'Stanford'subfolder.TheyarefromAndrewNg's"Machine
Learning"course(MOOC)–StanfordUniversity–Fall2011.a. ex2.mshowslogisticregressionb. ex2_reg.mshowstheuseofadditionalpolynomialfeaturesandtheimpactof
regularizationonlogisticregressionwithmultiplevariables.Don'tforgettochangethevalueoftheregularizationparameter,lambda,inline90,andrunthatsectioneverytimeyoudoso.Noticehowtheresultingdecisionboundarychangesasaresultofchangesinlambda.
Thisversion:4/4/174:37PM
Part7-TheClassificationLearnerAppGoal:LearnhowtousetheMATLABClassificationLearnerApptoperform3-classclassificationontheFisher’sIrisdataset.1.Dataset:
Inthisexample,wewillusetheFisher’sIrisdataset.ThisisasampledatasetincludedintheMATLABStatisticsandMachineLearningToolbox.Youcanfindallsampledatasetsat:https://www.mathworks.com/help/stats/_bq9uxn4.html
2.Viewthedataset:*Thisstepisnotnecessary,justtogiveanideahowthisdatasetlookslikeToloadadatasetintotheMATLABworkspace,type:loadfilenameInthisparticularexample,wewilltypeinCommandWindow:loadfisheriris.matYoucanviewthedatasetsloadedtotheworkspacebydoubleclickingthematrixnameunderWorkspacewindow.(Pleasenoticethatyoumayhaveadifferentwindowlayoutthanthescreenshotbelow)Inthisexample,measisa150*4doublematrix.Thereare150rowseachrowrepresentsoneinstance.Thereare4columnsstoreattributeinformation(col1:sepallengthincm;col2:sepalwidthincm;col3:petallengthincm;col4:petalwidthincm).Theclassforeachinstanceisstoredinaseparate150*1cellcalled“species”.Inthiscase,thefirst50instancesbelongtoclassSetosa,thefollowing50belongtoclassVersicolorandthelast50belongtoclassVirginica.
Thisversion:4/4/174:37PM
3.Preparethedata.Weneedtofirstloadthefisheririsdatasetandcreateatableofmeasurementpredictors(orfeatures)usingvariablesfromthedatasettouseforaclassification.TypethefollowingcommandaftertheCommandWindowprompt:
fishertable=readtable('fisheriris.csv');4.StarttheClassificationLearnerApp.Therearetwowaysofdoingthis:a)MATLABToolstrip:OntheAPPStab,underMath,StatisticsandOptimization,clicktheappicon(seescreenshotbelow).b)MATLABcommandprompt:typeclassificationLearner
Thisversion:4/4/174:37PM
5.OntheClassificationLearnertab,intheFilesection,clickNewSession.(seescreenshotbelow)
Thisversion:4/4/174:37PM
6.IntheNewSessiondialogbox,selectthetablefishertablefromtheworkspacelist.Note:Ifyoudidoptionalstep2,youmayfindmeasinthedialogaswell;makesurethefishertableisselected.Observethattheapphasselectedresponseandpredictorvariablesbasedontheirdatatype.Petalandsepallengthandwidtharepredictors,andspeciesistheresponsethatyouwanttoclassify.Forthisexample,donotchangetheselections.
7.Acceptthedefaultvalidationoption(5-foldcross-validation)andcontinuebyclickingStartSession.Youwillseethesessionlikefollowingscreenshot.
Thisversion:4/4/174:37PM
8.Chooseaclassificationmodel.Inthiscase,weshalluseasimpledecisiontree.Tocreateaclassificationtreemodel,ontheClassificationLearnertab,intheClassifiersection,clickthedownarrowtoexpandthegalleryandclickSimpleTree.Thendisable
the'UseParallel'button(ifit'ssettoON)andclickTrain.
9.Examineresults
TheSimpleTreemodelisnowintheHistorylist.ThemodelvalidationscoreisintheAccuracybox.Thisnumbermaybeslightlydifferentinyourcase.Examinethescatterplot.AnXindicatesmisclassifiedpoints.Thebluepoints(setosaspecies)areallcorrectlyclassified,butsomeoftheothertwospeciesaremisclassified.UnderPlot,switchbetweentheDataandModelPredictionsoptions.Observethecoloroftheincorrect(X)points.Alternatively,whileplottingmodelpredictions,toviewonlytheincorrectpoints,cleartheCorrectcheckbox.OntheClassificationLearnertab,inthePlotssection,clickConfusionMatrixorROCCurvetogenerateConfusionMatrixorROCCurve,respectively.Eachplotwillopenonaseparatetab.Seerepresentativescreenshotsonthenextpage.ExperimentwithchangingthesettingsineachPlotsectiontofullyexaminehowthecurrentlyselectedclassifierperformedineachclass.
Thisversion:4/4/174:37PM
Thisversion:4/4/174:37PM
10.Chooseanothermodel.
Youcantraindifferentmodelstocomparetothedecisiontree,bychoosingothermodelsintheClassifiersection.Inthisexample,IchoseFineKNN1.ClickFineKNN,andthenclickTrain.Aftertraining,youcanseetheFineKNNintheHistorylist.YoucanclickeachmodelintheHistorylisttoviewandcomparetheresults.Theaccuracyvaluemaybeslightlydifferentinyourcase.
1 Technically, you don't know what a kNN classifier is, since we haven't covered it in class (yet). But that's on purpose! My goal is to show that you can pick other classifiers, train them, and 'play' with their parameters rather easily, even if you don't quite know what is "inside the box".
Thisversion:4/4/174:37PM
11.Tryusingdifferentattributes
Totrytoimprovethemodel,tryusingdifferentfeaturesinthemodel.Seeifyoucanimprovethemodelbyremovingfeatureswithlowpredictivepower.OntheClassificationLearnertab,intheFeaturessection,clickFeatureSelection.Youcanremoveafeaturebyunchecktheboxbesideit.
Thisversion:4/4/174:37PM
AfterperformingFeatureSelection,anewmodelwillappearontheleft-handsideoftheapp.Youshouldthentrainitandcomparetheaccuracyresults(aswellasconfusionmatrix,ROCcurve,AUC,etc.)againstthepreviouslytrainedmodels.MATLABwillindicatethebestmodelsofarbyhighlightingthehighestaccuracyvalues.Seescreenshotbelow(obtainedaftertrying5variantsofdecisiontreesand2variantsofkNN).
12.Advancedclassifieroptions
Tolearnaboutmodelsettings,chooseamodelintheHistorylistandviewtheadvancedsettings.TheoptionsintheClassifiergalleryarepresetstartingpoints,andyoucanchangefurthersettings.OntheClassificationLearnertab,intheTrainingsection,clickAdvanced.Fordecisiontrees,considerchangingtheMaximumNumberofSplitssetting(whichcontrolstreedepth),thentrainanewmodelbyclickingTrain.ViewthesettingsfortheselectedtrainedmodelintheCurrentmodelpane,orintheAdvanceddialogbox.
Thisversion:4/4/174:37PM
13.ExporttrainedmodelToexportthebesttrainedmodeltotheworkspace,ontheClassificationLearnertab,intheExportsection,clickExportModel.IntheExportModeldialogbox,clickOKtoacceptthedefaultvariablenametrainedClassifierorchangetoanothername.
IftheexportedmodelisadecisiontreecalledtrainedTreeClassifier,usethefollowinglineonMATLABcommandwindowtoviewtheresultingmodel(tree):view(trainedTreeClassifier.ClassificationTree,'mode','graph'); Youcanusetheexportedclassifiertomakepredictionsonnewdata.Forexample,tomakepredictionsforthefishertabledatainyourworkspace,enter:yfit = trainedClassifier.predictFcn(fishertable)
Theoutputyfitcontainsaclasspredictionforeachdatapoint.SeetheCommandWindowscreenshotbelow:
Thisversion:4/4/174:37PM
14.GeneratecodeIfyouwanttoautomatetrainingthesameclassifierwithnewdata,orlearnhowtoprogrammaticallytrainclassifiers,youcangeneratecodefromtheapp.Togeneratecodeforthebesttrainedmodel,ontheClassificationLearnertab,intheExportsection,clickExportModel>GenerateCode.TheappgeneratescodefromyourmodelanddisplaysthefileintheMATLABEditor.SeescreenshotbelowforthegeneratedcodeinEditor:
Mathworksprovidemanynicedetailedexamples,alternatively,youcanrefertotheselinks:https://www.mathworks.com/help/stats/train-decision-trees-in-classification-learner-app.htmlhttp://www.mathworks.com/help/stats/train-logistic-regression-classifiers-in-classification-learner-app.html