Download - Placeholders - Machine Learningcs229.stanford.edu/proj2015/113_poster.pdf · Placeholders: The various elements included in this poster are ones we oGen see in medical, research,

Transcript
Page 1: Placeholders - Machine Learningcs229.stanford.edu/proj2015/113_poster.pdf · Placeholders: The various elements included in this poster are ones we oGen see in medical, research,

PosterPrintSize:Thispostertemplateis24”highby36”wide.Itcanbeusedtoprintanyposterwitha2:3aspectraAoincluding36x54and48x72.

Placeholders:ThevariouselementsincludedinthisposterareonesweoGenseeinmedical,research,andscienAficposters.Feelfreetoedit,move,add,anddeleteitems,orchangethelayouttosuityourneeds.Alwayscheckwithyourconferenceorganizerforspecificrequirements.

ImageQuality:YoucanplacedigitalphotosorlogoartinyourposterfilebyselecAngtheInsert,Picturecommand,orbyusingstandardcopy&paste.Forbestresults,allgraphicelementsshouldbeatleast150-200pixelsperinchintheirfinalprintedsize.Forinstance,a1600x1200pixelphotowillusuallylookfineupto8“-10”wideonyourprintedposter.Topreviewtheprintqualityofimages,selectamagnificaAonof100%whenpreviewingyourposter.Thiswillgiveyouagoodideaofwhatitwilllooklikeinprint.Ifyouarelayingoutalargeposterandusinghalf-scaledimensions,besuretopreviewyourgraphicsat200%toseethemattheirfinalprintedsize.

Pleasenotethatgraphicsfromwebsites(suchasthelogoonyourhospital'soruniversity'shomepage)willonlybe72dpiandnotsuitableforprinAng.

[Thissidebarareadoesnotprint.]

ChangeColorTheme:Thistemplateisdesignedtousethebuilt-incolorthemesinthenewerversionsofPowerPoint.Tochangethecolortheme,selecttheDesigntab,thenselecttheColorsdrop-downlist.

Thedefaultcolorthemeforthistemplateis“Office”,soyoucanalwaysreturntothataGertryingsomeofthealternaAves.

PrinAngYourPoster:Onceyourposterfileisready,visitwww.genigraphics.comtoorderahigh-quality,affordableposterprint.EveryorderreceivesafreedesignreviewandwecandeliverasfastasnextbusinessdaywithintheUSandCanada.Genigraphics®hasbeenproducingoutputfromPowerPoint®longerthananyoneintheindustry;daAngbacktowhenwehelpedMicrosoG®designthePowerPoint®soGware.

USandCanada:1-800-790-4001Email:[email protected]

[Thissidebarareadoesnotprint.]

PredicAngFinalScoresofMajorLeagueBaseballGames

NicolasCserepy,RobbieOstrow,BenWeems

December8th,2015NicoCserepy([email protected])BenWeems([email protected])RobbieOstrow([email protected])

CS229:MachineLearning1. Retrosheet.org2. Chadwick:SoGwareToolsforScoringBaseballGames3. sportsbookreview.org

DataSources

Baseball:America’snaAonalpasAme.TheMLBhad$7.2billioninrevenuein2010,and,accordingtoCNBC,$30to$40billionisbetillegallyonthegameeveryyear.Thereare2,430baseballgamesinaseason.That’sover190,000plateappearanceseachyear.Weleveragedcomprehensivedatasince1980–over7milliondatapoints–topredictthescoresofbaseballgames.Weusedthedatatorepeatedlysimulateeveryatbatofeverygame.SophisAcatedanalysisoftheseresultsrevealedsurprisinglypredicAvestaAsAcsthatleaveusquesAoningtheproverbialwisdom:“Thehousealwayswins.”

IntroducAon

SimulaAngGames

WetreatbaseballgamesasaMarkovdecisionprocess.However,therearetoomanypossiblestatesandtransiAonstofeasiblycalculatetrueprobabiliAes.Assuch,werandomlysamplefromthestatespacetoesAmatethetruedistribuAonofgames.

ExploringtheStateSpace

StaAsAcsareubiquitousinBaseball.Weknoweveryplayer’sbarngaverage,everypitcher’searnedrunaverage–wecancalculateanystaAsAcweneed,atarbitrarylevelsofgranularity.Aplayer’shirngstaAsAcs,however,arenotsufficienttocalculatetheprobabilitydistribuAonforsomeatbat.TheseprobabiliAesareacombinaAonofhirngstaAsAcs,pitchingstaAsAcs,andenvironmentalvariables(likerunnersonbase,handed-ness,etc.).Wehave7.2millioninstancesofatbats,whichwefeaturizedintoabout100features(mostlysparse).Eachinstanceismatchedtoaresult,like“single,”“strikeout,”or“fielder’schoice”.Splirngtheseexamplesinto70%developmentand30%tesAng,werunmulAnomiallogisAcregressiontomoreaccuratelycalculateP(acAon|state).

LearningP(acAons|state)

WehavesuccessfullyrepresentedMajorLeagueBaseballgamesasaMarkovChain,andthroughMonteCarlosimulaAonsofthesegameswecangeneratemeaningfulresultsthatarecompeAAvewithstate-of-the-arttechniques.Byfocusingonhigh-confidencegames,wegeneratemeaningfulresults.Wefound,withp<.05,thatourpredicAonsforgameswithmorethan80%confidenceareexpectedtoguessthecorrectresultfortheover/under.We’resAllworkingontweakingourfeatureselecAonandalgorithm,andexpecttoimproveourmodelandresultsinthecomingdays.

Conclusions

“AcAons”isthesetofpossibleresultsforthebawer,and“outcomes”isthesetofpossiblestatesaGeraplayhasbeenmade.1.  Enterstartstate(Awayteamhirng,nobodyonbase,etc.)2.  RepeatunAlgameend

(a)  CalculateP(acAons|state)(learnthisprobability!)(b)  ChooseweightedrandomacAon(c)  CalculateP(outcomes|acAon)(weassumethatthisisthesamefor

allgames)(d)  Chooseweightedrandomoutcome(e)  Gotothestatethatoutcomespecifies

3.GatherstaAsAcsaboutsimulaAonWesimulateeachgame10,000Ames.ThekeytosimulaAngaccurategamesislearningP(acAons|state).

Table1.Afewexamplesoffeatures.

Binary-Valued Real-Valuedhas_one_out bat_singles_per_try_against_same_hand

is_7th_inning pitcher_doubles_per_bawer

bawer_and_pitcher_same_handed bat_homers_in_last_30_awempts

runner_on_second bat_doubles_per_try_against_diff_hand

in_stadium_5 bat_walks_per_plate_appearanceConfidenceinresult

Averagereturnon$100bet

Graph1.Confidencevs.Return

Runsscored

Countofgames

Graph2:ExamplesimulaAon