Synthesis of application-level utility functions for autonomic …...1 Synthesis of...

1

Synthesis of application-level utility functions for autonomic self-assessment GiuseppeValetto,PauldeGrandis&DaleSeybold,Jr.DrexelUniversity,DepartmentofComputerScience

Tel.+1215895‐2669

Fax+1215895‐0545

[email protected]@[email protected]

Thisisanextendedversionofthepaper“ElicitationandUtilizationofApplicationlevelUtilityFunctions”,publishedintheProceedingsofthe6thInternationalConferenceonAutonomicComputing(ICAC2009),Barcelona,Spain,June2009.AbstractWepresentanon‐analyticapproachtoself‐assessmentforAutonomicComputing.Ourapproachleveragesutilityfunctions,atthelevelofanautonomicapplication,orevenasingletaskorfeatureperformedbythatapplication.Thispaperdescribesthefundamentalstepsofourapproach:instrumentationoftheapplication;collectionofexhaustivesamplesofruntimedataaboutrelevantqualityattributesoftheapplication,aswellascharacteristicsofitsruntimeenvironment;synthesisofautilityfunctionthroughstatisticalcorrelationoverthecollecteddatapoints;andembeddingofcodecorrespondingtotheequationofthesynthesizedutilityfunctionwithintheapplication,whichenablesthecomputationofutilityvaluesatruntime.Weemployanumberofcasestudies,withtheirresultsandimplications,tomotivateanddiscussthesignificanceofapplication‐levelutility,illustrateourstatisticalsynthesismethod,anddescribeourframeworkforinstrumentation,monitoring,andutilityfunctionembedding/evaluation.

Keywords:Applicationutility,utilityfunctions,instrumentation,nonanalyticselfassessment.

Abbreviations:Aupy:ApplicationutilityinPython;BSD:BerkeleySoftwareDistribution;COTS:CommonOff‐The‐Shelf;CPU:CentralProcessingUnit;FTP:FileTransferProtocol;JDK:JavaDevelopmentKit;IP:InternetProtocol;IT:InformationTechnology;MAPE‐K:Monitor,Analyze,PlanandExecute–Knowledge;NTP:NetworkTimeProtocol;QoS:QualityofService;RFC:RequestForContribution;RL:ReinforcementLearning;SSH:SecureShell;VoIP:VoiceoverIP

Introduction Autonomic software systems should be able to adapt to a wide range ofconditions, including unforeseen and unexpected events thatmay occur asthesystemisoperating.Sucheventsmayoftencausethemostcriticalin‐the‐field faults, since, by definition, they have not been accounted for, ordiscovered, during the various phases of the engineering process. It istherefore arguable that the significant additional cost and complexityincurred in designing and implementing autonomic mechanisms for asoftwaresystemmayat timesbe justifiableonly incase theyprovidesomelevel of protection from unpredictable conditions. However, self‐adaptivefacilitiesareoftendesignedtocoveronlycertainclassesofexpectedorlikelyconditions,anddesigningfortheunexpectedremainsonemajorchallengefortheengineeringofautonomicsystems. Thispaperpresentsanapproachtoattackthatchallenge,relatedtotheareaof self‐assessment, a critical pre‐requisite for being able to take decisions

2

uponanyadaptationthatmayneedtobeenacted.Self‐assessmentrequiresthat the autonomic system possesses means to recognize at least somesalient properties of its operational state, andpredicate upon that state, inordertoevaluatewhetheritissatisfactory,orshouldratherbechanged.Manycurrenttechniquespursuingself‐assessmentare–instrictaccordwiththeMAPE‐Kreferencearchitecture forautonomicsystems[13] ‐analytic innature: for example, they may monitor the occurrence of specific runtimeconditionswithinthesystem,asin[2],andtrytodiagnosetheirrootcause,asin[19]. We propose a nonanalytic approach that leverages the concept ofapplication‐levelutility,anabstractionthatcaptures inasinglescalarvaluethe salient properties of an application, such as health, value, or quality ofservice, with respect to its operational requirements. In practice, ourapproachforutility‐basedself‐assessmentisthree‐phased.Firstly,bymeansofinstrumentation,wecollectruntimedataattwodistinctlevels:anarrayofmeasures sampling the environmental operating conditions for theapplication; and, in parallel, another set of quantitative characteristics ofapplication‐level quality and performance. We then synthesize a utilityfunction for the application, through the statistical correlation of the datacollected at those two levels. Finally, we turn utility into an intrinsiccharacteristic of the application itself, by using meta‐programmingtechniquestoembedtheutilityfunctioninitsruntimelibraries.Thatenablestheonlineevaluationofapplicationutility,which laysanobjectivebasis forself‐assessmentand, in turn,decision‐makingon theneedand the scopeofautonomicadaptations.In the remainder of this paper, we first discuss the motivation andimplications of application‐level utility for autonomic self‐assessment, alsoemploying examples originating fromour various experiments in this area.Wethendescribeourstatisticalmethodforsynthesizingutilityfunctionsonthe basis of runtime monitoring of the application. We also describe aframework we have developed to enable the synthesis and embedding ofutility functions in application code, called Aupy, and present some casestudiesinwhichweappliedAupyindifferentdomains,withtheirresults.Wediscuss the engineering lessonswehave learnedwhile buildingAupy,withimplications in particular to equipping legacy applications with utilityfunctions.Finally,weoutline furtherresearchopportunitiesonapplication‐level utility for autonomic computing. But, first, we briefly review theconceptsofutilityandutilityfunctions,anddiscusstheirmeaningandusageinAutonomicComputingandinourownwork.

Utility in autonomic computing Theconceptofutilityhas itsroots inUtilitarianphilosophy,and itsearliestapplication perhaps in the field of microeconomics, where it indicates thesatisfaction generated from owning or consuming a good or service.ComputerSciencehasborrowedtheterm,andassignedsimilarsemanticstoit.Forexample,inArtificialIntelligence,expectedutilityhasbeenindicatedasa unifying principle for decision‐making rules [6], thus highlighting howutilityexpressesthelevelofsatisfactionofanagent’sgoals.Utilityhasbeenemployedinmanyothercomputingareas,andevenjustanoverviewwould

3

exceedourspacelimitations.Toofferjustafewexamples,incommunicationsystemsutilitycanbeusedtooptimizetheconsumption[3],orpricing[31],of networking resources. In distributed computing, it can for instancearbitratetheallocationofshareddataandotherresourcesamongconcurrentprocesses[15]. In all cases, it is essential to be able to make the utility abstractionoperational.That translates intodevisingautility function that cancapturethedefinition‐andcomputevaluesof‐utilityinthecontextathand.Autilityfunctioncanbeexpressedinthesimpleform:

U=f(V)=f[s0,…,sn]which denotes how utility depends on a vector V of attributes (or statevariables) that capture the state of the system and its environment; thefunction f maps that ‐ possibly large ‐ vector characterizing the runningsystemtoasinglescalarvalueU,conventionallyinthe[0,1]range.In Autonomic Computing, as discussed in [14], utility functions becomeparticularly attractive because of their ability to quantitatively attributepoints in the system state space. They thus provide an objective andquantitative basis for sound, automated decision‐making, and they can tiethosedecisionstohigh‐levelconceptualrequirements,concernsandgoals.Autilityfunctioncanforexampleguideautomatedagentstodeviseorcompareself‐optimizationstrategies,inordertoensurecertainlevelsofperformance,orevenassignmonetaryvaluetotheoperationofasystem.Acommonself‐optimization context in which utility has been applied is the automatedmanagementof large,multi‐applicationcomputing infrastructures.Walshetal. [29], andDas et al. [7], amongothers, have shownhowutility functionscanguidetheallocationofresourcesinadatacenter,basedontheevaluationof the (economic) value of the resources being provisioned for hostedapplications,andofthoseapplicationsfortheirownersorusers.Adifferentself‐optimization scenario is proposed by Cheng et al [5], in which utilityevaluationenables the choiceamongalternative strategies foradapting theconfigurationofthearchitectureofasoftwaresystematruntime.Notice that in many current approaches, including the ones above, thedefinition of the utility functions is not a direct concern for the autonomicsystem itself. Utility functions are oftenmanually elicited with the help ofdomain experts or application users [11]. In those cases, we talk aboutsubjective utility, as it reflects the goals and the values of those humansubjects. For instance, in previouswork co‐authoredby the second author,the elicitation of subjective utility is used to characterize a communicationapplication[25].InAutonomiccomputing,however,itwouldbedesirabletomove towards objective utility, and from elicitation to (at least partially)automatedsynthesis,sincethatleadstoutilityfunctionsthatfaithfullyfittherecorded usage and intended purpose of the target system. In certaincontexts,forinstancethemanagementofITinfrastructures,objectiveutilityfunctions can be derived from contractual agreements, such as SLAs [24].Also, some automated approaches based on negotiation among multipleparties have beenproposed, limited to the resource allocation scenario [1][20].OurworkonAupyisdifferentinpurposeaswellasscope.Itsfocusisuponthe non‐analytic self‐assessment of a single application, or even a single

4

recognizable service or task executed by that application. Some other non‐analyticapproachesexist,based,e.g.,onclustering[9][22].Theyareabletodistinguish between anomalous and common behavior of an application;however, they cannot quantitatively attribute application states thatcorrespond to either anomalies or normal operation. Our application‐levelutility functions can automatically compute a value for the current state ofoperation of the application or task, denoting, for instance, its health orperformance. In that sense, our work is perhaps most similar to that ofSterrittandBustard[26],whichalsoestablishesaframeworkforquantitative(althoughdiscrete)self‐assessment,bymeansofamulti‐levelscalar,calleda“pulse”. However, in that work the definition of pulse levels for anapplication,with their semanticsand thresholds, is left to the implementer.Weareinterested,instead,alsointheobjectivesynthesisofutilityfunctionsthataretightlylinkedtoobservableruntimeproperties.To thatend,weemployastatistics‐basedapproach that ispredicateduponthe identificationof a setof observable characteristicsof theapplicationathand, as well as its runtime environment. The selection of thosecharacteristicsisacriticalstepthatcurrentlyreliesuponexpertknowledge.Forthatreason,wehavefoundthatourapproachappliesparticularlywelltocertain classes of applications, which have some intrinsic coherence ofpurposeintheirfeatureset,suchasprotocol‐basedcommunicationsystems,or infrastructure‐level services and facilities. In the case of veryheterogeneoussystems,likeenterpriseapplications,whichmayofferalargenumberofverydiversefeaturestotheirclients,itbecomesmorecomplicatedto conceive auniformsetof salient characteristics that is representativeofthe system’s utility. However, in those cases, our approach is fine‐grainedenoughtoenabletheself‐assessmentofsomegivenspecifictasksofinterest.Theinterestedreadercanfindamoreexhaustivediscussiononapplicabilityin a later Section of this paper, delineating our experience with our Aupyframework.

Application-level utility Attributing utility to application tasks

As discussed above, the concept of utilitywithin the domain of AutonomicComputing isapplied largelyata“macro” level, that is, toguidepolicies forthe automated management of large‐scale computing infrastructures thatmay host many diverse applications. We maintain that it can be at leastequallyimportantwhenappliedatamorefine‐grained,“micro”level,whichpertainstotheassessmentofhowwellasingleapplicationisabletoperformthetask(ortasks)requestedbyitsusers,orbyothersystemsthatbehaveastheapplication’sclients.Inthatrespect,utilitymustreflectthevalueofferedbytheservicesmadeavailablebytheapplicationtothoseusers/clients,andmustthusbedefinedfromtheirpointofview.In order to elucidate the potential of utility functions for application self‐assessment, we describe two preliminary experiences we have had in thisarea. The first one has to do with assessing the utility of an interactivecommunication system, based on the perceived support it provides to itsusers in fulfilling various collaborative tasks. The second one, instead,discusseshowasystemfordistributedparallelcomputationcanbenefitfrom

5

beingawareoftheutilityofthevariousworkingsubsystemsitiscomposedof. Those experiences offer a rationale for self‐assessment via application‐level utility, and, more specifically, provided us with the motivation fordevelopingaframeworklikeAupy.

Utility of IP teleconferencing

Webrieflydiscussheretheset‐upandoutcomeofacasestudythatismorethoroughlydescribedin[25].ThatcasestudyinvolvedanumberofvolunteerusersofanIPteleconferencingsystem,andrequiredthemtotrytoperformvarious tasks that required pairwise remote communication, mediatedthroughtheIPconferencingsystem.User tasks varied from simple communication exchanges, to time‐sensitiveexchanges,tomorearticulatedcollaborativetasks.Thosetaskswerecarriedoutbypairsofvolunteersontopofaframeworkthatenabledthrottlingthebasic parameters of the network channel upon which IP conferencingoccurred, thus allowing to create a set of tests for each task. A testwouldcorrespond to a combination of bandwidth, latency, and packet lossconditions assigned to the network channel. For each test and task, thecollectionofapplication‐leveldatawascarriedoutbytheusersthemselves,whosubjectivelyratedandrecordedtheirabilityandsatisfactionincarryingout that particular task with the support of the IP teleconferencingapplication. That data on userperceived utility was then correlated to themetricsofthenetworkchannelusedbytheapplication,inordertoobtainamappingofhowtheutilityforeachgiventaskdependedontheenvironment‐levelconditions.For each task, repetitions of test with different network conditions anddifferentpairsofusersyieldedthousandofdatapoints.Astatisticalanalysisontheoutcomeofthoseexperimentsdeterminedthatthesensitivityofeachtasktothevaryingnetworkconditionsisdifferent.Forinstance,increasesinnetworklatencyhaveabiggereffectonthe(lackof)userabilitytoperformsatisfactorily time‐constrained tasks, or collaborative tasks that requiresubstantial amounts of interaction and back‐and‐forth communicationbetween users. Bandwidth and packet loss, instead, are dominant in tasksthat require simpler communication exchanges, such as asking a triviaquestionandgettingananswer.Asa result,differentutility functionswerestatistically devised for our IP teleconferencing application, on a pertaskbasis,byusingthedatamapsprovidedbytheexperiments,andcurve‐fittingthe values of perceived utility to the three environment‐level parametersconsidered.This study not only highlighted the task‐dependent nature of application‐levelutility,butalsoofferedamethodtodeviseoperationalutilityfunctionsforagivenapplicationtaskbasedonmonitoringandstatisticanalysis.

Utility of MapReduce workers

MapReduce is a paradigm for distributed parallel processing of largeamounts of data that has been pioneered by Google [8]. It allows theharnessingoftheresourcesofalargenumberofcommercial‐grademachines(calledworkers, orworkernodes) in a very simpleway, by segmenting theoriginal data set in chunks that can be processed independently, assigning

6

those chunks to the worker nodes for the required processing, and thenmerging the partial results they produce back together. The segmentation,assignment and merging phases of a MapReduce job outlines above areorchestrated by a master node, and may take advantage of large‐scaledistributed storage systems for the input data set, the intermediate resultsandthefinaloutput,suchastheGoogleFileSystem[10];howeverthatdoesnotrepresentapre‐requisite.Since it does not require special‐purpose hardware, or complex parallelsystemsoftware,MapReducehasrecentlybecomequitepopularforallkindsof data‐intensive applications that do not require interaction among theprocessing tasks running onworker nodes. Themain tenets ofMapReduceare also in line with the vision of utility computing [23], and possiblyamenabletobeimplementedasaserviceinacloud,asdiscussedforinstancebyJinandBuyya[12].Oneknownissuewiththisparadigmisits“endgame”,sinceaMapReducejobcan be completed only as quickly as the completion time of its slowestworker node. Strategies for mitigating that issue exist: for example, [8]describes how, if someworkers are unreasonably delaying job completion,whereas the rest of theworker nodes have already finished their part, themaster can abort the processing of the late workers, and re‐assign thecorresponding chunks to some other available nodes. That helps in thosesituations in which some workers are “hanging”, have crashed, or havesimplybecomeexcessivelyslow forsomereason,and ithasbeenshowntoimprove the endgame of MapReduce systems. However, strategies of thiskindonlyreacttoaproblemonceithasalreadyoccurred,andtheirbenefitsare directly tied to the timeliness and accuracy of the problem detectionmechanismsputinplace.Wehavethusstartedtoexplorehowtheconceptofutilityofworkernodes,as seen from the point of view of themaster, can benefit the execution ofMapReducejobs.Thebasicideaisthatafunctioncontinuouslycomputingavalue of utility for eachworker node for each job – based on theworker’spastandcurrentbehavior‐couldbecomepartoftheassignmentpolicyofthemasternode;itcouldthenhelpatthesametimetoimprovecompletiontimefor thevarious jobs, andoptimize theusageofworker resources,basedontheirefficiencyinsatisfyingthespecificprocessingneedsofdifferentjobs.Toexamine this hypothesis we have developed a MapReduce simulator, andcarriedoutaseriesofexperimentsonMapReduceexecution.Our in‐house simulator encompasses the concepts ofmaster,workers, andnetwork links for data transfer, and its parameters include number ofworkers, data chunk size, total data set size, bandwidth for network links,processor speed, maximum number of simultaneous connections that themaster can sustain, etc. The bandwidth of network links varies over time.Similarly, to simulate a set ofworkerswith heterogeneous capabilities dueforexampletodifferenthardware,eachworkerissetwithaprocessingspeedparameter for each job,which varies in a +/‐ 50% rangewith respect to anominalspeed.Eachworkeralsohasavariableloadvalue,whichrepresentsthepercentageoftheprocessorthatisunavailablebecauseitisbeingusedbyother programs. Therefore, the computational power available to eachworkerforagivenjobisdeterminedby:

Computationalpower=processingspeed*(1‐load)

7

Theutilityfunctionthatwehaveuseddependsnotonlyonthefactorabove,butalsoonthecurrentbandwidthofthenetworklinkavailabletoeachnode.Thisdetermines the timeneeded for transferring a chunk, orworkunit, tothe worker, as well as transferring the worker’s results back (we haveassumedinoursimulationsthatthesizeoftheworkunitininputandthesizeoftheprocessingresultsaresimilar).Therefore,thetotaltimeforaworkernodetocompleteaworkunitis:Completiontime=2*UnitSize/bandwidth+UnitSize/ComputationalpowerWecanthenrearrangethatequationtoexpresstheutilityofaworkerfromthepointofviewoftheMapReducemaster,thatis,intermsoftheamountofdata a worker can go through per unit of time. That yields the followingutilityfunction:

€

Uworker = UnitsizeCompletiontime

=1

2bandwidth

+1

Computationalpower

Workerscalculatetheirutilityperiodically(for instance,everyminute),andsend that value to the master. The master records the utility of all of theworkersinvolvedinajob,aswellasthestateofallworkunits(i.e.,whetherthey are unassigned, assigned, or finished), and which work units areassigned toeachworker.Notice that thecommunicationofutilityvalues tothemasteralsoservesasaheartbeatsignal.Themastertimestampsthelastmessageitreceivedfromeachworker,andifutilitydataisnotrefreshedforacertainamountoftime,thentheworkerisconsidered“down”,anditsutilityissetto0untilfurthernoticebythatworker.The MapReduce master exploits its knowledge of utility to rank workernodes,andassignmoreworktothosenodeswithhigherutilityvalues.Itcanalso re‐assign chunks from workers whose utility has dropped, to otherworkersthatrankbetter.Finally,sinceworkerutilityisexpressedintermsofunit sizeandcompletion time, the total amountofdata (i.e., thenumberofchunks)beingassignedtoeachworkercanbe tuned insuchaway tokeepworkersbusyasmuchaspossible. Wehavecomparedsuchautility‐awareassignmentpolicy,toasimplepolicythat assigns chunks uniformly to all available nodes, and implements theendgame re‐assignment optimization we mentioned above, in case someworkershangorcrash. Inaseriesofpairwisesimulations,wehavenoticedsignificantimprovementswith regard to both job completion times, and the overall efficiency of theMapReducesystem,expressedasthepercentageofworkersthatremainbusythroughout the duration of the job. Table 1 reports the comparison ofcompletion times data for three different experimental set‐ups: the firstexperimentwasperformedwith100workers,aunitsizeof15MB,totaldatasize of 72,000MB, 600KB/s averagenetwork speed, and60KB/s averageprocessing speed; the second experimentwas the same as the first, exceptthat it had 300workers, and a total data size of 300,000MB. In the thirdexperiment the total data size was changed to of 600,000 MB, and theaverageprocessingspeedwaschangedto40KB/s.Furthermore,Figures1and2,provideacomparisonofefficiencydatarelatedtoworkersutilizationlevelsforthefirstexperimentdescribedabove.Inthe

8

plotson the left‐handside, thegreen lineshows theproportionofworkersthatarebusyoverthedurationofthejob,whereasontheright‐handsideweshow a graph of the distribution of worker utilization broken into 5%increments. As evident from those Figures, the utility‐aware assignmentpolicy is superior, and ensures very high, constant utilization of workerresources, whereas with the simple assignment policy the utilization ofworkers varies widely between the 45% and 95% percentiles. Also thegraphsforourexperiments#2and3presentsimilarresults.We consider this kind of results particularly encouraging to continueexploring application‐level utility in the context of MapReduce distributedcomputing. Therefore, we have started to look into ways in whichwe canconfirm these results through amore extensive experimental plan:we areconsidering using a third‐party MapReduce simulation platform forindependentvalidation,suchasMRPerf[30],aswellastakingadvantageofMapReducecapabilitiesmadeavailablebycloudinfrastructures,suchastheAmazonElasticMapReduce1.

Lessons learned

Thetwoexperiencesdescribedaboveofferanumberofdifferentinsightsonthe issueofapplication‐levelutility.The lattersuggests thepotentialof thisconcept for achieving self‐assessment, and, in turn, being able to takeinformedadaptationdecisions,without requiringdeepanalytic capabilities,even in the case of significantly complex, large‐scale applicationenvironments. In the case at hand, the utility values of a worker are keptdistinct for any different jobs running concurrently on the MapReduceinfrastructure;themasteristhusabletotuneitspoliciesonaper‐jobbasis,withoutanyjob‐specificknowledge,oraneedtodiagnosetheworkers’state.Itisnoticeable,however,thatinthatcasestudy,wehaveusedapre‐defineddefinitionofworkerutility,anditscorrespondingutilityfunction.Theexperiencegainedwith the IP teleconferencingcasestudy, instead,hasimproved our understanding of how utility can be linked to environment‐levelconditionsthatimpacttheabilityoftheapplicationtoperformitstasks.That insight, in turn, has enabledus todevise amethod fordefiningutilityfunctions starting from a set of observable runtime parameters, usingstatistical tools. Thatmethod is the basis of the automated process for thesynthesisofutilityfunctionswehaveadoptedforAupy,andisatthecoreofthedesignoftheAupyframework,asdiscussedinthefollowingSections.

Synthesizing application-level utility functions In order to attribute utility semantics to a given application,we conduct aprocessofcorrelationandstatisticalfitbetweenavectorofattributesoftheapplication behavior, and another vector of characteristics of its runtimeenvironment.That process is based on the statistical utility elicitation technique used in[25] for IP teleconferencing, but we have worked to extend it to cases inwhich theutility information isnotprovidedbyusers.Our goal is tomovefromelicitationtoautomatedsynthesis,andsubstituteuserperceptionwith

1 Seehttp://aws.amazon.com/elasticmapreduce/

9

objectivemeasuresofbehavioralparametersoftheapplication,asobserved,forexample,byanyclientsystemusingthatapplication’sservices.Toachievethat, we instrument the application, and sample a chosen set of salientapplication‐level parameters, in conjunction with environmentcharacteristics.We then try tocorrelateapplication‐levelandenvironment‐levelmeasures:wecalculatevariousmomentsofthemultidimensionalgraphgenerated from the environment‐level measures, and curve‐fit it to thecorrespondingapplication‐levelgraph.Theequation thatproduces thebestfitbecomesourutility function,whichallowsus toassignautilityscalar tothe running application, by simply monitoring the vector of environment‐andapplication‐levelcharacteristicsincludedinthatequation. Letusconsider,forexample,theutilityfunctionfordownloadingafilewithan FTP service (FTP utility is discussed in detail as a case study furtherbelow). For such a task, the most significant application‐level measure ofutility,seenfromthepointofviewoftheFTPclientrequestingthetransfer,isthroughput. That effectively incorporates and abstracts a number ofphenomena that occur within the FTP networking environment, such asprotocol handshaking and control flow, transfer error, packet loss andretransmission,etc. Italsodependsonotherenvironmental factors,suchasthe raw throughput and the latency provided by the network socket, andtheir fluctuations over the time of the transfer. We observe thoseenvironmentmetrics,aswellasthroughputasseenfromtheFTPclient,foralargenumberoftestruns.Weemployacontrolledtestbedthatenablesustocollectatcorrespondingtimeintervalsthosetuples(ourdatapoints),aswellasvarytherunningconditionsfromtestruntotestrun,sothatwecancrawlsystematicallythroughthemulti‐dimensionaloperationspaceconstitutedbythe metrics being observed. We then proceed to synthesize offline theequation for the application‐levelutility forFTPdownload, basedupon thecollecteddatapoints.Acriticalaspectofourtechniqueisthechoiceofparameterstobeobserved,whichremainsapplication‐oreven task‐specific.However,anadvantageofour approach over analytic techniques for self‐assessment is that – oncedefined – utility functions do not need to monitor any application‐specificevents and patterns thereof, nor develop any diagnostic capability toprove/disprove hypotheses that are specific to the application and thosepatterns. A utility‐based self‐assessment approach may thus requiresignificantlylessdomainknowledgethanmostanalytictechniques,whichisadvantageous, since – as observed for example by Tesauro in [27] ‐ theacquisition,modelingandmaintenanceofdomainknowledge inAutonomicComputing is itselfanopenresearchissue,andtendstobetime‐consumingandhardtogeneralize.Anothermeritofourapproachisthatdatacollectioncanbe fully automated through instrumentationof theapplicationathand,thus enabling collection in contexts that closely match its natural runningconditions. Moreover, once our utility function are synthesized, they arealreadytiedtomeasuresthatareinherentlyobservable:therefore,thesameinstrumentationpreviouslyused for the synthesis process canbedeployedwithinautility‐awareversionoftheapplication,tocomputeitsutilityonthefly,asitoperatesinthefield.

10

Making software applications utility-aware We have prototyped a system of code annotations that supports the datacollectionandsynthesisprocessoutlinedabove,aswellastheembeddingofcode representing the resulting utility function within the runtime of theapplication,atthesamepointswherethedatawascollected.Ourannotationsintercept the program execution upon entering marked functions in thetarget application. To the target application, those function calls seemsemantically identical, however a background process is triggered by theannotation, which monitors and collects the parameters of interest, orcalculates utility on the basis of those parameters. For our prototype, weselected Python as the target language, for its advanced introspectionmechanismsanditsabilitytofostermeta‐levelprogrammingfunctionality.Infact,thename“Aupy”standsfor“applicationutilitywithPython”.InthefirstversionofAupy,ourannotationswereimplementedusingaseriesof Python decorators, associated to method definitions. Those decoratorstrigger code in the Aupy framework that is used either for monitoring orutilitycomputation. Table2providesasynopsisofthosedecorators.The@monitordecoratoristhe centerpiece of our system: it can be associated to any method in theapplication, and can take as an argument a list of functions that are to beexecuted in the same scope as the original “target” method. Those otherfunctionsmustbemarkedwitheither@pre_utility,@post_utility,@utility, or@concurrent_utilityannotations,andspecifywhethertheircodeshouldbe executed – respectively ‐ before the target method, after the targetmethod, before and after the target method (i.e. as a wrapper), orconcurrently(thatis,inthebackgroundbyaseparatethread,whilethetargetmethodperformsitsoperations). Thesedecoratedfunctionsdonottakeanyargumentsdirectly,but,sincetheyareinthesameprogrammingscopeastheirtarget,theyhavevisibilityontoits data, and, specifically, its arguments. They can therefore be used toobservetheworkdonebythetargetmethod,aspartofthesynthesisphaseofourprocess,inwhichwegatherruntimedataontheoperationofthesystem.Similarly, they can be used ‐ once the synthesis of the utility function hasbeencompleted‐forcomputingvaluesofutilitybasedontheobservationofthesamedata.InasecondversionofAupy,wealsointroducedPythoncontexts.Annotationsvia method decorators remain very convenient to make legacy systemsutility‐aware, since Python makes possible to associate them to methodsdefined within ‐ and imported from – libraries. Contexts, and the scopingconstructstheyenable,allowusassociateutilitynotonlytoentiremethods,but also to any block of application code. Therefore, the granularity of ourmonitoring and utility calculation facilities can become finer at will. Thesyntaxofourutilitycontextsisasfollows(boldfacepartshighlightkeywordsandother fixed syntactic elements,whereasparts in anglebracketsdenoteparametricelementsoftheconstruct):

11

with Utility.contextUtility (

<utility function>, <u.f. argument list>, evalFunc=<function>, utilityCallback=<callback function> utilityCallbacArgs=<array of argument for callback>) as status: # here follows original target code …

By analyzing the construct above,we can discuss some further advantagesintroducedwithcontexts.Firstofall,itisworthnoticingthatcontextscanbecombined,byusingmultiplenestedwithstatements,eachassociatedwithitsownutility function.Theability for anouter “parent” context todecide theparametersofaninnercontext,andpassinformationtoit,opensthedoortosophisticatedwaystodefineandcomputeutility.Forexample, it ispossibletodeviseastepwisemethodofutilitycalculation,inwhichthevaluereturnedby the utility function of the outer context becomes one of the argumentspassedtotheutilityfunctionoftheinnercontext.Itisalsopossibletocodeanescalation of diagnostic mechanisms: in such a case, the utility valuescomputed by the outer context are used by the inner context in order todecide whether to activate a different method of utility calculation, whichperhapsworksatadifferentprecisionlevel,ortakesintoaccountadditionaldata,orismoredemandingintermsofcomputingresources.Alsothestatusvariable‐whichisusedtodenotewhetheracontextanditsassociatedutilityfunction,hasbeenexecutedcorrectly–canbeleveragedbynestedcontexts:aninnercontexthasaccesstothatvariable,andcandecidewhatneedstobedonebasedupontheinformationitcarries. Other noteworthy new features are the (optional) evaluation and callbackfunctions.Anevaluationfunctioncanbeusedtopost‐processtherawutilityvalue returned by the utility function; it always returns a boolean value.Evaluationfunctionsservetwomainpurposes.Firstofall,theyallowaclean‐cut separation between code that measures the utility of the application(whichresides intheutility functionassociatedwiththecontext),andcodethatmayuseand interpret thoseutilitymeasures,whichmaybe situation‐dependent,andshouldbeplacedintheevaluationfunction.Thatcontributestoprovidecontextswithricheroperationalsemantics.Forexampleanevaluationfunctionliketheonebelow:def evalAgainstStdDev(u): # u is current utility value

median = computeMedian(u) sd = computeStdDev(u) if (u > median + 2*sd) or (u< median – 2*sd): return False return True

canbeusedtocomparethecurrentvalueofutilitytoadistributionofutilityvaluesrecordedinthatpast,andflagthecurrentvalue,incaseitrepresentsatwo‐standard‐deviationsanomaly.Inthesimplestcase,andasadefault, theAupy programmer can also define bool as the evaluation function, whichrepresentsapass‐throughoption:inthatcase,inpractice,theutilityfunctionitselftakesalsotheroleoftheevaluationfunction.

12

Theothermainpurposeofevaluationfunctionsistoworkasatriggerforthecorrespondingcallbackfunction:wheneveranevaluationfunctionfails(thatis,returnsFalse),andincasethecontextincludestheindicationofacallbackfunction, that function is invokedwith thearrayofparametersprovided inthe context definition. The intent behind callback functions is to provide aflexible andpracticalway to connect the self‐assessment capabilitiesof theAupy frameworkwithanyavailableprovisions for theproactiveadaptationof the application.Callback functions could implement a set of adaptations,suchasremediesandoptimizations,customizedfortheapplicationathand,or could represent the entry point of some other comprehensive, genericframeworkthatcanbeusedtoeffectsuchadaptations,andcanbe– in thisway – plugged in togetherwithAupy to construct a full‐fledged autonomicsolution.Finally, contexts, analogously to decorators, can have differentmoments ofexecution, with respect to the execution of the target code. We havedeveloped,besidesUtility.contextUtility,whichisdepictedintheexampleabove and denotes a wrapper, also the Utility.preContext,Utility.postContext,andUtility.concurrentContextconstructs.In the current version of Aupy, which has become an open source projectundertheMITlicensetofosterfurtherenhancementsandexperimentation 2,method decorators and contexts co‐exist. Their combination supports theretrofitting of legacy systems with utility‐aware behaviors, as well as thedevelopmentofnewsystemswiththesamecapabilities.Thatisillustratedinthe twocasestudies that followanddescribeourexperienceandresults inworkingwithAupy.

Case studies FTP download

The goal of this case studywas to validate our approach to synthesis andembedding of application‐level utility function, as well as the Aupyannotationframework,withinawell‐understoodscenario.WechosetheFTPprotocol, since it is a well‐known and widely studied system, with a clearprincipled parameter that can be used to represent the utility of theapplication,thatis,datathroughput.ThiscasestudywasconductedusingastandardFTPclientwritteninPythonand conforming to the FTP RFC [21]. The FTP client was instrumented torecord itsapplication‐level throughput,aswellasCPUutilization,andRAMusage. From the networking environment, we recorded raw channelthroughput and inter‐arrival time, asmeasured on the socket of the clientmachine. The taskswe profiledwith data collectionwere the download ofonelargemulti‐gigabytefile(inthefollowing,FTPbulkexperiment),andtheconsecutivedownloadofmultipleone‐megabytefiles(inthefollowing,FTPsmallexperiment). Firstofall,anannotatedversionofPythonFTPlibrarywasconstructed.Toaccomplishthat,asubclassoftheFTPclassinthatlibrarywasdeveloped,andweusedAupyannotations tomark thesubclassmethods implementing theFTPcommandsascodetobemonitored.Asshownintheexamplebelow,the2 Seehttp://www.aupy.org

13

annotated methods simply wrap pass‐through statements that invoke thesamefunctionalityintheoriginalFTPlibrary.

@monitor(None) def ntransfercmd(self, command): return super(SafeFTP,

self).ntransfercmd(command) Next,aconcurrentutilityfunctionwasdeveloped,tocaptureandlogdataattheapplicationaswellasattheenvironmentlevel.

@concurrent_utility def ftp_utility(self): # monitoring code below

... The effect of this set of annotations is that, as the monitored transfercommand gets executed, Aupy automatically finds all the utility functionsdefinedwithinthesamelanguagescope‐inthisexampleonlyftp_utility()isavailable‐andrunthemasprescribedbytheir@utilityannotation. Finally,wehijackedtheoriginallibraryandsubstituteditwiththeannotatedversion:thatiseasilyaccomplishedduetothepossibilityinPythontoimportlibrary classes under alternate names, to preserve andmaintain control ofthe namespace. The Python code below overrides the namespace of theoriginalFTPlibrary.

# from ftplib import FTP from aupy_ftp import SafeFTP as FTP

The instrumented FTP client application was then run on a test bed,comprised of two Linux endpoint machines and a BSD machine, whichutilized thedummynetutility 3 to simulate a variety of network conditionsforour test runs.A single test run isa complete file transfer scenario foradummynetnetworksetup,whichiscomprisedofgivenvaluesforbandwidth,latency, and packet loss settings. The clocks on all the machines weresynchronizedbeforeeachtestrunusingNTP[18].Abackchannelwasusedforloggingandadministrativeactions,allowingthenetworkchannelfortheFTPexperimentstobefullydedicatedtotransfers.Wecarriedoutasetof125testrunsandrepeateditthreetimesforboththeFTP‐bulkandFTP‐smallexperiments.Duringeachtestrun,measurementsofenvironmental and application attributes were recorded each second. Thisprocess allowed us to establish quantitative relationships between thoseattributes, by aggregating the data and performing their interpolation.Finally,we obtained an equation that estimates client‐side FTP applicationutility purely on the basis of environment‐level metrics, i.e., packet inter‐arrivaltimeandchannelthroughput. Since for this case study our utility definition directly corresponds to aspecific principal QoS parameter of FTP (data throughput), the statisticalprocess we employ can be seen as analogous to QoS profiling approachesadoptedinadaptiveoractivemiddleware,suchasQualProbes[17].However,thegoalthereistocorrelateapplication‐levelparameterstooneanother,astheactivemiddlewareisinterestedinevaluatinghowinterveninguponsomeparameters under its control may impacts other facets of the applicationrunningontopofitsplatform.Ourgoalisinsteadtofindoutbyobservation

3 Seehttp://info.iet.unipi.it/~luigi/dummynet/

14

whatsetofmeasurableparametersatenvironmentlevelcanbebestusedtoaccuratelymodelthequalityfactorsthatarechosenasbeingrepresentativeoftheapplicationutility.From the process above, we could clearly see how – like in our previousexperience with IP conferencing – the FTP application utility is task‐dependent: the equations we synthesized for the bulk‐FTP and small‐FTPexperiments differ significantly, as shown in the formulas below and theircoefficients,whicharelistedinTable3.

€

Ubulk = a1X1+ a2X2+ a4X4 + a5X5+ b1Y1+ b2Y2+ b5Y5Usmall = a4X4 + a5X5+ b1Y1+ b2Y2+ b4Y4 + b5Y5

Ourempiricalverificationconfirmedhowestimatedutility,i.e.,thevaluesofutility computed through the above equations, approximateswell observedutility,thatis,forthiscasestudy,themonitoredvaluesofthroughputfortheFTPclientatcorrespondingdatapoints.WeshowinFigure3andFigure4acomparisonbetweenobservedandestimatedutility,usingtheplotsobtainedwith MATLAB from the FTP‐bulk experiment. In those Figures, values ofutility are color‐coded, with red being highest utility, and deep blue beinglowest utility. Those color‐coded utility maps are situated within a tri‐dimensional space of observable environmental characteristics of thenetwork(bandwidth,packetlossandlatency).Forsimplicityofvisualization,theFiguresdisplayonlysome“slices”ofthatspace.OurnextstepwastoembedthesynthesizedfunctionsintheFTPclient.Forthat, we simply had to enhance the function ftp_utility, which previouslysimply used to log our monitored metrics, with a call to Python coderepresenting the utility equations. We also carried out further validation,sincewecontinuedto independently logtheFTPclient throughput, tokeeptheobservedvs.estimatedcomparisonactiveonline.TheresultsreportedinFigures3and4wereconfirmedalsointhatvalidationphase.

Log explosion

The goal of this case study was to investigate application‐level utility in adomaindifferentfromprotocol‐basedapplications(suchasFTPorVoIP),andto experimentwith our technique to solve a real‐world problem related tothemanagement of IT infrastructure. The problem at hand regards certainfaults that can cause a system to consume a global resource rapidly andunexpectedly. As a consequence, other functioning parts of the system arestarved, which usually results in complete system failure. Such a cripplingproblem can arise for many disparate reasons in a variety of computingsystems,inthesmallaswellasinthelarge.Wehavebeenabletoworkwithonespecificformofthatproblem,occurringwithin a popular Python‐powered Web site (more than 1 million uniqueviews per day,): a “log explosion”, which caused the log aggregation androtation subsystemof theWeb site infrastructure to be overwhelmedwithhard disk writes, rapidly and at unpredictable junctures. System data onthose events was collected and analyzed, to discover that a round‐robinschedulerinchargeofserver‐sidecachereplicationwithinacachingclusterwouldattimesfailandremain“stuck”onagivennode. Everyattemptthe

15

schedulerwouldmaketomovetothenextnodewouldfail,resultinginalogentry.Veryquicklythelogwouldexhaustdiskspace,tothepointthatthesitewouldnolongerbeabletoacceptnewconnections(whichrequiresloggingtobefunctional).Alsoattemptstologontothenodesbyothermeans,likeanSSHsession,wouldfail,sincetherewouldbenoroomonthedisktologthesession information. Existing monitoring software for the site could notdetectthisproblemquicklyenoughtoavoidthiscompletesystemfailure.Wethereforedecidedtotrytouseourframeworktorecognizelogexplosionsastheyhappen,withouthavingto investigateorresolvetherootcauseof thatfaulty behavior. Our hypothesis was that by monitoring the rate andacceleration of logwrites,we could comeupwith a utility function for theloggingfacilitythatwouldtelluswhenanexplosionwasabouttooccur.Ourtestbedforthisexperimentwassimple.Onemachinewasusedtorunaclean‐roomre‐engineeredversionofthefaultyround‐robinscheduler. Thesystem rotated through logical representations of nodes, one ofwhichwasinjected to cause a fault. Additionally, in the experiment, log informationwas also piped to standard output, to watch the system live. It was pre‐determined that on every twentieth rotation a faultwould be triggered ononeofthenodes,causingittobecomeunresponsivetoallschedulerrequests.WethenembeddedAupycodewithinthescheduler,tocapturetherateoflogwritesaswellastheiraccelerationrate.Undernormaloperatingconditions,the recorded logwrites rate by the schedulerwas consistently low, in theorder of one write/sec. Our concurrent monitors, implemented as Aupycontexts, were set tomonitor at about the same interval. During repeatedexperiments, when an explosion occurred, log write rates increased twoordersofmagnitudeinthefirstsecond,toroughly720‐780writes/sec,andagain two orders of magnitude in the following second to about 15000writes/sec. In our particular testing environment, hardware limits for diskwriteaccessratewerereachedwithinthethirdsecond.Accelerationappearsthereforetobeexponential,whichshowstheextremityofthephenomenon.Consequently, its representation in terms of a utility function produced astep‐wisefunctiondroppingfrom1to0intheeventofanexplosion.Whereas we had hoped for a utility profile, which could have possiblyprovidedmorepreliminaryevidenceofthebuild‐upofthephenomenon,thestep‐wisefunction,onceembeddedinthesystemallowedustorecognizelogexplosions within one second from the triggering of the log explosionphenomenon.We could thus easily put into place a provision to block theround‐robinschedulerintime,throughacustomizedcallbackfunction,stopits runaway logging, and avoid further clogging of the system and its totalfailure.Weintendtocarryoutmoreworkonthisparticularexample,togainfurther insight on the early assessment of explosive resource consumptionproblems.

Discussion of Aupy Our work has provided us with a number of insights on application‐levelutility for autonomic self‐assessment. First of all, we have gained anunderstandingofaprocesstosynthesizeutilityfunctionsbymonitoringthebehaviorof applications, and their environment.Thatprocess is repeatableandrequiresonlyamodicumofapplication‐specificknowledge.Althoughwe

16

haveexperienceonlywithcarryingoutthatprocessinlabconditions,andinpart (i.e. the delicate statistical analysis phase) offline, it seems feasible toconsider it as a basis for working toward online, automated synthesis ofutilityfunctions.WeexpandonthismatterasapossibledirectionforfutureresearchinthenextSection.Ourcasestudieshaveconfirmed thatutilityat theapplication level is task‐dependent,andanumberofconsequencesdescendfromthat.Sinceanarrayofutilityfunctionsapplytothesamesystem,and,insomecases,suchasourcase study on FTP download, even to the same application feature, havingsynthesizedcorrectly thosedistinctutility functions isnotpersesufficient:onemustalsohavethemeanstoselectthemopportunistically,basedonthecontextofthetaskbeingperformed.Thathasengineeringimplications,andhasbecomeoneoftherequirementsdrivingourAupyframework.Ourlatestprototypeallowsassociatingmultiple(possiblyconcurrent)utilityfunctionstothesamecodeblock.Moreover,mechanismsbasedonandenabledbyourutilitycontexts, like theirnesting,canalsoaccount forconditionalselectionandcompositionofourutilityfunctions.The task‐dependent nature of application utility also helps understandinghowourmethodfitsnaturallywellcertainspecificclassesofapplications.Asmentioned earlier, one prominent such class is constituted bycommunication‐oriented systems, because they are usually based onprotocols that formally codify and restrictwhat the application is intendedfor.Thatinformationistypicallysufficientfortheextractionoftheprincipalfeatures of the system, and to guide the synthesis of appropriate utilityfunctions for the application tasks that reify those features. Similarconsiderationsoftenapplytoinfrastructure‐levelsoftwaresystemsthat‐likeinonourcasestudyonserverlogging–arecharacterizedbyalimitedsetofspecific,self‐containedservicestheyprovidetothecomputingenvironmentas a whole. In the case of other classes of applications, e.g., enterprisesystems,thesynthesisofutilityfunctionsacrossthespectrumofallpossiblefeatures anduser tasks canquicklybecomeunfeasible. Inparticular, in theabsenceof automated support in the synthesisprocess, ourapproachmustnecessarily be limited to a subset of those features, typically the ones thatdomainexpertsconsiderthemostimportantorquality‐critical.Anadvantageofthefine‐grainedutilityabstractionsrequiredandatthesametime enabled by our framework is a viable approach to comparable,interoperableandcomposabledefinitionsof“health”fordifferentautonomicapplications,whichdescendsfromtheunit‐lessandsummativepropertiesofutilityfunctions.Itiswellknownhowthisappliesatamacroscopiclevel,e.g.,for multiple applications operating within the same provisioninginfrastructure (leading to a view of utility for the whole computingenvironment, as discussed in [29]). It applies as well at themicro‐level ofdifferent tasks being performed within the same application: for example,related to our FTP case study, the utility of the FTPdownload service as awholecouldbeexpressedasacombinationoftheutilityvaluesrecordedforbulk downloads and small download tasks as they occur. Things areanalogousintheincreasinglycommoncaseofcomponent‐basedapplications,or even systems‐of‐systems, in which different pieces of legacy, COTS orotherwise third‐party functionality are integrated into anewapplication. Ifall componentscouldbecharacterizedwithembeddedutility functions, the

17

utility of the overall composite system could be derived by combining theutilityvaluesrecordedforeachofthevariouscomponents.Lastly,wehavebeenabletotestthelimitsofourhypothesisthattheutilityabstraction can lead to non‐analytic self‐assessment. Domain‐specificknowledgeplaysaroleinourapproach,butonlyintheinitialdatacollectionphase,fortheselectionofthebasiccharacteristicsoftheapplicationortaskthat must be monitored. Past that phase, the whole process of functionsynthesis andembedding, aswell as the computationofutility at run time,remain application‐agnostic. We see this as a major factor enabling theassessmentoftheapplicationunderscrutiny,nomatterhowunexpectedorunpredictableitsstateofoperationcanbecome.

Ongoing and future work An effort that is currently undergoing aims at making available the self‐assessment provisions described so far beyond the repertoire of softwareapplication developed in Aupy. We would like to be able to apply thoseprovisions to prevalent examples of system software, middleware andapplication‐level software. For that reason, we have begun the porting ofAupy to Java. So far, we have implemented and successfully validatedfunctionality that is analogous to Python decorators, by leveraging acombination of Java annotations,which have become a standard feature ofthe JDK since its 1.5 release, and the JBoss Aspect Oriented Programmingframework4, which offers an open‐source library that enables to associatecustomcodetoJavaannotations.Asournextstepinthiseffort,wewillattackthechallengeofportingalsothemeta‐programmingsemanticsandfeaturesofourutilitycontextstotheJavaplatform.A central part of our approach is the process for the synthesis of utilityfunctions, which requires the collection of a large amount of data and itsstatisticalanalysis.Althoughdatacollectionisautomated,thisiscurrentlyatime‐consuming process that occurs in the lab, by means of a test bedenvironmentthatneedstobeconfiguredforeachapplication.Moreover,thestatisticalanalysisoccursoffline,andcanberatherlabor‐intensive.Thisisanissue that must be resolved, in order to adopt utility functions for self‐assessment extensively (that is, for a diverse range of applications), and atthe correct level of granularity (that is, for the distinct features and tasksenabledbyagivenapplication).Wearenowconsideringhowtoattackthatissue. First of all,wewant tomove from in‐the‐lab to in‐the‐fielddata collection.With Aupy, it is easy to deploy in the field an instrumented version of theapplication under scrutiny, so that its various features can be exercised inregular working conditions. However, in that case, it may be difficult toproduceanexhaustivemapofevenlydistributeddatapoints,tobefedtoourstatistical correlation analysis. It is likely that many of the collected datapointswould tend to gather in oneormoredense areaswithin that space,with other data points constituting outliers or operational anomalies.Consequently, itmightbeunfeasible to synthesizeanequation that is validoutside of the main operation areas, and which covers and accurately

4 Seehttp://www.jboss.org/jbossaop/

18

attributes utility to anomalous data points. On the other hand, we couldidentifyandtellapartdifferentoperationareasbyanalyzingin‐the‐fielddatawithclusteringtechniques–asshownforinstancebyQuirozetal.[22]‐andpossiblycharacterizethoseoperationareasintermsofdifferentapplicationfeaturesbeingexercised.Thatwouldeasily leadto thedefinitionofdistinctutility functions for each of them. Drawing an example from our FTP casestudy, tasks likedownloadinga single largemulti‐GB file, anddownloadingmultiple1MBfileswouldcorrespondtotwodistinct,recognizabledatapointregions,andcouldbetreatedindependentlyfromoneanother.We also plan to leverage some AI‐based techniques, to advance towardsfuller automation of the utility function synthesis procedure. For example,cooperative negotiation has been already applied to Autonomic Computingcontexts, in which overall utility is not known or cannot be expressed apriori,andmustbeelicitedincrementallyfrommultipleinterestedparties[1][20].Toapplyandadaptthoseearlierresultstoourcontext,wecouldemploymultipleevaluatingagents,eachdefiningitsutilityintermsofasingleaxisofthemulti‐dimensionalspacebeingmonitored.Eachagentwouldcomputeitslocalutilityvalues foranumberof samplerequests,and theaggregationofthevariousutilityperspectivesthusgeneratedwouldbeusedtocomputetheoverall utility function. Methods for constructing dependency graphs andprofilingtherelationshipsamongmultiplequalityfactors,analogoustothosedescribedin[16][17]byLietal,couldbeusedtoguidesuchanaggregationprocess,aswellasthechoiceoftherelevantdimensions.Furtherinsightonthis matter could come from the field of automated learning. We plan toexperimentwith supervised and unsupervised learning techniques, to helpderiving utility functions from the potentially large and high‐dimensionalspacemonitoredthroughAupy.Finally,sinceautilityfunctioncanberegardedasamulti‐dimensionalmapofapplication value, in which the axes represent environmental as well asapplication attributes, and assuming a management infrastructure that isable to actuate some subset of those attributes (at either level), it isconceivabletoelaborateactionabletrajectorieswithinthatmap,tomoveanapplication from a low‐utility state towards a higher‐utility state. Asomewhat similar idea has been pursued for example in [4]. To efficientlycalculate those trajectories, we could again leverage machine learningtechniques, in particular Reinforcement Learning (RL). RL has beensuccessfullyusedforAutonomicComputing intherecentpasttosynthesizepoliciesfordecision‐making,aimingonceagainatself‐optimization[27][28].A key concept in RL is “reward”, a scalar that valuates the observedconsequence of a decision, andwhich the learning agentmaking decisionsaims tomaximize. The concept of utility naturally maps onto RL rewards;thus,havingsynthesizedautilityfunction,itispossibletocomputerewardsfortrajectorieswithinthestatespaceoftheapplication,andsegmentswithinatrajectory.Withthatdevelopment,utility‐basedmethodscouldbeappliedinAutonomicComputing not only towards non‐analytic self‐assessment (i.e., within thereactive/introspective“Monitor+Analyze”portionoftheMAPEcontrolloop),but also to the equally important areaofnon‐analytic, automateddecision‐making.Suchatechniquecouldthusenablethedesignofprovisionsforthe

19

“Plan+Execute”halfoftheMAPEloopdevotedtoproactiveadaptation,whichcouldpotentiallycopewithunexpectedeventsandconditions.

Conclusions Thispaperpresentsamethodandatoolforthesynthesisofapplication‐levelutilityfunctions,whichcanbeemployedforautonomicself‐assessment.Ourworkhasprovideduswithanunderstandingofthepotentialofconsideringutility at the level of an application and its single, recognizable tasks orfetures.Ithasalsoledustodeviseasynthesisprocessbasedonthestatisticalanalysis of measurable data that is collected at two levels: the applicationunder scrutiny, and the running environment hosting that application.Wehave also acquired know‐how on the engineering of a framework for datamonitoring and collection, as well as the embedding of synthesized utilityfunctionsintheruntimecode,whichleadstoseamlesscomputationofutilityas theapplicationsoperates.Toexploit thatknow‐how,wehavedevelopedan open source prototype, Aupy, and experimented with it in two casestudies from diverse application domains. Through those case studies wehave acquired further insight on the task‐based nature of application‐levelutility, which has prompted us to include in our framework fine‐grained,block‐wisesupportforutilityfunctions.Wearenowinvolvedtoexpandthefunctionalityofourprototypetootherplatforms.Thisworkshowshowapplication‐levelutility isaviableabstractionforthenon‐analytic self‐assessment of autonomic systems, and can be madeoperationalfornewlydevelopedaswellaslegacysystems.Finally,thisworkcanrepresentafirststepthatenablesfurtherresearchonapplication‐level utility in Autonomic Computing, regarding the fullyautomated,in‐the‐fieldsynthesisofutilityfunctions,aswellasthepossibilityto leverage utility information for automated and non‐analytic decision‐makingonapplicationadaptation.

Acknowledgements The authorswould like to thank the othermembers of the SoftwareEngineering Research Group (SERG) at Drexel University. In particular,we are grateful toProf. Spiros Mancoridis, and to students Max Shevertalov and Ed Stehle, who have beeninstrumentalininitiatingandcarryingouttheworkleadingtothereportedresults.Sepcialthanks to Justin Hummel for his work on the porting of Aupy to the Java platform, hisassistanceindocumentingAupyfeatures,andhishelpintherevisionofthismanuscript.

References [1] C.Boutilier,R.Das,J.O.Kephart,G.Tesauro,andW.E.Walsh,“CooperativeNegotiationinAutonomicSystemsusingIncrementalUtilityElicitation,”InProceedingsofthe19thConferenceonUncertaintyinArtificialIntelligence,Acapulco,Mexico,August2003.

[2] M.Brodie,S.Ma,G.Lohman,L.Mignet,M.Wilding,J.Champlin,andP.Sohn.“QuicklyFindingKnownSoftwareProblemsviaAutomatedSymptomMatching”,inProceedingsofthe2ndInternationalConferenceonAutonomicComputing(ICAC2005),Seattle,WA,USA,June2005.

[3] Z.CaoandE.W.Zegura,“Utilitymax‐min:anapplication‐orientedbandwidthallocationscheme,”inProceedingsofIEEEINFOCOM'99,NewYork,NY,USA,April1999.

[4] M.Karlsson,andM.Covell,“DynamicBlack‐BoxPerformanceModelEstimationforSelf‐TuningRegulators”,inProceedingsofthe2ndInternationalConferenceonAutonomicComputing(ICAC2005),Seattle,WA,USA,June2005.

20

[5] S.W.Cheng,D.Garlan,andB.Schmerl,“Architecture‐basedself‐adaptationinthepresenceofmultipleobjectives,”inProceedingsoftheICSEInternationalWorkshoponSelf‐adaptationandself‐managingsystems(SEAMS2006),Shangai,China,May2006.

[6] F.C.Chu,andJ.Y.Halpern,“GreatExpectations.PartII:generalizedexpectedutilityasauniversaldecisionrule”,ArtificialIntelligence,Elsevier,159(1‐2):297‐302,November2004.

[7] R.Das,I.Whalley,andJ.O.Kephart,“Utility‐basedcollaborationamongautonomousagentsforresourceallocationindatacenters”,inProceedingsofthe5thIntl.JointConferenceonAutonomousAgentsandMultiagentSystems,Hakodate,Japan,2006.

[8] J.Dean,andS.Ghemawat,“MapReduce:SimplifiedDataProcessingonLargeClusters”,inProceedingsofthe6thSymposiumonOperatingSystemsDesignandImplementation(OSDI2004),SanFrancisco,CA,USA,December2004.

[9] W.Dickinson,D.Leon,andA.Podgurski,“Findingfailuresbyclusteranalysisofexecutionprofiles”,inProceedingsofthe23rdInternationalConferenceonSoftwareEngineering(ICSE),Toronto,Ontario,Canada,May2001

[10] S.Ghemawat,H.Gobioff,andS.Leung,“TheGoogleFileSystem”,inProceedingsofthe19thACMSymposiumonOperatingSystemsPrinciples,LakeGeorge,NY,USA,October2003.

[11] D.E.Irwin,L.E.GritandJ.S.Chase,“BalancingRiskandRewardinaMarket‐BasedTaskService,”inProceedingsofthe13thIEEEInternationalSymposiumonHighPerformanceDistributedComputing(HPDC‐13),Honolulu,HI,USA,June2004.

[12] C.Jin,andR.Buyya,“MapReduceProgrammingModelfor.NET‐BasedCloudComputing”,LectureNotesinComputerScienceVol.5704,pp.417‐428,Springer,2009.

[13] J.O.Kephart,andD.M.Chess,“TheVisionofAutonomicComputing”,IEEEComputer36(1):41‐50,January2003

[14] J.O.Kephart,andR.Das“AchievingSelf‐ManagementviaUtilityFunction”,IEEEInternetComputing11(1):40‐48,January2007.

[15] J.F.Kurose,andR.Simha,“AMicroeconomicApproachtoOptimalResourceAllocationinDistributedComputerSystems”,IEEETransactionsonComputers,May1989,38(5):pp705‐717.

[16] B.Li.“AHierarchicalGraphModelforProbingMultimediaApplications”.InProceedingsoftheIEEEInternationalConferenceonMultimediaandExpo(ICME2001),Tokyo,Japan,August2001.

[17] B.LiandK.Nahrstedt.“QualProbes:MiddlewareQoSProfilingServicesforConfiguringAdaptiveApplications”.InProceedingsofMiddleware2000,HudsonRiverValley,NY,USA,April2000.

[18] D.L.Mills,“NetworkTimeProtocol(Version3)Specification,ImplementationandAnalysis”,IETFRFC1305,March1992,availableat:http://www.faqs.org/rfcs/rfc1305.html

[19] A.V.Mirgorodskiy,N.Maruyama,andB.P.Miller,“Problemdiagnosisinlarge‐scalecomputingenvironments”,inProceedingsofthe2006ACM/IEEEConferenceonSupercomputing,Tampa,FL.,USA,2006.

[20] R.Patrascu,C.Boutilier,R.Das,J.O.Kephart,G.Tesauro,andW.E.Walsh,“NewApproachestoOptimizationandUtilityElicitationinAutonomicComputing,”InProceedingsoftheNationalConferenceonArtificialIntelligence,Pittsburgh,PA,USA,July2005.

[21] J.Postel,andJ.Reynolds,“FileTransferProtocol(FTP)”,IETFRFC959,October1985,availableat:http://www.ietf.org/rfc/rfc959.txt

[22] A.Quiroz,N.Gnanasambandam,M.Parashar,andN.Sharma,“RobustClusteringAnalysisfortheManagementofSelf‐MonitoringDistributedSystems,”JournalofClusterComputing,12(1):73‐85,March2009.

[23] M.A.Rappa,“Theutilitybusinessmodelandthefutureofcomputingservices”,IBMSystemsJournal,43(1):32‐42,InternationalBusinessMachineCorporation,2004.

[24] M.SalleandC.Bartolini,“Managementbycontract,”InProceedingsofNetworkOperationsandManagementSymposium(NOMS2004),Seoul,Korea,April2004.

[25] E.Stehle,M.Shevertalov,P.deGrandis,S.Mancoridis,andM.Kam,“TaskDependencyofUserPerceivedUtilityinAutonomicVoIPSystems”,inProceedingsoftheInternationalConferenceonAutonomicandAutonomousSystems,(ICAS2008),Gosier,Guadeloupe,March2008.

21

[26] R.Sterrit,andD.Bustard.“AHealth‐CheckModelforAutonomicSystemsBasedonaPulseMonitor”,KnowledgeEngineeringReview,21(3):195‐204,September2006.

[27] G.Tesauro,“ReinforcementLearninginAutonomicComputing.AManifestoandCaseStudies,”IEEEInternetComputing,11(1):22‐30,January2007.

[28] G.Tesauro,N.K.Jong,R.Das,M.N.Bennani,“Ontheuseofhybridreinforcementlearningforautonomicresourceallocation,”JournalofClusterComputing,10(3):287‐299,September2007.

[29] W.E.Walsh,G.Tesauro,J.O.Kephart,andR.Das,“UtilityFunctionsinAutonomicSystems”,inProceedingsoftheInternationalConferenceonAutonomicComputing(ICAC2004),NewYork,NY,USA,May2004.

[30] G.Wang,A.R.Butt,P.Pandey,andK.Gupta,“ASimulationApproachtoEvaluatingDesignDecisionsinMapReduceSetups”,inProceedingsofthe17thAnnualIEEE/ACMInternationalSymposiumonModelling,AnalysisandSimulationofComputerandTelecommunicationSystems(MASCOTS'09),London,UK.Sep.2009.

[31] X.WangandH.Schulzrinne,“Pricingnetworkresourcesforadaptiveapplicationsinadifferentiatedservicesnetwork,”inProceedingsofINFOCOM2001,Anchorage,AK,USA,April2001.

22

Figure1:MapReduceworkerutilizationwithutility‐awareassignmentpolicy.

Figure2:MapReduceworkerutilizationwithsimpleassignmentpolicy.

Figure3:observedFTPutility.

23

Figure4:estimatedFTPutility.

24

Table1:comparisonofMapReduceassignmentpolicies–jobcompletiontime.EXPERIMENT UTILITYAWARE

POLICYSIMPLEPOLICY

IMPROVEMENT

#1 25,939s 38,926s 33%#2 34,226s 47,368s 28%#3 120,671s 188,425s 36%Table2:AupyAnnotations.INTENT SYNTAX NOTESMonitoroperationoftheannotatedcode

@monitor(*args) *args isadynamiclistpopulatedwithutilityfunctions.IfNoneispassed,itwilluseallfunctionsdefinedwithinthesamescopeastheannotatedcode.

Pre‐evaluatedutilityfunction

@pre_utility Evaluationoccursuponenteringtheannotatedcode.

Post‐evaluatedutilityfunction

@post_utility Evaluationhappensuponleavingtheannotatedcode.

Envelopingutilityfunction

@utility Evaluationoccursuponentering,andagainafterleaving,theannotatedcode.

Concurrentlyevaluatedutilityfunction

@concurrent_utility Evaluationoccursconcurrentlywiththeexecutionoftheannotatedcode.

Table3:utilityequationsfortheFTPcasestudyLegend

Xi : ith statistical moment of packet inter-arrival time Yi : ith statistical moment of socket throughput

FTP-bulk FTP-small a1 -175150.7079 a1 0 a2 -145163.8736 a2 0 a3 0 a3 0 a4 -579.7485 a4 1028.3839 a5 2.0136 a5 -157.4976 b1 0.9241 b1 0.56368 b2 -0.2341 b2 -0.42717 b3 0 b3 0 b4 0 b4 612.90247 b5 -0.17755 b5 15.18621

Synthesis of application-level utility functions for autonomic …...1 Synthesis of...

Documents

Transcript of Synthesis of application-level utility functions for autonomic …...1 Synthesis of...