Journal of Software Engineering and Applications, 2011, 4, 396-410 doi:10.4236/jsea.2011.47046 Published Online July 2011 (http://www.SciRP.org/journal/jsea) Copyright © 2011 SciRes. JSEA Synthetic Workload Generation for Cloud Computing Applications Arshdeep Bahga, Vijay Krishna Madisetti Electri cal and Computer Engineering, Geor gi a Instit ute of Technology, Atlanta, USA. Email: {arshdeep, vkm}@gatech.edu Received May 19th, 2011; revised June 18th, 2011; accepted June 26th, 2011. ABSTRACT We present techniques for characterization, modeling and generation of workloads for cloud computing applications. Methods for capturing the workloads of cloud computing applications in two different models - benchmark application and workload models are described. We give the design and implementation of a synthetic workload generator that accepts the benchmark and workload model specifications generated by the characterization and modeling of work- loads of cloud computing applications. We propose the Georgia Tech Cloud Workload Specification Language (GT-CWSL) that provides a structured way for specification of application workloads. The GT-CWSL combines the specifications of benchmark and workload models to create workload specifications that are used by a synthetic work- load generator to generate synthetic workloads for performance evaluation of cloud computing applications. Keywords: Synthetic Workload, Benchmarking, Analytical Modeling, Cloud Computing, Workload Specification Language 1. Introduction Synthetic workload generation techniques are required for performance eva luatio n of co mplex multitier app lica- tions such as e-Commerce, Business-to-Business, Bank- ing and Financial, Re tail and Social Net working applic a- tions deployed in cloud computing environments. Each class of applications has its own characteristic workloads. There is a need for automating the process of extraction of workload characteristics from different applications and a standard way of specifying the workload characte- ristics that can be used for synthetic workload generation for evaluating the performance of applications. The per- formance of complex multitier systems is an important factor for their success. Therefore, performance evalua- tions are critical for such systems. Provisioning and ca- pacity planning is a challenging task for complex mul- ti-tier systems as they can experience rapid changes in their workloads. Over-provisioning in advance for such systems is not economically feasible. Cloud computing provides a promising approach of dynamically scaling up or scaling down the capacity based on the application workload. For resource management and capacity plan- ning deci sions, it is important to understand t he workload characteristics of such systems and measure the sensitive- ity of the application performance to the workload attributes. In this paper we briefly propose, 1) techniques for extraction of semantic and time behaviors from ap- plications at both task and operational levels for mul- ti-tenanted cloud platforms, 2) benchmark and workload models for complex multi-tier applications that allows describing different benchmarks in the form of building blocks, 3) the Georgia Tech Cloud Workload specifica- tion language (GT-CWSL) that provides a standard way for defining application workloads in a form that can be used by s ynt he tic wor kl o ad ge ne ra ti on t echni q ues, a nd 4) workload generation techniques based on the workload specifications of enterprise applications, for generating synthetic workloads. In this paper we describe a work- load characterization, modeling and generation approach that can be used for a wide range of multi -tiered applica- tions. We evaluate the proposed methodology using the RUBiS e-commerce benchmark that models an online auction site (such as ebay.com) and TPC-W benchmark that models an online book store (such as amazon.com). We describe the characterization and modeling of the workloads of RUBiS and TPC-W benchmarks and pro- vide a comparison of the generated synthetic workloads and empirical workloads obtained from logged traces.
Synthetic Workload Generation for Cloud Computing Applications Copyright © 2011 SciRes. JS EA 2. Related Work Several studies on analysis and modeling of web work- loads have been done [1-3]. Since obtaining real traces fr o m co mp le x mul t i-tier s ystems is difficult, a n umber of benchmarks have been developed to model the real sys- tems [4-8]. There are several workload generation tools developed to study Web servers such as SPECweb99 [3], SURGE [9], SWAT [10] and httperf [3]. Such workload generation tools repeatedly send requests from machines configured as clients to the intended systems under test. Table 1 provides a comparison of few workload genera- tion tools. Several other tools generate synthetic work- load s throug h trans for matio n (eg. permutat ion) of e mpir- ical workload traces [11-13]. The commonly used tech- niques for workload generation are user emulation and aggregate workload generation. In user emulation, each user is emulated by a separate thread that mimics the actions of a user by alternating between making requests and lying idle. The attributes for workload generation in the user emulation method include think time, request types, inter-request dependencies, etc. User emulation allows fine grained control over modeling the behavioral aspects of the users interacting with the system under test, however, it does not allow controlling the exact time instants at which the requests arrive the system [9]. This is because in user emulation, a new request is issued only after the response to the previous request has been re- ceive d. T hus, due t o net work d elays, he avy l oads o n sys- tem under test, etc, the intervals between successive re- quests increase. Aggregate workload generation is another approach that allows specifying the exact time instants at which the requests should arrive the system under test [14]. However, there is no notion of an indi- vidual user in aggregate workload generation, therefore, it is not possible to use this approach when dependencies between requests need to be satisfied. Dependencies can be of two types - inter-request and data dependencies. An inter-request dependency exi sts when the c urrent request depends on the previous request, whereas a data depen- dency exists when the current requests requires input data which is obtained from the response of the previous request. 3. Motivation We now describe the motivation for workload characte- rization and modeling, workload specification and syn- thetic workload generation for cloud computing applica- tions. 3.1. Workload Modeling Workload modeling involves creation of mathematical models that can be used for generation of synthetic workloads. Workloads of applications are often recorded as traces of workload related events such as arrival of requests along with the time-stamps, details about the users requesting the services, etc. Analysis of such traces can provide insights into the workloads characteristics which can be used for formulating mathematical models for the workloads. 3.2. Workload Specification Since the workload models of each class of cloud compu- ting applications can have different workload attributes, there is a need for standardizing the specification of ap- plication workloads. A Workload Specification Lan- guage (WSL) can provide a structured way for specifying the workload attributes that are critical to the perfor- mance of the applications. WSL can be used by synthetic workload generators for generating workloads with slightly varying the characteristics. This can be used to perform sensitivity analysis of the application perfor- mance to the workload attributes by generating synthetic workloads. 3.3. Synthetic Workload Generation An important requirement for a synthetic workload ge- nerator is that the generated workloads should be repre- sentative of the real workloads and should preserve the important characteristics of real workloads such as in- ter-session and intra-session intervals, etc. There are two approaches to synthetic workload generation: 1) Empiri- cal approach, in which traces of applications are sampled and replayed to generate the synthetic workloads, 2) Analytical approach, which uses mathematical models to define the workload characteristics that are used by a synthetic workload generator. The empirical approach lacks flexibility as the real traces obtained from a partic- ular system are used for workload generation which may not well represent the workloads on other systems with different configurations, load conditions, etc. On the oth- er hand, the analytical approach is flexible and allows generation of workloads with different characteristics by varying the workload model attributes. With the analyti- cal approach it is possible to modify the workload model parameters one at a time and investigate the effect on application performance to measure the application sen- sitivity to different parameters. 4. Current Challenges & Contributions We now describe the shortcomings in the previous ap- proaches and the contributions in our proposed metho- dology to address these shortcomings. 4.1. Accuracy The effectiveness of any benchmarking methodology is
Synthetic Workload Generation for Cloud Computing Applications Copyright © 2011 SciRes. JSEA Table 1. C omparison of published approaches. Refer ence Approach Appli c a tio n Input Output httperf [3] Has a core HT TP en gine that handles all communication with the server, a workload ge nera tio n m odule that i s responsible for initiating appropriate HTTP calls at the appr opri ate times, and a statis tics collecti on modu le. rat es various HTTP workloads and for m ea suring server perfor- mance. Request URLs, specifications of the request r ates , n umber of connec- tions, etc. Req uests generated at the specified ra te. SURGE [9] Uses an offline trace generation engine t o creat e traces of r eq uests. Web characteristi cs suc h as file sizes, request si zes, popular ity, tem- poral locality, etc., are statistically model ed . for tes ting ne two r k and server perfor- mance. Pre-com put e d d a ta -sets consisting of th e sequence of requests to be mad e, the number of embedded files in each web object to be requested, an d th e sequences o f Ac tive and Inactive OFF times to be inserted between requ es t . agrees with the six distri- butional models th at make up the SUR GE model ( f ile sizes, request si zes, popu- larity, em bedded r ef er- enc es, t emporal locality, an d O FF times). SWAT [10] Uses a trace gener ation engine that tak es sess ionl ets (a sequ ence of request types from a real system user) as input and produces an output trace of sessions for stress test. SWAT uses httperf for request gen- eration. session based web appli cations. Trac e o f sessionlets obta ined from access logs of a live system under test, specifications of think ti m e, session length, session inter-arrival ti me, et c. Trace of sessions for stress test. defined by how accurately it is able to model the perfor- mance of the application. Accuracy of a benchmarking methodology is determined by how closely the generated synthetic workloads mimic the realistic workloads. Ag- gregate workload generation techniques such as the one used in Geist [14], can run into difficulties when in- ter-request or data dependencies exist. Therefore, we adopt a user emulation approach where the workload characterizations are in the form of the behavior of an individual user. By accurately modeling the application characteristics, request types, inter-request dependencies, data dependencies, transition probabilities, think times, intersession intervals and session lengths, by analysis of traces of applications, our proposed methodology is able to generate workloads that are representative of the real workloads. 4.2. Ease of Use The existing tools reviewed in the related work section, require a significant amount of hand coding effort for writing scripts for workload generation that take into account the dependencies between requests, workload attributes, etc. For example, TPC-W [7] uses a remote browser emulation (RBE) system for generating work- loads that accepts specifications for workload mix which are provided in separate script files. Furthermore, there are scripts for requests that are executed such that the specified workload mix can be obtained. To add new specifications for workload mix and new requests, addi- tional scripts need to be written. Writing additional scripts for new requests may be complex and time con- suming as inter-request dependencies need to be take care of. In our proposed approach, we perform an auto- mated analysis of the application traces and extract the application characteristics using the benchmark and workload model generators. Addition of new requests can be done by changing the benchmark model specifica- tion whereas new workload mix can be specified by making changes in the workload model specification. The GT-CWSL code generator accepts these benchmark and workload model specifications and generates the benchmark driver file that drives the synthetic workload generator. 4.3. Flexibility Our proposed methodology allows fine grained control over the workload attributes such as think time, in- ter-session interval, session length, workload mix, etc. By varying the characterizations of the attributes of the benchmark and workload models, different workloads can be generated to test the system under study and also perform a sensitivity analysis of the performance of the system to various model attrib utes. Sensiti vity anal ysis is performed by varying one workload characteristic at a time while keeping others constant. Such an analysis is not possible using an empirical approach as it not possi- ble to obtain empirical traces with such varying work- loads. 4.4. Wide Application Coverage Workload modeling and generation techniques have been
Synthetic Workload Generation for Cloud Computing Applications Copyright © 2011 SciRes. JS EA investigated in the past for different classes of applica- tions. For each class of application, different workload specification and generation approaches have been used. Our approach differs from the existing approaches as we provide a generic methodolo gy for extraction of workload characteristics from different classes of applications, cap- turing the workload characteristic in benchmark and workload models and a synthetic workload generator that accepts the workload specifications in the form of our propo sed cloud workl oad spec ificatio n language ( GT- CWSL). The advantage of using two separate models to guide the synthetic workload generation is that the pro- posed workload generation process becomes independent of the application under study. The benchmark model captures the different requests types/operations allowed in the benchmark application, proportions of different re- quest types and the dependencies between the requests. The workload model captures workload attributes such as inter-ses sion interval, think time and sessio n length. Since the syntheti c workload ge nerator used in our methodology is generic in nature and generates workloads based on the GT -CWSL specifications, the workload generation proce ss becomes independe nt of the application. 5. Proposed Methodology Figure 1 shows an overview of our proposed approach for workload characterization, modeling and generation. 5.1. Trace Generation and Analysis A benchmark application is instrumented to generate traces which have information regarding the user, the requests submitted by the user and the time-stamps of t he requests. Typically in the benchmark applications sepa- rate threads are created for each user, where each thread creates an instance of the load driver or the load genera- tion logic. By instrumenting the load driver of a bench- mark application we can obtain the access logs. An ex- ample of a trace generated from a benchmark application is shown in Table 2. Each entry in the trace has a time-stamp, request type, request parameters and user’s IP address. For a benchmark application that uses a syn- thetic workload generator running on a single machine, IP address cannot be used for identifying the users. In that case, the thread-ID (where each thread represents a separate user) is used. The trace generated from a benchmark has all the requests from all users merged into a single file. The trace analyzer identifies unique us- ers/sessions based on the IP address or thread-ID from which the request came. The terms user and session ca n- not be always used interchangeably because a single user can create multiple sessions. Therefore, we use a time-threshold to identify a session. All requests that come from a single user within that threshold are consi- dered as a si ngle session. Figure 1. Proposed methodology for workload cha r acterization , modeling and generation.
Synthetic Workload Generation for Cloud Computing Applications Copyright © 2011 SciRes. JSEA Table 2. Trace generated from a benchmark appl icati on. 1119844280621 Hom e ip:192.168.0.2 1119844280635 Login username:user1 password:pwd1 ip:192.168.0.2 1119844280646 AddEvent eventname:event1 date:05062010 venue:r oom 1 desc r i ptio n: m ee ting ip: 1 92. 16 8.0. 2 1119844280648 Hom e ip:192.168.0.3 1119844280655 ViewEvent eventname:ev ent1 ip:192.168.0.2 1119844280662 Login username:user2 password:pwd2 ip:192.168.0.3 1119844280675 ViewEvent event name:event2 ip:192.168.0.3 5.2. Modeling Workloads The trace generated from a benchmark application is analyzed by a trace analyzer. Two different models are generated from the analysis of the traces, 1) Benchmark Model, 2) Workload Model. The attributes of benchmark and workload models are shown in Table 3. T he charac- terizations for the attributes of benchmark and workload models are obtained by the analysis of empirical traces obtained from benchmark applications. a) Sessio n: A set of success ive req uests submitte d by a user constitute a session. b) Inte r-Session Interval: Inter-session interval is the time interval between successive sessions. c) Think Time: In a session, a user submits a series of requests in succession. The time interval between two successive requests is called think time. Think time is t he inactive period between subsequent requests in a session. It is the time taken by the user to review the response of a request and decide what the next request should be. d) Session Length: The number of requests submitted by a user in a session is called the session le ngth. e) Workload Mix: Workload mix defines the transi- tions between different pages of an application and the propor t ion in which the p ages are vis ited. 5.3. Benchmark Model The benchmark model includes attributes such as opera- tions, workload mix, inter-request dependencies and data dependencies. The benchmark modeling approach in our proposed methodology is based on Probabilistic Finite State Machine (PFSM) Model [15]. A Probabilistic Fi- nite State Machine (PFSM) is a non-deterministic finite state machine in which every transition has an associated Table 3. A ttributes of benchmark and workload models. Operations, Workload mix, Inter-request depe nden cies, Dat a depe n dencie s Inter-session interval, Think time, Session lengt h output and a probability. A PFSM M i s defined by a tuple, M = (I, O, S, T, P), where I = {a1, , ap} is the finite input alpha bet O = {o1, , oq} is the finite output alpha- bet, S = {s1, , sn} i s a finite set o f states, T is t he set o f transitions and P is the probability of a transition. Each transition is defined as a tuple, t = (s, q, a, o) where s is the current state, q is next state, a is the input symbol and o is the output symbol. For every state s and input symbol a, the sum of pr obabilities of all t he transi- tions out of s on input a is equal to 1. A PFSM is represented as a transition graph with n nodes, where each node represents a state. Transitions are represented by directed edges between the states. A directed edge exists between two states s and q only if the associated prob ability of the transition P(s,q,a,o) > 0. We now describe the operation of a PFSM. Consider the initial state of the machine to be si. When an inp ut ak is received the machine makes a transition from si to sj with a probability P= (si, sj, ak, ol), and produces the output ol. For modeling different benchmarks we use PFSM as follows. Each state in the PFSM represents a web-page of the benchmark. The directed edges between states represent the transitions between different pages. Each transition has an input, an output and a probability. In- puts in PFSM represent the operations of the application or the requests submitted by the user. Each opera- tion/request R is defined as a tuple R = (X, D) where X is the request-type and D is the data associated with the request which is represented as key-value pairs. For representing the data associated with each request in the benchmark model, we use data substitution tags. The data substitution tags are used for generating the dyna mic U RL s d uri ng the s ynthe tic wor klo ad gener at io n. A data substitutio n tag can have two types of functions, 1) data generation, 2) data extraction. Consider the follow- ing URL generated from the benchmark model specifica- tion: http://Server/App/registe rUser.php?name= <generateUsername(5,10)>&password= <generatePwd()>&email= <generateEmail(“@app.com”)> The func ti on s u se d in t he d ata sub st it ut io n tags suc h a s generateUsername(5,10) are the data generation func- tions, which generate synthetic data. E.g. generateUser- name(5,10) generates a random user name of length be- tween 5 to 10 characters. Now consider another URL shown bel ow: http://Server/App/viewEvent.php?eventname=<extract EventNameFromHTML()>
Synthetic Workload Generation for Cloud Computing Applications Copyright © 2011 SciRes. JS EA The function extractEventNameFromHTML() used in the data substitution tag is the data extraction function, that extracts the value for the request parameter event- name from the response of the previous HTML page. Figure 2, shows the PFSM model for a social event calendar application. Table 4 shows the details of the transitions in the PSFM model for the social event ca- lendar application. Outputs in the PFSM model for an application represent the return values of the opera- tions/requests. Characterization of Benchmark Model Attributes: Cha- racterization of benchmark model attributes involves identification of different operations/request types in a benchmark application, proportions of different request types, i.e. the workload mix, the inter-request and data dependencies. Given a trace of a benchmark application as shown in Table 2, the benchmark model generator first identifies the unique pages in the application or the request types in the trace, which are the states in the PFSM model. Then Figure 2. PFSM model for a social event calendar applica- tion. the transitions between different pages of an application and the proportion in which the pages are visited are identified. These transitions represent the inter-request dependencies. The data associated with each request is also identified, which appears in the trace in the form of request parameters. Table 4 shows an example of a cha- racterization of benchmark model attributes for a social event calendar application. Benchmark Model Specification: The benchmark mod- el specifications are formulated in as an XML document that is input to the GT-CSWL code generator. Table 5 shows the specifications of the benchmark model for a social event calendar application. The benchmark model specification contains details on various request types in the benchmark application, the request parameters and the transition probabilities for the PFSM model of the benchmark application. 5.4. Workload Model The workload model includes attributes of the workload such as inter-session interval, think time and session length. The workload model describes the time behavior of the user requests. When multiple users submit requests to an application simultaneously the workload model attributes such as intersession interval, think time and session length are important to study the performance of the applicatio n. Think time and session le ngth capture the client-side behavior in interacting with the application. Whereas the inter-session interval is a server-side aggre- gate, that captures the behavior of a group of users inte- rac t ing with the application. Characterization of Workload Model Attributes: For characterizing the workload model attributes, it is neces- sary to identify independent users/sessions in trace. The trace analyzer identifies unique users and sessions from the trace of a benchmark application. A statistical analy- sis of the user requests is then performed to identify the right distributions that ca n b e used to model the workload model attrib utes such as inter -sessio n inter val, t hink ti me and session length. The steps involved in characterizing workload model attributes are a s follows: 1) Select Candidate Distributions: We consider four candidate d istributions for the workload model attributes, (1) Exponential distribution, (2) Hyper-exponential dis- tribution, (3) Weibull distribution, and (4) Pareto distri- bution. We now briefly describe the reasons for considering them as candidate distributions. Exponential distribution can be used for modeling inter-session intervals. Pre- viou s studi es ha ve sho wn t hat the sess ion arr ivals c onsti- tute a Poisson process in which the arrivals are indepen- dent and unifor mly distribute d. The inter-arrival times o f a Poisson process are exponentially distributed. Hy-
Synthetic Workload Generation for Cloud Computing Applications Copyright © 2011 SciRes. JSEA Table 4. Transitions in the PF SM model for a soci al event calendar applicat ion. <username, user1>, <password, pwd> <username, user1>, <password, pwd>, <emai l, user1@app.com> <even tname, ev ent1>, <date, 1207201 0 >, <venue, room1>, <description, meeting> <username, user1>, <password, pwd> <even tname, ev ent3>, <date, 0506201 0 >, <venue, room3>, <description, meeting> per-exponential can be use d fo r mod eli n g thi nk times a nd inter-session intervals. The difference between hyper- exponential and exponential distribution is that the hy- per-exponential distribution has a larger variance with respect to the mean, whereas an exponential distribution has variance equal to the mean. Pareto distribution is a “heavy-tailed” distribution in which very large values have non-negligible probability. Pareto distribution can be used to model session lengths, where long sessions have a non-negligible probability. 2) Parameter Estima tion: Given a set of candidate dis- tributions for the workload model attributes, the parame- ter estimation process identifies parameters for the dis- tributions that best fit the data. We use the Maximum Likelihood Estimation (MLE) method for parameter es- timation. The MLE method produces the parameter val- ues that maximize the probability of sampling the given data values. Consider a distribution defined by a para- meter . The likelihood function to observe a set of samples { x1, , xn}is give n by, ()( ) 11 ,., ;; n ni i Lxxf x θθ = = ∏ where is the distribution for parameter . Setting, We can find the value of the parameter that max- imizes the likelihood. We use the MLE tool [16], which provides a language for building and estimating parameters of likelihood models. We use the PDF types EXPONENTIAL, HY- PER2EXP, WEIBULL and PARETO supported in MLE. 3) Checking the Goodness of fit: To verify the good- ness of fit of the distributions with the estimated para- meter values, we perform statistical tests devised by Kolmogorov and Smirnov (KS test) [3]. KS test is based on calculating the maximum distance between cumula- tive distribution function of the candidate distribution and the empirical distribution. We use an online tool available at [17] for performing the KS tests for the ex- periments. The data sets obtained from the logged and estimated distributions for workload model attributes are the input to the online tool, which calculates the maxi- mum distance between the CDFs of the two input data sets. Workload Model Specification: The workload model specifications are formulated in as an XML document that is input to the GT-CSWL code generator. Table 6 shows the specifications of the workload model for a social event calendar application. The workload model contains speci fications for the d istributions for the wor k- load model attributes such as think time, inter-session inter val and se ssion le ngth. The GT-CWSL code genera- tor supports Negative-Exponential, Weibull, Hyper-Ex- ponential and Pareto distributions for the workloa d mod- el attributes. 5.5. Performance Policies The performance policies specify the service expecta- tions from the benchmark. The performance requirement specifications include a series of service level objectives (SLO’s) that define the performance metrics such as the response time specification for each request in the appli- cation. 5.6. GT-CWSL GT-CWSL provides specifications for workload mix, ben-
Synthetic Workload Generation for Cloud Computing Applications Copyright © 2011 SciRes. JS EA Table 5. Benchmark model specification for a social event cale ndar ap plication. <?xml version = ”1.0” encoding=”UTF-8”?> <benchmark name = ”SocialEventCalendar”> < r equ es t s > <req uest n ame=”Home”> <path>/home.html</path></request> <req uest n ame=”Log in”> <path>/login.php</path> <param name= ”use r n ame”> < d at a> getUser n a me( )< / data> </param> <param name= ”p ass wor d”> < d at a> getPa ss word ( )< / d at a> </param> </reques t > <req uest n ame=”Register”> <path>/regist er.php</path> <param name= ”username”> <dat a >g enera teUsernam e() < / d a ta > </param> <param name=”password”> <dat a > gener a t ePa s s wor d () < / d at a> </param> <param name=”email”> <data>generateEmail()</data> </param> </reques t > ... ... <req uest n ame=”V i ewEv en t”> <path>/viewEvent.php</pa th> <param name= ”eventname”> <da- ta>extractStringFromHTML(”eventname”)</data> </param> </reques t > < / r equ es ts> <workloadMix> <req u est> <nam e> Home< / n a m e> <r>0</r><r>60</r><r>40</r><r>0</r><r>0</r> </reques t > <req u est> <nam e> Logi n</n am e> <r>20</r><r>0</r><r>0</r><r>40</r><r>40</r> </reques t > <req u est> <nam e>Regi s t er < / n a m e> <r>20</r><r>10</r><r>0</r><r>60</r><r>10</r> </reques t > <req u est> <name>AddEvent</name> <r>40</r><r>0</r><r>0</r><r>0</r><r>60</r> </reques t > <req u est> <nam e> Vie wE ven t < / na m e> <r>40</r><r>0</r><r>0</r><r>60</r><r>0</r> </reques t > </workloadMix> </benchmark> chmark requests and workload model attributes such as think time, inter-session interval and session length dis- tribution, using Java annotations. The GTCWSL code generator uses the Faban driver framework [18]. Faban Table 6. Workload model specification for a social event cale ndar ap plication. <?xml version=”1 .0” encodi ng=”UTF-8” ?> <workload name=“SocialEventCalendar”> <thinkTime> <distribution>NegativeExp onential</distribution> <mean>4000</mean> <min>100 </ mi n> <max>20000</max> <deviation>2</deviation> </th ink Tim e> <int er Ses si o nInter va l> <distribution>NegativeExp onential</distribution> <mean>3000</mean> <min>100 </ mi n> <max>15000</ma x > <deviation>2</deviation> </interSessionInterval> <sessionLength> <distribution>NegativeExp onential</distribution> <mean >1 0 </ m ea n > <min>5</m in > <max>50</max> <deviation>2</d evi a tion > </sessionLength> </workload> provides a framework for developing workloads (called the Driver Framework) and a mechanism for run execu- tion and management (called the Harness). The GT- CWSL code generator takes the benchmark model, workload model and performance policy specifications as input and generates the benchmark driver file that in- cludes GT-CWSL specifications as Java annotations. Table 7 shows a snippet of the generated GT-CWSL code which forms a part of the benchmark driver logic. The bench mark dr iver co ntai ns the logic defi ning how to interact with the system under test. The requests speci- fied in the driver are selected for execution in a manner such as to obtain the workload mix specified in the workload model. The benchmark requests which are an- notated as @Request define the logic that is used to gen- erate the load for the system under test. The benchmark requests contain implementations for generation of the requests for the system under test and the data associated with the request. The implementations for the data gen- eration and the extraction functions which are specified in the data substitution tags are provided in the bench- mark driver. 5.7. Run Configuration In addition to the specifications for benchmark and wor k l o ad models and the performance policies, a run configuration file is required to provide the input para- meters that control the benchmark run on the system un- der test. The run configuration contains specifications of the ramp up, steady state and ramp down times, the number of users, output directory, etc.
404 Synthetic Workload Generation for Cloud Computing Applications Copyright © 2011 SciRes. JS E A Table 7. Sample GT-CWSL code. 5.8. Test Harness Code To automate the running of benchmarks and queue mul- tiple benchmark runs we use the Faban Harness infra- structure [18]. The Faban Harness provides a web inter- face to launch and queue benchmark runs and visualize the results. In order to hook the benchmark driver to the Faban Harness a test harness code is required. Although the logic for driving the workload is specified in the benchmark driver file, a mechanism is needed to connect the driver logic to the Faban Harness. This mechanism is provided using the test harness code. The test harness code defines the process of running the benchmark and the logic for customizing and controlling the behavior of the benchmark. The test harness code includes methods for starting and stopping a run, validation of configura- tion file, configuring the benchmark before a run, pre- processing and post-processing that may be required for a run. 5.9. Synthetic Workload Generation Figure 3 shows the block diagram for the synthetic workload generator used in our proposed methodology. This workload generator is built using the Faban run ex- ecution and management infrastructure [18], which is an open source facility for deploying and running bench- marks. We have extended the Faban Harness to accept GT-CWSL specifications that are generated by the GT-CWSL code generator using the benchmark and workload models. This synthetic workload generator allows generating workloads for multi-tier benchmark applications that are deployed across several nodes in a cloud. Figure 3 shows the Faban Master agent that controls the Driver agents that run on one or more machines and the system under test (SUT) that can have one or more machines. The different components of Faban are as fol- lows: 1) Master: The Faban master contains a web-server that runs the Faban harness which provides a web inter- face to launch and queue benchmark runs and visualize the results. Multiple b enchmark runs can be sub mitted to the s ys tem under test. 2) RunQueue: Run Queue manages the benchmark runs which are run in a first in first out (FIFO) manner. 3) LogServer: Log Server collects pseudo real time logs fr o m the sys te ms und e r te st . 4) Agent: Agent is the mecha nism that drives t he load. Agents are deployed on both the driver systems and the systems under test. These agents control the benchmark runs and collect the system statistics and metrics which are used for performance evaluation. 5) Agent Thread: Multiple agent threads are created by an agent, where each thread simulates a single user. 6) Registry: Registry registers all the agents with the Master so that the master can submit the load driving task s t o the a gents. 7) Driver: Driver is a class supplied by the developer that defines the logic for workload generation, workload characteristics, benchmark operations and the logic for generating requests and the associated data for each of the benchmark operations. name = "SocialEventCalendar" ) @ThinkTime ( distType = DistributionType.NEGEXP, distMin = 100, distMean = 300 0, distMax = 15000, distDeviation = 2 ) @InterSessionInter v al ( distType = DistributionType.NEGEXP, distMin = 100, distMean = 400 0, distMax = 20000, distDeviation = 2 ) @SessionLength ( distType = DistributionType.NEGEXP, distMin = 5, distMean = 10, distMax = 50, distDev ia tion = 2 ) @CommonPolicies( maxUsers = 100000, metric = "req/s", unit = TimeUnit.MILLISECONDS ) @WorkloadMix ( requests = {"Home", "Login", "R egi s ter", "AddEvent", "ViewEvent"}, mix = { @Ro w ({ 0, 60, 40, 0, 0}), @Row ({ 20, 0, 0, 40, 40}), @Row ({ 20, 10, 0, 60, 10}), @Row ({ 40, 0, 0, 0, 60} ), @Row ({ 40, 0, 0, 60, 0}) }, deviation = 2 ) @Req uest ( name = "Home", path = "/home.html", max9 0t h = 50 ) ... @Req uest ( name = "Login", path = "/login.php", data = "?user- name=<generateUserName()>&pas s word=<generatePassword()>", max90th = 100 )
Synthetic Workload Generation for Cloud Computing Applications Copyright © 2011 SciRes. JSEA Figure 3. Synthetic workload generation. 5.10. Master Controller The Master Controller is responsible for controlling the benchmark runs on the system under test. The Master Controller starts and stops the benchmark runs based on the specifications in the run configuration file. In addi- tion to controlling the runs, the Master Controller also collects the runtime metrics from the system under test. Currently in our proposed framework, the Faban Master performs the tasks of the Master Controller. 5.11. Deployment Tools Our proposed framework uses a number of deployment tools. For deploying the benchmark driver, a benchmark deploy image (jar file) is created from the benchmark driver file. T he Faban Harness pro vides a utility for dep- loying the benchma rk deploy image on the sys t ems under test. Faban Harness also provides a utility for deplying ser- vices such as Apache2HttpdService, MySQLService, etc. The services which are configured in the run configura- tion are started by the Faban Harness before the bench- mark run starts and stopped after the run completes. The Faban framework allows deployment of pluggable tools for collecting information from specific server software. For example, tool for gathering the statistics from a MySQL instance using the MySQL query inter- face, tool for looking into the Oracle database, etc. Tools get configured before the run starts and they actually
406 Synthetic Workload Generation for Cloud Computing Applications Copyright © 2011 SciRes. JS E A collect information from specific server software during the steady state. For deploying the benchmark application on the sys- tem under test we developed a deployment utility that transfers the benchmark application files to the web server. The details of the web server on which the benchmark application is deployed (such as the hostname, host port, etc) are specified in a deployment configura- tion file. 6. Experiment Setup To demonstrate the proposed workload characterization, modeling and generation approach we created benchmark and workload models for the Rice University Bidding System [5] benchmark. RUBiS is an auction site proto- type which has been modeled after the internet auction website eBay. To study the effect the different workload attributes on the response times, we performed a series of experiments by varying the workload attributes such as think time, inter-session interval, session length and number of users. The experiments were performed on a machine with Intel Core i5 3.2 GHz processor, 4 GB memory and 1TB disk space. We used a PHP implemen- tation of the RUBiS benchmark for all the experiments. The benchmark was executed on an Apache-2.2.14 web server and MySQL 5.1.41 database server. We used Sys- tat utility for measuring the system metrics. The perfor- mance metric used for comparison of different runs is the 90th percentile of the response time. To validate that the proposed app roach works for a wide range of benchmarks we repeated the above experiments for the TPC-W benchmark [7] that models an online bookstore. We used a Java Servlets version of TPC-W benchmark that works with MySQL database. 7. Performance Evaluation We instrumented the PHP implementation of the RUBiS benchmark and obtained the traces of the user requests, similar to the trace shown in Table 2. From the analysis of the logged traces the benchmark and workload models were generated. We considered a subset of the request types of RUBiS benchmark for the benchmark model. The distributions for the workload model attributes were estimated using the MLE approach described in Section V. Table 8 shows the KS test results for goodness of fit of the estimated distributions for the workload model attributes. The implementations of the data generation and data extraction functions are provided in the bench- mark driver. From the KS test results it is observed that exponential distributions best fit the logged think time and inter-session attributes, whereas a Weibull distribu- tion best fits the lo gged session len gth. Figures 4 - 6 show the comparisons of the cumulative Table 8. KS test results for workload model attributes. Attribute Exponential Weibull Pareto Inter-session int er va l Figure 4. Comparison of logged and estimated think time distributions for RUBiS benchma rk appli cation. Figure 5. Comparison of logged and estimated inter-session interval distrib utio ns for RUB iS benchmark application. distribution functions (CDFs) of the logged and esti- mated distributions for think time, inter-session interval and session length respectively for the RUBiS bench- mark applicatio n. Figures 7 - 9 sho w the c ompar isons of the CDFs of the distributions of think time, inter -session interval and session length respectively, of the logged and the generated synthetic workloads for the RUBiS benchmark application. From these plots it is observed that the distributions for the workload attributes for the logged and generated synthetic workloads for RUBiS benchmark applicatio n are ver y close to eac h other, which
Synthetic Workload Generation for Cloud Computing Applications Copyright © 2011 SciRes. JS EA Figure 6. Comparison of logged and estimated session length distributions for RUBiS benchmark application. Figure 7. Comparison of logged and synthetic think time distributions for RUB iS benchmark appli c a tio n. Figure 8. Comparison of logged and synthetic inter-session interval distributions for RUBiS bench mark applicatio n. Figure 9 . Co mpa ri s o n of l og g ed an d sy ntheti c s ess i on l e ng t h distributions for RUB iS benchmark appli c a tio n. validates that the our proposed approach for workload modeling and generation closely simulates the real workloads. We no w provi de the r esult s of s ensiti vity ana lysis . We performed a number of experiments by varying the workload attrib utes one a t a time to measure the sensitiv- ity of the performance of the system under test to the workload attributes. Figure 10 shows the effect of think time o n the 90th percentile of the response time (R90). For this experiment we performed a run with a steady state time of 5 minutes and the same number of users, average inter-session interval, average session length and work- lo ad mi x. Fr o m Figure 10 it is observed that as the thin k time increases R90 decreases. The reason for this is that as the think time increases while keeping other workload attributes fixed, the mean request arrival rate decreases. Since fewer requests are serviced per second with an incre as i ng thi n k time, R90 decreases. Figure 11 shows the effect of inter-session interval on R90. We performed a run with a steady state time of 5 minutes while keeping the other workload attributes such as number of users, think time, average session length and workload mix t he same. From Figure 11 it is observed that as the in- ter-session interval increases, R90 decreases. This is be- cause with an increasing inter-session interval, the mean request arrival rate decreases, thus fewer requests are serviced per second, which decreases R90. Figure 12 shows the e ffect of session length o n R90. We performed a run with a steady state time of 5 minutes, and the same number of users, average think time, average intersession interval and workload mix. From Figure 12 it is ob- serve d t hat b y i ncr ea si n g the s es sio n le n gt h, R90 increases. This is because for larger session lengths, the number of concurrent sessions and thus the mean request arrival rate increases.
408 Synthetic Workload Generation for Cloud Computing Applications Copyright © 2011 SciRes. JS E A Figure 10. Effect of think time for RUBiS benchmark ap- plication. Figure 11. Effect of inter-session interval for RUBiS benc h mar k appl i c ati o n. Figure 12. Effect of session length for RUBiS benchmark application. Existing approaches such as SURGE [9] and SWAT [10] have used offline trace generation and request gen- eration approach where a trace is first generated that meets the desired workload characteristics and then a request generation engine is used to submit the requests from the generated trace. The advantage of offline trace generation is that it separates the complex process of computing request parameters and workload attributes from the request submission step. In the request submis- sion process the requests are read from the trace and submitted to the system under test. However, for per- forming rapid sensitivity analysis where only one work- load attribute is changed at a time while keeping others constant, an online trace generation approach is preferred. In the online trace generation and request submission approach, the threads that emulate the users generate and submit the requests to the system under test. Our pro- posed approach differs from the existing approaches as it provides both offline and online traces generation capa- bility, and can be used to perform a rapid sensitivity ana- lysis as shown in Figures 10 - 12. In order to validate that the proposed approach for workload characterization, modeling and generation works for a wide range of benchmarks we repeated the above experiments for the TPC-W benchmark applica- tion. Figures 13 - 15 show the comparisons of the cu- mulative distrib ution functio ns (CDFs) o f the logged a nd estimated distributions for think time, inter-session in- terval and session length respectively for TPC-W benchmark application. Figures 16 - 18 show the com- parisons of the CDFs of the distributions of think time, inter-session interval and session length respectively, of the logged and the generated synthetic workloads for TPC-W benchmark application. From these plots it is observed that the d istributions for the workload attr ibutes for the logged and generated synthetic workloads for TPC-W benchmark are very close to each other, which validates that the our proposed approach for workload Figure 13. Comparison of logged and estimated think time distributions for TPC-W benchmark application.
Synthetic Workload Generation for Cloud Computing Applications Copyright © 2011 SciRes. JS EA Figur e 14. Comparison of logged and est imated int er-session interval dist ri butions for TPC-W bench mark applicat ion. Figure 15. Comparison of logged and estimated session length distributions for TPC-W benchmark application. Figure 16. Comparison of logged and synthetic think time distributions for TPC-W benchmark application. Figure 17. Comparison of logged and synthetic inter-session interval distributions for TPC-W benchmark application. Figure 18. Comparison of logged and synthetic session length distributions for TPC-W benchmark application. modeling and generation closely simulates the real wor- kloads. 8. Conclusions & Future Work Traditional approaches for workload modeling and gen- eration have been application specific. There are a num- ber of benchmarks available for complex multitier appli- cations, which have their own specific workload genera- tors. There is a lack of a standard approach for specifica- tion of the workload attributes for different application benchmarks. In this paper we proposed a methodology for characterization, modeling and generation of work- loads for complex multitier enterprise applications that are deployed in cloud computing environments. The proposed approach automates the process of extraction of workload characteristics from different applications. We
410 Synthetic Workload Generation for Cloud Computing Applications Copyright © 2011 SciRes. JS E A used an analytical modeling approach to represent the behavior of applications and their workload characteris- tics. A methodology for creation of benchmark and workload models was proposed that can be used for modeling different cloud application benchmarks. To specify the benchmark and workload models in a stan- dard way that can be used for synthetic workload genera- tion we briefly proposed the Georgia Tech Cloud Work- load Specification Language (GT-CWSL). A GTCWSL code generator was developed that generates the specifi- cations that are input to a synthetic workload generator. We demonstrated the effectiveness of the proposed me- thodology by modeling the RUBiS auction site and TPC-W online book store benchmarks. Results showed that the generated synthetic workloads closely match the rea l workload s. With a synthe tic worklo ad gene rator tha t accepts GT-CWSL specifications it is possible to per- form a sensitivity analysis of the performance of the sys- tem under test to different workload attributes. Future work will focus on adding new attributes to the bench- mark and workload models such as temporal locality, file size, request size, file popularity, etc and performing stu- dies on the effects of these attributes on the performance of different multi-tier applications. Furthermore, we will incorporate a cost model for specifying the cost of the cloud computing services, and incorporate additional performance metrics such as cost per month, maximum number of users that can be served for a fixed cost, cost/request, etc. REFERENCES [1] G. Abdulla, “Analysis and Modeling of World Wid e Web Tra f fic, ” Ph.D. Thesis, Virginia Polytechnic Institute and State University, Blacksburg, 1998. [2] M. Crovella and A. Bestavros, “Self-Similarity in World Wide Web Traffic: Evidence and Possible Causes,” IEEE/ACM Transactions on Networking, Vol. 5, No. 6, 1997, pp. 835-846. doi:10.1109/90.650143 [3] D. Mo sberger and T. Jin, “Httperf: A Tool for Measuring Web Server P erfor mance, ” ACM Performance Evaluation Review, Vo l. 26, No. 3, 19 98, pp. 31-37. doi:10.1145/306225.306235 [4] D. Garcia and J. Garcia, “TPC-W E-Commerce Bench- mark Evalu ation,” IEEE Computer, Vol. 36, No. 2, 2003, pp. 42-48. [5] RUBiS, 2010. http://rubis.ow2.org [6] SPECweb99, 2010. http://www.spec.org/osg/web99 [7] TP C -W, 2010. http://jmob.ow2.org/tpcw.html [8] WebBen ch, 2010. http://www.zdnet.com/zdbop/webbench/webbench.html [9] P. Barford and M. E. Crovella, “Generating Representa- tive Web Workloads for Network and Server Perfor- mance Evaluation,” Proceedings of the 1998 ACM SIG- METRICS International Conference on Measurement and Modeling of Computer Systems, Madison, 22-26 June 1998, pp. 151-160. [10] D. Krishnamurthy, J. Rolia and S. Majumdar, “A Syn- thetic Workload Generation Technique for Stress Testing Session-Based Systems,” IEEE Transactions on Software Engineering, Vol. 32, No. 11, 2006, pp. 868-882. [11] A. Mahanti, C. Williamson and D. Eager, “Traffic Anal y- sis of a Web Proxy Caching Hierarchy,” IEEE Network, Vol. 14, No. 3, 20 00, pp. 16-23. doi:10.1109/65.844496 [12] S. Manley, M. Seltzer and M. Courage, “A Self-Scaling and S elf-Configuring Benchmark for Web Servers,” Pro- ceedings of the ACM SIGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems, Madison, June 1998, pp. 270-271. [13] Webjamma, 2010. http://www.cs.vt.edu/chitra/webjamma.html [14] K. Kant, V. Tewari and R. Iyer, “Geist: A Generator for E-Commerce & Internet Server Traffic,” IEEE Interna- tional Symposium on Performance Analysis of Systems and Software, Tucson, 4-5 November 2001, pp. 49-56. [15] E. Vidal, F. Thollard, C. Higuera, F. Casacuberta and R. C. Carrasco, “Probabilistic Finite-State Machines Part I,” IEEE Transactions of Pattern Analysis and Machine In- telligence, Vol. 27, No. 7, 20 05, pp. 1013-1025. [16] MLE Tool, 2010. http://faculty.washington.edu/djholman/mle/index.html [17] Kolmogorov-Smirnov Test, 2010. http://www.physics.csbsju.edu/stats/KStest.html [18] Faban, 2010. http://faban.sunsource.net
|