Modelling of Cloud Computing Centers Using
M/G/m Queues
Hamzeh Khazaei
Computer Science Department
University of Manitoba
Winnipeg, Manitoba, Canada
Email: hamzehk@cs.umanitoba.ca
Jelena Misˇic´
Computer Science Department
Ryerson University
Toronto, Ontario, Canada
Email: jmisic@scs.ryerson.ca
Vojislav B. Misˇic´
Computer Science Department
Ryerson University
Toronto, Ontario, Canada
Email: vmisic@scs.ryerson.ca
Abstract—Cloud computing is a new computing paradigm,
whereby shared resources such as infrastructure, hardware plat-
form, and software applications are provided to users on-demand
over the internet (Cloud) as services. Successful provision of
infrastructure-as-a-service (IaaS) and, consequently, widespread
adoption of cloud computing necessitates accurate performance
evaluation that allows service providers to dimension their
resources in order to fulfil the service level agreements with their
customers. In this paper, we describe an analytical model for
performance evaluation of cloud server farms, and demonstrate
the manner in which important performance indicators such as
request response time and number of tasks in the system may
be assessed with sufficient accuracy.
I. INTRODUCTION
Cloud computing is a general term for system architectures
that involves delivering hosted services over the Internet [1].
These services are broadly divided into three categories:
Infrastructure-as-a-Service (IaaS), which includes equipment
such as hardware, storage, servers, and networking com-
ponents are made accessible over the Internet); Platform-
as-a-Service (PaaS), which includes computing platforms—
hardware with operating systems, virtualized servers, and
the like; and Software-as-a-Service (SaaS), which includes
software applications and other hosted services [2]. A cloud
service differs from traditional hosting in three principal as-
pects. First, it is provided on demand, typically by the minute
or the hour; second, it is elastic since the user can have as
much or as little of a service as they want at any given time;
and third, the service is fully managed by the provider.
Due to dynamic nature of cloud environments, diversity
of user’s requests and time dependency of load, providing
expected quality of service while avoiding over-provisioning is
not a simple task [3]. To ensure that the QoS perceived by end
clients is acceptable, the providers must exploit techniques and
mechanisms that guarantee a minimum level of QoS. Although
QoS has multiple aspects such as response time, throughput,
availability, reliability, and security, the primary aspect of QoS
considered in this work is related to response time [4].
In this paper we describe an analytical model for evaluating
the performance of cloud server farms and verify its accuracy
with numerical calculations and simulations. we assume that
any request goes through a facility node and then leaves
the center. A facility node may contain different computing
resources such as web servers, database servers, and others.
We consider the time a request spends in one of those facility
node as the service time.
We model the cloud environment as an M/G/m queuing
system which indicates that inter-arrival time of requests is ex-
ponentially distributed, the service time is generally distributed
and the number of facility nodes is m, without any restrictions
on the number of facility nodes. These two characteristics,
general service time and large number of nodes, have not been
adequately addressed in previous research.
The remainder of the paper is organized as follows: Sec-
tion II gives a brief overview of related work on cloud
performance evaluation and on performance characterization
of M/G/m queueing systems. We introduce our analytical
model in Section III, and derive the analytical solutions for
the desired performance metrics in section IV. The perfor-
mance results are presented in Section V, using discrete event
simulation to validate them. Discussion of our approach and
outlook to future research activities complete the paper.
II. RELATED WORK
As mentioned above, most of the research related to
cloud computing has dealt with implementation issues, while
performance-related issues have received much less attention.
In [5], the authors studied the response time in terms of various
metrics, such as the overhead of acquiring and realizing
the virtual computing resources, and other virtualization and
network communication overhead. To address these issues,
they have designed and implemented C-Meter, a portable,
extensible, and easy-to-use framework for generating and
submitting test workloads to computing clouds.
In [6], the cloud center was modelled as an M/M/m/N
queuing system, which has been used to compute the distribu-
tion of response time. Inter-arrival and service times were both
assumed to be exponentially distributed, and the system had a
finite buffer of size N . The response time was broken down
into waiting, service, and execution periods, assuming that all
three periods are independent which is unrealistic, based on
their own argument.
In [3], the authors consider a cloud center which is modelled
as the classic open network; they obtained the distribution of
response time based on assumption that inter-arrival time and
2011 31st International Conference on Distributed Computing Systems Workshops
1545-0678/11 $26.00 © 2011 IEEE
DOI 10.1109/ICDCSW.2011.13
87
2011 31st International Conference on Distributed Computing Systems Workshops
1545-0678/11 $26.00 © 2011 IEEE
DOI 10.1109/ICDCSW.2011.13
87
本页已使用福昕阅读器进行编辑。
福昕软件(C)2005-2008,版权所有,
仅供试用。ഀ
service time are both exponential. Using the distribution of
response time, they found the relationship among the maximal
number of tasks, the minimal service resources and the highest
level of services.
Theoretical analyses have mostly relied on extensive re-
search in performance evaluation of M/G/m queuing sys-
tems, as outlined in [7], [8], [9]. As solutions for distribution
of response time and queue length in M/G/m systems can’t
be obtained in closed form, suitable approximations were
sought. However, most of these provide reasonably accurate
estimates of mean response time only when number of servers
is comparatively small, (say, less than twenty or so), as well as
small coefficient of variation of service time, CV, (less than
unity), but fail for large number of servers and higher CV.
Approximation errors are particularly pronounced when the
offered load ρ is small, and/or when both the number of servers
m and the CV of the service time, are large [10], [11], [12]. As
a result, these results are not directly applicable to performance
analysis of cloud computing server farms where the number
of servers is huge and service request arrival distribution is
not generally known.
III. THE ANALYTICAL MODEL
We model a cloud server farm as a M/G/m queuing
system which indicates that the inter-arrival time of requests is
exponentially distributed, the service times of tasks’ requests
are independent and identically distributed random variables
with a general distribution whose service rate is µ; we assume
both µ and CV , the coefficient of variation defined as standard
deviation divided by the mean, are finite.
A M/G/m queuing system may be considered as a Markov
process which can be analysed by applying the embedded
Markov chain technique. Embedded Markov Chain technique
requires selection of Markov points in which the state of
the system is observed. Therefore we monitor the number
of the tasks in the system (both in service and queued) at
the moments immediately before the task request arrival. If
we consider the system at Markov points and number these
instances 0, 1, 2, . . . , then we get a Markov chain [13]. Here,
the system under consideration contains m servers, which
render service in order of task request arrivals (FCFS).
Task requests arrival process is Poisson. Task request inter-
arrival time A is exponentially distributed with rate of 1λ . We
denote its Cumulative Distribution Function (CDF) as A(x) =
Prob[A < x] and its probability density function (pdf) as
a(x) = λe−λx. Laplace Stieltjes Transform (LST) of inter-
arrival time is A∗(s) =
∫∞
0
e−sxa(x)dx = λλ+s .
Task service times are identically and independently dis-
tributed according to a general distribution B, with a mean
service time equal to b = 1µ . The CDF of the service time is
B(x) = Prob [B < x], and its pdf is b(x). The LST of service
time is B∗(s) =
∫∞
0
e−sxb(x)dx.
Residual task service time is the time interval from an
arbitrary point (an arrival point in a Poisson process) during
a service time to the end of the service time; we denote it as
B+. Elapsed task service time is the time interval from the
Fig. 1. Embedded Markov points.
beginning of a service time to an arbitrary point of the service
time; we denote it as B−. It can be shown that probability
distribution of residual and elapsed task service times has
the same probability distribution and LST of them can be
calculated as [14]
B∗+(s) = B
∗
−(s) =
1−B∗(s)
sb
(1)
The offered load may be defined as
ρ , λ
mµ
(2)
For practical reasons, we assume that the system never enters
saturation, which means that any request submitted to the
center will get access to the required facility node after a
finite queuing time. Furthermore, we also assume each task
is serviced by a single server (i.e., there are no batch arrivals),
and we do not distinguish between installation (setup), actual
task execution, and finalization components of the service
time; these assumptions will be relaxed in our future work.
A. The Markov chain
We are looking at the system at the moments of task request
arrivals – these points are selected as Markov points. A given
Markov chain has a steady state solution if it is ergodic. Based
on conditions for ergodicity [13] and the above-mentioned
assumptions, it is easy to prove that our Markov Chain is
ergodic. Then, using the steady state solution, we can extract
the distribution of number of tasks in the system as well as the
mean response time. Let An and An+1 indicate the moment
of nth and (n+1)th arrivals to the system, respectively, while
qn and qn+1 indicate the number of tasks found in the system
immediately before these arrivals; this is schematically shown
in Fig. 1. If vn+1 indicates the number of tasks which depart
from the system between An and An+1, the following holds:
qn+1 = qn − vn+1 + 1 (3)
We need to calculate the transition probabilities associated
with this Markov chain, defined as
pij , Prob [qn+1 = j|qn = i] (4)
i.e., the probability that i+ 1− j customers are served during
the interval between two successive task request arrivals.
Obviously for j > i+ 1
pij = 0 (5)
8888
Administrator
附注
到达时刻为嵌入点
本页已使用福昕阅读器进行编辑。
福昕软件(C)2005-2008,版权所有,
仅供试用。ഀ
Fig. 2. State-transition-probability diagram for the M/G/m embedded
Markov chain.
Fig. 3. System behaviour in between two arrivals.
since there are at most i+ 1 tasks present between the arrival
of An and An+1. The Markov state-transition-probability
diagram as in Fig. 2, where states are numbered according to
the number of tasks currently in the system (i.e those in service
and those awaiting service). For clarity, some transitions are
not fully drown, esp. those originating from states above m.
We have also highlighted the state m because the transition
probabilities are different for states on the above and right
hand side of this state.
B. Departure Probabilities
Due to ergodicity of the Markov chain, an equilibrium
probability distribution will exist for the number of tasks
present at the arrival instants; so we define
pik = lim
n→+∞Prob [qn = k] (6)
From [14], the direct method of solution for this equilibrium
distribution requires that we solve the following system of
linear equations:
pi = piP (7)
where pi = [pi0, pi1, pi2, . . .], and P is the matrix whose
elements are one-step transition probabilities pij . To find the
elements of the transition probability matrix, we need to count
the number of tasks departing from the system in between two
successive arrivals. Consider the behaviour of the system, as
shown in Fig. 3. Each server has zero or more departures
during the time between two successive task request arrivals
(the inter-arrival time). Let us focus on an arbitrary server,
which (without loss of generality) could be the server number
1. For a task to finish and depart from the system during the
inter-arrival time, its remaining duration (residual service time
defined in (1)) must be shorter than the task inter-arrival time.
This probability will be denoted as
Px = Prob [A > B+] =
∫ ∞
x=0
P{A > B+|B+ = x }dB+(x)
=
∫ ∞
x=0
(∫ ∞
y=x
λe−λydy
)
dB+(x)
=
∫ ∞
0
e−λxdB+(x) = B∗+(λ)
(8)
In the case when arriving task can be accommodated
immediately by an idle server ( and therefore queue length
is zero) we have to evaluate the probability that such task will
depart before next task arrival. We will denote this probability
as Py and calculate it as:
Py = Prob [A > B] =
∫ ∞
x=0
P{A > B|B = x }dB(x)
=
∫ ∞
x=0
(∫ ∞
y=x
λe−λydy
)
dB(x)
=
∫ ∞
0
e−λxdB(x) = B∗(λ)
(9)
However, if queue is non-empty upon task arrival, the
following may happen. If between two successive task arrivals
a completed task departs from a server, that server will take
a new task from the non-empty queue. That task may be
completed as well before the next task arrival and if the
queue is still non-empty new task may be executed, and so on
until either queue gets empty or new task arrives. Therefore
probability of k > 0 job departures from a single server, given
that there are enough jobs in the queue can be derived from
expressions (8) and (9) as:
Pz,k = B
∗
+(λ)(B
∗(λ))k−1 (10)
Note that Pz,1 = Px. Using these values we are able to
compute the transition probabilities matrix.
C. Transition Matrix
Based on our Markov chain, we may identify four different
regions of operation for which different conditions hold; these
regions are schematically shown in Fig. 4, where the numbers
on horizontal and vertical axes correspond to the number of
tasks in the system immediately before a task request arrival
(i) and immediately upon the next task request arrival (j),
respectively.
• Regarding the region labelled 1, we already know from
Eq. 5 that pij = 0 for i+ 1 < j.
• In region 2, no tasks are waiting in the queue, hence
i < m and j ≤ m. In between the two successive request
8989
Fig. 4. Range of validity for pij equations.
arrivals, i+1−j tasks will complete their service. For all
transitions located on the above side of state m in Fig. 2,
the probability of having i+ 1− j departures is
pij =
(
i
i− j
)
P i−jx (1− Px)jPy
+
(
i
i+ 1− j
)
P i+1−jx (1− Px)j−1(1− Py),
for i < m, j ≤ m
(11)
• Region 3 corresponds to the case where all servers are
busy throughout the inter-arrival time, i.e., i, j ≥ m.
Let us denote the number of jobs which depart from the
system between these two Markov points as w = i+1−j.
In this case all transitions remain to the right of state m
in Fig. 2, and the state transition probabilities can be
computed as
pij =
minw,m∑
s1=min(w,1)
(
m
s1
)
P s1x (1− Px)m−s1 ·
minw−s1,s1∑
s2=min(w−s1,1)
(
s1
s2
)
P s2z,2(1− Pz,2)s1−s2 ·(
s2
w − s1 − s2
)
Pw−s1−s2z,3 (1− Pz,3)s2 (12)
Note that under moderate load it is not likely to have more
than a couple of task departures from a single server.
• Finally, region 4, in which i ≥ m and j < m, describes
the situation where the first arrival (An) finds all servers
busy and a total of i − m tasks waiting in the queue,
which it joins; while at the time of the next arrival (An+1)
there are exactly j tasks in the system, all of which are
in service. The transition probabilities for this region are
pij =
minw,m∑
s1=(m−j)
(
m
s1
)
P s1x (1− Px)m−s1 ·
minw−s1,s1∑
s2=min(w−s1,m−j)
(
s1
s2
)
P s2z,2(1− Pz,2)s1−s2 ·(
s2
w − s1 − s2
)
Pw−s1−s2z,3 (1− Pz,3)s2 (13)
IV. EQUILIBRIUM BALANCE EQUATIONS
After finding matrix P we can establish the balance equa-
tions. The balance equations balance the leaving and entering
a state in equilibrium. This leads to the equations
pii
∑
j 6=i
pij =
∑
j 6=i
pijpji, i, j = 0, 1, 2, . . . (14)
The steady state balance equations outlined above can’t be
solved in closed form, hence we must resort to a numerical
solution. Since the number of balance equations is infinite,
it must be truncated for numerical solution; we have set the
number of equations to twice the number of servers, which
allows us to achieve satisfactory accuracy (as will be explained
below). Thus, for our system the balance equations are
pii =
2m∑
j=0
pijpji , 0 ≤ i ≤ 2m (15)
and augmented by the normalization equation
i=2m∑
i=0
pii = 1. (16)
To obtain the steady state probabilities pi = [pi0, pi1, pi2, ...],
we solved the resulting system of equations using Maple 13
from Maplesoft, Inc. [15].
A. Distribution of the Number of Tasks in the System
Using the steady state probabilities we are able to establish
the probability generating functions (PGFs) for the number of
tasks in the system at an arrival time:
Π(z) =
2m∑
k=0
pizz
k (17)
For any Poisson arrivals system, PASTA property hold [14];
thus, the PGF Π(z) for the distribution of number of tasks in
system at an arrival time is identical to the PGF P (z) for the
distribution of number of tasks in system at an arbitrary time:
P (z) = Π(z) (18)
The mean number of tasks in the system is, then, obtained as
p = P
′
(1) (19)
and using Little’s law, the mean response time is obtained as
t =
p
λ
(20)
9090
B. Distribution of Waiting and Response Time
Let W denote the waiting time in the steady state, and
similarly let W (x) and W ∗(s) be the CDF, of W and its
LST, respectively. For the M/G/m systems, authors in [11]
showed that the queue length, Q, has the same distribution as
the number of tasks which arrive during the waiting time, W ,
Q(z) = W ∗(λ(1− z)) (21)
The left hand side of ( 21) in our system can be calculated as:
Q(z) =
k=m−1∑
k=0
pik +
k=2m∑
k=m
pikz
k−m (22)
Hence, we have
W ∗(s) = Q(z)|z=1−(s/λ) = Q(1− s/λ) (23)
Moreover, the LST of response time is
T ∗(s) = W ∗(s) B∗(s) (24)
in which the B∗(s) is the LST of service time. The i th
moment t(i) of the response time distribution is given by
t(i) =
∫ ∞
0
xidT (x) = i
∫ ∞
0
xi−1[1− T (x)]dx
= (−1)iT ∗(i)(0) i = 2, 3, 4, . . . (25)
V. NUMERICAL VALIDATION
We have assumed that the service time of tasks follows
the gamma distribution; however, our model may accom-
modate other distributions without any changes. Then, we
have performed two experiments with variable task request
arrival rate and coefficient of variation CV (which can be
adjusted independently in the gamma distribution). To validate
the analytical solutions we have also built a discrete even
simulator of the cloud server farm using object-oriented Petri
net-based simulation engine Artifex by RSoftDesign, Inc. [16].
The diagrams in Fig. 5 show analytical and simulation results
(shown as symbols and lines, respectively) for mean number
of tasks in the system as functions of the offered load ρ,
under different number of servers. Two different values of
the coefficient of variation, CV = 0.5 and 1.4, were used;
the corresponding results are shown in Figs. 5(a) and 5(b).
As can be seen, the results obtained by solving the analytical
model agree very well with those obtained by simulation.
The diagrams in Fig. 6(a) and Fig. 6(b) show the mean
response time, again for the same range of input variables
and for the same values of the coefficient of variation. As
above, solid lines correspond to the results of simulation,
while different symbols correspond to different number of
servers for analytical model results. As could be expected,
the response time is fairly steady up to the offered load of
around ρ = 0.8, when it begins to increase rapidly. However,
the agreement b
本文档为【Modelling of Cloud Computing Centers Using】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。