Ant_Algorithm_for_Grid_Scheduling_Powered_by_local_search

Ant_Algorithm_for_Grid_Scheduling_Powered_by_local_search Int. J. Open Problems Compt. Math., Vol. 1, No. 3, December 2008 Ant Algorithm for Grid Scheduling Powered by Local Search Kousalya.K and Balasubramanie.P Department of Computer Science and Engineering , Kongu Engineering College, Tamilnadu,I...

Int. J. Open Problems Compt. Math., Vol. 1, No. 3, December 2008 Ant Algorithm for Grid Scheduling Powered by Local Search Kousalya.K and Balasubramanie.P Department of Computer Science and Engineering , Kongu Engineering College, Tamilnadu,India, E-mail: keerthi.kous@gmail.com, pbalu_20032001@yahoo.co.in Abstract Grid computing is a form of distributed computing that coordinates and shows computing power, applications, data storage and network resources across dynamic and geographically dispersed organizations. Resource management and application scheduling are the two major problems in grid computing. The resources are heterogeneous in terms of architecture, power, configuration and availability. This complicates the task scheduling problem. The major objective of grid scheduling is to reduce the makespan. Hence the scheduling must consider some specific characteristics of the job and decide the metrics to be used accordingly. Ant algorithm, which is one of the heuristic algorithm suits well for the grid scheduling environment. This paper proposes an modified ant algorithm for Grid scheduling problem that is combined with local search. The proposed ant algorithm takes into consideration the free time of the resources and the execution time of the jobs to achieve better resource utilization and better scheduling. In the evaluation study, a number of intensive experiments are conducted using the standard bench mark problem. The result shows that the proposed ant algorithm is capable of producing high quality scheduling of jobs to grid resources. Thus the algorithm can be used to design efficient dynamic schedulers for real time grid environments. Keywords: computational grid, scheduling algorithm, Heuristic approach, Ant algorithm, simulation 1 Introduction The computational scientist has started to adapt to the grid computing techniques for the past seven years. This has increased the computing power and capability of inter-organization, national and international grid computing infrastructures such 223 Ant Algorithm for Grid Scheduling Powered by Local Search as the e-minerals grid in the U.K, the tera grid in the U.S and the Guhas EGEE [1]. The main aim of grid computing is to integrate clusters into global infrastructures. The grid users need not be aware of the computational resources that are used for executing their jobs and storing their data [1]. Grid computing is emerging as the next generation of parallely distributed computing platform for solving large scale computational and data intensive problems in science, engineering and commerce [4]. It enables the sharing, selection and aggregation of a wide variety of geographically distributed resources including supercomputers, databases, data sources and specialized devices owned by different organizations. Authentication and authorization, secure and reliable file transfer, distributed storage management and resource scheduling across organizational boundaries are the list of problems need to be solved in grid computing area. The Grid users need not be aware of the computational resources that are used for executing their applications and storing their data. The resource allocation to a large number of jobs is hard and much more difficult than the LAN computational environments [2]. The load balancing of the available resources in the computational grid is another important factor. However, different applications have different characteristics, and their demands on the resources may differ greatly. Efficient and application adaptive resource management and scheduling are technical challenges in the Grid. Grid computing is now being used in many applications that are beyond distribution and sharing of resources. The distributed resources are useful only if the Grid resources are scheduled. Using optimal schedulers results in high performance grid computing, where as poor schedulers produce contrast results. Now, the grid scheduling is a big topic in grid environment for new algorithm model. The scheduling in Grid environment has to satisfy a number of constraints on different problems. The existing approaches for scheduling in grid applications uses queuing systems or adhoc schedulers that use specific knowledge of the underlying grid infrastructure to achieve an efficient resource allocation. However, there approaches cannot deal with the complexity of the problem due to dynamic nature of the grid. In fact, job scheduling in computational grids is multi objective in its general formulation and therefore optimization approaches that could tackle many conflicting objectives are imperative. A grid scheduler, often called resource broker, acts as an interface between the user and distributed resources. It hides the complexities of the computational grid from the user. The scheduler does not have full control over the grid and it cannot assume that it has a global view of the grid. The single most challenging issue of the grid scheduler encounters is the dynamicity of resources. Although a resource may be participating in a grid, its main purpose is used by local users of the organization that it belongs to. Therefore, the load on the resource imposes a great strain on grid scheduling. Kousalya.K and Balasubramanie.P 224 The grid scheduling consists of three stages [5]. Resource discovery and filtering in the first phase, the resource selection and scheduling according to the certain objective is the second phase and the job submission is the third phase. The third stage includes the file staging and cleanup. High performance computing and high throughput computing are the two different goals of grid scheduling algorithm. The main aim of the high performance computing is to minimize the execution time of the application. To increase the processing capacity of systems over a long period of time is the aim of the high throughput computing. The proposed approach is the high throughput computing. Grid scheduling is a NP-Complete problem. Heuristic optimization techniques are the best approach to solve NP-complete problem. The four basic heuristic methods for grid scheduling are namely Genetic Algorithm (GA) [7], Simulated Annealing (SA) [10], Ant Colony Optimization (ACO) and Tabu search (TS) [6]. The main focus of this paper is to develop a high throughput scheduling algorithm based on ACO. This technique does the repeat sampling experiments on the model of the system. It uses the stochastic component in the state sampling and/or transition rules. The statistical knowledge about the problem and the estimate of the variables are updated using the above experimental results. The variances in the estimation of the described variables are reduced using the above knowledge repeatedly. In ACO algorithms, the ants try to build the feasible solution to apply the stochastic decision policy repeatedly. This paper implements a modified ant algorithm which results in minimum makespan and maximum resource utilization for the grid scheduling problems. The paper is further organized as follows. In section 2, the related works are discussed. Section 3 contains the problem description. Section 4 contains the result of the proposed method and the advantages of the proposed method, while Section 5 concludes with future direction. This document can be used as a template for Microsoft Word versions 6.0 or later. You may open this document then type over sections of the document or cut and paste to other document and then use adequate styles. The style will adjust your fonts and line spacing. Please set the template for A4 paper (21 x 29.7 cm). For emphasizing please use italics and do not use underline or bold. Please do not change the font sizes or line spacing to squeeze more text into a limited number of pages. 2 Literature Review This section reviews a set of heuristic algorithms which has been designed to schedule the meta-tasks in the computational grids. The collection of independent tasks with no data dependencies is called as meta-task. Meta-tasks are mapped on to the available machines statically; each machine in the computational grid executes a single task at a time. For this mapping, it is assumed that the number of machines, ‘m’ and the number of tasks‘t’, are known a priori. A large number of 225 Ant Algorithm for Grid Scheduling Powered by Local Search heuristic algorithms have been designed to schedule tasks to machines on grid computing systems. The eleven commonly used algorithms are listed as follows. Opportunistic Load Balancing’ (OLB) is one of the easiest techniques. For each job, it finds out the next available machine and simply schedules that job to that machine. The machine selection is in an arbitrary manner. In this algorithm, the expected execution time is not taken into the account, so it produces the poor result [5]. But it uses the resources in the balanced way. The next simplest schedule is User Directed Assignment (UDA). It schedules each job on the resource, which resource will have the best expected execution time for that task. The load will not be balanced across all the available resources. If all jobs will be best expected execution time in one resource than the other resources (consistent) then all the jobs are allocated to that machine only [5]. The next effective approach is to assign each job, in arbitrary order, to the resource on which it is expected to finish earliest. The algorithm calculates the current job’s completion time against the list of available machines. This approach is Fast Greedy method. The benefits of OLB and UDA are combined and form the approach Fast greedy [5]. One of the best and simple heuristics method is called Min-min. In this method, compute the minimum completion time of each task with respect to all machines. The task with the overall minimum completion time is selected and assigned to the corresponding node. The currently assigned job is removed from the unscheduled task list and the above process is repeated until all the tasks are scheduled. Here, all jobs have a good chance to select a suitable resource. So this method automatically minimizes the makespan and balances the load to an extent. It is more complex than the UDA. But it produces better solution when compared to UDA [6]. Another heuristics approach is Max-min. Like Min-min, Max-min also calculates the minimum completion time of each job. It selects a job with the overall maximum of minimum completion time [6]. The Genetic Algorithm (GA) is one of the best methods to search the large solution space. This method operates on a population of chromosomes for a given problem. First it generates the initial population randomly. The initial population may be generated by any other heuristic algorithm; if the population is generated by Min-Min then it is called “seeding” the population with Min-Min [11]. Simulation Annealing (SA) is an iterative technique. It finds out only one possible solution at a time for each Meta task. This method probabilistically allows solution to obtain a better search of the solution space based on a system temperature [10]. The Genetic simulated annealing (GSA) is a combination of genetic algorithm and simulated annealing techniques [12]. Tabu search is also a solution space search. It keeps track of the regions which have already been searched. The new search need not repeat a search near this area[6]. A* is a tree search. Initially, the root node has null solutions.. When the tree grows, the intermediate nodes have a partial solution and the leaf nodes have final Kousalya.K and Balasubramanie.P 226 solutions. Each and every node has its own cost function. The node with minimum cost function is replaced by its children. When a new node is added into the tree, the tree is pruned by deleting the node with the highest cost function. The above process is repeated until a leaf node is reached [13]. 2.1 Comparison of above stated algorithms The algorithms, OLB, UDA, Max-min, SA, GAS and Tabu, do not produce good results [6]. The remaining algorithm Min-min and A* produce good results of makespan. The makespan of the above mentioned algorithm difference is within 10%. GA is little better than Min-min. The A* produces best and worst results as compared with GA and Min-min in a different situations. Min-min algorithm is faster than GA and A*. Min-min also has some pitfalls. In the Min-min method, too many jobs are assigned to a single grid node and this will lead to system overloading and the response time of the job is not assured. This is the main disadvantage of Min-min method. Load balancing is not considered in the OLB method. In this paper [8], the scheduler is available in all the resources. The scheduler looks for the best job to be executed in the resource. Here the jobs travel from one location to another to identify the correct resource which it requires. So the traffic in the grid system will be automatically increased. In this paper [9] communication cost is considered to be an important factor. The grid scheduling problem is a complex one. So, lots of researchers do their research in this area. The main aim of the researches is to find out the optimal solution and to improve the overall system performance. Min-Min, Max-min, fast greedy, tabu search and ant system are some of the heuristic algorithms which create a static environment. Here, they must know the execution time and the workload in advance. In this paper [9], grid simulation architecture using ACO is proposed. The response time and average utilization of resources are used as the evaluation index. In the paper [14], they could improve the performance like job finishing ratio using the ACO algorithm. 2.2 Ant Algorithm Dorigo M. introduced the Ant algorithm in 1996, which is new heuristics, predictive scheduling algorithm. It is based on the real ants. When an ant looks for food, ant deposits some amount of pheromone on the path, thus making, it is followed by a trail of this substance. If an ant tries to move from one place to another then it encounters a previously laid trail. The ant can detect the pheromone trail and decide with high probability to follow it. This ant also reinforces the trail with its own pheromone. When more ants are following the trail, then the pheromone on shorter path will be increased quickly. The quantity of pheromone on every path will affect the possibility of other ants to select path. At last all the ants will choose the shortest path. In paper [9], the experiment 227 Ant Algorithm for Grid Scheduling Powered by Local Search results show that ant algorithm has produced an optimum solution. The ACO algorithm has been used to solve many NP problems, such as TSP, assignment problem, job-shop scheduling and graph coloring successfully. So the ant algorithm is suitable to be used in Grid computing task scheduling. In the grid environment, the algorithm can carry out a new task scheduling by experience, depending on the result in the previous task scheduling. In the grid computing environment, this type of scheduling is very much helpful. So ant algorithm for task scheduling in Grid Computing, is proposed in this paper. 2.3 Local Search The heuristic algorithms can often be improved by combining the local search techniques to take the solution to its local optimum in the search space [3]. The local search technique is to define the neighborhood of a solution. In general a solution will have one or more ‘problem’ resources (those with schedule lengths equal to the makespan of the whole solution). Try to reduce the ‘problem’ resource makespan as this will immediately reduce the overall makespan of the solution. The neighborhood is a solution of single transfer of a job from the problem resource to any other resources. The local search technique analysis, the neighborhood and the transfer which reduces the maximum schedule length of the two resources is involved the most. The above process is repeated until no further improvement is possible 3 Problem Description In this study, the grid is composed of number of hosts. Each host has several computational resources. The resources may be homogeneous or heterogeneous. The grid scheduler finds out the better resource of a particular job and submits that job to the selected host. The grid scheduler does not have control over the resources and also on the submitted jobs. Any machine in grid can execute any job, but the execution time differs. The resources are dynamic in nature. As compared with the expected execution time, the actual time may be varied at the time of running the jobs to the allocated resource. The grid scheduler’s aim is to allocate the jobs to the available nodes. The best match must be found from the list of available jobs to the list of available resources. The selection is based on the prediction of the computing power of the resource. So, lots of problems are needed to be solved in this area. The grid scheduler must allocate the jobs to the resources efficiently. The efficiency depends upon two criteria; one is makespan and the other is flow time. These two criteria are very much important in the grid system. The makespan measures the throughput of the system and flow time measures its QoS. The following assumptions are made before discussing the algorithm. The collection of independent tasks with no data dependencies is called as a meta-task. Each Kousalya.K and Balasubramanie.P 228 machine executes a single task at a time. The meta-task size is one and the numbers of machines are ‘m’. The ant based algorithm is evaluated using the simulated execution times for a grid environment. Before starting the grid scheduling, the expected execution time for each task on each machine must be estimated and represented by an ET matrix. Each row of ET matrix consists of the estimated execution time for a job on each resource and every column of the ET matrix is the estimated execution time for a particular resource of list of all jobs in the job pool. ETij is the expected execution time of task ti and the machine mj. The time to move the executables and data associates with the task ti includes the expected execution matrix ETij. For this algorithm, it is assumed that there are no inter-task communications, each task can execute on each machine, and the estimated expected execution times of each task on each machine is known. The ET matrix will have N x M entries, where N is the number of independent jobs to be scheduled and M is the number of resources which is currently available. Each job’s workload is measured by million of instructions and the capacity of each resource is measured by MIPS. Definition 3.1 The Ready time (Ready m) indicates the time resource ‘m’ would have finished the previously assigned jobs. The completion time of ith job on the jth machine is Max(CTij) is the makespan of the complete schedule. Makespan is used to measure the throughput of the grid system. In general the existing heuristic mapping can be divided into two categories. One is on line mode and the other one is batch mode. In the on line mode, the scheduler is always in ready mode. Whenever a new job arrives to the scheduler, it is immediately allocated to one of

                    本文档为【Ant_Algorithm_for_Grid_Scheduling_Powered_by_local_search】，请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑，
                    图片更改请在作品中右键图片并更换，文字修改请直接点击文字进行修改，也可以新增和删除文档中的内容。 
 该文档来自用户分享，如有侵权行为请发邮件ishare@vip.sina.com联系网站客服，我们会及时删除。

                    [版权声明] 本站所有资料为用户分享产生，若发现您的权利被侵害，请联系客服邮件isharekefu@iask.cn，我们尽快处理。

                    本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权，请谨慎使用。

                    网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传，仅限个人学习分享使用，禁止用于任何广告和商用目的。
                

下载需要：免费已有0 人下载

立即下载

Ant_Algorithm_for_Grid_Scheduling_Powered_by_local_search

你可能还喜欢