Monitoring And Analyzing Distributed Cluster Performance And Statistics Of Atlas Job Flow

Monitoring And Analyzing Distributed Cluster Performance And Statistics Of Atlas Job Flow

dc.contributor	Ramprakash, Sreeranjani	en_US
dc.date.accessioned	2007-08-23T01:56:05Z
dc.date.accessioned	2011-08-24T21:39:47Z
dc.date.available	2007-08-23T01:56:05Z
dc.date.available	2011-08-24T21:39:47Z
dc.date.issued	2007-08-23T01:56:05Z
dc.date.submitted	August 2005	en_US
dc.description.abstract	Grid3 is a Grid facility used by many High Energy Physics experiments to enable physicists to process data intensive and CPU intensive jobs more effectively as well as more efficiently. Amongst other things, the highlights of Grid3 are participation by more than 25 sites across the U.S. and Korea which collectively provide more than 2000 CPU's, resources used by seven different scientific applications, including three high energy physics simulations and four data analyses in high energy physics, bio-chemistry, astrophysics and astronomy, more than 100 individuals are currently registered with access to the Grid, a peak throughput of 500-900 jobs running concurrently with a completion efficiency of approximately 75%. Since each application and organization utilizing Grids has different measures for efficiency and different parameters such as number of successfully completed jobs, turnaround time, number of idle processors, etc., to be considered for scheduling, scheduling on any Grid still needs to be tailored for individual cases. The ATLAS experiment is a High Energy Physics experiment that utilizes the services of Grid3 now migrating to the Open Science Grid (OSG). This thesis provides monitoring and analysis of performance and statistical data from individual distributed clusters that combine to form the ATLAS Grid and will ultimately be used to make scheduling decisions on this Grid. The system developed in this thesis uses a layered architecture such that predicted future developments or changes brought to the existing Grid infrastructure can easily utilize this work with minimum or no changes. The starting point of the system is based on the existing scheduling that is being done manually for ATLAS job flow. We have provided additional functionality based on the requirements of the High Energy Physics ATLAS team of physicists at UTA. The system developed in this thesis has successfully monitored and analyzed distributed cluster performance at three sites and is waiting for access to monitor data from three more sites.	en_US
dc.identifier.uri	http://hdl.handle.net/10106/117
dc.language.iso	EN	en_US
dc.publisher	Computer Science & Engineering	en_US
dc.title	Monitoring And Analyzing Distributed Cluster Performance And Statistics Of Atlas Job Flow	en_US
dc.type	M.S.	en_US

Collections

University of Texas at Arlington

Monitoring And Analyzing Distributed Cluster Performance And Statistics Of Atlas Job Flow

Files

Collections