Grid Federation: Number of Jobs and File Size Effects on Jobs Time
Keywords:
Data Grids, Data Grid Federation, Data Replication, EDR Optimizer,Abstract
Grid federation is fast emerging as an alternative solution to the problems posed by the large data handling and computational needs of the existing numerous worldwide scientific projects. Efficient access to such extensively distributed data sets has become a fundamental challenge in grid computing. Creating and placing replicas to suitable sites, using data replication mechanisms can increase the system’s performance. Data Replication reduces data access time, ensures load balancing as well as narrows bandwidth consumption. In this paper, an enhanced data replication mechanism called EDR is proposed. EDR applies the principle of exponential growth/decay to both file size and file access history, based on the Latest Access Largest Weight (LALW) mechanism. The mechanism selects a popular file and determines an appropriate number of replicas as well as suitable grid sites for replication. It establishes the popularity of each file by associating a different weight to each historical data access record. Typically, recent data access record has a larger weight, which signifies that the record is more relevant to the current situation of data access. By varying the number of jobs as well as file sizes, the proposed EDR mechanism was simulated using file size and job completion time as the variable metrics. Optorsim simulator was used to evaluate the proposed mechanism alongside the existing Least Recently Used (LRU), and Least Frequently Used (LFU) Mechanisms. The simulation results showed that job completion time increases by the growth in both file size and number of jobs. EDR shows improved performance on the mean job completion time, compared to LRU and LFU mechanisms.References
D. G. Cameron, R. Carvajal-Schiaffino, A. P. Millar, C. Nicholson, K. Stockinger, and F. Zini, "Evaluating scheduling and replica optimisation strategies in OptorSim", Journal of Grid Computing, pp. 57-69, March 2004.
R. S. Chang and H. P. Chang, “A Dynamic Data Replication Strategy using Access-Weights in Data Grids”, Future Generation Computer System, 22, pp. 254-268, 2008.
M. K. Madi and S. Hassan, “Dynamic replication algorithm in Data Grid: a survey”, In International conference on network applications, protocols, and services, November 2008.
M. A. Salehi, B. Javadi, and R. Buyya, “Preemption-aware admission control in a virtualized grid federation,” in Advanced Information Networking and Applications (AINA), 2012 IEEE 26th International Conference on, 2012, pp. 854–861.
A. Jagatheesan and R. W. Moore, "Data grid and grid-flow management systems", in Proceedings of IEEE International Conference on Web Services, 2004, pp. xxix-xxix.
Z. Mohamad, F. Ahmad, A. N. M. Rose, F. S. Mohamad and M. M. Deris, “Job scheduling for dynamic data replication strategy in heterogeneous federation data grid systems”, In 2013 2nd IEEE International Conference on Informatics and Applications (ICIA), September 2013, pp. 203-206.
Z. Mohamad, F. Ahmad, A. N. M. Rose, F. S. Mohamad and M. M. Deris, "Implementation of Sub-Grid-Federation Model for Performance Improvement in Federated Data Grid", Malaysian Journal of Applied Sciences, vol. 1, no. 1, pp. 55-67, 2016.
P. Kunszt, “European DataGrid project: Status and plans”, Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, vol. 502, no. 2, pp. 376-381, 2003.
A. Sulistio, C. S. Yeo and R. Buyya, “A taxonomy of computer-based simulations and its mapping to parallel and distributed systems simulation tools”, Software-Practice and Experience, vol. 34 no. 7, pp. 653-674, 2004.
K. Jain, A. V. Vidhate, V. Wangikar, and S. Shah, “Design of file size and type of access based replication algorithm for data grid”, in Proc. ACM International Conference & Workshop on Emerging Trends in Technology (ICWET '11), New York, NY, USA, 2011, pp. 315-319.
W. Zhao, X. Xu, N. Xiong, and Z. Wang, "Dynamic replica replacement strategy in data grid", in Proc. 8th IEEE International Conference on Computing Technology and Information Management (ICCM), 2012, vol. 2, pp. 578-584.
H. Stockinger, F. Donno, E. Laure, S. Muzaffar, P. Kunszt, G. Andronico and P. Millar, “Grid Data Management in action: Experience in running and supporting data management services in the EU Datagrid Project", arXiv preprint cs/0306011, June 2003.
B. H. William, D. G. Cameron, L. Capozza, A. P. Millar, K. Stockinger, and F. Zini, "Evaluation of an economy-based file replication strategy for a data grid", in Proceedings of 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid), 2003, pp. 661-668.
L. Guy, P. Kunszt, E. Laure, H. Stockinger and K. Stockinger "Replica management in data grids", in Global Grid Forum, July 2002, vol. 5, pp. 278-280.
R. L. Anikode and B. Tang, "Integrating scheduling and replication in data grids with performance guarantee", in conf. 2011 IEEE Global Telecommunications Conference (GLOBECOM 2011), pp. 1-6.
P. Vashisht, R. Kumar, and A. Sharma, "Efficient dynamic replication algorithm using agent for data grid", The Scientific World Journal, pp. 767016-767016, 2014.
N. Mansouri and A. Asad, "Weighted data replication strategy for data grid considering economic approach", Int. J. Comput. Elect. Auto. Control Inf. Eng, vol. 8, pp. 1336-1345, July 2014.
B. H. William, D. G. Cameron, L. Capozza, A. P. Millar, K. Stockinger and F. Zini, "Simulation of Dynamic Grid Replication Strategies in OptorSim", in International Workshop on Grid Computing, Springer, Berlin, Heidelberg, 2002, pp. 46-57.
M. R. K. Grace, S. S. Priya and S. Surya, “A survey on grid simulators”, Int. J. Comput. Sci. Inf. Technol. Secur, 2(6), 1224-1230, 2012.
S. A. Monsalve, F. G. Carballeira and A. C. Mateos, "Analyzing the performance of volunteer computing for data-intensive applications", 2016 International Conference on High-Performance Computing & Simulation (HPCS), Innsbruck, 2016.
H. Casanova, “Simgrid: A toolkit for the simulation of application scheduling”, in Proc. First IEEE/ACM international symposium on Cluster computing and the grid, 200, pp. 430-437
C. L. Dumitrescu and I. Foster, “GangSim: a simulator for grid scheduling studies”, In Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid (CCGrid'05), Volume 02 (CCGRID '05), IEEE Computer Society, Washington, DC, USA, 1151-1158, 2005.
H. J. Song, X. Liu, D. Jakobsen, R. Bhagwan, X. Zhang, K.Taura, and A. Chien, "The microgrid: a scientific tool for modeling computational grids", in Proceedings of the 2000 ACM/IEEE Conference on Supercomputing (SC '00), 2000, pp. 53-53.
R. Buyya and M. Murshed, "Gridsim: A toolkit for the modeling and simulation of distributed resource mgt and scheduling for grid computing", Concurrency and computation: practice and experience, vol. 14, no. 13‐15 pp. 1175-1220, 2002.
M. Lei, and S. Vrbsky, “A Data Replication Strategy to Increase Data Availability in Data Grids”, in Proc. 2006 International Conference on Grid Computing and Applications, Las Vegas, NV, June 2006, pp. 221-227.
R. Jain, “Art of Computer Systems Performance Analysis: Techniques for Experimental Design Measurements Simulation and Modeling”, John Wiley & Sons, Inc., 1991.
F. Jolfaei and A. T. Haghighat, "The impact of bandwidth and storage space on job scheduling and data replication strategies in data grids", in Proc. 8th IEEE International Conference on Computing Technology and Information Management (ICCM), 2012, vol. 1, pp. 283-288.
Downloads
Published
How to Cite
Issue
Section
License
TRANSFER OF COPYRIGHT AGREEMENT
The manuscript is herewith submitted for publication in the Journal of Telecommunication, Electronic and Computer Engineering (JTEC). It has not been published before, and it is not under consideration for publication in any other journals. It contains no material that is scandalous, obscene, libelous or otherwise contrary to law. When the manuscript is accepted for publication, I, as the author, hereby agree to transfer to JTEC, all rights including those pertaining to electronic forms and transmissions, under existing copyright laws, except for the following, which the author(s) specifically retain(s):
- All proprietary right other than copyright, such as patent rights
- The right to make further copies of all or part of the published article for my use in classroom teaching
- The right to reuse all or part of this manuscript in a compilation of my own works or in a textbook of which I am the author; and
- The right to make copies of the published work for internal distribution within the institution that employs me
I agree that copies made under these circumstances will continue to carry the copyright notice that appears in the original published work. I agree to inform my co-authors, if any, of the above terms. I certify that I have obtained written permission for the use of text, tables, and/or illustrations from any copyrighted source(s), and I agree to supply such written permission(s) to JTEC upon request.