A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster

E. Laxmi Lydia

Volume 17, Issue 10 (2023) Section 1 Paper ID: KO1A4

A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster

Authors

E. Laxmi Lydia

Keywords

Department of Computer Science Vignan's Institute of Information Technology India

Abstract

Big data storage management is a significant challenge in Hadoop cluster environments due to the high degree of data access locality required by data-intensive applications. Traditionally, high-performance computing has relied on dedicated servers for data storage and replication. This research introduces a 'Disparateness-Aware Scheduling algorithm' to address the issues of resource and job disparity in cluster environments. Utilizing K-centroids clustering, the proposed method focuses on energy consumption in Hadoop clusters, enhancing system reliability. Resources are categorized to minimize scheduling delay using the K-Centroids algorithm. A novel provisioning mechanism considers load, energy, and network time, optimizing the fitness function for Particle Swarm Optimization (PSO) to select computing nodes. The study also addresses fault tolerance by focusing on cluster migration for failure nodes, allowing recomputation and prediction of optimal nodes via PSO. Experimental results demonstrate improvements in scheduling length, delay, speed, failure ratio, and energy consumption compared to existing systems.

Access Full Text (PDF) ← Back to Issue