Studi Perbandingan Performa Algoritma Penjadwalan untuk Real Time Data Twitter pada Hadoop
Hadoop is an open source and java based software framework. Hadoop consists of two main components, namely MapReduce and Hadoop Distributed File System (HDFS). MapReduce consists of Map and Reduce which are used for data processing, while HDFS is a places or directory where data can be stored. In carrying out a job that is not uncommonly diverse in its execution characteristics, a proper job scheduler is needed. There are many job schedulers that can be selected to matching job characteristics. Fair Scheduler uses a scheduler where the principle is to ensures that jobs will get the same resources as other jobs, with the aim of improving performance in terms of Average Completion Time. Hadoop Fair Sojourn Protocol Scheduler is a scheduling algorithm in Hadoop that can do scheduling based on the size of jobs provided. This study aims to compare the performance of the two schedulers for Twitter data characteristics. The test results show the Hadoop Fair Sojourn Protocol Scheduler has a better performance than the Fair Scheduler both from handling average completion time of 9.31% and job throughput of 23.46%. Then the Fair Scheduler excels in the task fail rate parameter of 23.98%.