An Experimental Survey on Big Data Frameworks

An Experimental Survey on Big Data Frameworks
展开查看详情

1. An Experimental Survey on Big Data Frameworks W. Inoubli, S. Aridhi, H. Mezni, M. Maddouri, E. Mephu Nguifo XLDB 2017 October 10-12, 2017 Casino de Royat, Allée du Pariou, 63130 Royat, FRANCE 10/10/2017

2. Context and motivations Big Data problems:  Scalability and fault tolerance requirements  Several applications have been migrated to Big Data  Several Big Data frameworks have been proposed XLDB October 10-12, 2017 Clermont- Ferrand, France 10/10/2017 2

3. Experimental protocol Batch Mode evaluation StreamMode evaluation  Workload: Kmeans, WordCount and PageRank.  Frameworks: Hadoop (Mapreduce), Spark and  Workload: ETL Workload. Flink.  Frameworks: Storm, Spark and Flink.  Features: Scalabilty, Configuration parameters.  Features: Number of events processed. XLDB October 10-12, 2017 Clermont- Ferrand, France 10/10/2017 3

4.Experimental protocol Monitoring Tool Data collection: kafka. Data Strorage: Elastic search. Data Visualisation: Kibana 10/10/2017 XLDB October 10-12, 2017 Clermont- 4 Ferrand, France

5. Experimental results Batch Mode results : (a) Small datasets (b) Big datasets Impact of size of data on response time, with small and big datasets. 10/10/2017 XLDB October 10-12, 2017 Clermont- 5 Ferrand, France

6. Experimental results Batch Mode results :  Vary the number of machines in cluster.  kmeans workload  Study the scalability feature.  Vary the number of iterations, 10/10/2017 6 XLDB October 10-12, 2017 Clermont-Ferrand, France

7. Experimental results Stream Mode results : Event with 100 kb Event with 500 kb Impact of the size of messages on the number of processed messages  Storm performs better in the case of small datasets.  Flink provides good results in the case of big datasets. 10/10/2017 XLDB October 10-12, 2017 Clermont-Ferrand, 7 France

8.More at the Poster Session ! 10/10/2017 XLDB October 10-12, 2017 Clermont-Ferrand, 8 France