Concept and benchmark results for Big Data energy forecasting based on Apache Spark

Jorge Ángel González Ordiano,Andreas Bartschat,Nicole Ludwig,Eric Braun,Simon Waczowicz,Nicolas Renkamp,Nico Peter,Clemens Düpmeier,Ralf Mikut,Veit Hagenmeyer

Concept and benchmark results for Big Data energy forecasting based on Apache Spark

2018

Jorge Ángel González Ordiano
Andreas Bartschat
Nicole Ludwig
Eric Braun
Simon Waczowicz
Nicolas Renkamp
Nico Peter
Clemens Düpmeier
Ralf Mikut
Veit Hagenmeyer

The present article describes a concept for the creation and application of energy forecasting models in a distributed environment. Additionally, a benchmark comparing the time required for the training and application of data-driven forecasting models on a single computer and a computing cluster is presented. This comparison is based on a simulated dataset and both R and Apache Spark are used. Furthermore, the obtained results show certain points in which the utilization of distributed computing based on Spark may be advantageous.

Keywords:

Computer science
Data science
Big data
Data mining
Computational Science and Engineering
Spark (mathematics)
Distributed Computing Environment
Energy forecasting
Computer cluster
Data-driven

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations