When talking about smart grid data it is hard to avoid the variable associated with time. A blackout occurred at a specific time; high consumption will be expected between 1 and 4 pm due to extreme heat leading to more usage of HVAC; customer Smith consumes less during the weekdays than during the weekends as he is working then; pricing is higher between 8 and 10 pm; etc.
As such most data (energy, pricing, weather, customer behavior) is recorded against time. This sort of data where data points occur in successive order over some period of time is called a time series.
Time series allow us to track the evolution of data over time, we can see how the energy consumption varies with time, how weather relates to day-night patterns, or how customers unfold their specific daily or weekly habits. This aspect is most important as it allows utilities to plan for future events by analyzing the time series and applying machine learning techniques to predict consumption spikes for instances.
The easiest way to represent time series data is through 2D plots where the data (vertical Y-axis) is plotted against time (horizontal X-axis) as you can see in the next image where the energy consumption (KWh) is plotted at 15-minute intervals (in this case between 16:30 and 19:00 on the 23rd of the month):
In a future post I will show how to load and prepare the raw data for analysis.