In a previous post, I explained how smart grid time series data can be cleaned in preparation for data analysis. Not all analyses involve predictions. In fact, the preliminary information that I will extract in this post is usually part of an additional step in data preparation and entails time-series information which is used in the subsequent steps of time-series forecasting.
As always, you can try and test the code in Google Colab and use this file for experimenting. You could also generate your own cleaned time series file by following the steps in Cleaning up smart grid data (with examples) post.
As usual, let me know your thoughts on this in the comments below.
Trend stationary vs difference stationary vs stationary time series
In the previous Google Colab notebook, you have encountered the notions of trend stationary, difference stationary, and (strict) stationary. These may be confusing as they all contain the word stationary. In fact, trend stationary and difference stationary time series exhibit a trend or difference which must be removed through differencing leaving the series stationary.
For the trend stationary time series (those failing the KPSS test) the trend is deterministic meaning the time series returns to its trend following a disturbance. In consumption time series such a trend can be the seasonality. Despite fluctuating trends over the year the consumption patterns return to the previous trends (for instance, each summer the trend is the same).
For the difference stationary time series (those passing the ADF test) the trend is stochastic meaning the time series will never return to its trend following a disturbance. In consumption time series taking just the consumption during the morning of a hot day could yield in a difference stationary time series as the consumption would steadily increase as the temperature rises due to the use of the HVAC. The series would be difference stationary as the difference between consecutive values would be constant.