Challenge

We were asked by our client to predict future energy consumption in a commercial building in order to provide visibility on upcoming energy costs and load for budgeting and forecasting purposes.

The data provided by the client consisted of hourly KWH consumption readings over a 12 month period. Whilst the amount of data provided was sufficient for naive time series modelling but did not allow us to take into account seasonality which could have a considerable impact on electricity consumption, especially in environments where the climate varies between extreme low and high temperatures. 

Solution

The data was first cleaned and any anomaly such as outages had to be removed and their values estimated.

As a next step, we prepared a baseline to compare our models against. Given the strong periodicity of the time series, a natural baseline is for each forecast to be equal to the same value as the one observed at the same time the previous week.  

To beat this baseline, some additional features were engineered:

  • Weather features (temperature, press ion, etc.)
  • Holidays, weekday, week-end
  • The analysis of the time series showed some weekly and daily periodicity, so some Fourier coefficients were added.

Different models were then trained to predict one day ahead (i.e. 24 steps ahead since we have hourly data). The evaluation was made on the month of November, comparing both the sliding and expanding window techniques for the training set (more details on that here).

The final model ended up being a Random Forest fine-tuned on an expanding dataset predicting the energy consumption with a mean absolute percentage error of 2.6%.

The project was completed in 4 weeks