During the time I spent working on various smart grid projects I often found myself in the situation where I had to apply scientific results to real-life applications. However, I found that in reality practice beats theory, and integrating machine learning and optimization algorithms in production level applications rarely produces the same results as during testing.
The scenario
Let me explain one actual case where our team had to demonstrate to stakeholders near real-time automated demand response (energy curtailment to meet the supply in peak hours) on fully integrated and controlled buildings through EBI (Enterprise Buildings Integrator) from Honeywell. This is something that with residential customers cannot be easily accomplished due to the real-time constraint and factors such as customer compliance, but works in theory in microgrids where automated control software like EBI is installed. To make things a bit more complicated, the objective was not only to reduce the consumption by a specific target but also to keep it stable throughout the event (curtailing everything in the first 15 minutes and nothing for the rest of the hour was undesired). Our team developed a building selection algorithm which based on historical data would select the right combination of buildings to meet a certain curtailment target, say 1 MWh. In addition, to adapt to unforeseen scenarios, our algorithm would reselect buildings every 15-minutes to cope with cases where the expected curtailment was not observed. Energy consumption and curtailment estimates were obtained through machine learning algorithms.
The theory said that our machine learning algorithms for curtailment estimation had for 16 buildings errors less than 10% and for 2 buildings errors exceeding 20%. For those interested our error metric was the MAPE (Mean Absolute Percentage Error) and KWh was recorded at the building level. Some good numbers considering the challenges of forecasting on smart grid data. But to our apparent advantage, we were using campus buildings with specific destinations (offices, libraries, dorms, lecture and lab rooms, etc.), hence buildings where the consumption can follow a rather predictable pattern. In addition, our building selection algorithm was aiming to achieve a close to an optimal solution similar to that deriving from the knapsack problem, a well-known combinatorial optimization problem. It appeared that we had everything to our advantage, relatively small errors and a good algorithm.
So how did everything go?
Long story short, on our demo day, after a few dry runs our team prepared a live demo for the stakeholders. The intent was to run the demand response event for one hour with a specified curtailment target. In that time, our system would monitor and estimate the curtailment based on ISO models (California Southern Edison to be more exact) while our building selection algorithm would take decisions to add/remove buildings in the hope to keep the curtailment stable and meet the target.
We all gathered in the demo room and started the demand response event. As the event was running we had discussions and presentations on our theoretical and dry run experiments (in which I would add that we never met the target despite trying various curtailment estimation methods). Halfway through it was becoming clear that we cannot curtail enough, and by the time it was all over we managed to achieve about half of the target.
A short paper on the proposed integrated system can be read on ResearchGate.
So what went wrong?
We had the estimated curtailment numbers and their associated errors, we had the building selection algorithm so where did we go wrong? One possible answer is that our curtailment estimations were off in the first place. The problem of curtailment estimation is nearly impossible to solve as there is no way to know for sure how much you reduced as there is no baseline. You would have to know your consumption in the absence of curtailment which is impossible as you wouldn’t have a curtailment to work in the first place. In practice, utilities rely on the consumption just before the curtailment event or the consumption on similar past days, but all are subjective and provide different results. So in the end, we are just guessing but who is right? I will talk more about curtailment estimation in a future post so stay tuned if it doesn’t all make sense right now.
Best vs. correct answer
The takeaway from my short story is that irrespective of how good the algorithms themselves are it all comes down to how we interpret data and whether all not we have the right information to draw the correct conclusion. It may just as well be that the answers to some problems are by nature data-influenced and that there is no “correct” answer just the best answer provided a well-defined business objective. In other words, given a specific KPI (Knowledge Performance Indicator) and a lack of the right information, those machine learning algorithms which provide the best answer can be used. So in our case, given the outcome of several algorithms, we could choose the one which overestimates compared to the rest and we could probably build a solid case around it as there is no solid baseline to compare against.
Of course as always I am more than keen to hearing your thoughts on this in the comments below.