Energy and Power En gi neering, 2011, 3, 9-16
doi:10.4236/epe.2011.31002 Published Online February 2011 (http://www.SciRP.org/journal/epe)
Copyright © 2011 SciRes. EPE
Short-Term Electricity Price Forecasting Using a
Combination of Neural Networks and Fuzzy Inference
Evans Nyasha Chogumaira, Takashi Hiyama
Department of Computer Science and Electrical Engineering, Kumamoto University, Kumamoto, Japan
E-mail: evans@st.cs.kumamoto-u.ac.jp
Received October 28, 201; revised November 3, 2010; accepted November 4, 2010
Abstract
This paper presents an artificial neural network, ANN, based approach for estimating short-term wholesale
electricity prices using past price and demand data. The objective is to utilize the piecewise continuous na-
ture of electricity prices on the time domain by clustering the input data into time ranges where the variation
trends are maintained. Due to the imprecise nature of cluster boundaries a fuzzy inference technique is em-
ployed to handle data that lies at the intersections. As a necessary step in forecasting prices the anticipated
electricity demand at the target time is estimated first using a separate ANN. The Australian New-South
Wales electricity market data was used to test the system. The developed system shows considerable im-
provement in performance compared with approaches that regard price data as a single continuous time se-
ries, achieving MAPE of less than 2% for hours with steady prices and 8% for the clusters covering time pe-
riods with price spikes.
Keywords: Electricity Price Forecasting, Short-Term Load Forecasting, Electricity Markets, Artificial Neural
Networks, Fuzzy Logic
1. Introduction
With deregulation in many electricity markets around the
world, knowledge of possible future values of demand
and the corresponding price has become more significant
to the different entities on the market – generators and
electricity traders for determining bidding strategies, and
system operators for administration of the market [1] .
Generally commodity prices are compelled by supply
and demand balance. In electricity markets the traded
‘commodity’ cannot be stockpiled economically, the con-
straints are defined by the syste m total capacity to satisfy
demand at any given time [2]. This therefore causes elec-
tricity prices to have a high probability of volatility,
which masks observable trends necessary for forecasting
future values, especially in the short term.
Short-term forecasts cover the period from a few min-
utes to about one week ahead. These are useful for dis-
patch and short-term or spot trading. Short term trading is
meant to service the short-term variations in load and the
actual prices are only known after matching of bids and
offers by the market operator [1]. This presents a chal-
lenge in that to place effective bids; the traders need to
have an idea of the future values of the demand and its
corresponding price.
Different models have been employed in power sys-
tems for achieving forecasting accuracy and these include:
regression, statistical and state space methods [3-6]. Arti-
ficial intelligence based approaches have been explored
based on expert systems, evolutionary programming,
fuzzy systems, artificial neural networks and various
combinations of these. The widely used approach in pre-
vious works has been based on developing a mathemati-
cal model of the power system and then to perform simu-
lations to determine required values [7-9]. The main
challenge with this has been to make accurate non-linear
mathematical models. Also, complete system data is not
always readily available and the great computational ef-
fort required [2].
To estimate future values of any parameter one needs
to have some information on factors that influence that
parameter, or trends that describe the parameter of inter-
est. A number of parameters have been analyzed to de-
termine their usefulness as inputs data for short term es-
timation. These include weather variables, past demand
and price data. Weather variables such as temperature
used in [10] are either averaged information of forecasted
values, there is always a chance that the uncertainty could
E. N. CHOGUMAIRA ET AL.
10
affect output of price and load forecasting. As for past
price, data analysis shows that trends are not maintained
consistently for all hours of a day, or even for different
week days.
Presented in this paper is an approach utilizing artificial
neural networks (ANN) to extract trends from past data of
the same parameters and use that to predict possible fu-
ture values. A data clustering approach is adopted that
groups the d ata into sets with cut off po ints that vary from
one day to the other. This creates zones of intersection,
and to handle these point s f uz z y logi c i s used.
In the wholesale electricity markets different types of
prices exist, but the most basic pricing concept is the
market clearing price (MCP). When there is transmission
congestion in the system then other prices concepts such
as the location marginal price (LMP) and the zonal mar-
ket clearing price (ZMCP) are used. This paper focuses
on MCP which is no t dependent on internal tran smission
constraints.
2. Selecting Input Data
A variety of parameters have generally been used as input
to price and demand forecasting by different researchers,
from weather variables, socio-economic indices, time
indicators to past trends of the demand and its corre-
sponding price. One approach would be to adopt tech-
niques used in long term forecasts, which uses most of t he
factors stated above, but the disadvantage of these is that
useful information about changes during price spikes is
discarded as outliers [11-1 3] .
Statistical analysis was used to find the dominant fac-
tors that decide the present electricity cost. An overview
of price and demand variation for a week, shown in Fig-
ure 1 below, shows that the price variation is a complex
trend that requires detailed analysis to establish the influ-
encing factors. The figure also shows that demand varies
with some readily observable repetitive pattern yet in the
same period the price shows spikes of varying magnitude
and duration.
Weather variables were also considered; the parameters
of interest are temperature and cloud cover. From Figure
2 we can see that temperature and cloud cover are similar
in terms of information they provide. Observable correla-
tion is over a long period of time and in the short term
weather variables cannot be relied on as input to load
forecasting.
Socio-economic variables are performance indicators
like gross domestic product, which are complex figures
incorporating e s t im a tes and averages making them not t he
best inputs for further estimation. Significant social ac-
tivities are strongly linked to the calendar days and hours
of the day such that simply including an element of time
Figure 1. Price and demand variations for a week.
Figure 2. Weather variables and electricity demand.
in the input ob viat es t hem.
Past data has been selected for this approach other as
factors are either estimates in themselves or data aver-
aged over a long time making them not ideal for short-
term application. For past price data a similarity index
was calculated for past demand and past price values,
defined as follows:
 
12
2
MT DM kDT k

JPP
(1)
Where,
PDM(k) - paramete r val ue on th e t a rget day for ho ur k
PDT(k) - parameter value from past data for hour k
k - sampling interval
The smaller the figure is the higher is the similarity in
the compared days. The similarity indices show a very
significant difference from one day to the next, so to
avoid masking trends by grouping data from different
days; each day of the week is treated individually. Na-
tional holidays are treated as Sundays. Further correla-
tion analysis was done through scatter plots shown in
Figures 3-5.
From the scatter plots we can easily see that the price
variation between different days can be described by
more than one trend, and therefore cannot be easily ap-
proximated by a single function. As for the demand the
relationship between one day’s data set and the next is
nearly linear.
In Figure 5 the comparison between price and demand
Copyright © 2011 SciRes. EPE
E. N. CHOGUMAIRA ET AL.
11
Figure 3. Correlation between different days for price.
Figure 4. Correlation between different days for demand.
Figure 5. Price ag ai ns t demand.
shows an almost linear relationship up to some point at
which there is a sudden rise in price. This shows that
there is a significant change in the dominant factors that
determine the price at some level of demand, suggesting
that price can be regarded as having multiple trends for
which calculations can be done separately.
Published market data shows the power served, which
is almost always the power demanded. However, infor-
mation about the supplying generators is not readily
available publicly, yet this has more influence on the
price than the magnitude of demand. Normally expensive
generators are dispatched last, so at high demand periods
the price demand ratio tends to be much higher, as can be
seen in Figure 5.
3. System Configuration and Training
3.1. Artificial Neural Networks
An artificial neural network (ANN) is a model that emu-
lates the functional architecture of the human brain. In
this research a multi-layer perceptron ANN is adopted.
This ANN consists of: an input layer, hidden layers and
an output layer as shown in Figure 6. Except for the in-
put layer, each neuron receives a signal that is a linearly
weighted sum of the outputs from all the neurons in the
preceding layer.
Activation of neuron j is then defined as
 
k-1k
ji jii
i
Ufwu (2)
Where, if neuron j is a hi d den neur on ,
 
1
1exp
i
fx x
 
. Otherwise,

i
f
xx.
i covers all the neurons in the in the layer (k1).
Note, the activation of the jth neuron in the kth layer,
uj(k) in (1), is only a function of the activations of the
neurons in the (k1)th layer and the weights which con-
nect the jth neuron in the kth layer with the neur ons in the
(k1)th layer. The non-linearity fj(x) can be any mono-
tonic function di fferent i abl e i n t he x domain [14].
3.2. Fuzzy Inference Systems
Fuzzy inference is an implementation of fuzzy logic in
which linguistic like rules map the input onto output
space without strict specification of the input [3,4].
Fuzzy Logic: if X is a universe of discourse with ele-
ments denoted by x, then the fuzzy set A in X is defined
as a set of ordered pairs, A = {x, µA(x)|x ε X}. µA(x) is
called the membership function of x in A. Figure 7 shows
Figure 6. Multi layer perceptron type ANN.
Copyright © 2011 SciRes. EPE
E. N. CHOGUMAIRA ET AL.
12
Figure 7. Triangular membership function.
the triangular membership function used for three differ-
ent fuzzy sets; Low, Medium and High.
Fuzzy inference systems (FIS) use fuzzy rules (IF -
Then) and fuzzy reasoning, an inference procedure that
obtains conclusions from a set of fuzzy rules and known
facts. Three conceptual components: rule base (fuzzy
rules selection); database (membership functions) and
reason mechanism inference procedure thus form the
basic structure of a fuzzy inference system.
3.3. System Configuration
A number of configurations based on the proposed ap-
proach were setup. A system with cascaded processing
elements was setup as shown in Figure 8 below. Included
are data conditioning and classification blocks as part of
preprocessing and post-processing for the price estima-
tion path.
3.4. Short-Term Demand Estimation
Short-term load forecasting (STLF) is implemented
through a single stage ANN calculation process. Two
input configuration s are taken for comparison. They both
use past demand data as input; supplied concurrently or
sequentially. Sequential in puts – input vectors occur in a
specific time order. Concurrent inputs – input ordering
not important as the inpu ts do not interact with each oth-
er. A separate ANN was trained for each day of the
week.
3.5. Short-Term Price Estimation
For short-term price estimation the estimation system
was setup in two configurations, one with data clustering
and the other without, for comparison. The preprocessing
block in the first case (referred to as method-1) has logic
for data clustering and normalizing, i.e., mapping the
data onto the [0 to 1] domain. Th e clusters are defined as
follows:
1) normal variations
2) sharply rising prices
3) sharply falling prices
Figure 8. System configuration.
Figure 9 shows the movi ng ave rage and the d erivativ e
taken for a typical weekday.

A; AD*MA
PB; AD2*MA & FD
C; AD2*MA & FD0
t
0


(3)
Where, AD - Actual price data
MA - moving average of price data
FD - first derivative of the price data
1
FD nn
PP
P
tt


In the second configuration, method-2, instead of
clustering, preprocessing is done by logarithmic condi-
tioning as in (4).


 

in
P; P*UL
PUL+P*logPUL; PUL
tt
ttt t



(4)
Post processing block for method-1 contains the fuzzy
inferencing logic for the data cluster intersections. Choice
of membership function is influenced by need for sim-
plicity, convenience, sp eed and efficiency. The triangular
membership function chosen for these reasons.
For the second configuration post processing block has
Copyright © 2011 SciRes. EPE
E. N. CHOGUMAIRA ET AL.
Copyright © 2011 SciRes. EPE
13
Figure 9. Actual price, first derivative and moving average.
the logic for data recovery, shown in (5):
  



out out
out
out
P; PUL
PUL*expPULUL; PUL
t
tt



t
t
(5)
calculated value. No predetermined method was used for
the tuning, but trial and error appro ach seeking to modify
only the components pulling the output in the direction
of the error.
4. Simulation and Results
3.6. Training
After training the different network configurations were
tested with the data sets not used for training to measure
the performance. The platform for the simulation was a
Windows desktop computer with 2 GB of RAM. The post
training computer executions during simulations were
almost instantaneous, although speed was not the most
critical factor in th is study as the system is for o ffline not
real-time forecasting. The following sections show the
test results and analysis. The performance measure used
is the mean average percentage error, MAPE, defined as:
The proposed system was setup in the various configura-
tions in a Matlab environment. The ANN blocks were
initialized with randomly generated numbers as weights
then trained on past data. Back propagatio n with gradient
descent method was used for updating the weights. The
training performance measure used is the mean square
error, MSE:
 
2
af
1
MSEV Vii
N

(6)
 
af
1
MAPEV V
ii
N

The optimum number of training vectors/patterns re-
quired to classify test examples with an error limit of δ
approximately equal to the number of weights in the net-
work multiplied by the inverse of the error limit [15].
This was used as a guideline in preparing the input data
for training. The training performance for the three data
clusters A, B and C is shown in Figure 10.
i (7)
Table 1 shows the estimation performance for demand
Triangular membership function was used for the fuzzy
inference (FIS) and the rules incrementally developed in
the tuning process. The inputs supplied to the FIS block
are the estimated price for the intersection region and the
estimated demand f or the t ar get hours.
Tuning the fuzzy inference system was done by
changing the rule antecedents or conclusions, changing
the centers of the input and/or output membership func-
tions based on the error between the target output and Figure 10. Ann training for price estimation.
E. N. CHOGUMAIRA ET AL.
14
Table 1. Demand forecasting results.
Input vector elements Network size MAPE
2 weeks past demand 96 – 72 – 48 2.3
3 weeks past demand 144 – 96 – 48 3.5
4 weeks past demand 192– 144– 48 10.3
6 – 3 – 1 2.5
6 – 4 – 1 2.02
6 – 5 – 1 2.43
6 – 7 – 1 2.25
6 – 8 – 1 2.21
Past demand values for 3
immediate hours,
Demand values for the same
hour as target hour from 3
previous weeks
6 – 10 – 1 6.09
forecasting in the two configuration variations. Results
shown in Table 1 above show that demand estimation in
the short-term can be achieved with good accuracy
through a well trained ANN. The difference in the two
variations is mostly due to the network size; the higher
dimension networks require a lot of training data before
it can be able to generalize.
Figures 11 and 12 show the results for the estimation
plotted against the target for demand. The main reasons
for the estimation performance shown for demand with a
single stage ANN are: the consistency in trends for the
demand data and the grouping /clustering of demand data
into similar weekdays.
Results for price estimation using method-1 are shown
in Tables 2-4 for each data cluster. The input vectors
comprise of: estimated demand at target hour and values
for the same hour as target, from 4 previous weeks for
the 5-input networks. For the 9-inputs networks price
values for 4 immediate past hours are also added.
Results shown in the tables also illustrate th e effect of
changing the size of the input vector on performance.
Forecasting accuracy for cluster A is significantly better
than the other clusters because it represents hours of
steady variation in price. Figures 13-15 illustrate the
level of coincidence between the forecasted values and
target for each cluster. Figure 16 shows the forecasted
values for hours falling in the intersection of cluster A
and cluster B.
Results for forecasting using method-2 are shown in
Table 5. Increasing input vector size, by increasing the
number of weeks of past data initially provides a wider
variety of data to the ANN, but beyond a certain network
size the ANN loses ability to generalize. Figure 17 also
shows output obtained with this method.
5. Conclusions
Electricity price in the short-term shows high volatility
Figure 11. Demand estimation network size [6-4-1] with
sigmoid transfer function.
Figure 12. Demand estimation linear network [6-4-1].
Table 2. Forecasting resu lts for clu ster A.
Network size MAPE Network size MAPE
5 – 5 – 1 1.065 9 – 5 – 1 1.870
5 – 8 – 1 0.598 9 – 8 – 1 0.914
5 – 11 – 1 1.312 9 – 11 – 1 0.704
5 – 13 – 1 0.944 9 – 13 – 1 1.343
5 – 14 – 1 0.813 9 – 14 – 1 1.451
Table 3. Forecasting resu lts for clu ster B.
Network size MAPE Network size MAPE
5 – 5 – 1 3.0 9 – 5 – 1 9.7
5 – 7 – 1 6.3 9 – 7 – 1 16.7
5 – 9 – 1 4.6 9 – 9 – 1 17.2
5 – 11 – 1 3.6 9 – 11 – 1 9.9
5 – 13 – 1 6.6 9 – 13 – 1 23.1
Table 4. Forecasting resu lts for clu ster C.
Network size MAPE Network size MAPE
5 – 7 – 1 3.3 9 – 7 – 1 13.7
5 – 9 – 1 2.0 9 – 9 – 1 7.3
5 – 11 – 1 2.4 9 – 11 – 1 9.2
5 – 13 – 1 1.9 9 – 13 – 1 9.1
5 – 15 – 1 2.6 9 – 15 – 1 16.5
Copyright © 2011 SciRes. EPE
E. N. CHOGUMAIRA ET AL.
15
Figure 13. Price estimation method-1: cluster A.
Figure 14. Price estimation method-1: cluster B.
Figure 15. Price estimation method-1: cluster C.
Figure 16. Price estimation method-1: intersection AB.
making it difficult to p redict its future value, but with the
proposed approach significant improvement in the fore-
casting performance has been observed. One key element
in the processing is selection of input data and careful
preprocessing. Data clustering into ranges in which ob-
servable trends are maintained allows for th e extraction of
such trends using separate ANN trained for each specific
cluster. Data samples that cannot fit into the strictly de-
fined clusters form intersection zones that, in addition to
ANN, are processed using fuz z y i nference.
Electricity demand shows smoothly varying trends
such that a single stage well tuned ANN blocked man-
aged to achieve good accuracy. Of the two configuration
Table 5. Price forecasting results using method-2.
CaseInput vector elements Network size MAPE
1 Forecasted demand
1 week past price 96 – 72 – 48 18
2 Forecasted demand
2 week past price 144 – 96 – 48 21
3 Forecasted demand
3 week past price 192 – 144 – 488
4 Forecasted demand
4 week past price 240 – 192 – 4817
Figure 17. Price estimation method-2.
options tested comparable performance was obtained.
However, the second configuration that uses large input
vectors comprising of complete days data requires more
computational effort, so the first configuration is consid-
ered optimum.
For electricity price estimation data clustering gives a
comparatively much higher accuracy for the normal price
region and a significantly better result for the times cha-
racterized with spikes. It is therefore concluded that the
way to approach electricity price forecasting while
maintaining all the data is to use a method that employs
data clustering even within a day’s d ata set.
6. References
[1] M. Shahidehpour, H. Yamin and Z. Li, “Market Opera-
tions in Electric Power Systems,” John Wiley & Sons,
Chichester, 2002. doi:10.1002/047122412X
[2] A. K. Topalli, I. Erkmen and I. Topalli, “Intelligent
Short-term Load Forecasting in Turkey,” Electrical Pow-
er and Energy Systems, Vol. 28, 2006, pp. 437-447. doi:
10.1016/j.ijepes.2006.02.004
[3] R. C. Garcia, et al., “GARCH Forecasting Model to Pre-
dict Day-ahead Electricity Prices,” IEEE Transactions on
Power Systems, Vol. 20, No. 2, May 2005, pp. 867-874.
doi:10.1109/TPWRS.2005.846044
[4] M. Stevenson, “Filtering and Forecasting Spot Electricity
Prices in the Increasingly Deregulated Australian Elec-
tricity Market,” Quantitative Finance Research Centre,
University of Technology, Sydney, 2001.
[5] N. Hubele, et al., “Identification of Seasonal Short-term
Copyright © 2011 SciRes. EPE
E. N. CHOGUMAIRA ET AL.
Copyright © 2011 SciRes. EPE
16
Load Forecasting Models Using Statistical Decision
Functions,” IEEE Transactions on Power Systems, Vol. 5,
No. 1, 1990, pp. 40-5. doi:10.1109/59.49084
[6] M. El-Hawary, et al, “Short-Term Power System Load
Forecasting Using the Iteratively Reweighted Least
Squares Algorithm,” Electrical Power Systems Research,
Vol. 19, 1990, pp. 11-22. doi:10.1016/0378-7796(90)900
03-L
[7] V. S. Kodogiannis and E. M. Anagnostakis, “A Study of
Advanced Learning Algorithms for Short-term Load Fo-
recasting,” Engineering Applications of Artificial Intelli-
gence , Vol. 12, 1999, pp. 159-173. doi:10.1016/S0952-
1976(98)00064-5
[8] G.-C. Liao and T.-P. Tsao, “Application of Fuzzy Neural
Networks and Artificial Intelligence for Short-term load
Forecasting,” Electrical Power Systems Research, Vol.
70, 2004, pp. 237-244. doi:10.1016/j.epsr. 2003.12.012
[9] H. Yamin, M. Shahidehpour and Z. Li, “Adaptive
short-term Price Forecasting using artificial Neural Net-
works in the Restructured Power Markets,” Electrical
Power and Energy Systems, Vol. 26, 2004, pp. 571-581.
doi:10.1016/j.ijepes.2004.04.005
[10] P. Mandal, T. Senjyu, N. Urasaki and T. Funabashi, “A
Neural Network Based Several-Hour-Ahead Electric
Load Forecasting using Similar Days Approach,” Elec-
trical Power and Energy Systems, Vol. 28, 2006, pp.
367-373. doi:10.1016/j.ijepes.2005.12.007
[11] S. Rahman and R. Bhatnager, “An Expert System based
Algorithm for Short Term Load Forecast,” IEEE Trans-
actions on Power Systems, Vol. 3, No. 2, 1988, pp. 392-
399. doi:10.1109/59.192889
[12] Q. Lu, et al., “An Adaptive Nonlinear Predictor with
Orthogonal Escalator Structure for Short-term Load Fo-
recasting,” IEEE Transactions on Power Systems, Vol. 4,
No. 1, 1989, pp. 158-164. doi:10.1109/59.32473
[13] I. Moghram and S. Rahman, “Analysis and Evaluation of
Five Short-Term Load Forecasting Techniques,” IEEE
Transactions on Power Systems, Vol. 4, No. 4, 1989, pp.
1484-1491. doi:10.1109/59.41700
[14] M. Ikeda and T. Hiyama, “ANN Based Designing and
Cost Determination System for Induction Motor,” IEEE
Proceeding Electrical Power Application, Vol. 152, No.
6, 2005, pp. 1595-1602. doi:10.1049/ip-epa:20050173
[15] E. Baum and D. Haussler, “What Net Size Gives Valid
Generalization?” Neural Computation, Vol. 1, No. 1,
1989, pp. 151-160. doi:10.1162/neco.1989.1.1.151