Applied Mathematics
Vol.05 No.10(2014), Article ID:46520,10 pages
10.4236/am.2014.510141
Modeling and Design of Real-Time Pricing Systems Based on Markov Decision Processes
Koichi Kobayashi1*, Ichiro Maruta2, Kazunori Sakurama3, Shun-ichi Azuma2
1School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa, Japan
2Graduate School of Informatics, Kyoto University, Kyoto, Japan
3Graduate School of Engineering, Tottori University, Tottori, Japan
Email: *k-kobaya@jaist.ac.jp
Copyright © 2014 by authors and Scientific Research Publishing Inc.
This work is licensed under the Creative Commons Attribution International License (CC BY).
http://creativecommons.org/licenses/by/4.0/



Received 2 April 2014; revised 2 May 2014; accepted 9 May 2014
ABSTRACT
A real-time pricing system of electricity is a system that charges different electricity prices for different hours of the day and for different days, and is effective for reducing the peak and flattening the load curve. In this paper, using a Markov decision process (MDP), we propose a modeling method and an optimal control method for real-time pricing systems. First, the outline of real-time pricing systems is explained. Next, a model of a set of customers is derived as a multi-agent MDP. Furthermore, the optimal control problem is formulated, and is reduced to a quadratic programming problem. Finally, a numerical simulation is presented.
Keywords:
Markov decision process, Optimal control, Real-time pricing system

1. Introduction
In recent years, there has been growing interest in energy and the environment. For problems on energy and the environment such as energy saving, several approaches have been studied (see, e.g., [1] [2] ). In this paper, we focus on real-time pricing systems of electricity. A real-time pricing system of electricity is a system that charges different electricity prices for different hours of the day and for different days, and is effective for reducing the peak and flattening the load curve (see, e.g., [3] - [6] ). In general, a real-time pricing system consists of one controller deciding the price at each time and multiple electric customers such as commercial facilities and homes. If electricity conservation is needed, then the price is set to a high value. Since the economic load becomes high, customers conserve electricity. Thus, electricity conservation is achieved. In the existing methods, the price at each time is given by a simple function with respect to power consumptions and voltage deviations and so on (see, e.g., [6] ). In order to realize more precisely pricing, it is necessary to use a mathematical model of customers.
In this paper, using a Markov decision process (MDP), we propose a mathematical model of real-time pricing systems. Since in many cases, the status of electricity conservation of customers is discrete and stochastic, it is appropriate to use an MDP. Then, a set of electricity customers is modeled by a multi-agent MDP. Furthermore, we consider the finite-time optimal control problem. By appropriately setting the cost function, it is achieved that customers conserve electricity actively. This problem can be used for the model predictive control method, which is a control method that the finite-time optimal control problem is solved at each time. In addition, the finite-time optimal control problem can be reduced to a quadratic programming problem. The proposed approa- ch provides us with a basic of real-time pricing systems.
This paper is organized as follows. In Section 2, the outline of real-time pricing systems is explained. In Section 3, a model of electricity customers is derived. In Section 4, the optimal control problem is formulated, and its solution method is derived. In Section 5, a numerical simulation is shown. In Section 6, we conclude this paper.
Notation: Let
denote the set of real numbers. Let
,
denote the
identity matrix, the
zero matrix, respectively. For simplicity, we sometimes use the symbol
instead of
, and the
symbol
instead of
. For two events
, let
denote the conditional expected value of 
under the event
.
2. Outline of Real-Time Pricing Systems
In this section, we explain the outline of real-time pricing systems studied in this paper.
Figure 1 shows an illustration of real-time pricing systems studied in this paper. This system consists of one controller and multiple electric customers such as commercial facilities and homes. For an electric customer, we suppose that each customer can monitor the status of electricity conservation of other customers. In other words, the status of some customer affects that of other customers. For example, in commercial facilities, we suppose that the status of rival commercial facilities can be checked by lighting, Blog, Twitter, and so on. Depending on power consumption, i.e., the status of electricity conservation, the controller determines the price at each time. If electricity conservation is needed, then the price is set to a high value. Since the economic load becomes high, customers conserve electricity. Thus, electricity conservation is achieved.
In this paper, the status of electricity conservation of each customer is modeled by a Markov decision process (MDP). Then a set of customers is modeled by a multi-agent MDP (MA-MDP). Furthermore, by using the obtained MA-MDP model, we consider the optimal control problem and its solution method.
Figure 1. Illustration of real-time pricing systems.
3. Model of Customers
First, consider modeling the dynamics of each customer by a one-dimensional MDP. The value of the state 
is randomly chosen among the finite set
. The element of 
electricity conservation, and “


where 


the probability that the state is 

The transition probability matrix 
The control input is determined under the condition for each element:

and the condition for each column:

Next, consider modeling the dynamics of a set of customers by an MA-MDP. The number of customers is
given by


given by
Then, we suppose that the MA-MDP model expressing the dynamics of a set of customers is given by

where 
condition:

For simplicity of discussion, coupling terms are given by
some condition corresponding to (5).
4. Optimal Control
4.1. Problem Formulation
Consider the following problem.
Problem 1. Suppose that for the MA-MDP model (4) expressing the dynamics of customers, the initial state




input sequence



subject to the following constraint:

where 


Hereafter, for simplicity of notation, the condition 
By using the constraint (7), the input constraint such as 
adjusting


4.2. Solution Method
We derive a solution method for Problem 1. First, consider the MDP model (1). The MDP model is a class of nonlinear systems. However, in this case, it can be transformed into a linear system. The MDP model (1) can be rewritten as
where
By the property of the probability distribution, the relation 

where
Next, by using the linear system (8), consider representing the MA-MDP model (4) as a linear system. The linear system for the customer 
Then, the MA-MDP model (4) can be equivalently transformed into the following linear system:

where
Finally, consider the cost function (6). Define
Then we can obtain
Therefore, the cost function (6) can be rewritten as

From the above discussion, Problem 1 is equivalent to the following problem.
Problem 2
Problem 2 is reduced to a quadratic programming (QP) problem, and can be solved by a suitable solver such as MATLAB and IBM ILOG CPLEX. In addition, if
ming (LP) problem (we remark that 
5. Numerical Example
Since it is difficult to use data in real systems, we present an artificial example. The state is chosen among the
finite set


system for the consumer 
The parameters 
The parameters







From

addition, the input constraint 
In this numerical example, we consider the following two cases:
• The price for each customer is the same (i.e., 
• The price for each customer is different.
Case (i) is the conventional case in real-time pricing systems. In Case (ii), we suppose that the difference in the price is covered by using local concurrencies such as the Eco-point point system [8] . The Eco-money system [9] in Japan were introduced to stimulate the economy and raise awareness of global warming. In the Eco-point point system, many points, which correspond to money in a local concurrency, are given for the products that are effective from the viewpoints of electricity conservation and the environment. Such a system for energy management systems has been discussed in [10] .
Next, we present the computational results. First, the computational result in Case (i) is explained. Figures 2-6 show the probability distribution for each customer. From these figures, we see that 


electricity maximally, with a certain probability. Furthermore, the optimal value of the cost function is
Figure 2. π1(t) in Case (i).
Figure 3. π2(t) in Case (i).
Figure 4. π3(t) in Case (i).
Figure 5. π4(t) in Case (i).
Next, the computational result in Case (ii) is explained. Figures 7-11 show the probability distribution for
each customer. Comparing Figures 2-6 with Figures 7-11, we see that transient responses of 
improved in Case (ii). In particular, for the customer

Figure 6. π5(t) in Case (i).
Figure 7. π1(t) in Case (ii).
Figure 8. π2(t) in Case (ii).
Figure 6 and Figure 11). Furthermore, the optimal value of the cost function is 84.5057, and we see that the
optimal value of the cost function is improved. The optimal control input

Figure 9. π3(t) in Case (ii).
Figure 10. π4(t) in Case (ii).
From these values, we see that in the steady state, 


6. Conclusions
In this paper, we have proposed a modeling method and an optimal control method of real-time pricing systems using the MDP-based approach. In many cases, the status of electricity conservation of customers is discrete and
Figure 11. π5(t) in Case (ii).
stochastic, and the use of the MDP model is effective. A real-time pricing system is modeled by multi-agent MDPs, and the optimal control problem is reduced to a QP problem. Furthermore, a numerical simulation has been shown. The proposed method provides us with a new method in real-time pricing of electricity.
There are several open problems. First, it is important to develop the identification method of the MA-MDP model based on the existing result (see, e.g., [11] ) for MDPs. Since the effect of couplings between customers was simplified, it is also important to consider modeling it more precisely. Next, the optimal control problem is reduced to a QP problem or an LP problem. These problems can be solved faster than a combinatorial optimization problem such as a mixed integer programming problem. However, for large-scale systems, the computation time for solving the optimal control problem will be long. Then, it is important to develop a distributed algorithm.
Acknowledgements
This research was partly supported by JST, CREST.
References
- Camacho, E.F., Samad, T., Garcia-Sanz, M. and Hiskens, I. (2011) Control for Renewable Energy and Smart Grids. In: Samad, T. and Annaswamy, A.M., Eds., The Impacy of Control Technology, IEEE Control Systems Society, New York.
- Ruihua, Z., Yumei, D. and Yuhong, L. (2010) New Challenges to Power System Planning and Operation of Smart Grid Development in China. Proceedings of the 2010 International Conference on Power System Technology, Hangzhou, 24-28 October 2010, 1-8.
- Borenstein, S., Jaske, M. and Rosenfeld, A. (2002) Dynamic Pricing, Advanced Metering, and Demand Response in Electricity Markets. Center for the Study of Energy Markets, University of California, Berkeley.
- Roozbehani, M., Dahleh, M. and Mitter, S. (2010) On the Stability of Wholesale Electricity Markets under Real-Time Pricing. Proceedings of the 49th IEEE Conference on Decision and Control, Atlanta, 15-17 December 2010, 1911- 1918.
- Samadi, P., Mohsenian-Rad, A.-H., Schober, R., Wong, V.W.S. and Jatskevich, J. (2010) Optimal Real-Time Pricing Algorithm Based on Utility Maximization for Smart Grid. Proceedings of the 1st IEEE International Conference on Smart Grid Communications, Gaithersburg, 4-6 October 2010, 415-420.
- Vivekananthan, C., Mishra, Y. and Ledwich, G. (2013) A Novel Real Time Pricing Scheme for Demand Response in Residential Distribution Systems. Proceedings of the 38th Annual Conference of the IEEE Industrial Electronics Society, Monteral, 25-28 October 2012, 1954-1959.
- Bello, D. and Riano, G. (2006) Linear Programming Solvers for Markov Decision Processes. Proceedings of the 2006 IEEE Systems and Information Engineering Design Symposium, Charlottesville, 28 April 2006, 90-95. http://dx.doi.org/10.1109/SIEDS.2006.278719
- Eco-Point System for Housing. http://www.vec.gr.jp/english/topics/100217_1.htm
- Kyoto Eco Money. http://www.city.kyoto.jp/koho/eng/topics/2012_8/index.html
- Sawashima, K., Kubota, Y., Lu, H., Takemae, T., Yoshida, K. and Wan, Y. (2011) Socio-Personal Energy Management System. Keio ALPS2011 Group K Final Report. http://lab.sdm.keio.ac.jp/alps2011k/FinalReport-ALPS2011-K.pdf
- Rust, J. (1994) Structural Estimation of Markov Decision Processes. In: Handbook of Econometrics, Elsevier, Amsterdam, Vol. IV, Chapter 51, 3081-3143.
NOTES
*Corresponding author.
































