^{1}

^{*}

^{2}

^{3}

In this paper, an autonomous and distributive demand-side management based on Bayesian game theory is developed and applied among users in a grid connected micro-grid with storage. To derive that strategy, an energy consumption of shiftable loads belonging to a given user is modelled as a noncooperative three-player game of incomplete information, in which each user plays against the storage unit and an opponent gathering all the other users in the micro-grid. Each player is assumed to be endowed with statistical information about its behavior and that of its opponents so that he can take actions maximizing his expected utility. Results of the proposed strategy evaluated by simulating, under MATLAB environment, a connected micro-grid with storage device evidence its efficacy when employed to manage the charging of electric vehicles.

Demand-Side Management (DSM), which is the management mechanism of demand side in the next generation of the grid [

Research in Bayesian game is much going beyond the game with complete or incomplete information. The authors of [

The rest of the manuscript is organized as follows. The MG model considered in our work is described in System Model. A novel DSM strategy based on game theory is developed in Demand-Side Management based on Bayesian Game Theory. Some performance results are illustrated in Numerical Results, where its use in the management of the recharge of PHEVs in a MG is analyzed.

In this study we consider a low voltage MG consists of n ∈ N ≜ { 1, ⋯ , n , ⋯ , N } residential users, where N = | N | , equipped with RE (e.g., solar PV panel). Users are connected each other and to the public utility via power line. Residential consumers gathering in the MG community share their surplus of energy by storing it in a shared ESU managed by a controller, and act as a single entity when interacting with the public utility. Each user has two types of power loads: ULs and SLs. ULs are appliances that can be turned on at arbitrary instants of the day, i.e. their energy consumption schedule is strictly constrained; that category contains appliances such as refrigeratorfreezer, heating, electric stove and lighting [

Furthermore, each household in the MG is assumed to be equipped with a SM which controls and monitors the energy sharing and the electricity consumption. Each household’s SM also exchange, with other SMs via data network, some information about the RE forecasts, prices of energy, the customers’ demands at every instant and can get information of energy available in the storage unit. We assume that the communication between MG and power utility is supervised by a MSM, i.e. an upgraded SM adapted for operating at high power and serving as the intermediate link between NAN and the main grid global network (BN). The architecture of the proposed MG is shown in

At every instant of time t, each household n ∈ N has following sets of power: the renewable power produced by his own RESs, the power demand from his appliances (SLs and ULs). The real time power exchanged by each customer with the MG is evaluated as follows:

l n ( t ) = l n ( r ) ( t ) + l n ( s ) ( t ) . (1)

Equation (1) gives the instant power l n ( t ) exchanged by the user itself at an instant of time t with the MG. It is the sum of: l n ( r ) ( t ) accounting both the power from RESs for the considered user and the power absorbed by its ULs and, l n ( s ) ( t ) depending on the activation of its SLs. That user’s power quantity can be positive (if he is absorbing power from the MG) or negative (if he is supplying power to the MG) and we highlight that it is constrained to the following inequality:

P g , max ( n ) < l n ( t ) < L a , max ( n ) (2)

where P g , max ( n ) is the maximum power generated by the n^{th} user’s renewable resources and L a , max ( n ) is the maximum power consumed by the same user.

The battery model needs more clarifications concerning its power consumption. The battery’s controller provides real time monitoring of the power p b ( t ) exchanged by the battery itself at time instant t with the MG; this quantity is positive (negative) if the battery is charging from (discharging to) the MG and satisfies the following inequality at any instant of time t:

P d c h , max ( b ) < p b ( t ) < P c h , max ( b ) (3)

where P d c h , max ( b ) and P c h , max ( b ) represents respectively the maximum power that can be discharged from the battery and the maximum power needed for charging the battery.

The overall power monitored by the MSM (see

follows:

l T ( t ) = ∑ i = 1 N l n ( t ) + p b ( t ) . (4)

That overall power is also constrained to: P p u ( i n j ) : the negative maximum power that can be injected in the main grid; P p u ( a b s ) : the maximum positive power that can be absorbed from the public utility, and the previous expressions lead to the following inequality:

P p u ( i n j ) < l T ( t ) < P p u ( a b s ) . (5)

The maximum power that can be injected to the publicutility is a negative value of the summation of power generated by all RESs and the maximum power that can come from the battery when discharging; it is expressed as follows:

P p u ( i n j ) ≜ ∑ n = 1 N P g , max ( n ) + P d c h , max ( b ) < 0 . (6)

In this Section a brief description of our game model is provided and, on the basis of this model, a mixed strategy for the activation of SLs is developed.

The SM installed at each prosumer’s premise is considered as player (taken as player 1 in the following) behaving in a selfish and rational manner, capable of turning the load on or off. Furthermore, as this SM competes with the rest of the MG community in the exploitation of the energy resources available, we can model other N − 1 prosumers as a single aggregated opponent called player 2 and the shared battery as player 3 because it will be competing with all the prosumers for charging and discharging.

The power flow for player 2 is defined as the difference between the overall power available in the MG and power for player 1 and player 3; it can be expressed as follows:

l − n ( t ) ≜ l T ( t ) − l n ( t ) − p b ( t ) = ∑ l = 1 l ≠ n N p l ( t ) (7)

which verify the following inequality:

P g , max ( − n ) < l − n ( t ) < L a , max ( − n ) (8)

where L a , max ( − n ) > 0 and P g , max ( − n ) ≤ 0 are respectively the maximum powers absorbed and generated by player 2; We note that l − n ( t ) > 0 when player 2 is absorbing power from the MG and l − n ( t ) < 0 when player 2 is providing power to the MG.

Those previous MG parameters have led us to a simplified three players game instead of a complicated game of N + 1 players. Our 3 players game model can be used to describe the interactions between each player with the rest of the MG community.

From the every player’s point of view, there is a payoff associated to each of its actions. The evaluation of payoffs gives a full description of the game. For our work we will assume that:

• For prosumers, the action of keeping SLs off is associated with a payoff equals to 0 for the corresponding prosumer without taking into account the power absorbed or generated by other prosumers and the battery.

• The activation of SLs will entail a variation in the payoff for the corresponding prosumer because it will change the operating conditions of the MG, which means that the associated payoff E P n will depend on the expected (statistical) future consumption/generation of the whole MG community.

• The battery (player 3)’s payoff will depend on its charging and discharging efficiency. In other words, its payoff will decrease as the number of charging cycles gets larger; we assume that the battery’s life will be reduced as a function of its charging cycles which will influence its capacity.

The derivation of the expected payoffs E P n in the following sections of this work will take into account:

1) A pricing model for power shared between players and the MG; means that each power exchange will be paid or rewarded with a certain amount of monetary units.

2) Specific statistical information available at each prosumer’s SM and the battery’s controller.

The pricing model takes into account any power exchange between players and the MG. A provision of service, accounted by a power exchange between player and the MG, involves a variation in the total amount of virtual currency owned by the corresponding player. In our work, such variation depends on the MG’s condition and the cost function.

We assume in the following that it depends on the operating condition of the MG represented by a state variable that can take two values. Those values describe the normal (briefly state 0) and stress (briefly state 1) operating conditions. To bring out the characteristics of those states, we consider a positive power threshold that verifies the following inequalities:

l T ( h ) ≤ P c forthenormalstate (9)

P c < l T ( h ) < P p u ( a b s ) forthestressstate . (10)

The normal state represents the regular operating condition of the MG, whereas the stress state corresponds to high consumption of power which may end up by some risk of blackout.

The derivation of the cost function is given by:

C ( l n , l − n , p b ) ≜ − k A ( l T ) ⋅ max ( l n , 0 ) − k G ( l T ) ⋅ min ( l n , 0 ) + k F ( l n , l − n , p b ) ⋅ g ( l n , l − n , p b ) . (11)

The cost function expresses, for given players’ powers, the cost (if negative) or reward (if positive) for the considered player. The first term of the Equation (11) represents the cost associated with the power absorbed by player 1 from the MG, the second one represents the gain coming from the power supplied to the MG and the third one is a fairness term referring to the immediate power exchange between player 1 with that of player 2 and player 3, and the coefficients k A , k G and k F are weight functions that can be adjusted by the MSM in order to influence the behavior of the MG community. The variation of powers according to time has been omitted to ease the reading.

In our work, the weight functions appearing in the right hand side of Equation (11) are given by the following expression:

k X ( l T ) ≜ ( k X ( j ) for l T ≤ P c k X ( j ) + k X ( j ) ( l T − P c ) / P c for l T > P c . (12)

In the Equation (12), X can take two values: A (absorbing) and G (generating), j can also take two values: 0 (normal state) and 1 (stress state).

As Bayesian game theory is concerned, we assumed in (2) the availability of statistical information at every SM and controller; then player 1 is provided with three different probability density functions (pdfs); f l T ( j ) ( x ; τ ) , f l n ( r ) ( x ; τ ) and f p b ( x ; τ ) which are related to the overall power available in the MG, the n^{th} prosumer’s behavior and the battery’s behavior respectively. In short, the controller and n^{th} SM’s statistical knowledge about the complete MG can be summarized as follows:

1) The first order probability density function (pdf) f l T ( j ) ( x ; τ ) with ( τ > t ) which refers to the overall power absorbed by the MG or supplied to the public utility without taking into account DSM.

2) The first order pdf f l n ( r ) ( x ; τ ) of the instantaneous portion l n ( r ) ( t ) of l n ( t ) (see Equation (1)).

3) The first order pdf f p b ( x ; τ ) of the instantaneous battery power level p b ( t ) (see Equation (3)).

In order to derive the payoffs function E P n , what is needed to the prosumer’s SM is the knowledge of the joint probability f l n ( r ) , p b , l − n ( j ) ( x , y , z ; τ ) . The number of prosumers forming player 2 influences the statistical behavior of l − n ( j ) in a way that they may exhibit different behaviors when speaking of power consumption/ generation compared to player 1 and player 3. We will assume in the following that the joint probability can be factored as follows:

f l n ( r ) , p b , l − n ( j ) ( x , y , z ; τ ) = f l n ( r ) ( x , τ ) f p b ( y , τ ) f l − n ( j ) ( z , τ ) . (13)

It is interesting to mention that in order to estimate the above indicated pdfs, specific learning algorithms have to be developed for a real world implementation of the suggested strategy.

Firstly, the pdf f l T ( j ) ( x ; τ ) can be evaluated by the MSM and then its

representation distributed to all players. To achieve that, the MSM must be provided with the exact knowledge of past consumption/generation of players. After receiving the necessary data about users and the weather predictions, the MSM exploits them using improved machine learning tools such as regression models to reliably forecast the statistical behavior of the MG. The assumption

f l T ( i ) ( x ; τ ) ≅ f l − n ( j ) ( x ; τ ) can be adopted when the number of N prosumers is big.

Secondly, the approximation of the pdf f l n ( r u ) ( x ; τ ) can be reliably evaluated by each prosumer’s SM using machine learning tools on the basis of its real time energy consumption data stored over a number of different days.

Lastly, the estimation of the pdf f p b ( x ; τ ) can be accomplished by the battery controller capable of a real time sensing of the battery power level and be able to predict the statistical charging/discharging behavior of the battery.

Knowing the cost function C ( l n , l − n , p b ) given in Equation (11) and the statistical information previously described, the expected payoff related to the switching on (briefly ON) of SLs can be calculated as follows. We first define the

expected overall cost E C n = ( l n ( s ) ; t s l , 0 ( n ) , t s l , 1 ( n ) ) , charged to player 1 for its power flow in time slot [ t s l ,0 ( n ) , t s l ,1 ( n ) ] with t s l ,1 ( n ) = [ t s l ,0 ( n ) + T s l ( n ) ] which is the integral of the cost function evaluated with respect to l n ( r ) , l − n ( j ) and p b in the interval [ t s l , 0 ( n ) , t s l , 1 ( n ) ] and is given by the following equation (note that l n ( t ) < P p u ( a b s ) − l − n ( t ) − p b ( t ) see Equations (5) and (7)):

E C n ( l n ( s ) ; t s l , 0 ( n ) , t s l , 1 ( n ) ) ≜ ∫ τ = t s l , 0 ( n ) t s l , 1 ( n ) ∫ x 2 = P g , max ( − n ) P p u ( a b s ) − L a , max ( n ) − P c h , max ( b ) ∫ x 1 = P g , max ( n ) min ( L a , max ( n ) , P p u ( a b s ) − x 2 − P c h , max ( b ) ) ∫ x 3 = P d c h , max ( b ) P c h , max ( b ) C ( l n , l − n , p b ) ⋅ f l n ( r ) , p b , l − n ( j ) ( x 1 − l n ( s ) ( τ ) , x 2 , x 3 ; τ ) d x 1 d x 2 d x 3 d x τ (14)

Next, the expect payoff E P n associated with the ON action of player 1 is defined as the difference between the expected cost related to the activation of the considered load at t = t s l , 0 ( n ) and that associated with keeping it off, which leads to:

E P n ( l n ( s ) + , l n ( s ) − ; t s l , 0 ( n ) , t s l , 1 ( n ) ) ≜ E C n ( l n ( s ) + ; t s l , 0 ( n ) , t s l , 1 ( n ) ) − E C n ( l n ( s ) − ; t s l , 0 ( n ) , t s l , 1 ( n ) ) f (15)

Parameters l n ( s ) + and l n ( s ) − represent the function l n ( s ) ( t ) (see Equation (15)) respectively associated with the ON and OFF actions. A simplified equation for

E P n ( l n ( s ) + , l n ( s ) − ; t s l , 0 ( n ) , t s l , 1 ( n ) ) (15) can be found as follows. We first replace (13) in Equation (14) and we get:

E C n ( l n ( s ) ; t s l , 0 ( n ) , t s l , 1 ( n ) ) ≅ ∫ τ = t s l , 0 ( n ) t s l , 1 ( n ) ∫ x 2 = P g , max ( n ) ( P p u ( a b s ) − L a , max ( n ) − P c h , max ( b ) ) f l − n ( j ) ( x 2 , τ ) ⋅ ∫ x 3 = P d c h , min ( b ) P c h , max ( b ) f p b ( x 3 , τ ) ⋅ ∫ x 1 = P g , max ( n ) min ( L a , max ( n ) , P p u ( a b s ) − x 2 − P c h , max ( b ) ) C ( l n , l − n , p b ) ⋅ f l n ( s ) ( x 1 − l n ( s ) ; τ ) d x 1 d x 2 d x 3 d x τ (16)

We can further simplify the equation by substituting the upper limit of the second integral appearing in the right hand side of Equation (16) min ( L a , max ( n ) , P p u ( a b s ) − x 2 − P c h , max ( b ) ) by L a , max ( n ) ; this simplification is justified by the fact that the integral function takes negligible values in the interval that has been added in the integration domain, which yields:

E C n ( l n ( s ) ; t s l , 0 ( n ) , t s l , 1 ( n ) ) ≅ ∫ τ = t s l , 0 ( n ) t s l , 1 ( n ) ∫ x 2 = L g , max ( − n ) ( P p u ( a b s ) − L a , max ( n ) − P c h , max ( b ) ) f l − n ( j ) ( x 2 ; τ ) ∫ x 3 = P d c h , min ( b ) p c h , max ( b ) f p b ( x 3 ; τ ) ⋅ ∫ x 1 = L g , max ( n ) L a , max ( n ) C ( l n , l − n , p b ) f l n ( r ) ( x 1 − l n ( s ) ( τ ) ; τ ) d x 1 d x 2 d x 3 d x τ (17)

We finally replace Equation (17) in Equation (15) which leads to the following equation after some manipulations:

E P n ( l n ( s ) + , l n ( s ) − ; t s l , 0 ( n ) , t s l , 1 ( n ) ) ≅ ∫ l − n = P g , max ( − n ) ( P p u ( a b s ) − L a , max ( n ) − P c h , max ( b ) ) β ( l − n , l n ( s ) + , l s ( s ) − ; t s l , 0 ( n ) , t s l , 1 ( n ) ) d l − n (18)

where:

β ( l − n , l n ( s ) + , l s ( s ) − ; t s l , 0 ( n ) , t s l , 1 ( n ) ) = ∫ τ = t s l , 0 ( n ) t s l , 1 ( n ) f l − n ( j ) ( l − n ; τ ) ∫ y = P d c h , min ( b ) p c h , max ( b ) f p b ( y ; τ ) ∫ x = P g , max ( n ) L a , max ( n ) C ( l n , l − n , p b ) ⋅ [ f l n ( r ) ( x − l n ( s ) + ; τ ) − f l n ( r ) ( x − l n ( s ) − ; τ ) ] d x d y d τ (19)

The parameter in Equation (19) can be interpreted as an expected cost density because it indicates how the overall expected cost E P n ( l n ( s ) + , l n ( s ) − ; t s l , 0 ( n ) , t s l , 1 ( n ) ) (18) is allocated over the l − n axis in the considered time interval [ t s l ,0 ( n ) , t s l ,1 ( n ) ] . As in the work of [

1) The game is replayed by player 1 every T s s until n s l t h SLs is activated or the maximum activation time limit is reached;

2) For each shiftable load, the activation interval is scheduled during N l ( n ) slots; which means that the activation time interval for the considered load is: T s l ( n ) = N l ( i ) T s ;

3) The density cost function β ( l − n , l n ( s ) + , l s ( s ) − ; t s l , 0 ( n ) , t s l , 1 ( n ) ) (19) can be formulated as the sum of N l ( n ) expressions, each related to a different time slot. To each time slot, we assign a weight factor decreasing exponentially with the slot index [

The new expression of the expected cost density taking into account the above considerations can be expressed as follows:

β ˜ ( l − n , l n ( s ) + , l s ( s ) − ; t s l , 0 ( n ) , t s l , 1 ( n ) ) = 1 − ω 1 − ω N l ( n ) ∑ z = 0 N l ( n ) − 1 ω z ⋅ β z ( l − n , l n ( s ) + , l s ( s ) − ; t s l , 0 ( n ) ) d p − n (20)

where:

β z ( l − n , l n ( s ) + , l s ( s ) − ; t s l , 0 ( n ) ) ≜ β ( l − n , l n ( s ) + , l n ( s ) − ; t s l ,0 ( n ) + z T s , t s l ,0 ( n ) + ( z + 1 ) T s ) . (21)

In our considered game, player 1 attempts to maximize his own expected payoff E P n . For that reason, the optimal pure strategy can be formulated as follows:

t ^ s l , 0 ( n ) = arg max t ˜ s l , 0 ( n ) ∈ S 0 n E P n ( l n ( s ) + , l n ( s ) − ; t ˜ s l , 0 ( n ) , t ˜ s l , 0 ( n ) + T s l ( n ) ) (22)

where S 0 n = { t p | t p = t s l , 0 ( n ) + p T s ; p = 0 , 1 , 2 , 3 , ⋯ } represents all possible instants on which loads can be activated. We need to specify that the expected payoff E P n ( l n ( s ) + , l n ( s ) − ; t s l , 0 ( n ) , t s l , 1 ( n ) ) mainly depends on the power consumption from other players which makes difficult to derive the equilibrium point for the optimal strategy (22).

In our work, we did not adopt the strategy given by Equation (22) for the following reasons:

1) As stated before, the MSM estimates and periodically broadcasts an update of two probabilities density functions; one related to the overall power in the MG: f l − n ( j ) ( y ; τ ) and the second one related to the battery state (charging or discharging) and power level: f p b ( z ; τ ) ; in the same way, the probability density function f l n ( r ) ( x ; τ ) related to the power consumption of a prosumer is estimated by the n^{th} SM at least on the daily basis. That is why different values of the cost function appearing in the right hand of Equation (22) may emerge when computed at different instants of time and need to be recalculated when an update of the mentioned pdfs is broadcasted.

2) When multiple SLs are simultaneously activated by the n^{th} prosumer, they need to be properly and efficiently scheduled.

These remarks and considering previous work on load management have led us to developing a mixed strategy. It will be part of the following section.

In the proposed game, player 1 replays the game at instants t p = t s l , 0 ( n ) + p T s , with p = 0 , 1 , ⋯ , K n − 1 , until he chooses to turn the load ON or the maximum number of activation trial ( K n ) is reached. The selection of a specific action in the p^{th} attempt is randomly chosen in a given set of action based on the probabilities P o n ( n ) [ p ] and ( 1 − P o n ( n ) [ p ] ) corresponding to the ON and OFF actions respectively. In our game model, P o n ( n ) [ p ] is the activation probability for the n^{th} prosumer in the p^{th} attempt. We need to highlight that:

• Given the activation vector P o n ( n ) [ p ] ; where p = 0 , 1 , ⋯ , K n − 1 , the probability P s ( n ) that the ON action is chosen in K n trials is expressed a follows:

P s ( n ) = P o n ( n ) [ 0 ] + ∑ l = 1 K n − 1 P o n ( n ) [ l ] ∏ k = 0 l − 1 ( 1 − P o n ( n ) [ k ] ) . (23)

• If for any p = 0 , 1 , ⋯ , K n − 1 , the activation probability P o n ( n ) [ p ] remains constant over K n trials (i.e. P o n ( n ) [ p ] = P o n ( n ) ); the probability of success in Equation (23) will be written as follows after factoring:

P s ( n ) = 1 − ( 1 − P o n ( n ) ) K n . (24)

We can derive the activation probability from Equation (24) which gives:

P o n ( n ) = 1 − ( 1 − P s ( n ) ) 1 / K n ≅ P s ( n ) K n (25)

is the activation probability to be selected at each trial to get the probability of success equals to P s ( n ) .

The objective of our mixed strategy is to adjust the activation probabilities P o n ( n ) [ p ] ; p = { 0 , 1 , ⋯ , k n − 1 } so as to minimize, on the average over the set of prosumers, the reduction in the expected utility E P n ( l n ( s ) + , l n ( s ) − ; t s l , 0 ( n ) , t s l , 1 ( n ) ) evaluated on the basis of the expected cost density (19). To derive this strategy, we first define a daily average (where t b represents the beginning of a considered day and T D = 86400 s its duration):

β ¯ ( l − n , l n ( s ) + , l n ( s ) − ) ≜ 1 T D ∫ τ = t b T D + t b β ˜ ( l − n , l n ( s ) + , l n ( s ) − ; τ , τ + T s l ( n ) ) d τ . (26)

of the expected cost density β ( l − n , l n ( s ) + , l s ( s ) − ; t s l , 0 ( n ) , t s l , 1 ( n ) ) (19) and the function φ ( l − n , l n ( s ) + , l n ( s ) − ; t s l ,0 ( n ) , t s l ,1 ( n ) ) which represents for a given l − n the deviation of the expected cost density from its average and it can be expressed as follows:

φ ( l − n , l n ( s ) + , l n ( s ) − ; t s l ,0 ( n ) , t s l ,1 ( n ) ) ≜ β ˜ ( l − n , l n ( s ) + , l s ( s ) − ; t s l , 0 ( n ) , t s l , 1 ( n ) ) − β ¯ ( l − n , l n ( s ) + , l n ( s ) − ) . (27)

Then, the deviation Δ E P n ( l n ( s ) + , l n ( s ) − ; t s l , 0 ( n ) , t s l , 1 ( n ) ) of the expected payoff E P n ( l n ( s ) + , l n ( s ) − ; t s l , 0 ( n ) , t s l , 1 ( n ) ) from its daily average in the considered time interval ( t s l ,0 ( n ) , t s l ,1 ( n ) ) is given by the following equation:

Δ E P n ( l n ( s ) + , l n ( s ) − ; t s l , 0 ( n ) , t s l , 1 ( n ) ) ≜ E P n ( l n ( s ) + , l n ( s ) − ; t s l , 0 ( n ) , t s l , 1 ( n ) ) − E P n ¯ ( l n ( s ) + , l n ( s ) − ) = ∫ l − n = P g , max ( − n ) P p u ( a b s ) − L a , max ( n ) − P c h , max ( b ) φ ( l − n , l n ( s ) + , l n ( s ) − ; t s l ,0 ( n ) , t s l ,1 ( n ) ) d l n (28)

where

E P n ¯ ( l n ( s ) + , l n ( s ) − ) = ∫ l − n = P g , max ( − n ) P p u ( a b s ) − L a , max ( n ) − P c h , max ( b ) β ¯ ( l − n , l n ( s ) + , l n ( s ) − ) d l − n . (29)

The integration domain of the integral appearing in the right hand side of Equation (28) Λ ( n ) = [ P g , max ( − n ) , P p u ( a b s ) − L a , max ( n ) − P c h , max ( b ) ] can be divided into two parts: Σ ( n ) and its complement: Σ ( n ) ¯ given by:

Σ ( n ) ≜ { l − n | l − n ∈ Λ ( n ) , φ ( l − n , l n ( s ) + , l s ( s ) − ; t s l , 0 ( n ) , t s l , 1 ( n ) ) < 0 } . (30)

The expected payoff’s deviation from its average in Equation (28) can now be rewritten as follows:

Δ E P n ( l n ( s ) + , l n ( s ) − ; t s l , 0 ( n ) , t s l , 1 ( n ) ) = ∫ Σ ( n ) ¯ φ ( l − n , l n ( s ) + , l n ( s ) − ; t s l ,0 ( n ) , t s l ,1 ( n ) ) d l − n − ∫ Σ ( n ) | φ ( l − n , l n ( s ) + , l n ( s ) − ; t s l ,0 ( n ) , t s l ,1 ( n ) ) | d l − n . (31)

We need to specify that the first part of the right hand side of Equation (31) is a positive expression describing the reward in monetary units, whereas the

second one, given by: ∫ Σ ( n ) | φ ( l − n , l n ( s ) + , l n ( s ) − ; t s l ,0 ( n ) , t s l ,1 ( n ) ) | d l − n is a negative term

representing a cost or loss of monetary units.

The equilibrium point for our game model can be defined as a reference power level represented by P ¯ r for the overall power flow l T ( t ) (1) and consequently, for l − n ( t ) (7) from the assumption: l − n ( t ) ≅ l T ( t ) . We then partition the integration Σ ( n ) (30) into two different sets given by:

Σ + ( n ) = { l − n | l − n ∈ Σ ( n ) , l − n > P ¯ r } (32)

and

Σ − ( n ) = { l − n | l − n ∈ Σ ( n ) , l − n < P ¯ r } (33)

and the error signal is defined by:

e n [ p ] ≜ ∫ l − n ∈ Σ + ( n ) | φ ( l − n , l n ( s ) + , l n ( s ) − ; t s l ,0 ( n ) , t s l ,1 ( n ) ) | d l − n − ∫ l − n ∈ Σ − ( n ) | φ ( l − n , l n ( s ) + , l n ( s ) − ; t s l ,0 ( n ) , t s l ,1 ( n ) ) | d l − n (34)

It is important to note that the two integrals appearing in the right-hand side of the Formula (34) represent areas of specific regions underlying the function φ ( l − n , l n ( s ) + , l n ( s ) − ; t s l ,0 ( n ) , t s l ,1 ( n ) ) (27), as illustrated in

From Equation (34), it is important to highlight that if the error signal e n [ p ] is positive, i.e. that the first part of the right hand side of Equation (34) is greater than the second; as p − n (power for player 2) or the overall power flow is lesser than the threshold, then player 1 should be encouraged to increase the activation probability P o n ( n ) [ p ] of his SLs and be discouraged when e n [ p ] is negative. To achieve that, we need to develop a strategy which adapt P o n ( n ) [ p ] (with p = 0 , 1 , 2 , ⋯ , K n − 1 ) based on the signal e n [ p ] ; this strategy should produce a monotonous increase according to this signal. We adopted, for its simplicity, the following formula:

P o n ( n ) [ p ] = P ¯ n + γ n e ˜ n [ p ] (35)

where:

The parameters appearing in the right-hand side of Equation (35) are defined as follows: P ¯ n represents a reference probability level, γ n is a real positive parameter and e ˜ n [ p ] is defined as follows:

e ˜ n [ p ] ≜ Φ n ( e n [ p ] ) (36)

where:

Φ n ( e ) ≜ ( − P ¯ n γ n − 1 for e < − P ¯ n γ n − 1 e for − P ¯ n γ n − 1 < e < ( 1 − P ¯ n ) γ n − 1 ( 1 − P ¯ n ) γ n − 1 for e > ( 1 − P ¯ n ) γ n − 1 (37)

Equation (37) represents a clipping function dependent on each prosumer, which limits the variation interval of P o n ( n ) [ p ] , evaluated using Equation (35) to the range [0, 1]. The computation of Equation (35) requires the knowledge of the parameters P ¯ n and γ n . In our work the value suggested by (25) has been selected:

P ¯ n = P s ( n ) K n (38)

which means that the value assigned to P ¯ n is the same as the value that each element of the sequence P o n ( n ) [ p ] should take on if all the activation trails made by the n^{th} prosumer were equally likely. On the other hand, the evaluation of γ n follows an optimization approach based on the following considerations. After

replacing (36) in (35) and (35) in (23) we get the following expression:

P s ( n ) = f s ( P ¯ n , γ n ) = P ¯ n + γ n Φ n ( e n [ 0 ] ) + ∑ l = 1 K n − 1 [ P ¯ n + γ n Φ n ( e n [ l ] ) ] ⋅ ∏ k = 0 l − 1 ( 1 − P ¯ n − γ n Φ n ( e n [ k ] ) ) (39)

We can notice from the Equation (39) that the probability of success P s ( n ) presents a nonlinear dependence according to the parameter γ n . For different values of P s ( n ) and specific sequence of { e n [ l ] } , the condition f s ( P s ( n ) K n − 1 , γ n ) must be satisfied. After selecting P ¯ n according to Equation (38), the value of γ n satisfying P s ( n ) and specific sequence of { e n [ l ] } , the condition f s ( P s ( n ) K n − 1 , γ n ) can be graphically found and evaluated by means of a simple direct search method and be used to emphasize the weight of e n [ p ] in the evaluation of P o n ( n ) [ p ] on the basis of Equation (35).

The implementation of our strategy follows many steps that are summarized in

For using the proposed strategy, the following considerations should be taken into account:

1) In the MG, SLs owned by prosumers can be identified according to their types, where a type could be characterized by two values (absorbed power, absorption duration). Then, when the n^{th} prosumer wants to activate in the same time interval N s l ( n ) SLs of different types, his SM will evaluate distinct probabilities. Turning on a specific SLs, causes a change in the power absorbed l n ( s ) ( t ) , hence the probabilities of the other SLs waiting for activation must be recalculated.

2) Each prosumer repeats the game every T s until he opts for the ON action or the OFF action. The slot duration T s is proportional to the overall power. Then, if the OFF action has been selected in a certain repetition of the game, the next attempt should occur when the overall power L T ( t ) has undergone a significant change.

3) The energy available in the MG is shared between the three players on the basis of probabilistic mechanism. It may happen an instant when the scheduled power for SLs is not sufficient. In that case, the MSM is supposed to broadcast a disconnection message of specific portion (if not all) of active SLs to avoid overload risks.

4) The reference power P ¯ r which is selected by the MSM and broadcasted to all prosumers, plays an important role because its change in value modifies the equilibrium of the MG. The value of that threshold is practically fixed on the basis of the expected consumption over the whole day.

This chapter describes the results obtained by simulating the DSM strategy for a MG collecting N = 100 residential prosumers sharing a battery capable of generating 300 KW for 1 hour. Analysis of the results to evaluate the effectiveness of the Bayesian DSM strategy is also presented.

The following assumptions have been made in all our simulations:

1) Each prosumer in the MG has made a contract to not exceed a maximum power absorbed of P a , max ( n ) = 6 kW , of which, 3.6 kWh are solely for charging his electrical vehicle (PHEV). The PHEV will be considered as the only SLs for each prosumer. Furthermore, the prosumer’s PV panels are able to generate up to − P g , max ( n ) = 3 kW . In order to account the daily fluctuations in the power generated by solar PV panels, we have superposed the average power generated by those renewable sources with a zero mean random Gaussian process.

2) We have considered 15 appliances for each prosumer that are characterized by a probability Mass Function (pmf). For a specific appliance, a pmf represents its activation probability for a single day and is dependent to the prosumer’s behavior. An example of a daily power consumption of a random prosumer is shown in ^{th} prosumer’s SM.

3) The Gaussian approximation has also been used for the pdf f l T ( i ) ( x ; τ ) related to the overall power of the MG in absence of the DSM. As previously mentioned, the approximation f l − n ( n ) ( y ; τ ) ≅ f p T ( i ) ( x ; τ ) has been used and its parameters (mean value and variance) are perfectly known by all prosumers.

4) In our simulation of the proposed DSM strategy, the parameters presented in

5) The MG load demand has been observed for three days corresponding to 72 hours and 288 slots when taking 15 minutes as slot duration. The activation requests of SLs (only the PHEV for simplicity), has been concentrated in the second day after reaching a steady state condition in order to evaluate the effectiveness of our DSM strategy under different load demand.

6) The energy storage unity (the battery in our case) has been sized according to the total load demand. In our simulation we have estimated the battery to

MG | Cost Function | ||
---|---|---|---|

Parameters | Value | Parameter | Value |

N | 100 | k A ( 0 ) = k G ( 0 ) | 30 mu/J |

P g , max ( n ) | −3 kW | k A ( 1 ) = k G ( 1 ) | 200 mu/J |

L a , max ( n ) | 6 kW | k F ( 0 ) | 200 mu/J |

P p u ( a b s ) | 0.7 ⋅ N ⋅ P a , max ( n ) W | k F ( 1 ) | 4.5 mu/J |

P d c h , max ( b ) | 0.5 ⋅ N ⋅ P a , max ( n ) W | P c | P p u ( a b s ) |

P c h , max ( b ) | 0.5 ⋅ N ⋅ P a , max ( n ) W | ||

P L | 0.25 ⋅ N ⋅ P a , max ( n ) W | ||

T s l ( n ) | 6 h | Activation Probability | |

T D | 24 h | Parameter | Value |

T s | 15 mins | P s | 0.90, 0.95 |

δ | 0.75 |

supply half of the load demand (see

A sample function of the overall power absorbed by the MG from the public utility if it is positive or generated by the MG and injected in the grid if is negative is represented in Figures 6-8 for the considered three days. Three cases, corresponding to the operating conditions of the MG without the battery, with the battery charging and with the battery discharging have been considered. In both cases, the performance of the proposed DSM strategy has been analyzed for SLs (PHEVs).

In particular, in

Similarly, in

Finally, the simulation of the micro grid was done taking into account the discharge of the battery.

These results evidence that the scheduling of PHEVs may substantially lower the peaks in load demand due to SLs.

As we can see, the operation of the MG in the presence of the battery presents a significant reduction of peaks in the load demand thanks to the compensation in power provided by the battery. This conclusion is also supported by

use of DSM and a shared battery in the MG brings between 40% and 42% on average in the improvement of the MG PAR. Note that this improvement is substantially better compared to that provided by the use of the Bayesian game theory developed in [

The expected payoff E P n (15) evaluated for the activation of the PHEVs (SLs) owned by the prosumer in the presence of the DSM and the battery in relation to the cases where they are not present may vary from one prosumer to another. Figures 10-12 show the gap of the expected payoff existing between the use of DSM (blue squares) and the battery compared to the cases where DSM is absent (red squares) and the battery charging or discharging. Note that the line that connects the two squares referring to a specific user is blue (red) if the first value is greater (smaller) than the second one.

Especially ^{th} prosumer without considering the battery.

The realization of the values of the above mentioned expected payoffs { E P n , n = 1 , 2 , ⋯ , 100 } when the battery is charging is shown in

The last case in the evaluation of the expected payoffs { E P n , n = 1 , 2 , ⋯ , 100 } related to the activation of the PHEV has been simulated when the battery is discharging; which is exemplified in

These results show that, in the MG community, 70, 64, and 73 prosumers respectively for the MG operates without battery, with battery charging and finally with battery discharging, benefit from the DSM strategy in terms of monetary

units. Further simulations have shown that the average Expected payoff for the PHEV activation is:

1) −0.87 mu in the presence of the DSM and −1.46 mu in its absence and without tarnishing the battery;

2) −0.57 mu in the presence of DSM and −0.66 mu in its absence and with the battery charging;

3) −2.23 mu in the presence of the DSM and −4.44 mu in its absence and with the battery discharging.

In this paper, a game theory based on DSM strategy relying on statistical information about prosumer consumption, the charging/discharging of a shared battery and the overall consumption of a MG has been developed. The proposed strategy helps to mitigate fluctuations in the load demand when applied to a MG with SLs and preserve privacy for users. Numerical results obtained when using the strategy to the MG and considering a shared battery in a multi user scenario, evidence a significant reduction in the MG PAR for the management of the recharge of PHEVs considered as shiftable load owned by each user. Furthermore, the strategy allows a substantial satisfaction for the activation of those SLs when the community storage is contributing to the power supply in the MG. Future work concerns the management of the community storage by autonomously scheduling the charging and discharging of the energy storage units in a MG.

Ininahazwe, H., Muriithi, C.M. and Kamau, S. (2018) Optimal Demand-Side Management for Smart Micro Grid with Storage. Journal of Power and Energy Engineering, 6, 38-58. https://doi.org/10.4236/jpee.2018.62004