Energy and Power Engineering, 2013, 5, 616-620
doi:10.4236/epe.2013.54B119 Published Online July 2013 (http://www.scirp.org/journal/epe)
Operating Analysis and Data Mining System for Power
Haiming Zhou1, Dunnan Liu2, Dan Li1, Guanghui Shao3, Qun Li3
1China Electric Power Researc h I n s t itute, Beij i n g, China
2School of Economics and Management, North China Electric Power University
3Northeast China Grid Company, Shenyang, China
Email: haimingzhou@ 163.com, firstname.lastname@example.org, lidan@163. com
Received January, 2013
The dispatching center of power-grid companies is also the data center of the power grid where gathers great amount of
operating information. The valuable information contained in these data means a lot for power grid operating manage-
ment, but at present there is no special method for the management of operating data resource. This paper introduces the
operating analysis and data mining system for power grid dispatching. The technique of data warehousing online ana-
lytical processing has been used to manage and analysis the great capacity of data. This analysis system is based on the
real-time data of the power grid to dig out the potential rule of the power grid operating. This system also provides a
research platform for the dispatchers, help to improve the JIT (Just in Time) management of power system.
Keywords: Power Grid Dispatch; Index System; Data Mining
Network control center is th e center of the grid operating
data, a large amount of data that produced from SCADA,
EMS, CPS and other systems contains a grid in the op-
eration and safety of all o f the information. However, th e
current dispatch center information system is mainly for
data collection, storage and simple summary of query of
daily operations, and not specifically for a lot of histori-
cal information for effective category management and
correlation analysis, cannot effectively use the data from
the valuable resource the data extract useful knowledge
paired scheduling management.
Data mining techniques  is a large database from
extracting previously unknown information about the
operability of the knowledge discovery process, includ-
ing data warehouse , online analytical processing
OLAP , and data visualization of the new information
science technology. Data mining is widely used in avia-
tion, finance, telecommunications and other areas, which
have greatly improved data management and deci-
sion-making; in the power industry, data mining system
has been in power market analysis, supply market analy-
sis and other fields to be effectively applied. Therefore, it
is technologically advanced and practical feasibly for
data mining techniques applied to power scheduling
The measures of establishing dispatching operation
analysis and data mining systems, effectively manage-
ment and scientific analysis for power grid data, do goo d
use for the summed experience of power grid, the explo-
ration of the power grid law in order to provid e the anal-
ysis platform for the dispatchers and improve the level of
dispatching and scientific decision-making.
This paper first describes the content of the dispatch-
ing operation analysis and the system of the index evalu-
ation, and then describes the data warehouse platform
and the analysis techniques of multidimensional data of
data analysis, finally describes examples of data mining
with the act ua l needs of dispatchi ng operat ion.
2. Dispatching Content
It is supposed to monitor and compute the status of all
aspects of dispatching operation, in order to facilitate the
refinement of grid operation management, accumulate
data and experience under the conditions of the electric-
ity market, continuously improve network security, qual-
ity, economic operation, and ensure grid security and
Dispatching indicator system is to quantitatively re-
flect the operations of all aspects of the grid, and make
comprehensive and clear understanding on the main fea-
tures of network parameters for the scheduling staff
through the calculation and analysis of a series of core
According to State Grid Corporation of scheduling
system  requirements, grid dispatching departments
Copyright © 2013 SciRes. EPE
H. M. ZHOU ET AL. 617
need to analyze and evaluate the power grid control class,
scheduling classes, statistical analysis class, with the
purpose of forming the closed-loop feedback mechanism
of the accuracy of network control and the rationally
scientific operation mode, and to achieve meticulous
management of the power grid.
2.1. Run Control Class Indicators
Categories grid operation control targets should real-time
record automatically generated indicators report and
power grid index quantification statistics accord ing to the
EMS and other automated systems for the inspection of
the relevant analysis.
Scheduling on duty should be able to real-time view
the running control class indi cators and the basic content,
in order to adopt appropriate control measures to con-
tinuously improve the level of the grid control. These
include the follo wing ind icators:
1) Frequency indicators:
Include the highest power, lowest frequency, out of 50
± 0.2 Hz of the frequency of the more limited running
time, 50 ± 0.1 Hz of freq uency pass rate.
2) Tie-line indicator:
Including the tie-line A1/A2 or CPS1/CPS2 more lim-
3) Important trend s of b ound cross-sec t i on:
Include the cumulative time and the percentage of total
calendar time of an important trend over stable section
running limits, the important trend of cross-section limit
of 90% to 100% of the total time and percentage of total
4) Standby indicator:
Include: real-time spinning reserve capacity and day
96 load ratio of the percentage of the maximum and
minimum values, the ratio of the percentage of an FM
unit Number of units and operating units Number of units,
the total available capacity of investment in an FM unit,
minimum spinning reserve capacity during peak hours,
spinning down maximum output capacity during low
5) Voltage indicators:
Include the voltage passing rate of the plant site under
the jurisdiction, the proportion of the total number of
stations under the jurisd iction of plants of the lower limit
of the peak voltage, the proportion of the number of ju-
risdiction of plants under the total number of stations in
the low hours of maximum voltage.
2.2. Scheduling Class Indicators
1) Load characteristics of indicators:
Inclu de: max imum / minimu m load , maximum l oad at
the same time rate, which is the whole network maxi-
mum load and net provincial, city (region or control area)
ratio of the sum of the maximum load; net provincial,
city (region or control area), maximum rate of change on
rise and decrease of 96 points load in the more than 220
kV substation, the dev iates of 96 points on the load fore-
cast and the actual load value more than ±2% of the
points and so on in net provinces (or control area).
2) Power balance indicators
Include: the completion of day, month power supply;
the completion of selected power plant project.
3) Standby indicator:
Include: spinning reserve to meet the rules of order, 10
minutes to bring up the emergency and alternative situa-
tion, 30 minutes to control the load standby conditions.
4) Maintenance plan targets:
Include: the completion of scheduled maintenance and
the rationality of repairing plan.
5) Indicators of low-fr equency load shedding:
Include: real cut capacity of low-frequency load shed-
ding partition, the real fault cut capacity to meet the var-
ious regions of the maximum possible N-1 components.
6) System automatic safety device index:
Include: the adaptability of installation strategies, the
rationality of device configuration.
7) Voltage indicators:
Include: the rationality of the central point reactive
power arr angemen t, th e ave rage in the p lant statio n unde r
the jurisdiction of the peak voltage, low hours.
8) Important indicators of the transmission section:
Include: the number and total time of the runn ing trend
over stable limit, the number and total time of the run-
ning trend in the 80% to 100%.
2.3. Use of KPI
Once we’ve identified the Key Performance Indices for
the business, it is ready to filter these down to the ana-
lyzers. The KPI usually have these three functions:
1) Reflect the Targets
Where possible offer incentives linked to targets, and
encourage analyzers to involve staff in the setting of their
targets. For example, a sales analyzer may set the sales
team targets of a certain number of new contacts per
week or visits per month. A production manager might
set targets for output, reject rate or work breaks.
2) Find the Cause
Set review dates and compare against previous figures,
your business plan, budget or other agreed standard. If
you spot an anomaly or problem, the KPI will help to
backtrack and pinpoint the cause.
3) Report regularly
Communicate performance figures regularly. Keeping
employees updated will encourage them to focus on
meeting or exceeding their KPI.
Copyright © 2013 SciRes. EPE
H. M. ZHOU ET AL.
2.4. Statistical Analysis of Categories of
Statistical analysis indicator refers to the statistical anal-
ysis of the previous two categories of indicator s during a
period of time.
1) Total statistics:
The purpose is that to stat the cumulative quantity of
some indicators over a period of time. Such as: failure
time (seconds) of frequency, the unqualified cumulative
number of days and failure time, the maintenance ticket
number and the operating ticket number performed
monthly, power grid network losses, unplanned generator
outage times, and the frequency of fault trip of power
lines, bus, together (main).
2) Mean / extreme value statistics:
The purpose is that to stat the maxi mum, the minimum
or the average indicators over a period of time.
Such as: the largest number of operating ticket in sin-
gle-day, the 96-points load characteristic curve, together
(main) maximum load change rate, water consumption
rate of direct transfer hydropower plants, thermal power
plant coal consumption rate of direct transfer, high- vol-
tage transmission network net loss, 220 kV and more
rapid protection line fault removal rate, times of power
cuts and the average daily capacity and so on.
3) Percentage statistics:
The purpose is that to stat the ratio between the two
types of indicators of the total in a period of time. Such
as: the correct rate of the moves of security control de-
vice, the tie-line bias and frequency control passing rate,
voltage passing rate, load forecasting accuracy, the pass-
ing rate of the low-frequency load shedding control ca-
pacity, the operation rate of unit PSS, the operation rate
of unit AGC, the qualified rate of the operating vote.
3. Scheduling Operaion Analysis Data
3.1. Scheduling Integrated Data Warehouse
All the available data relative to the supplier’s bidding
can be divided into three classes:
Data Hub is an onlin e system that is designed for Stats
analysis and decision support applications. It could meet
the decision support and online analytical applications
require all. This data is called data warehouse platform.
The establishment of dispatching and comprehensive
analysis platform for data mining is responsible for col-
lecting all kinds of the required scheduling run data indi-
cators in regular time.
1) The basic information of network parameters, pow-
er plants and units.
2) Power generation, by the power, the load plan and
3) Power line trend data, the node voltage data.
4) Local power plants planned and actual data.
5) AGC unit and various indicators of data assessment.
6) Market transactions and load forecast data.
7) Reporting of transactions, transaction data.
8) All kinds of i nf ormation in regional p ower market .
9) Regional power grid parameters and inter-provin-
cial trade data.
Data warehouse is a new application of a database
technology, and so far, the data warehouse is a relational
database management system to manage the data.
4. Olap Multidimensional Data Analysis and
Data warehouse contains a lot of data extracted from a
number of databases, but these data could play its proper
value only by using the right tools and being used effec-
The methods of data mining for the data in the data
warehouse to be analyzed include online analytical proc-
essing (OLAP), association rule mining, decision tree
analysis, cluster analysis and other. Shanghai power sys-
tem mainly uses online analytical processing (OLAP)
4.1. Dispatching Multi-dimensional Analysis and
Multi-dimensional data ex traction and OLAP data analy-
sis could be done by the use of all kinds of data in the
dispatching data warehouse. As shown in Figur e 1:
1) Time dimension: Data classification according to
year, quart e r, month , week, day, ho ur and minute.
2) Period dimension: Data classification according to
the three period-peak, trough , and waist load.
3) Regional dimension: Data classification according
to different regions.
4) Plant dimensions: Data classification according to
power plant, the type (bid/peaking/FM/self) and Power
Generatio n Gr oup.
5) Unit dimensions: Data classification depending on
the capacity of the unit.
6) Line dimensions: data analysis depending on the
line, the line type.
7) Substation dimensions: Data classification accord-
ing to the type of substation and sub station.
8) Weather dimensions: Data classification according
to the temperature, humidity, the sunny weather or the
cloudy weather and other standard.
9) Custom dimension: We could create new data clas-
sification flexibly according to the needs of scheduling
Analysis, such as the level of the plant load factor, power
frequency, peak and valley levels and any other levels of
Copyright © 2013 SciRes. EPE
H. M. ZHOU ET AL.
Copyright © 2013 SciRes. EPE
Year /quarter/ month/ week / day
Peak / waist /Valley
Different transmission line
Basic business data
Various types of indicators
Hours / minutes
Time dimensi on
Regi onal dimension
Group / plant
Figure 1. Multi-dimension analysis for power grid data.
4.2. OLAP-based Reporting System 3) Annual analysis: The aim is to make suggestions
and comments for company planning and technical in-
novation mainly through the analysis of statistical analy-
sis indicators combined with annual operational mode.
Then it provides support for the operational mode in the
The previous po wer system analysis is based on a variety
of control model calculations in advance of the simula-
tion analysis, the schedule data warehouse platform es-
tablished by the project, aims to statistically analyze and
data excavate a large number of historical data, and find
the potential operational law from the long-term accu-
mulation. These all cannot be achieved in any previous
analysis methods and models.
The operator is supposed to strengthen communication
and coordination with the company development of in-
frastructure, production, safety supervision, marketing,
trading and other relevant departments, and implement
corrective measures according to the important issues
reflected in the power grid power system analysis.
Power grid analysis is divided into day, monthly and
yearly analysis, in accordance with the Requirements of
refinement of grid operation. Scheduling classes, statis-
tical analysis of categories of indicators should be
months, years for analysis. The reporting system should
be based on real-time automation systems such as EMS
automatic generation of indicators recorded statements
and quantify the power grid index statistics for the in-
spection of the relevant power system analysis. Dis-
patchers on duty should be able to view real-time basic
and operational control content class indicators, in order
to adopt appropriate control measures to continuously
improve the level of the grid control.
The purpose is to get th e patterns and trends related to
grid operation through the power grid power system
analysis statistics and analysis, as the power grid sum-
mary information and to make decision thought on the
future operation mode of preparation and decision.
5. Dispatching Data Mining Analysis System
5.1. Dispatching Data Mining Analysis Methods
Data Mining (Data Mining) is to extract the implicit in
the work, people do not know in advance, but is poten-
tially useful information and knowledge from a large
number of, incomplete, noisy, fuzzy, random data.
1) Day analysis: The aim is to find grid operation con-
trol problems and deviations of yesterday, and find out
the reasons, and to remedy the problems in the next day's
operation, to continuously improve the level of opera-
tional contro l. The methods and procedures of data mining for the
data of grid dispatching are as follows:
1) Statistical analysis of indicators
2) Monthly analysis: Mainly through the analysis of
the type of operation plan, statistical analysis class indi-
cators to verify the adaptab ility of operating mode on the
month, in-depth study of the problems and deviations,
and improve it in the second month operation mode ar-
rangements, at the same time, to make recommendations
and comments for other departments and continuously
improve the scheduling refinement level.
It can create various indicators to analyze according to
the needs of data analysis, inclu ding:
a) Indicators statistics: Aggregate value, maximum /
minimum / average, expectations, variance, standard de-
b) Indicators score and early warning: You can score
all types of indicators, based on operating experience,
H. M. ZHOU ET AL.
determine early warning threshold on the basis of statis-
2) Multivariate correlation analysis
For any amount of view two or more groups, you can
associate the following methods analyzed in order to
study the association and mutual influence:
a) Qualitative analysis: You can observe the changes
of the curve and relevance by the use of data visualiza-
tion technology, the multiple dimensions of data accord-
ing to different categories, in 2D or 3D graphics on dis-
play, rotation, and perspective.
b) Quantitative analysis: Relevant statistics, analysis
and factors affecting the amount of view between the
correlation coefficient, cov ariance .
3) Other analytical methods
Data mining itself reflects a thinking that with the ap-
propriate mathematical methods to explore the law from
the data. Data mining does not require one or a few me-
thods to analyze data. In contrast, data mining method
broadly define to include all kinds of statistics, signal
processing, information science, computer graphics, arti-
ficial intelligence and other areas of different methods,
depending on the data content, you can use different me-
Commonly used methods include: cluster analysis,
classification, prediction, regression, power system analy-
sis the evolution, correlation analysis, correlation analy-
sis and other methods.
5.2. The Practical Application of Dispatching
Power dispatching data mining can be applied to the
following practical issues.
1) The analysis of the reasonableness of the electric
power grid power system.
Analysis methods: analysis of local power generation
planning and load correlation, analyze tie-line planning
and load correlation.
2) Analysis of electricity contract in real-time
Analytical purposes: to ensure the fairness of the pow-
er plant contract implementation and scheduling.
Analysis methods: monitoring all electricity plants
contract in real-time completion, monitoring all power
plants project plan and the actual deviation; by Dispatch-
ers three shifts and each team for the electricity comple-
tion of sequencing analysis.
3) Analysis of various factors in CPS
Purpose of analysis: analyze various factors on the
impact of CPS indicators, to improve scheduling policy.
Analysis methods: To list CPS-related factors, to stat
the correlation between the factors and CPS indicators by
the time dimension, to sort the factors relation, and to
give scheduling control recommendations to improve
Dispatching operational analysis and data mining system
described in this paper is the implementation and en-
forcement to the schedule of dispatching operational
analysis. The calculation and analysis of the dispatching
indicators gives the power grid dispatchers a more com-
prehensive, clear, quantitative understanding. The data
mining of the historic data can facilitate the understand-
ing of some complex related issues in the dispatching
operation. The author hopes that this research practice
could make contributions to the work of dispatching op-
 J. Q. Zhao, S. Yan, X. Xiao and Y. Z. Zhou, “Design and
Implementation of Double-core Redundant Power Grid
Dispatch Automation System,” Automation of Electric
Power Systems, Vol. 33, No. 21, 2009, pp. 101-103.
 S. M. Wang, S. N. Wu, D. L. Zhou and W. C. Wu, “Re-
search on Dispatch Training Base Construction Scheme
for Jiangxi Power Grid,” Electric Power, Vol. 42, No. 4,
2009, pp. 70-74.
 “ISO Market Monitoring & Information Protocol,” Cali-
fornia Independent System Operator Corporation FERC
Electric Tariff First Replacement, Vol. 2, No. 497.
 H. L. Jin and H. Liu, “Research on visualization tech-
niques in data mining,” Proceedings of the 2009 Interna-
tional Conference on Computational Intelligence and
Software Engineering, 2009, p. 3.
 A. Koretsune, S. Aoki, T. Konzo, H. Tsuji, S. Shimano
and E. Mimura, “DEA-based data mining for energy
consumption,” 10th IEEE International Conference on
Emerging Technologies and Factory Automation, Vol. 1,
2005, p. 4.
Copyright © 2013 SciRes. EPE