Synchrophasors are the state-of-the-art measuring devices that sense various parameters such as voltage, current, frequency, and other grid parameters with a high sampling rate. This paper presents an approach to visualize and analyze the smart-grid data generated by synchrophasors using a visualization tool and density based clustering technique. A MATLAB based circle representation tool is utilized to visualize the real-time phasor data generated by a smart-grid model that mimics a synchrophasor. A density based clustering technique is also used to cluster the phasor data with the aim to detect contingency situations such as bad-data classification, various fault types, deviation on frequency, voltage or current values for better situational alertness. The paper uses data from an IEEE fourteen bus system test-bed modeled in MATLAB/SIMULINK to aid system operators in carrying various predictive analytics, and decisions.
Wide Area Management Systems (WAMS) are being developed by upgrading the existing power grids to enhance the abilities of the grids. Synchrophasors are the units that have the ability to measure various parameters such as voltage, current, and frequency of the lines at a sampling rate of 30 to 120 samples per second [
The IEEE fourteen Bus Test Case [
and load points which are added and removed to specific busses during the simulation period. There are several conditions such as fault types, topology, generation, and load changes) introduced into the IEEE test system. The colored blocks in
Data visualization (DV) is the study of the visual representation of data. The main purpose of DV is to transform the complicated information into a visually interesting representation for human observation and understanding. Generally, synchrophasors stream the grid data continuously to the operations center at a very high sampling rate. In order to make use of streaming phasor data, a tool is needed to be developed that would meaningfully represent the data streamed in from the synchrophasors for the system operators to monitor. The authors here develop a unique tool based on MATLAB in order to monitor the grid data. Representation can be in any form but it needs to communicate the information of all the data precisely [
The circle representation is the process of representing the angular value in conjunction with a scalar quantity. For suppose, a voltage phasor at any point has a phase angle as well as the magnitude. Phase angle generally varies from 0 to 360 degrees (−π to +π) and the magnitude is always positive. Thus, visualizing the voltage phasor in terms of circle forms a perfect DV. First three columns of
Data Mining (DM) is a process of extracting useful information from the data-sets. The DM is used in various disciplines such as medicine, engineering and technology, sciences, many more. DM is mainly categorized into two types. They are: data classification and data clustering [
A-phase magnitude | B-phase magnitude | C-phase magnitude | A-phase angle | B-phase angle | C-phase angle |
---|---|---|---|---|---|
0.8290 | 0.8291 | 0.8291 | 3.418616 | −116.5805 | 123.4187 |
0.9817 | 0.9817 | 0.9817 | −5.93905 | −125.9382 | 114.0611 |
1.099 | 1.099 | 1.099 | −11.69568 | −131.6951 | 108.3043 |
0.9817 | 0.9817 | 0.9817 | −5.939051 | −125.9382 | 114.0611 |
advantageous over the classification techniques if the data properties are not known. These clustering techniques are also adaptable to changes and helps single out useful features that dissimilar clusters. If one want to discover cluster or groups of data with arbitrary shapes, then density based algorithms are one of the best methods available. These typically regard clusters as dense regions of objects in the data space that are separated by regions of low density (representing noise). The algorithm this paper considers is called as Density-Based Spatial Clustering of Applications with Noise (DBSCAN). The MATLAB function for DBSCAN utilized for our data clustering.
DBSCANDBSCAN is a density based clustering algorithm that forms clusters based on the density of data points [
1) Minimal requirements of domain knowledge to determine the input parameters.
2) Discovery of arbitrary shaped clusters because; the shape of clusters in spatial databases can be more than circular in nature.
3) Good efficiency on large databases.
DBSCAN considers two parameters as input excluding the data needed to cluster. They are ε (Eps) and MinPts. DBSCAN works with the following methodology:
Step 1: Initially, select a point arbitrarily.
Step 2: Observe all the points that are density reachable from X w.r.t ε (Eps) and MinPts
Step 3: If point X is a core point, a cluster is formed.
Step 4: If point X is a border point and no points are density reachable from , Then DBSCAN visits the next point of the data base.
Step 4: This process from step1 to step 4 is continued till all the points have been processed or when no new point can be added to any cluster.
All the points are clustered into three types that are called as Core, Border and the Noise points.
The pseudo code for DBSCAN can be written as follows:
Pseudo code for DBSCAN clustering
Input:
= Minimum distance between two points to be clustered (D).
MinPts = Minimum number of points that should be in a cluster to consider it a border group (Pt).
Output: L =
DBSCAN (Input Set (X), Pt, D)
foreach xi in the Input set do
If (xi is not in any cluster) then
If (xi is a core point) then
Generate a new clusterID.
Label xi with clusterID.
expandCluster (xi, X, Pt, D, clusterID)
else
label (xi, NOISE)
end
end
end
end
expandCluster (xi, X, Pt, D, clusterID)
put xi in seed queue
while the queue is not empty
extract c from the queue (where c ϵ X and c ≠ xi)
retrieve the neighborhood (eps) of c.
If there are atleast minPts neighbors
foreach neighbor n
If n is labeled NOISE
Label n with clusterID
If n is not labeled
Label n with clusterID
Put n in the queue
end
end
end
end
This section explains the various case studies that have been studied using DBSCAN from the data generated from IEEE fourteen bus system. The different cases study on how the DBSCAN clusters are the variations in the fourteen bus model. The four case studies considered here are:
1) Normal steady state;
2) High load condition;
3) Light load condition;
4) Fault conditions.
The test bed is run for 1 second for each case at a sampling rate of 0.02 seconds. Therefore, the granularity of the data points is very high. This data generated from the model mimics the synchrophasor data and is visualized through circle representation. It is also then provided as an input to the DBSCAN clustering algorithm. The initial parameters for the DBSCAN depend on various characteristics of the data and it is very difficult to come up with a value for them. The number of points, the threshold levels required can be the main problem solving terms and choosing the right values would give you the right output from the DBSCAN.
Under normal conditions, all the systems will be running normally and DBSCAN clusters based on its principles of core (green), border (yellow), and noise (red) data points. The left graph in
In real-world systems, heavy load conditions prevail most of the times. The power grids are often overloaded with demands over the generation capacities. The right side graph in
Light loaded situations occur when demand decreases and the generation levels remain high. The left side graph in
points in this light-load condition are shifted towards higher voltages. In comparing
A fault condition in the test system refers to a three phase to ground fault that occurs during the time span of 0.6 and 0.7 seconds on the two transmission lines in the system. The right side graph of
This paper emphasizes the need of the data mining for the smart-grid. An application of density based clustering algorithm, DBSCAN, has been proposed, and different case studies have been developed using the IEEE test system in MATLAB to study the DBSCAN clustering characteristics for the smart-grid data. The authors are also developing a data analysis framework for smart-grid with wide ranging data mining and visualizing techniques that will be made available for system operators.
This research work is possible with grant support from ND EPSCoR (UND0014140) and the Office of the VP (21418-4010-02294).