In this paper, a combination of data clustering and artificial intelligence techniques are used to predict incoming solar radiation on a daily basis. The data clustering technique known as Perceptually Important Points is proposed, where time-series data is grouped into clusters separated by key characteristic points, which are later used as training data for an artificial neural network. The type of network used is known as a Focused Time-Delay Neural Network, and an analysis of the data is performed using the Mean Absolute Percentage Error scheme.
Currently, fossil fuels such as crude oil, coal and gas are the main resources of energy that are used in today’s world. Fossil fuel reserves are expected to deplete in the near future, with studies showing the depletion of oil and gas as fuels expected to occur as soon as 35 to 37 years from the time of writing [
As a means to slow down the depletion of fossil fuels, renewable sources of energy are being utilized as alternatives to the aforementioned fossil fuels. Solar energy is one of the alternative sources of energy being researched, as the influx of solar radiation on the Earth’s surface is several magnitudes larger than the global power consumption of humanity as a whole [
The ability to accurately predict the incoming solar radiation is an important factor to improve the efficiency of a solar energy conversion system. One of the methods utilized to predict incoming solar radiation is the use of an empirical model. The empirical model is a technique which uses meteorological parameters as inputs to predict future values of solar radiation.
The main shortcomings of the empirical model are its focus on long-term prediction, its reliance on existing meteorological data, as well as its inability to identify abnormalities and account for sudden changes in data.
This paper proposes an alternative to the use of empirical models, which is the utilization of a combination of pattern recognition through data clustering techniques [
Perceptually Important Point (PIP) is a concept introduced by Chung FL, Fu TC, Luk R, and Ng V [
The process behind identifying PIPs is as follows:
1) The start and end of the graphs are set to be the starting and end points.
2) The point on the graph with the greatest vertical distance is set as the first PIP, and is also used as the end point of the first cluster, and the start point of the second cluster.
3) Subsequent PIPs are obtained by finding a point with the greatest vertical distance from the start and end points of each cluster, forming new clusters.
Algorithmically, the process behind segmenting the graph through PIPs can be explained as follows (
Points P1 and P2 on the time-series are established and a gradient is obtained between the two points.
A point on the gradient Pc and a point on the time-series Pn with the same x-axis value are obtained. Initially, (x1 + 1) is selected as the initial x-axis value.
The difference in values of the y-axis of points Pn and Pc is obtained, and set as distance d.
The x-value of Pc and Pn are incremented by one step and the distance d is calculated again. If the value of d obtained is greater than the previous value of d, the new value is stored. Otherwise, the new value is discarded and the old value of d is kept as the greatest vertical distance.
The x-value of Pc and Pn is repeatedly incremented until P2 is reached.
The time-series can subsequently be segmented into cluster 1 (data from points P1 to Pn) and cluster 2 (data from points Pn to P2) (
The process is repeated recursively for cluster 1 and cluster 2 to obtain points Pm and Pl, which segments the time-series into 4 clusters: P1 to Pm, Pm to Pn, Pn to Pl and Pl to P2.
The algorithm for training and utilizing a Focused Time-Delay Neural Network (FTDNN) can be obtained from MathWorks’ Neural Network Toolbox [
All values or names within <> braces are subject to change as per user specifications.
Firstly, the cluster dataset has to be loaded and converted into a time sequence using the following commands:
load
y = y(1: );
y = con2seq(y);
The FTDNN is then created using the following commands, with the tapped delay lines, hidden layer neurons and number of epochs being variable depending on optimal parameters:
= newfftd(y,y,[1: ], );
.trainParam.show = ;
.trainParam.epochs = ;
The prediction begins on the value in the series after the delay, and the initial values in the delay are also required to be loaded:
p = y( :end);
t = y( :end);
i = y(1: );
The network is then trained to perform one-step-ahead prediction:
= train( ,p,t,i);
The network is then ready for use through the calling of the network as a function:
= ( );
The resulting prediction can subsequently be converted for plotting:
= cell2mat( );
plot( );
Using the PIP method described earlier, 3 points are obtained. Due to the recursive nature of the PIP algorithm, an odd number of points will always be obtained.
For this paper, all solar radiation data is obtained from the Geography Weather Station of the National University of Singapore.
As shown in
The first cluster is selected to be data from minute 405 to minute 705. The point at minute 480 is ignored to reduce the number of clusters due to the similarity in trend from the data prior to and after minute 480.
The next 17 readings are used to form cluster 2, and the subsequent 20 readings are used to form cluster 3. Readings before cluster 1 and after cluster 3 are
ignored due to them being approximately zero, which are not required for use as training data as it indicates that there is no solar radiation incident during that period of time.
Using MAPE [
The resulting adjusted MAPE values of cluster 1, 2 and 3 are 3.890%, 4.129% and 1.180% respectively.
While MAPE is shown to not be entirely exact in portraying the accuracy of the prediction, it is sufficient to show a basic level of competency of the network in performing predictions.
In this paper, a system combining the use of data clustering and neural networks is proposed to optimize the prediction of solar radiation.
The paper provides a fundamental level of knowledge on Perceptually Important Points and the focused time-delay neural network that was used in the project.
The use of the methodologies discussed are not limited to the prediction of solar radiation, but can also be used in a more general case for other fields such as the prediction of stock market trends or water current strength for turbines.
The authors declare no conflicts of interest regarding the publication of this paper.
Chan, C.K. and Ler, Y.H. (2018) Prediction of Solar Radiation Using Data Clustering and Time-Delay Neural Network. Journal of Computer and Communications, 6, 91-97. https://doi.org/10.4236/jcc.2018.612009