H. LIN ET AL.
140
Olkin (KMO) test and Bartlett’s test of sphericity to test the
data whether they are fit for FA. The values of KMO statistic
between 0.7 and 0.8 are good, values between 0.8 and 0.9 are
great and values above 0.9 are superb (Hutcheson & Sofroniou,
1999). For these samples, the value is 0.732, which falls into
the range of being good, so we are confident that the FA is
appropriate for these data.
The Bartlett’s test of sphericity measures the null hypothesis
that the original correlation matrix is an identity matrix. For a
satisfactory FA to proceed, some relationship between variables
are needed, in other words, we want this test to be significant as
a significa nt test tell s us the matrix is not an identity matrix. For
these data, Bartlett’s test is highly significant (p < .001). There-
fore, FA is appropriate (Field, 2005).
We adopt principal component analysis to extract factors and
varimax rotation method to progress factor rotation. Eigenval-
ues, the percent of variance attributable to each factor and the
cumulative variance of the first 14 factors are shown in Table 2.
From Table 2, we can see that the cumulative variance of the
first 14 factors has reached 91.129%. To ensure adequate in-
formation is maintained and the eigenvalue is greater than one,
ultimately there are 12 factors to replace the original 24 vari-
ables and 88.769% of total information is guaranteed.
According to rotated component score coefficient matrix,
every factor can be expressed by a linear combination of the
original variables. The main indices (loadings are more than 0.2)
explain the 12 factors are shown in Table 3. Further, if we use
each factor’s percentage of variance explained as weight and
sum up these 12 factors, the composite score of each sample
can be gained.
From the above analysis, only 12 factors can well reflect the
88.769% information of original 24 variables, thus greatly re-
ducing the dimension of evaluation and the correlation between
variables.
Back Propagation Neural Networks
Back propagation neural networks (BPNN) are multilayer
feed-forward neural networks (NN) based on back-propagation
algorithm. The nonlinear processing ability of BPNN can proc-
ess cognitive judgments in various complex environments ef-
fectively such as vague, incomplete and conflicting information.
It is the most widely used NN model.
BPNN is a supervised learning algorithm. It is necessarily a
multilayer perception (with input layer, hidden layers and out-
put layer). The network structure of BPNN is in Figure 1: input
vector is 1
, 2
,, m
and output vector is 1, 2,,
n. The learning process can be divided into two phases: (a)
the information flow goes through input layer, hidden layers,
output layer; (b) error back propagation network process, if the
NN model does not get expected output value in output layer,
the error signal propagates backward along the original path-
way layer by layer, and adjusts its weights and threshold value.
y y
y
Kolmogorov theorem of neural networks has proved that a full
learning three-layer BPNN can approximate any functions.
Therefore we choose a three-layer BPNN with only one hidden
layer. There is no theoretical guidance in selecting the number
of hidden layer nodes currently. Too many nodes will increase
the training time and weaken the networks’ generalization and
predictive ability, while too few nodes cannot reflect the corre-
lation between the follow-up value and previous value and the
model is insufficient. The number of nodes in hidden layer can
refer to the following formula: 1
mmna, whe1
m
is the number of hidden layer nodes, 1
m is the number of
input layer nodes, n is the number of output layer nodes, a
is a constant between one and ten (Rafael, 200
re
4).
Sample Classification and Neural
Networks Design
Choose 10 testing samples from 92 samples as a test set. In
test set each of the five states (not-at-all, a little bit, moderately,
quite-a-bit, extremely) has two samples respectively. The other
82 samples are as a training set.
To reflect the difference whether carrying out FA before
BPNN or not, we design two BPNN structures. In particular,
we name the former one as factor-analysis-back-propagation
(FABP) neural networks, while name the latter one as non-
factor-analysis-back propagation (NFABP) neural networks.
FABP: after FA, 12 factors are gained. Take the values
of 12 factors in training set as input, well then the num-
ber of input layer nodes is 12. Take the corresponding
composite scores of training samples as output and then
the number of output layer nodes is one. According to
the formula 1
mmna
and experiment results,
we decide the number of hidden layer nodes is eight. The
maximum training times are 1500.
NFABP: take the original data of training samples as in-
put, well then the number of input layer nodes is 24; take
the corresponding composite scores of training samples
Table 2.
Main variance explained.
Factors F1 F
2 F
3 F
4 F
5 F
6 F
7
Eigenvalues 3.025 2.874 2.226 2.100 2.067 1.994 1.917
Variance explained % 11.882 11.016 9.531 8.885 8.123 7.337 7.089
Accumulative variance explained% 11.882 22.898 32.429 41.314 49.437 56.774 63.863
Factors F8 F
9 F
10 F
11 F
12 F
13 F
14
Eigenvalues 1.882 1.843 1.756 1.087 1.052 0.876 0.853
Variance explained % 6.782 6.275 6.368 2.931 2.550 1.224 1.136
Accumulative variance explained % 70.645 76.920 83.288 86.219 88.769 89.993 91.129