^{1}

^{*}

^{2}

^{2}

^{1}

This paper presents the application of recurrence plots (RPs) and recurrence quantification analysis (RQA) in the diagnostics of various faults in a gear-train system. For this study, multiple test gears with different health conditions (such as a healthy gear, and defective gears with a root crack on one tooth, multiple cracks on five teeth and missing tooth) are studied. The vibration data of a gear-train is measured by a triaxial accelerometer installed on the test. Two different support vector machine classifiers are trained and compared. Mutual information is used to rank the extracted features in order to select an optimal subset that provides as much information as possible about the intrinsic dynamics of the system. Results indicate that our approach is quite efficient in diagnosing the status of the health of the gear system and characterizing the dynamic behavior.

Machine condition monitoring techniques have generated considerable recent research interest due to their important role in preventing consequential damages before they develop into a catastrophic failure. Furthermore, condition monitoring techniques increase lifespans of systems and decrease maintenance costs by shifting maintenance from time-based to event-driven procedures, which offer significant economic benefits. Hence, it is important to develop efficient algorithms and techniques to monitor the status of machines and identify abnormalities.

A major current focus in fault diagnostics of gears is vibration and acoustic methods due to the valuable information they contain about the condition of rotating machines such as gears [

Various techniques have been developed to study gear fault detection and diagnostics [

In previous work [

The present work investigated vibration data of a helicopter gearbox mock-up system (5 m long). For this study, multiple test gears with different health conditions such as healthy gears (H) and defective gears with root crack on one tooth (SCD), multiple cracks on five teeth (MCD) and missing tooth (MTD) are studied. The vibrational signals are recorded using a triaxial accelerometer installed on the test gearbox.

The rest of this paper is organized as follows. In Section 2, the mathematical details of the RPs and RQA are introduced. Section 3 represents the experimental setup of the gear-train and measurement of data. Section 4 discusses the analysis process of the RPs and RQA. In Section 5, different fault classification models are compared and the feature ranking technique is explained. Finally, Section 6 summarizes and concludes the paper.

The recurrence plot analysis for time series is based on the analysis of a matrix R whose elements are defined as:

R i j = ( 1 , Φ i ≈ Φ j , 0 , Φ i ≠ Φ j , i , j = 1 , ⋯ , N , (1)

where Φ i = ( ϕ 1 i , ϕ 2 i , ⋯ , ϕ m i ) is a m-dimensional state vector, N is the length of the time series, i and j are the row and column indices of the matrix respectively, and Φ i ≈ Φ j means equality up to an error ϵ . In this paper, the vibrational acceleration of the test gearbox in x, y and z directions is considered as the state vector as follows:

Φ i = ( a x ( i ) , a y ( i ) , a z ( i ) ) (2)

The elements of the matrix R are thus obtained by comparing the state of the system at time i and j with a threshold precision ϵ . Thus, formally, one has:

R i j = θ ( ϵ − ‖ Φ i − Φ j ‖ ) , (3)

with ‖ . ‖ being the Euclidian norm (L_{2}-norm) and θ ( y ) is the heaviside function defined as:

θ ( y ) = 1 for y > 0 and θ ( y ) = 0 for y < 0

The threshold precision ϵ is a crucial parameter in RP analysis. The 5% maximal phase space diameter rule of thumb is used to select the optimal threshold.

Once the R matrix is constructed, the RP graph is obtained by plotting the R i j points in the i and j plane with different colors. By definition, RP graphs are always symmetric ( R i j = R j i ) and always have a central diagonal.

In order to go beyond the qualitative impression given by RPs, complexity measures have been developed that quantify the structures of RPs and are called recurrence quantification analysis (RQA) [

・ Recurrence rate (RR)

The recurrence rate is the simplest RQA parameter and measures the density of recurrence points in a recurrence plot. In other words, RR indicates the percentage of the plot occupied by points.

R R = 1 N 2 ∑ i , j = 1 N R i , j ( ϵ ) (4)

・ Determinism (DET)

The determinism is the percentage of recurrence points that form diagonal lines in the recurrence plot of minimal length l m i n .

D E T = ∑ l = l min N l P ( l ) ∑ l = 1 N l P ( l ) (5)

where P ( l ) is the frequency distribution of the lengths l of the diagonal lines. In this work, l min = 2 is used. This measure is critical in determining the nature of the process (deterministic vs. stochastic). Recurrence plots of a deterministic process usually contain more and longer diagonal lines compared to a stochastic process.

・ Laminarity (LAM)

In the same way, the amount of recurrence points forming vertical lines can be quantified by laminarity.

L A M = ∑ v = v min N v P ( v ) ∑ v = 1 N v P ( v ) (6)

where P ( v ) is the frequency distribution of the lengths v of the vertical lines, which have at least a length of v m i n . In this work, v min = 2 is used.

・ Longest Diagonal Line (LMAX)

LMAX is the length of the longest diagonal line.

L M A X = max ( { l i ; i = 1 , ⋯ , N } ) (7)

・ Trapping Time (TT)

The trapping time measures the average length of the vertical lines.

T T = ∑ v = v min N v P ( v ) ∑ v = v min N P ( v ) (8)

・ Entropy (ENTR)

The probability that a diagonal line has exactly length l can be estimated with p ( l ) = P ( l ) ∑ l = l min N P ( l ) . ENTR is the Shannon entropy of this probability, which reflects the complexity of the RP in respect to the diagonal lines.

E N T R = − ∑ l = l min N p ( l ) ln p ( l ) (9)

Two RQA parameters were excluded from the study. The divergence parameter which is the inverse of the longest diagonal line D I V = 1 / L M A X was not included in the study to avoid any redundancy because of it is direct relation with LMAX parameter which is included. Also, trend parameter was not included because of the constant operating speed condition we have in this study where trend has valuable information about the stationarity of the system and that is already considered.

The current work involved investigating a mock-up of a helicopter gear box system. All of the test data was acquired by collaboration with the United Technologies Research Center (UTRC). The gear-train experimental setup (shown in

The vibrational signals were recorded using a triaxial accelerometer installed on gearbox number 3. The vibrational data was measured at the sampling

frequency of 102,400 Hz. The rotational speeds of shafts A, C, and B were measured using two encoders and a tachometer. Two encoders were installed at shaft “A” (input shaft) and shaft “C” (output shaft) to measure their rotational speeds with a 360 pulse/rev resolution. The tachometer on shaft “B” was used to measure the shaft rotational speed at a rate of 1 pulse/rev. Due to the gear teeth ratio, the test gear shaft operates at the same speed as shaft “B”. In this study, the motor was operating at a rotational speed of 900 rpm while the test gear shaft was running at 94 rpm for the different gear conditions. The vibrational signals of gearbox number 3 were recorded for 64 seconds for healthy, single crack tooth and multiple crack teeth conditions, and for 3.2 seconds for the missing tooth condition. Samples of the measured vibration for different gear conditions in the three directions are shown in

An overview of the fault detection method used in this paper is summarized in

The red vertical lines mark the locations of the pulses in the tachometer signal and their corresponding locations in the acceleration signal.

The segment x k ( i ) , for i = 1 , 2 , ⋯ , N is the vibration data of revolution k of a total number of K revolutions. The total number of data segments that were obtained is 341 segments including 98 data segments each for healthy, single crack defect, and multiple crack defect conditions while 47 data segments were obtained for the missing tooth defect condition.

In order to obtain the recurrence matrix and plots, we need to reconstruct the state space from the acceleration time series. As discussed earlier, the vibrational acceleration of the test gearbox in x, y and z directions is considered as the state vector as indicated in Equation (2). By application of the method explained in section 2, the RP of the state vector for each data segmentation is constructed. Samples of the RPs of the same state vector length (6000 points) for different gear conditions are presented in

By inspecting the RPs, the following observations are made. First, diagonal segments of varied lengths that are parallel to the main diagonal (line of identity LOI) are the main pattern of the plots. A diagonal line occurs when the trajectory visits the same region of the phase space at different times. This is also true for diagonal segments that are orthogonal to the LOI but with reversed time sequence. These orthogonal segments appear in the multiple crack (state points within the range of 5000 - 6000) and missing tooth (randomly distributed) gear conditions. Second, small vertical segments are observed in some recurrence plots, i.e., the gear with the missing tooth condition. This indicates that the state at that location changes slowly or does not change. Third, white bands with different widths appear in the RPs at different locations i.e., in the missing tooth condition (state points within the range of 4500 - 5000). These white bands appear to form a rectangle in the center of the healthy RP and an upper right edge of a rectangle in the single crack RP, which usually develops when some states are rare to occur. Finally, the missing tooth condition is easily distinguished from the other gear conditions. It has a higher density of points compared to the other gear conditions. Other than that, the plots consist of complicated patterns which are hard to interpret. Hence, quantitative measures are necessary to obtain a more objective way to investigate the plots.

instance, the following can be observed. The healthy gear condition has positive RQA parameters relative to other gear conditions, which indicates higher RQA parameters than the average of all four gear conditions. On the other hand, the multiple crack condition has negative RQA parameters implying lower RQA parameters than average. A closer inspection of the figure reveals interesting characteristics of the individual RQA parameters. First, RR for the missing tooth condition has the highest magnitude compared to other gear conditions. This supports the results discussed previously when the RP was analyzed (missing tooth RP is the most dense). Second, inspecting the trapping time and the entropy leads to the same conclusion; the lowest values of the TT and ENTR correspond to a gear system with multiple cracks and the highest values corresponds to a healthy gear system.

In the previous section, we have illustrated how the condition of the gear system can be detected using the RQA parameters. We were able to measure and represent these influences by quantitative criteria. In contrast to this, the diagnostics procedure is the inverse, where the system status will be predicted using its dynamic response. In order to do that, support vector machine will be used as a classifier.

Support vector machine (SVM) is a supervised learning model used for classification and regression. This classification technique exploits training data in order to define a hyperplane that maximizes the separation distance between different classes in the feature space [

max Q ( α ) = ∑ i = 1 N α i − 1 2 ∑ i = 1 N ∑ j = 1 N α i α j γ i γ j K ( x i , x j ) subject to 0 ≤ α i ≤ C and ∑ i = 1 N α i γ i = 0 (10)

where α i is the Lagrangian multiplier, x and γ are the training feature set and the desired output set, respectively. N is number of training samples and K ( x i , x j ) is the kernel function which maps the input features to a higher dimension space in order to change the features representation and to capture nonlinear patterns. For this study, two kernel functions are investigated: linear kernel, which can be expressed as:

K ( x i , x j ) = x i T x j

and radial basis (Gaussian) kernel, defined as the following:

K ( x i , x j ) = exp ( − ζ ‖ x i − x j ‖ 2 )

where ζ > 0 is the kernel scale which determines the width of the kernel function. A small ζ value defines a kernel function with a wide width. Finally C is the box constraint that controls the values of the Lagrangian multipliers. A higher value indicates a higher missclassification cost, leading to a more strict separation.

After solving the optimization problem and obtaining the optimal parameters α i , a new test example can be classified using:

g ( x ) = ∑ i ∈ σ α i γ i K ( x , x i ) + b (11)

where σ denotes the set of indices of the support vectors. Note that only the supports vectors play a role in making predictions for new data points. This stems from the fact that α i = 0 for data points that are not corresponding to support vectors. The bias b in Equation (11) can be calculated as follows:

b = 1 N ν ∑ n ∈ ν γ n − ∑ m ∈ σ α m γ m K ( x n , x m ) (12)

where ν denotes the set of indices of the points that have 0 ≤ α n ≤ C .

Originally, SVM was formulated for binary classification problems [

The RQA parameters, which represent the feature vector, are used as input to the SVM classifier. For each set of data, six parameters are used in the feature set for training. Various SVMs were trained using 60% of the data samples (202 cases) and a cross validation algorithm with five folds. 40% of the data samples (139 cases) were used for testing the classifier. To find an optimal hyperplane, an optimization problem needs to be solved, as mentioned above. However, another optimization problem is encountered in tuning some of the SVM parameters such as box constraint and kernel scale. To solve this problem, the k-fold cross validation loss was minimized by searching on a given range of each optimized parameter. Box constraint was optimized for the two SVMs (linear and Gaussian). Meanwhile, kernel scale parameter was optimized for Gaussian SVMs.

The fault classification process is divided into three parts: 1) Anomaly detection 2) Defect classification 3) The effect of ranked RQA parameters. The detailed description of the above mentioned procedures is provided in the sequel.

Anomaly detection is a technique used to identify abnormal classes or irregular behavior from what is defined as a normal standard. This has a lot of practical advantages in system condition monitoring applications such as the ability to distinguish defective classes from healthy. In this part of the study, two-class (normal or fault) detection was performed where all gear defects were grouped in one class (D) and the healthy gear condition was used to define the second class (H). Two SVMs were optimized (Linear and Gaussian) and trained using all RQA parameters. The effectiveness of the classification models for the training data is presented by means of confusion matrix plots. The test confusion matrices of the linear and Gaussian SVMs are shown in

In the confusion matrix, the diagonal cells show the number and percentage of correct classifications by the trained classifier, while the off diagonal cells represent the misclassified predictions. For example, in

Analyzing the confusion matrix is an important step in building a classification model. It gives strong clues as to where the classification model is going wrong. However, the number of misclassifications is not adequate to evaluate the performance of the SVM classifier. Taking only one performance metric is sometimes misleading. For example, a classifier with a relatively low misclassification rate might predict some of the classes fairly accurately but performs poorly for other classes, which might be critical to a certain application. Thus, to describe the performance of the classifier for the fault detection application, some metric rates are calculated from the confusion matrix such as precision (P), recall (R) and overall accuracy.

For each class, precision measures the rate of correct predictions out of all

predictions that were made by the classifier. In other words, when the classifier predicts a class, precision indicates how often the prediction is correct. Recall measures the correctly predicted rate of the actual samples for a given class. If the classifier has high recall and low precision for a certain class, this means that the classifier is biased to that class. A high precision and low recall classifier for a given class indicates that the classifier is too conservative. When the classifier predicts that given class, it is usually correct but it is highly unlikely to predict it because of its low recall rate. Ideally, a classifier with high recall and high precision is what we seek. Finally, the overall accuracy of the classier is the rate of the correct prediction.

For a binary confusion matrix, healthy condition is designated as negative class and defective condition is designated as positive class. True positive (TP) is correctly classified as a defective condition, false positive (FP) is incorrectly classified as a defective condition. In contrast, true negative (TN) is correctly classified as healthy condition and false negative (FN) is incorrectly classified as healthy condition. For example in the confusion matrix (

The generalized formulas of false negative F N i , false positive F P i , true negative T N i , true positive T P i , precision P i , recall R i and overall accuracy for a given class i are presented below. Note that M i j is the element ( i , j ) in the confusion matrix. This generalization is presented here so it can be applied later for multiple class confusion matrix problems.

F N i = ∑ j = 1 ( j ≠ i ) M j i (13)

F P i = ∑ j = 1 ( j ≠ i ) M i j (14)

T N i = ∑ j = 1 ( j ≠ i ) ∑ k = 1 ( k ≠ i ) M j k (15)

T P i = M i i (16)

R i = T P i T P i + F N i (17)

P i = T P i T P i + F P i (18)

Overall Accuracy = T P all Total test samples (19)

where T P all = ∑ j = 1 T P j is the total number of true positives and represents the summation of the confusion matrix diagonal. Now, by inspecting the confusion matrices in

This result is remarkable for several reasons. First, employing RQA parameters as SVM features can provide valuable information in characterizing the dynamics of various gear faults. This helps to discriminate healthy gear condition from defective conditions. Second, both SVMs have high recall and high precision for each gear condition, which indicates an accurate prediction for detecting the health status and identifying any abnormality including the three defective gear conditions (SCD, MCD, and MTD). Finally, no a priori knowledge of the system was included in the features. This implies that the RQA approach can be conveniently applied to diverse dynamical systems in an automated process, with minimal need for adaptation and reliance on expert knowledge about the system.

In this subsection, we studied the effectiveness of RQA parameters as features to classify different gear conditions including healthy gear and defective gears with single crack, multiple crack and missing tooth conditions. Two SVMs (linear and Gaussian) were trained as indicated previously by using optimal parameters then tested on a new set.

The test confusion matrices for the linear and Gaussian SVMs are shown in

Performance metrics such as recall, precision and overall accuracy were calculated for the linear and Gaussian SVMs. First, linear SVM shows a more balanced precision/recall trade-off for healthy and single crack defect conditions. Meanwhile, Gaussian SVM is more biased to the single crack condition over the healthy condition. Second, both linear and Gaussian SVMs have virtually perfect recall and precision for multiple crack and missing tooth conditions. Finally, the overall accuracies of linear SVM (97.1%) and Gaussian SVM (96.4%) are fairly close. The linear SVM is chosen to continue further analysis in the next subsection.

In general, RQA parameters and optimized SVMs seem to achieve significant results in identifying various gear conditions in a mock-up of a helicopter gearbox. Outstanding performance was achieved with 100% accuracy, 100% recall and 100% precision in detecting multiple crack and missing tooth conditions. The most challenging problem was distinguishing between healthy and single crack defect conditions where RQA parameters were capable of capturing the differences. A balanced classifier with 95.0% recall and 95.0% precision was achieved. In summary, the classifier is extraordinarily effective in predicting all gear conditions using the RQA parameters as features.

As previously indicated, different kinds of information were extracted from the RPs using the RQA parameters. For each gear condition, some of the RQA parameters are more correlated and informative than others. A robust and accurate prediction is achieved by extracting an efficient set of features that can characterize the system response in a unique way. The feature set should also provide as much information as possible about the intrinsic dynamics of the system. This motivates ranking and selecting a subset of relevant features to maximize the correlation between the extracted features and the predicted classes. Furthermore, this aids in understanding the contribution of each ranked feature to the classification process and to which gear condition the feature information contributes. To do so, mutual information is used as a feature ranking technique. This technique was developed in our past work [

The features are ranked based on the mutual information content between the feature subset and the gear condition using a greedy search algorithm [

_{1} score are calculated for each subset. F_{1} score is the harmonic mean of recall and precision, which compares both measures in one metric and is calculated as follows:

F 1 score = 2 P t R t P t + R t (20)

where R t and P t denote the overall recall and precision which can be calculated by averaging all the individual values of recall and precision.

_{1} score for each feature subset. A detailed discussion of the effect of adding each of the ranked RQA parameters to the feature subset is presented below.

・ The effect of laminarity (LAM)

Laminarity provides a high correlation to the multiple crack condition with high recall and precision rates. Detection of the healthy condition is in an acceptable range but it does not give decisive information. Furthermore, laminarity’s information content regarding the single crack condition can be considered unreliable. It has low precision, which makes the classifier biased to that condition. Finally, laminarity does not help in detecting the missing tooth condition because of its poor recall rate, which makes the classifier too conservative to make any missing tooth predictions. Even though the overall accuracy is 75.0%, this represents an example of relying only on the accuracy of the classifier. The classifier failed in predicting an entire gear condition, i.e., the missing tooth condition. Moreover, F_{1} score can not be determined due to the lack of the precision metric for the missing tooth condition (no prediction was made).

・ The effect of entropy (ENT)

Entropy adds an evident value for detecting the missing tooth condition. For example, 100.0% precision and 78.0% recall was achieved for the missing tooth condition. Also, entropy highly contributes to the prediction accuracy of the single crack condition by increasing the precision. This leads to a reduction in the classifier’s bias toward the single crack condition. Adding the entropy feature has enhanced the classifier’s ability to distinguish both the healthy condition and the multiple crack condition. The quality of detection has improved due to the information relevance contributed by the entropy with 92.0% overall accuracy and 92.0% F_{1} score.

・ The effect of recurrence rate (RR)

The recurrence rate’s effect is significant due to a notable improvement in detecting the multiple crack and missing tooth conditions. The quality of

classification of the multiple crack and the missing tooth conditions is virtually perfect. It can be seen that the recall of the healthy condition decreased while the precision increased. This is reflected in the improvement in detection of the single crack condition.

・ The effect of the longest diagonal line (LMAX)

No change is noticed on the performance metrics by adding the LMAX parameter. However, changing the order and trying different RQA parameters affects the performance negatively. This supports the order of features that was determined by the mutual information ranking technique. Additionally, features work together in a nonlinear fashion, where a certain feature might not give enough information but combining it with another can add more value.

・ The effect of trapping time (TT)

Trapping time has a positive effect on the healthy condition that is represented by an increase the recall rate. It also has a positive effect on the single crack condition represented by an increase the prediction precision. This is reflected in the overall accuracy and the F_{1} score, which reaches the highest values thus far with 97.8% and 98.0%, respectively.

・ The effect of determinism (DET)

In this particular configuration, determinism gave no information. Moreover, the recall and the precision of the healthy and the single crack conditions are decreased.

The optimal subset is subset number 5, which contains all RQA parameters except determinism, which adds no additional information for classifying the gear conditions in the study. Laminarity provides valuable information about the multiple crack condition. The entropy parameter correlates to the missing tooth condition. Only by using three features LAM, ENT and RR, the accuracy of 95.6% is achieved (0.16% from using all features).

In this paper, a mock-up of a helicopter gear box system was studied in order to detect and identify various gear faults. Four gear conditions including healthy, single crack, multiple cracks, and missing tooth were investigated under a constant operating condition. The RP method, which is based on visualizing high-dimensional dynamical systems using a two-dimensional plot, was applied. The RQA parameters were then used as an input into two SVMs. The fault classification process was divided into three parts: 1) Anomaly detection, 2) Defect classification and 3) The effect of ranked RQA parameters. Results indicate that RQA parameters provide valuable information in characterizing the dynamics of various gear faults in order to discriminate the healthy gear condition from defective conditions. Also, an outstanding performance was achieved using RQA parameters to identify various gear conditions with 100% accuracy, 100% recall and 100% precision in detecting multiple cracks and missing tooth conditions. In general, the classifier is extraordinarily effective in predicting all gear conditions using the RQA parameters. Finally, mutual information is used to rank the extracted features. An optimal feature subset was determined using LAM, ENT, RR, LMAX, and TT. A correlation between the RQA parameters used in the study and the different gear conditions was discussed.

This work is supported by the US Office of Naval Research under the grant ONR N00014-15-1-2311 with Capt. Lynn Petersen as the Program Manager. We deeply appreciate this support and are humbled by ONR’s enthusiastic recognition of the importance of this research.

Mohamad, T.H., Chen, Y., Chaudhry, Z. and Nataraj, C. (2018) Gear Fault Detection Using Recurrence Quantification Analysis and Support Vector Machine. Journal of Software Engineering and Applications, 11, 181-203. https://doi.org/10.4236/jsea.2018.115012