Robust Algorithm to Learn Rules for Classification - a Fault Diagnosis Case Study

Machine learning algorithms are used for building classifier models. The rule-based decision tree classifiers are popular ones. However, the performance of the decision tree classifier varies with hyperparameter tuning. The optimum hyperparameter values are obtained using either optimization algorithms or trial and error methods. The present study utilizes the MODLEM algorithm to overcome the drawbacks accounted for by decision tree algorithms. Eliminating hyperparameter tuning and producing results closer to standard decision tree algorithms makes MODLEM a robust classification algorithm. The robustness of the MODLEM algorithm is illustrated with the fault diagnosis case study. The case study is faults diagnosis of an automobile suspension system using vibration signals acquired at various fault conditions.


INTRODUCTION
Machine learning algorithms are widely used to build classifiers. Amongst them, automation engineers are specifically interested in decision tree-based classifiers. Decision trees give simple (if-then rules) that can be easily implemented in real-time systems (Microcontrollers/processors). Decision tree algorithms are relatively simple to use. However, one has to tune its hyperparameters for better performance. As parameter tuning is a tricky task, especially when the parameters are many, it is challenging for classifier designers. Hence, there is a need for the new algorithm to build a decision tree that does not require hyperparameter tuning. This paper illustrates the use of the MODLEM algorithm for building a decision tree that can be used for classification problems.
To illustrate the capability of the MODLEM algorithm, a case study of the fault diagnosis of an automobile suspension system is taken up and presented. The fault diagnosis method involves acquiring raw signals, feature extraction, feature selection, and building a classifier model.
Researchers propose many decision tree algorithms using rule-based classifiers to understand the classification process [1]. As a result, plenty of rule-based algorithms are proposed, such as J48, and one R. JRip, PART, etc. [2,3]. Among these classifiers, the J48 classifier is used in many applications due to its classification ability with the limited dataset and high classification accuracy.
The J48 algorithm is used in various applications in the area of medical industries for disease identification due to the fact that generated tree is understandable by the user and it is easy to derive rules from it [4]; for the same reason, it uses rule-based classifier J48 for identifying heart diseases is proposed by the author in [5]. The ability to create understandable rules is made way to use rule-based classifiers in mechanical industries, particularly in the area of fault diagnosis using vibration signals.
The authors, Subrata et al., proposed a method to identify faults in the gearbox (transmission device to increase torque) widely used in industries using the J48 decision tree classifier with an accuracy of 92% [6]. A method of gear fault diagnosis is proposed in automobile applications by author Praveen et al., where the J48 algorithm can distinguish faults with an accuracy of 99.70%.
The authors in [7] used the decision tree algorithm model to detect faults in centrifugal pumps using vibration signals with an accuracy of 100%. Similarly, the J48 algorithm is used in many other applications [8,9,10].
Based on the literature survey, the authors have identified the following research gap: limited numbers of research studies have been conducted on identifying faults in suspension components. The available research work has primarily focused on finding faults in specific components of the suspension system, such as the lower arm bush worn out, strut mount fault, and tie rod ball joint fault. However, these specific faults have received little attention. The existing research in this field has predominantly employed a data-driven approach using a vibrating platform to induce forced vibrations. Additionally, several studies have utilized standard algorithms to classify faults in rotary machinery like suspension systems. However, there needs to be more information regarding the tuning of hyperparameters in these algorithms. Moreover, these parameters will vary depending on the type of signal and the specific problem under consideration.
The authors propose a robust rule-based classifier called the MODLEM to address these challenges. This classifier aims to overcome the difficulties mentioned above associated with parameter tuning by providing a more effective approach for fault classification in suspension components.
The MODLEM classifier is a Meta classifier that uses an understanding of dataset characteristics to improve the algorithm performance and uses this acquired knowledge to assist in selecting a learning algorithm based on the dataset's characteristics.
The MODLEM classifier outperforms the other rulebased classifiers because it can generate automated rules that are easy to understand. Hence, the author in [11] used the MODLEM algorithm to identify liver diseases. Similarly, the authors in [12] conducted a comparative study of various rule-based classifier algorithms. In their study, MODLEM produced the second-highest classification accuracy. Hence, in the present study, the advantage and robustness of the MODLEM classification algorithm for fault diagnosis of rotary machinery are illustrated by benchmarking with the J48 algorithm with accuracy and confusion matrix of fault diagnosis of various suspension component faults. As only a few studies were conducted in condition monitoring of vehicle suspension systems [13,14,15], this paper illustrates the robustness of the MODLEM algorithm over the J48 algorithm for fault diagnosis of automobile suspension systems.

MODLEM ALGORITHM
In rule-based classifiers, when the same attributes are used again to describe some other object, the problem of inconsistency in deriving the rule arises [15] shown in Figure 1.
Stefanowski introduced the MODLEM (Modified Learning from Examples Module) algorithm to overcome the abovementioned drawback proposed by induct rules. It works through sequential covering and heuristically produces a small set of decision rules for every decision concept. Moreover, the rule can cover all the positive examples and omit the negative examples, as shown in Figure. 2.

Figure 1. The function of rule-based classifiers and MODLEM
The author used a relabeling filter and MODLEM algorithm to improve the classification accuracy with the dataset having an imbalance class (or minority class having less number of instances to learn from the data). Also, the author quoted that the advantage of the MODLEM classifier is that rules induced by the algorithm proved the best single classifier. Furthermore, it can handle various data properties and works with reasonable computational costs to generate rules. The algorithm can be used for two categories, one for descriptive analysis and another for predictive analysis. In the case of descriptive analysis, the algorithm describes a relationship between an attribute and an object by determining their dependency level. In the predictive analysis, the algorithm performs based on experience learning to identify an object for the given rules for the attributes [16,17].
The main procedure for rule induction order is shown in steps as follows:

Figure 2. Rule induction algorithm for knowledge acquisition
The main procedure for rule induction order is shown in steps as follows: Step 1: Formulating the first rule by the optimum selection of primary conditions based on selected criteria Step 2: Store the rule Step 3: Remove all the examples which learned positively from the stored rule Step 4: Continue removing all the examples till all the decision concepts are identified Step 5: If some of the positive examples are uncovered, then repeat the step 1 to 4 sequentially for the succeeding decision concept In the MODLEM algorithm, the process of rule induction involves handling numerical attributes and generating elementary conditions for the rules. These conditions are represented as either (ar < vl) or (ar ≥ vl); in this, 'ar' denotes the attribute, and 'vl' denotes the value. While building a rule, if the same attribute (ar) is selected twice, then it may be deducted as (ar = [vl1,vl2]) that results from an intersection of two conditions (ar < vl2) and (ar ≥ vl1) such that vl1 < vl2. For nominal attributes, these conditions are (ar = va) or could be extended to the set of values.
The algorithm for the MODLEM in given in Figure. 3 (Stefanowski, 2007). Ability to handle various data properties, such as numerical attributes, without its pre-discretization.
3. Ability to work as an ensemble classifier.

4.
Ability to handle an imbalanced data set and provides optimum classification accuracy. 5.
No need to tune numerical hyperparameters like other rule-based classifiers.
6. Consistent performance. 7. Easy to understand the classification process from the rule 8. Much more insights about the data set is possible Cons.
1. Required additional computation time and resources compared to other rules-based classifiers 2. Less popular among the researchers Hence, in order to test the performance of the MODLEM classifier, the following study was conducted to benchmark the MODLEM classifier's performance with the J48 algorithm.

EXPERIMENTAL DESCRIPTION
The experiment is conducted to identify the fault at the earliest to prevent further damage, which is necessary to maintain the safety and reliability of the vehicle since the suspension is used to maintain uniform contact between wheels with the road surface [18,19].
The experiment was carried out on test setup, and the vibrational signals were acquired, simulating the suspension's real-time working at different load conditions. The experimental test setup used in the studies to acquire data is shown in Figure. 4a. The test setup consists of suspension system components (strut, lower arm, knuckle, tie rod wheel, and loading system). A piezoelectric sensor was deployed on the lower arm to measure the vibration signals.
The vibration signals were acquired with a sampling frequency of 20 kHz using a data acquisition system (National Instruments) and LabVIEW software. The acquired signals for each fault condition are portrayed in Figure. 4b. The collected data consist of eight states of the suspension system, namely, Good, lower arm bush worn out (Labwo) [20], strut mount fault (Stmf), lower arm ball joint fault (LABJF) [21], strut worn out (Stwo) [22], strut external damage (Sted), tie rod ball joint fault (Trbjf) [23] and low wheel pressure (Wlp) [24]. The image of the faulty components considered for the study is given in Figure 5.
A detailed data set description is given in Table 1.

FEATURE EXTRACTION
The feature extraction was done to extract useful information by recognizing the pattern from the raw signal. Observing the time domain plot of all the fault signals implies that the acceleration amplitude values vary from class to class. Histogram plots can bring out such variations in amplitude and pattern in the vibration signals of various conditions. The bins of the histograms can be chosen such that the amplitude differences of vibration signals form a unique pattern for a particular condition.

DIMENSIONALITY REDUCTION
Feature selection is a crucial step in the classification process as it involves removing irrelevant features that have no impact on the classification outcome, thereby reducing computational load. This process is also known as dimensionality reduction. Figure 5 illustrates the feature engineering process, which encompasses feature extraction and selection. During the feature engineering process, the J48 algorithm's decision tree was utilized to identify significant features [30][31][32][33]. The decision tree consists of multiple nodes that follow a top-to-bottom flow. The topmost node is known as the root node, and branching occurs until a leaf node (class) is reached. Figure 6 and Figure 7 depict sample decision trees generated by the J48 algorithm.

FAULT CLASSIFICATION USING J48 AND MODLEM CLASSIFIER
In this section, the performance of the MODLEM classification algorithm at various load conditions was benchmarked with the J48 algorithm to verify the advantage of MODLEM in the fault diagnosis process. For that, the selected statistical features and histogram features were used in the classification processes, and the performance measure, namely, classification accu-racy and confusion matrix, is considered. The hyper-para-meter setting for the J48 algorithm and MODLEM at the time of the fault classification process is described in Table 2. Batch size 100 C-confidence factor varied from 0 to 1 in steps of 0.1 (shown in Figure 8) M-minimum number of object varied from 5 to 100 in step of 5 (shown in Figure  9) MODLEM Classification strategy m estimate Conditional measure conditional entropy Rules type lower approximation of certain rules Figure 9 and Figure 10 display the plots drawn between the confidence factor and J48 classifier perfor-mance. From the plots, one can observe that the op-timum confidence factor for each load condition varies. From Figure 8, the optimum confidence factor for sta-tistical features is 0.1, 0.1, & 0.2 for no load, half load, and full load conditions, respectively. Similarly, for histogram features, the confidence factor of 0.1 produce maximum classifier performance in all three load conditions.  Figure 11 and Figure 12 illustrate the rela-tionship between a minimum number of objects and classifier performance. From these plots, one can un-derstand that the minimum number of objects required considering for rule-making for all load conditions is five for both statistical and histogram features, respectively.

Verification of classification Performance of the MODLEM at different load conditions
The performance measure discussed here is classification accuracy from Figure.12. One can observe that the accuracy of MODLEM is significantly higher compared to the J48 algorithm in all load conditions. Also, the improved performance can be seen even in different data sets of histogram features. Similarly, the second performance measure, the confusion matrix, is used in many applications to know the actual classification happened. The row-wise element in the confusion matrix indicates the actual class, and columnwise indicates the predicted class, as shown in Table 3.  Figures 13 and 14, one can understand that the MODLEM algorithm has higher classification accuracy compared to J48 in both cases of statistical and histogram features. Also, from the confusion matrix given in Tables 3 and 4, the classification capability of the MODLEM is similar to that of the J48 algorithm. Table 5 provides a breakdown of the class-wise accuracy of the MODLEM algorithm, representing the performance of the MODLEM classifier in terms of true positive rate (TP), false positive rate (FP), precision (Pr), recall, and F-measure. TP measures the proportion of instances correctly classified as "good," while FP represents mistakenly classified instances. In an ideal classifier, TP should be close to one, and FP should be zero.
From Table 5, the average TP value is greater than 0.8. Hence, the rule-based classifier will be best suitable for this type of specific problem. Precision (Pr) refers to the probability of correctly classifying retrieved instances for a specific class [32]. t is calculated as the ratio of true positive (TP) to the sum of true positive and false positive instances (TP+FP). Precision is also known as the positive predictive value and serves as a measure of accuracy or quality.
Recall, also known as sensitivity, represents the ability of the classifier to correctly classify instances (TP) out of the total number of instances (TP+FN). False-negative (FN) instances are considered type 2 errors, indicating cases where the classifier misclassifies the actual category. The F-measure is defined as the harmonic mean of both recall and precision. It can be seen as an approximate average of recall and precision. When recall and precision values are close, the Fmeasure is generally the square of the geometric mean divided by the arithmetic mean. The f-measure is expressed as

CONCLUSION
This study proposes the use of the MODLEM classifier to identify rules for classifier design. A comparative study was conducted using a dataset acquired from a specially designed suspension test setup, aiming to classify multiple faults in suspensions. The results demonstrate that the MODLEM algorithm outperforms the J48 algorithm, which required parameter tuning for each dataset. In the current case study, the MODLEM algorithm achieves an average classification accuracy of 91.42% for statistical features and 84.67% for histogram features when classifying faults. Furthermore, the experiment highlights the applicability of the MODLEM classifier in various industrial applications due to its robust performance compared to standard tree-based classifiers used for fault identification in rotary machinery based on vibration signals.