The Detection of Sensor Signal Attacks in Industrial Control Systems

To improve productivity and efficiency in industrial manufacturing, the fourth industrial revolution leads to the implementation of Cyber Physical Systems (CPS) and Internet of Things (IoT) in the industrial environment. Ubiquitous communication makes CPS susceptible to external influences, which can have a negative intention; for instance, CPS are prone to various attacks and malicious threats by different adversaries. The impact of an attack on the system can lead to anomalies and serious consequences for system parts or the system as a whole. Security mechanisms must be developed in order to timely detect different attacks and to keep the system safe and protected. In this paper, a method for sensor signal attacks detection in a continuous time controlled systems has been proposed. The method is based on Support Vector Machines (SVM) and tested on the data obtained from the Secure Water Treatment (SWaT) testbed, a scaled-down plant that produces purified water.


INTRODUCTION
To meet the requirements concerning efficient and flexible production, manufacturing enterprises have to embrace forthcoming technologies -Internet of Things (IoT) [1] and Cyber Physical Systems (CPS) [2]. The utilization of these technologies in industrial processes leads to significant changes that are characterized as the fourth industrial revolution -Industry 4.0 [3,4].
CPS integrate cyber capabilities into physical elements through their augmenting by cyber modules with integrated computation and communication abilities. The devices within Industry 4.0 plant (e.g., sensors and actuators) represent CPS, i.e. smart devices that can perform local processing and communicate with each other. The control of industrial plant is carried out through intensive communication between the central control system and smart devices in the plant (Figure 1) or through mutual communication between smart devices in the case of distributed control. This implies fully implemented internet communication, usually wireless [5], which allows such systems to be controlled and monitored online. The widespread communication opens up a new space for potential cyber attacks. These attacks can have fatal consequences; they can disrupt or completely disable the functioning of the system, or even influence the safety. Therefore, the development of defense techniques with a high level of protection is necessary in order to make systems attack-resilient and keep its normal functioning. Timely detection of anomalies can be crucial for preventing possible negative consequences. This represents a challenging task since, as a rule, adversaries aim to remain stealthy during the attack.
Generally, malicious cyber attacks can be classified as Denial of Service (DoS) or deception attacks [6]. Making data temporarily or permanently inaccessible, DoS attacks obstruct data from reaching their destinations and in essence they are similar to failures in communication. DoS attacks are not adapted to the system on which they act, they have a general goal to disrupt system functioning. Unlike DoS attacks that do not aim to be stealthy, deception attacks are more difficult to recognize. Deception attacks are able to manipulate data at a higher level, sending corrupted data to the system components.
In network controlled continuous time systems, controller closes control loop over the communication network through intensive communication with smart sensors and actuators. On the controller sensor input deception attack, adversary changes real plant outputsensor signal x(t) to corrupted controller sensor input x*(t), whereas actuator output deception attack changes the real value of the controller output signal u(t) with corrupted value u*(t) (Figure 1). Deception attacks detection represents a challenging task, and a number of different techniques for solving this issue were recently proposed. All these techniques can be classified into two groups: (1) data-centric and (2) design-centric [7]. Data-centric techniques use collected system data, while design-centric techniques are based on system analytical model and its control algorithms. In [8], attack detection and identification issues are considered through the design-centric technique that uses the proposed CPS model. Different data centric techniques were developed as well. For example, an anomaly detection method based on convolutional neural networks is proposed in [9]. Research work from [10] presents methods based on deep neural networks (DNN) and one-class support vector machines. Autoregression modeling and control limits techniques deployed for PLC security monitoring in [11], also belong to the data-centric group. A comparison of the performances of linear classification methods (Logistic Regression, Lasso, Support Vectors based classification with linear kernel), decision trees, and fully connected neural networks in cyber attacks detection in networked industrial control systems is presented in [12]. The analysis has shown that among the considered techniques fully connected neural networks presented the best accuracy (around 80%).
In this work, we consider sensor signals deception attacks and we propose a method based on ε-insensitive Support Vector Regression (ε-SVR) for their detection. The performances of the method are evaluated using publically available Secure Water Treatment (SWaT) dataset [13].
The reminder of the paper is organized as follows. Section 2 briefly outlines the ε-SVR, whereas in Section 3 we present the developed method for sensor signal attacks detection. Section 4 refers to implementation and experimental evaluation of the proposed method using SWaT dataset. Finally, the conclusions and directions for future research are discussed in Section 5.

SUPPORT VECTOR REGRESSION
This section provides a short overview of ε insensitive support vector regression (ε-SVR) that represents a basis of the proposed method for sensor signal attacks detection. For a given training data set {(x 1 , y 1 ), (x 2 , y 2 ), ..., (x l , y l )}, where x i , i∈ [1, l] denotes input variables vector, and y i represents the corresponding response value, the aim of ε-SVR is to find a function f(x) which has a maximum ε deviation from y i , and simultaneously tends to be as flat as possible. Errors that are less than the value of ε are not considered. The function f(x) in the case of linear dependence is described as follows: ( ) The flatness of the function is achieved when w is small. Thus, given considerations can be formalized through the following minimization problem: which minimizes the norm of w while keeping all the data within ε tube around f(x i ). Expression (2) implies an approximation of all pairs (x i , y i ) with ε precision. As this case does not frequently happen in reality, deviations can be allowed by introduction of slack variables ξ i , ξ i * , in (2) leading to the following form of the optimization problem [14]: where constant C corresponds the tradeoff between function flatness and the number of points that fall outside the ε-tube. The optimization problem (3) can be represented in its dual form [14]: In (4), α i and α i * represent Lagrange multipliers. If the condition |f(x i )-y| ≥ ε is fulfilled, the Lagrange multipliers are non-zero. Otherwise, if the vectors are within the ε-tube, α i and α i * vanish (equal zero). Vectors corresponding to non-zero α i and α i * are called support vectors. The solution of the problem described in (3) is given by: where ns represents the number of support vectors, and f(x) becomes: ε-SVR can be extended to non-linear regression. This is achieved by mapping the input vectors into highdimensional space where the regression is linear. Since in (4), training input vectors are only present in inner product the space of higher dimension can be implicitly defined, i.e. it is only necessary to know inner product in this space and not the explicit transformation of data. Inner product can be defined using kernel K(x, x i ) (a function that satisfies the conditions of Mercer's theorem [14]) leading to the following form of relation (4) Accordingly, the function f(x) becomes: Some of the commonly used kernels are polynomial kernel, radial basis kernel, wavelet kernel, sigmoid kernel, etc. It is necessary to select the kernel that is effective for the given application. The kernel chosen for the application in this work is radial basis function defined as follows: where γ determines the width of the bell-shaped curve.

SENSOR SIGNAL ATTACKS DETECTION METHOD
The method for sensor signal attacks detection that we propose in this paper consists of offline ε-SVR training and online attack detection. During training, ε-SVR model of the sensor signal for the system operating under normal conditions is generated, whereas online attacks detection is based on the discrepancy between sensor signal values estimated using generated model and measured values ( Figure 2). In our approach, the current value of the sensor signal x i is estimated from the buffer of previous k values x i-k , ...x i-1 generating the independent variables vector Thus, in the training phase, ordered pairs of independent and response values are defined as follows: The buffer length k is one of the parameters that should be tuned during training in order to get as good a model as possible. Furthermore, ε-SVR parameters (ε, C, and radial basis function parameter γ) need to be set. Each combination of the parameters leads to one ε-SVR model.

Figure 2. Overview of the method for sensor signal attacks detection
For the selection of the most appropriate model, two criteria are employed: (1) number of support vectors (SV), and (2) the model accuracy over the whole dataset including test data. To keep the model as simple as possible and to reduce the computational complexity and the online estimation latency thereof, the number of the support vectors should be as low as possible. On the other hand, the reduction of the number of support vectors can lead to the inaccurate model. Thus, to preserve the quality of the estimation, we also consider the number (∆ out ) of estimated values over the whole dataset whose absolute errors with respect to response values exceeded the threshold m. The threshold is based on the mean absolute deviation between the real x i and the esti- Once appropriate model is selected during offline training phase, the online attack detection is carried out based on the difference of estimated and measured values. If the absolute error between measured sensor value and estimated value exceeds the detection threshold, the attack or another anomaly in the system performance is present. The attack detection threshold (∆ max ) represents the maximum absolute error between estimated and measured signal values on the whole dataset increased by 10%: Thus, during online detection, the attack occurs when the absolute difference between measured (x j ) and estimated ˆj x sensor signal value crosses over ∆ max :

IMPLEMENTATION AND EXPERIMENTAL EVALUATION OF THE PROPOSED METHOD
To test the performances of the proposed method, we have used the publicly available SWaT dataset. In this section, before presenting the results of the method implementation, we briefly overview the dataset and the system that was used as a platform for its generation.

Secure Water Treatment (SWaT) testbed
SWaT is functional scaled down water treatment plant at the Singapore University of Technology and Design, which is capable to produce 5 gallons of purified water per minute. It was built to provide data from the real system useful for research mainly in the field of cybersecurity [15]. The whole process of water purification is divided into 6 serially interconnected stages (P1 to P6), as shown in Figure 3. Each of the stages contains a number of sensors and actuators, from which the required signals are obtained. Sensors and actuators are connected via wired or wireless links to the appropriate PLCs (Programmable Logic Controller). Further, all PLCs within the SWaT are connected to the SCADA (Supervisory Control and Data Acquisition) system. Data from all sensors and actuators were recorded every second for 11 days. The data is generated in 2 cases: (1) under normal conditions (without attack), and (2) when the attacks are present. A total of 41 attacks have been created, of which 36 have a physical impact on the system. Regarding the location of the attack, they can be simultaneously focused on a single or on multipoints within one or more stages, and each of the attacks is labeled along timeline. Attacks are of different types, duration, intensities, and therefore impact the parts of the system and/or the system as a whole.
In the focus of this paper are anomalies/attacks affecting the LIT101 sensor, a water level sensor on raw water tank, whose location is at the first stage, as shown in Figure 3. On the LIT101 sensor signal attack, the water level in the tank can come out of the permitted range (underflow/overflow), or other damages to the system such as pump P101 damage and UF Feed Tank water level underflow/overflow can occur.

Experimental results
Described method for attacks detection was implemented in MATLAB and tested using LIT101 sensor signal data, a part of which (for the normal system performance) is shown in Figure 4. Before using SWaT data in offline training, it is necessary to cut off the part that relates to the operation of the system before establishing a stable operating mode as presented in red line in Figure 4.
In the case of LIT101 sensor, this means that the tank should get into the continuous mode of charging and discharging. Therefore, the first 15400 data records were removed, leaving a normal operation dataset with a total of 481400 records. 10% of normal operation dataset are used for ε-SVR training. A number of models for different parameters' values were tested. The experiments have shown that the error cost parameter C and kernel parameter γ do not have significant influence on the model accuracy. On the other hand, we have varied the buffer size k in the range of 2 to 10 and the ε between 0.03 and 1. The best result for both criteria (707 support vectors and ∆ out =0) was achieved for parameter values ε=1 and k=2. The online application of the method employing the selected model was simulated using the data obtained during system performance under attack. This dataset contains 449919 records. Our method was able to appropriately detect all five attacks on the LIT101 sensor, as presented in Figure 5. In Figure 5, the signal values for LIT101 from the SWaT dataset during attacks on LIT101, and their predicted (estimated) values are shown in blue and red line, respectively.  (Figure 5c).
In addition to the LIT101 sensor signal attacks, the method detected two attacks on the adjacent devices that had the influence on the water level of raw water tank ( Figure 6).
In particular, single stage single point attack on raw water inlet motor valve MV101 (attack 1 - Figure 6a) and multi point multi stage attack on P101 and LIT301 1 Attacks are labeled as presented in [13] (attack 26 - Figure 6b) were detected. On the other hand, the method presents false positives neither on dataset with attacks nor on dataset obtained during normal system functioning.

CONCLUSSION
This paper proposed a method for sensor signals attack detection in remotely controlled continuous time systems that is based on the prediction of sensor signal values using ε-insensitive SVR. The method is data centric and it is based on ε-SVR model obtained from the sensor signals values acquired during normal system operation. For the evaluation, we used the raw water tank level sensor -LIT101 signals from SWaT testbed dataset. As presented in the paper, the method was able not only to efficiently detect the attacks on the sensor signals, but also some of the attacks on the adjacent actuators, without false positives. It is worth noting that there is a low latency of the detection with respect to the moment of the attack occurrence in the signal -it is at the level of 1 sample. During implementation, a number of parameters should be set. In this paper we chose their values by trial and error. A method for optimization of these parameters represents an avenue for future work. Furthermore, future work will address the attacks on actuators as well as the methods for the classification of the attacks according to the point of the attack.