PREDICTING THE TYPE OF PHYSICAL ACTIVITY FROM TRI-AXIAL SMARTPHONE ACCELEROMETER DATA

Development of various statistical learning methods and their implementation in mobile device software enables moment-by-moment study of human social interactions, behavioral patterns, sleep, as well as their physical mobility and gross motor activity. Recently, through the use of supervised Machine Learning, Human Activity Recognition (HAR) has found numerous applications in biomedical engineering especially in the field of digital phenotyping. Having this in mind, in this research in order to be able to quantify the human movement activity in situ, using data from portable digital devices, we have developed code which uses Random Forest Classifier to predict the type of physical activity from tri-axial smartphone accelerometer data. The code has been written using Python programming language and Anaconda distribution of data-science packages. Raw accelerometer data was collected by using the Beiwe research platform, which is developed by the Onnela Lab at the Harvard T.H. Chan School of Public Health. Tuning has been performed by defining a grid of hyperparameter ranges, using Scikit-Learn’s Randomized Search CV method, randomly sampling from the grid and performing K-Fold CV with each combination of tested values. Obtained results will enable development of more robust models for predicting the type of physical activity with more subjects, usage of different hardware, various test situations, and different environments.


INTRODUCTION
In order to demonstrate the usefulness of Human Activity Recognition (HAR) methods based on supervised Machine Learning, the objective of the research was to create a code which predicts the type of physical activity (e.g., walking or climbing stairs) from tri-axial smartphone accelerometer data. It is known that, beside the accelerometer, smartphones contain numerous other sensors, such as gyroscope, GPS, and magnetometer, that can be used to classify movement. All these sensors can be exploited for Human Activity Recognition (HAR) through the use of supervised Machine Learning. As a result, we can use smartphones for various purposes, such as remote health monitoring of disabled and elderly patients while they perform regular activities throughout the day, or just for reminding us to be more active. Furthermore, the use of these devices for monitoring our actions could assist us to better decide about our future activities [1]. Quantification of the behavior in situ using data from portable digital devices is also important for phenomics (the systematic study of phenotypes on a genome-wide scale), since digital phenotyping can be used to correlate mobility and quality of patient's life with spine disease [2], or to capture novel recovery metrics after cancer surgery [3] as well as to predict, diagnose, monitor and develop treatments for brain disorders [4]. A problem of predicting human activity by classifying sequences of accelerometer data using sensors from smartphones and wearable devices has been analyzed by numerous researchers [5][6][7]. Identifying the actions carried out by one or more subjects through gathering and analyzing context information about the user's state and his surrounding environment can be performed by the exploitation of environmental and on-body sensors, and distributed computing resources. This might be a challenging problem due to: a large number of observations produced each second, the temporal nature of the observations, and the lack of a clear way to relate accelerometer data to known movements [8]. Some of the Machine Learning models that have been already used for the recognition of activities are Movelet Method [9], Support Vector Machines (SVMs) [10], Decision Trees [11], Naive Bayes [12] and Markov chains [13]. It should be noticed that although Machine Learning models could be fit to training data, they could not be generalized with sufficient accuracy on data from subjects not included in the training set. As a result, an insufficient accuracy of the model can occur, which may be the reason for potential users to mistrust, ignore and remove the considered phone app. Taking this into account, our approach first compares the accuracy of Random Forest Classifier and Multinomial Logistic Regression as models for classification and then further exploits Random Forest Classifier to predict the type of physical activity from tri-axial smartphone accelerometer data.

METHODS
Taking into account that smartphone accelerometers are very precise, as well as that different physical activities give rise to different patterns of acceleration, in order to build a model for predicting the type of the user's physical activity, we have used sensor based HAR. Sensor based HAR collects the motion data from smart sensors such as accelerometers [14], while the model for pre- where m denotes the number of activity types. In order to capture activity information we can use a sequence of sensor readings: (1) (2) where d t denotes the sensor reading at time t. If the true activity sequence is (3) where n denotes length of the sequence, than building a model , which predicts the activity sequence based on a sensor readings: means to learn a model F by minimizing the discrepancy between predicted activity and the truth activity. In order to predict the type of physical activity, we have developed the code using Python programming language and Anaconda distribution of data-science packages. The input data used for training in our research consisted of two files. The first file (train_time_series.csv) contained the raw accelerometer data, while the second one (train_labels.csv) contained the activity labels, and has been used to train the model. Because the accelerometers are sampled at high frequency, the labels in train_labels.csv were only provided for every 10th observation in train_time_series.csv. Raw accelerometer data were collected by using the Beiwe research platform. The Beiwe research platform has been developed by the Onnela Lab at the Harvard T.H. Chan School of Public Health in order to collect and analyze raw sensor and phone use data for biomedical research. It contains three cloud-based components -for collecting data, managing studies and performing data analysis, primarily at the Harvard Medical School. Raw accelerometer data file has the following format: time stamp, UTC time, accuracy, x, y, z. Since x, y, and z, corresponded to measurements of linear acceleration along each of the three orthogonal axes, we could use them for separating different physical activities, such as standing, walking, going down or going up. Different activities have been numbered with integers (1 = standing, 2 = walking, 3 = stairs down, 4 = stairs up). First, measurements of linear acceleration (along each of the three orthogonal axes x, y, z) from every 10th observation in test_time were appended to the train_labels data frame. For predicting the type of physical activity from tri-axial smartphone accelerometer data, first, we compared the accuracy of Random Forest and Multinomial Logistic Regression models for classification of (4) our type of data. Unlike Multinomial Logistic Regression, which generalizes Logistic Regression to multiclass problems and where the predictions are transformed using the logistic function, random forest is an ensemble learning method for classification and regression which is based on constructing a multitude of decision trees at training time, where the predictions are either the mode of the classes (classification) or mean prediction (regression) of the individual trees.
In order to find out which model, Random Forest or Logistic Regression, is better for our type of classification, the cross-validated accuracy for these two models was calculated. After selecting the better model, tuning of its parameters was performed. Tuning was done by defining a grid of hyperparameter ranges, using Scikit-Learn's RandomizedSearchCV method, randomly sampling from the grid and performing K-Fold CV with each combination of values. Then, in order to test the accuracy of the model before applying it on the final testing dataset, data from previously modified train_labels data frame was divided into train and test subsamples. The train subset contained 80% of the data in train_labels data frame, while the test subset contained the other 20%. This type of data separation was performed in order to prevent overfitting of the model. Although the intention of our article is to demonstrate the usefulness of the presented method for general use, it should be emphasized that if we had optimized the model for the whole train_labels data, then our model would score very well on this set, but would not be able to generalize to new data, used for final testing. The data used for final testing of the code consisted of two files. The first file, called test_time_series.csv, contained the accelerometer data and the goal was to predict the corresponding type of the activity for every 10th observation in this time series. Class predictions were made using measurements of linear acceleration from test_time_series as testing data for Random Forest classifier. The predictions were then added to the file called test_labels.csv, which consisted of the timestamps corresponding to every 10th observation in test_time_series.csv. The code is presented as a supplementary material to this report (provided at the end of the report).

RESULTS
Physical activity, which is usually defined as "any bodily movement produced by skeletal muscles that requires energy expenditure" [15], offers an attractive target for the application of acceleration sensors, which cover an amplitude range of −12 to 12 g and a frequency range of at least 0 to 20Hz [16]. The basic mechanism of acceleration measuring can be explained by a mass-spring system, where a mass is displaced when acceleration is applied, generating a force in a spring connected to the mass [17]. Accelerometers can be classified as piezoresistive, piezoelectric and differential capacitive accelerometers, depending on the method of signal transduction  [18]. High accuracy accelerometers, such as ActiGraph (ActiGraph LLC) and StepWatch (modus health llc) have been considered as some of the most promising tools for the assessment of physical activity under free-living conditions. It should be mentioned that accelerometers, when used alone, can produce important errors in the process of motion analysis, since they cannot measure rotation around the vertical direction [17,18]. Activity recognition is an interesting problem for Machine Learning, because each user performs the same activity slightly differently. Therefore, it is important to find a method that can generalize the important features of an activity rather than the specifics of how a particular user performs the activity. In order to verify which model, Random Forest Classifier or Logistic regression, has better accuracy of predicting the type of physical activity, the ratio of 10-fold Cross Validation Testing results for both models was calculated. If Random forest Classifier has better accuracy, this ratio is greater than 1 and thus testing results will occupy the area above the diagonal line in Figure 1. Comparison of the results of the 10-fold cross-validation testing for both models is shown in Figure 1. It can be noticed that all the testing results are in the area above the diagonal line, which leads to the conclusion that Random Forest Classifier has better accuracy than Logistic Regression. Therefore this model is better suited for predicting the type of physical activity from tri-axial smartphone accelerometer data. For hyperparameter tuning, we performed many iterations of the entire K-Fold CV process, each time using different model settings. The best parameters obtained this way, are the following: 'n_estimators': 300, 'min_samples_split': 10, 'min_samples_leaf': 3, Using the train subset of the train_labels data frame for fitting and using the test subset for testing the model, predicted accuracy was 62,67%. It should be mentioned that one of the factors that influences the accuracy of the model is the distribution of the data in the training dataset. As it can be seen from Figure 2, acceleration data is not represented equally, which implies that some kind of renormalization of the statistical data should be performed. Furthermore, it should be noticed that the train_labels data does not represent all the activities equally. Unequal distribution of different physical activities through time is shown in Figure  3, where, on x-axis, timestamps from train_labels dataset were used as time variable, whereas, on y-axis, types of activities were represented as their corresponding numeric labels (1=standing, 2=walking, 3=stairs down, 4=stairs up). The majority of data (56.8 %) corresponds to walking, while 7.2 % corresponds to standing and about 12.53 % and 23.47 % corresponds to upstairs and downstairs, respectively. This might be the reason why predictions for walking activities occur more often then predictions for activities such as going downstairs, upstairs or standing. This is even more evident when using the whole train_labels data for fitting the Random Forest model and afterwards while making the final predictions for the test_labels dataset. In those cases, the accuracy of the model drops to 45,6%. It should be mentioned that this decrease of the accuracy might also occur due to significant differences between subjects, from whom train_labels and test_labels dataset have been collected. All this implies that, in order to improve the accuracy of the model, it is important to quantitatively score the model in a way that encourages accuracy of all activities. It should also be noticed that several studies suggested the existence of two major problems with signal validity, obtained by using accelerometer sensors. The first problem is the location of sensors, while the second one  Figure 3: Unequal distribution of different physical activities through time is reliability of the detected events. As a result, detection of abrupt changes in frequencies or attenuation in amplitude, both of which are characteristic features of freezing, the sensitivity and specificity of the event might be less than satisfactory [19]. Therefore, further improvements of the applied model may be achieved by incorporation of other public datasets to create a more robust model with more subjects, as well as with wider usage of different hardware, different test situations and different environments.

CONCLUSION
In this research, in order to demonstrate the usefulness of supervised Machine Learning methods in the field of digital phenotyping, the code for predicting the type of physical activity from tri-axial smartphone accelerometer data has been created and tested. Comparison of the accuracies of Random Forest and Logistic Regression models pointed out that the use of the Random Forest classifier might be better solution for the HAR applications. Therefore the code, based on application of Random Forest as the method for classification and for predicting four types of physical activity (standing, walking, stairs down, stairs up) is created. Raw accelerometer data was collected by using the Beiwe research platform. The results of the final code testing have shown that the accuracy of the model is 45,6%. This accuracy of the code might be the result of unequal distribution of different physical activities in training dataset and also the consequence of the differences between subjects from whom training and testing data have been collected. In order to obtain better predictions than the ones achieved by the presented model, some other deep learning methods such as Recurrent Neural Networks and Convolutional Neural Networks might be suggested.

ACKNOWLEDGEMENT
The author would like to express her gratitude to Harvard University for launching the online learning platform