Solving Product Allocation Problem (PAP) by Using ANN and Clustering

Proper planning of a warehouse layout and the product allocation in it, constitute major challenges for companies. In the paper, the new approach for the classification of the problem is presented. Authors used real picking data from the Warehouse Management System (WMS) from peak season from September to January. Artificial Neural Network (ANN) and automatic clustering by using Calinski-Harabasz criterion were used to develop a new classification approach. Based on the picking list the clients' orders were prepared and analyzed. These orders were used as input data to ANN and clustering. In this paper, three variants were analyzed: the reference representing the current state, variant with product relocation by using ANN, and the variant with relocation by using automatic clustering. In the research over 380000 picks for almost 1600 locations were used. In the paper, the architecture of the system module for solving the PAP problem is presented. Presented research proved that using multi-criterion clustering can increase the efficiency of the order picking process.


INTRODUCTION
Product Allocation Problem (PAP) is one of the major challenges for a number of warehouses and logistic centres. It would appear that it is a quite plain problem, however, for many companies storage of products is not currently sufficient [1]. The modern market force companies to compete not only on a local but also on a global scale [2]. From the perspective of the global market, it is not so important where the product was produced, but how much time and money cost the transport and logistics operations. The costs of transport are often a significant part of the total cost of a product, especially for products of relatively small value [3]. In the transport system, a warehouse is very often a bottleneck that raises total delivery costs. Consequently, if product allocation planning is made randomly, it may increase the distanceduring picking -the higher demand products will be located far from picking and packing zone. Thus, loss of time, the increase of needed employees and means of transport may occur. The different methods of items classification are used in order to plan product allocation in the warehouse. Items are identified and clustered, then they are distributed in the warehouse area in such a way as to ensure the shortest time to access products of the highest importance [4]. Typical methods of items classification are ABC, XYZ, EIQ, AHP, and COI Index analyses [5][6][7][8]. Although these methods are still widely used, they do not satisfy the requirements placed on the warehouses. These analyses are most often performed for just one criterion, sometimes repeated for another criterion and averaged according to obtained results. However, the mentioned approach allows to take into account several criteria at the same time but doesn't allow to include correlations between classified items [9][10][11][12][13][14].
According to insufficient methods of items classification, companies are increasingly looking for new methods by combining those already used or refining them. Each company, by its nature, may have different expectations [15]. An example can be a company producing bulk/granular goods packed in cartons and bags. For this type of business, it will be important for the products to be picked according to the type of packaging and weight, which describes stacking resistance. It is to avoid the necessity to rearrange products placed on the order picking truck in a sequence, where small items can be damaged by heavier and larger items located above [16,17]. Therefore, by distributing products only according to their demand, completion according to the type of packaging and weight would cause a number of problems -it would be necessary to cross the warehouse several times to revisit the places in the same area [2]. On the other hand, the importance of the demand, seasonality, value of the product cannot be ignored [18]. One might also ask themselves what about the relationships between products, e.g. frequency of certain product groups on picking lists? Due to an amount of input data of this type, problems cannot be solved by methods of classical classification.Because of that, the artificial intelligence method like Machine Learning and Artificial Neural Networks were examined to find new method taking into account many criteria at the same time.
Product Allocation Problem (PAP) in warehouses plays a significant role in the effectiveness of product picking process. The product allocation of products is strongly related to the efficiency of the warehouse operation and depends on: − required storage conditions, − type of load unit, − storage technology, − product picking density. The product could be allocated by the static storage or free storage method.
In the method of static storage, the goods in the warehouse have exactly the place in which they should be located, i.e. allocated shelves intended only for storing products of a certain type.The number of racks for one type of product corresponds to its maximum demand. The benefits of this method are transparency of the warehouse, the ease of finding goods. On the other hand, the main disadvantages are the necessity of having a large storage area, extended picking route of goods.
Two storage cases can be distinguished in the method of static storage locations: − dedicated storage -a shelf for permanent storage of only one type of product, − class-based storage for product groups -a shelf for permanent storage of any products in one group. In the free storage method, there is no need to reserve space for a given type of product. The goods are placed on the current basis in any currently available shelf, which is usually the nearest location. The size of the storage space, in this case, is selected at the level of average stocks, taking into account the additional surface that provides the opportunity to store more goods in the event of an increase in demand.The use of the free storage method allows you to reduce the storage area needed by about 30% compared to the method of permanent storage places.
The main problem for warehouses using the static storage system is of low sensitivity to fluctuations in demand. Of course, ABC / XYZ methods can be used cyclically throughout the year at a specified time interval, but there is always a decrease in the efficiency of the warehouse operation outside these periods. For this reason, more complex methods of planning product allocation have begun to develop, allowing real-time results to be obtained. These methods work based on such tools as fuzzy sets, artificial neural networks and genetic algorithms. Modern methods usually do not classify order of products, and immediately check selected variants of product allocation from the set of acceptable solutions. Such methods include heuristic methods. The modern tools used for this reason are: − Genetic Algorithms (GA) allowing the generation of new solutions that are a combination of current solutions or their mutation. Genetic algorithms do not allow verification of the entire set of solutions, but they allow obtaining a better solution with each step of the search -generation. It happens that the algorithms are combined with other methods, e.g. combining product picking paths, − Fuzzy Logic (FL) most commonly used when dividing products into categories. It allows adjusting the affiliation of products to a given category if the value of the feature for a given product is close to the boundary separating categories, − Artificial Neural Networks (ANN) are most often used where it is difficult to determine the algorithm to achieve the desired result. In this case, most often only the input data and information about the desired effect are available. Artificial neural networks help find correlations between input and result. The ANN operation can be divided into the phase of creating a network structure, learning the network -acquiring knowledge and the phase of network operation. Neural networks are a tool more and more often used due to their speed of operation, the growing quality of results as new data is processed and very large adaptability. Artificial neural networks are most often used to predict demand, classify products and determine the route of picking. Nowadays the tools for automatic analysis, Big Data and machine learning are developing rapidly. This method could be used also for solving logistics problems in warehouses and logistics centres.

INPUT DATA ANALYSIS
The warehouse used for the case study is based on real data from e-commercecompany stocking the electronics home appliances, computers and computer accessories. The picking of a product by the hour of the day is presented on image 1. The storekeepers work in two work shifts, from 8 am to 4 pm, and from 4 pm to midnight.The product dispatch is done by the 12 gates, and most of the transportsare send at the 3 pm.
In the analysed period, many products were sold once or a few times. It could mean that: − for those products, the demand was deficient, and this product should be removed from the company offers, − these products were ordered just for client order, and warehouse treat them as cross-docking products, because of that this product should not be analysed and clustered, − the product ID was changed, but the product is the same, this situation could be noticed when producer change the package or make little modification of the product, − the products with different ID are similar, for example, this product could have different colour, but the key parameters and functionality is the same, this product should be analysed as one. The total area of the warehouse is about 5400 m 2 (90x60m). In the warehouse, there is 96 forklifts and picking trolleys for the small orders.
The warehouse is divided into two main areas: an area for small products marked with "T" prefix for location number, and the area for big products stock on pallet unit marked with prefix "P".
The number of shelves in the "T" area is 22 000 for average 30 tier, the number of shelves for the standard tier is 740. The number of pallet socket in the racks is 3060 for the average of 3 tier, the number of the socket is 1160. The pick by order policy is generally used, but for big orders, the picking list is divided into few storekeepers and picked simultaneously. The relocation of the product is done during normal day picking. The picking process looks like this: − the product picking is done by using mobile terminals, which suggest the collection path, − the terminal indicates, among other locations, goods, quantity and requires confirmation by scanning the location barcode stuck on the rack and the barcode of the product, − products are put into the proper level of multi picking trolley, and the storekeeper scan the Serial Shipping Container Code (SSCC), − after picking all products from the list, the storekeeper goes to the shipping area, where he packs the products and print the shipping letter. The analysis was done for the period from October to December -the busiest period of the year. In the analysed period was stock 22586 of the product indexes (picked in area P 12095, selected in area T 16889). The 106651 clients orders were picked (orders picked in area P: 54767, orders picked in area T: 83106).The data comes from the busiest period in the year -from September to January. In this period the most orders are shipped during this period. Often with a small number of products. The number of requests and the type of products ordered is not easy to predict.

METHODOLOGY AND GENERAL ASSUMPTIONS
Row data exported from the system are hard to understand and hard to analyses. The WMS shows every operation, so if the storekeeper takes six pieces of product from the rack, and in this rack, the ten pieces were stocked, the system notice taking ten pieces from the shelf, putting six into containers and returning four pieces into the rack. Because of this, the data from the WMS system was cleaned from useless information. After that base on each housekeepers activities, the client's orders were formed. In the next step, the general statistics were done. Base on the warehouse plan (Autocad) the location for each shelve and pallet rackswere measured.
For making product classification model the Matlab and Tableau software was used. After a few iterations, the most promising model was chosen. The training process for the ANN was base on the data from October and November. And for simulation the data from November to December was used. The visualisation of the results was done in the Tableau software.
The method of evaluating the Product Allocation Problem taking into account the developed predictive model of the technical condition is shown in Figure 2.

Classification base on the Artificial Neural Network
A feed-forward network was used for the simulation. The selection of an artificial neural network structure was based on the method of subsequent approximations. The 10 different structures were examined. The network structure selection was made on the basis of the mean square error (MSE). Created artificial neural network (ANN) consists of one hidden with 8 neurons and one output layer (Fig. 4). As an activation, the tangensoidal function was used. The Levenberg-Marquardt backpropagation method was used as the learning algorithm. Fig. 4 shows the structure of the network.  As the input data, the 22 586 cases were used. The data was divided into three sets: for training: 15810 samples (70%), for validation: 3388 samples (15%) and for a test: 3388 samples (15%).During data preparation the fallowing Matlab functions were used: − Mapminmax -normalizes input values to the range of -1, 1 (acceleration of calculations), − Removeconstantrows -removal of input vectors consisting of the same values. For the ANN structure with best results, the MSE of 0.00095738 was achieved at the 12 epoch. After 12 epoch the results are worst. It could be also noticed by the learning factors changed in the epochs. The gradient, damping factor -training gain (mu), and validation check is presented in figure 6.
The results achieved in the training, the validation and testing of neural network processes are presented in Fig.7.The ANN returns the float value instead of the whole, so the prediction of the state is similar to fuzzy logic. And these results could be processed on the second stage.  I could be seen that the average error is less than 0.08. That shows that the chosen structure of the ANN is well fitted to real data cases. The circle in figure 7 presents results for comparing target value to the predicted. If the model reflects real cases well, the regression line is tilted 45 degrees to the axis, and the results (circle markers) are located near the regression line. If the model is overlearned, the results are usually above the line. If the model is not learned enough, the results are below. If the results are on both sides of the line at a considerable distance, despite the inclination of 45 degrees, it means that the analyzed cases are characterized by high uncertainty. The regression for the presented ANN structure is determined by the formula (2): The error analysis is presented by the histogram chart in figure 8. Figure 5 and 6 shows the good quality of the model results.

Classification model base on Machine Learning
For making the classification model the 22586 input data was used. The criteria for each product index were defined: the number of orders in which it occurs, the number of items, the difference between actual and previous months of the number of orders and items.
To avoid empty cluster and cluster with just one product, the appropriate number of the cluster was found -6 clusters. For finding the number of clusters the Calinski-Harabasz criterion was used to assess cluster quality (3).
The data diagnostics for the generated by Machine Learning model are presented below: − between-group Sum of Squares: 58.673, − within-group Sum of Squares:35.912, − total Sum of Squares: 94.585. For ML classification model was used the k-means algorithm. For a given number of clusters k, the algorithm partitions the data into k clusters. Each cluster has a centre (centroid) that is the mean value of all the points in that cluster. K-means locates centres through an iterative procedure that minimizes distances between individual points in a cluster and the cluster centre.
K-means requires an initial specification of cluster centres. Starting with one cluster, the method chooses a variable whose mean is used as a threshold for splitting the data in two. The centroids of these two parts are then used to initialize k-means to optimize the membership of the two clusters. Next, one of the two clusters is chosen for splitting and a variable within that cluster is chosen whose mean is used as a threshold for splitting that cluster in two. K-means is then used to partition the data into three clusters, initialized with the centroids of the two parts of the split cluster and the centroid of the remaining cluster. This process is repeated until a set number of clusters is reached.
The Lloyd's algorithm with squared Euclidean distances to compute the k-means clustering for each k was used. Combined with the splitting procedure to deter-mine the initial centres for each k > 1, the resulting clustering is deterministic, with the result depends only on the number of clusters. The analysis of variance for the model was done. The results are presented in Table 2. The F-statistic for one-way, or single-factor, ANOVA is the fraction of variance explained by a variable. It is the ratio of the between-group variance to the total variance. The larger the F-statistic, the better the corresponding variable is distinguishing between clusters.
The p-value is the probability that the F-distribution of all possible values of the F-statistic takes on a value greater than the actual F-statistic for a variable. If the pvalue falls below a specified significance level, then the null hypothesis (that the individual elements of the variable are random samples from a single population) can be rejected. The degrees of freedom for this Fdistribution is (k -1, N -k), where k is the number of clusters and N is the number of items (rows) clustered.The lower the p-value, the more the expected values of the elements of the corresponding variable differ among clusters.
The Model Sum of Squares is the ratio of the between-group sum of squares to the model degrees of freedom. The between-group sum of squares is a measure of the variation between cluster means. If the cluster means are close to each other (and therefore close to the overall mean), this value will be small. The model has k-1 degrees of freedom, where k is the number of clusters.
The Error Sum of Squares is the ratio of the within-group sum of squares to the error degrees of freedom. The within-group sum-of-squares measures the variation between observations within each cluster. The error has N-k degrees of freedom, where N is the total number of observations (rows) clustered and k is the number of clusters. The Error Sum of Squares can be thought of as the overall Mean Square Error, assuming that each cluster centre represents the "truth" for each cluster. To compare the results, the simulations of product allocation regarding the results of ANN and clustering were done. For visualisation, the results on the warehouse layout the coordinates for every location ID were identify. The warehouse plan in AutoCad format was flattened to JPG file. According to the topology of the warehouse, the location of each rack and shelf was found including their coordinates on the JPG file.
For calculating the access time from packing area (located near the gate no. 1 -coordination on the warehouse layout: X=150, Y=0) to the picking rack.shelves the distance between this points were calculated. The distance was summed for the order. For calculating this distance the Minkowski model (4) was used.
( ) The picking time for the order was calculated by the formula (5). The simulation results were compared by Heat Map for both areas: small products "T" area, and pallet unit "P" area. The results of visualisation are presented in the figures 10a-10c.
To be sure that the result for clustering is better, the statistical comparsions were done. The first comparsion was for orders picking based on the real cases from analysed period. The second comparsion was done for products direct picking.
In the ideal model every high pickup density product (red points) is located near the packing area (dock no. 1). It could be seen for reference variant (figure 10a) that product with high rotation factor (often picked) is picked from different places in the warehouse.
For the variant with allocation product by using the ANN method there are more red points near the packing area. For the clustering method, most of the products are located near this area, so this result is near the ideal.
The comparison of orders picking total time is presented by Box-and-Whisker plot in figure 11. The picking was made on the basis of marking the route between all products on the picking list. The results also are presented in table 3 -comparison for each area and product allocation method. The results also are presented in table 4 -comparison for each area and product allocation method. Both comparison methods prove thatthe best results were achieved for clustering. The product of the highest pickup density for orders and products picks are located in the nearest area of orders packing. For the actual product allocation (reference variant), the products with similar pickup density are placed in distant places. It is an unprofitable situation. The ANN method helps to find a better solution but not as good as clustering.

CONCLUSION
The effectiveness of picking is strongly related to the picking route (distance). This distance is related to product allocation. Increasing the effectiveness of picking is possible by proper product allocation -corresponding to the present demand for product and correlation of products on the picking orders. The presented research proves that it is possible, to improve the warehouse picking process by shortening the picking distance and reduce the time of this process.
The research base on the real data (over 380000 picks for almost 1600 locations were used for the period from October to December -the busiest period of the year), and prove that achieved results are useful and can be implemented by the company. Based on the presented case, it was possible to reduce the picking time for pallet unit -"P" area about 10%. For the small products -"T" area the presented method does not improve the picking time. The reason for it is a short distance and quasi-demand present product allocation in the warehouse.
In future research, the comparison with the classic method will be done. Morevoer, the presented method of clustering product will be upgraded by taking into account the relation between the products on the same picking list. The research with the own product Correlation Searching Algorithm (CSA) gave promising results. Because of that in the future, the CSA algorithm will be combined with clustering method.