Planning, analyzing and optimizing experiments

Research and Development (R&D) encompasses all the innovative activities of organizations engaged in the development of new products (recycled materials, hardware, software, services) or enhancement of existing ones. In the US, the Research & Development cost to revenue ratio averages 3.5% but reaches even higher levels of up to 40%. Research & Development applies scientific methods with iterative and cyclical processes through which information is constantly updated and replaced, using the Experimenting (Design of experiments, DOE). The experimenting is an empirical way to arbitrate competing models or hypotheses in order to test existing theories or new hypotheses. The experimenting not only contains the design as it is commonly thought but includes the sequential stages: creating designs, analyzing and constructing the plans of the experiments, optimizing the plans of the experiments and optimizing the response of the experiments. Experimental designs include different types, but today, factor designs, responsive surface designs, mixture design, and Taguchi designs are mostly used. This paper demonstrates the application of the methodology of experimentation with the creation, analysis, development of experimental plans, optimization of the experimental plan, and optimization of the experimentation response, in one specific example of a chemical reaction from practice, using the computer program Minitab®


INTRODUCTION
Today, every major industrial production in the world is based on Research and Development (R&D), which encompasses the innovative activities of organizations undertaken to develop new products (processed materials, hardware, software, services) or to improve existing ones. Typically, two basic organizational unit models have been encountered: R&D Departments with new process results development tasks and Industrial Research Departments with applied research tasks in scientific or technological fields that facilitate the development of process results. These departments are different from the vast majority of others in that they do not aim to profit immediately, because they bring greater risk and uncertainty to return on investment, but are key to gaining large market benefits by selling new products. In the US, the R&D cost and overall profit of organizations reach an average of 3.5% but reaches even higher levels (14.1% Merck & Co, 15.1% Novartis, 24.9% Ericsson, 43, 4% Allergan). Research and development apply scientific methods to iterative and cyclical processes through which information is constantly updated and replaced by Experimenting (Design of Experiments, DoE). Experimenting is an empirical way of arbitrating competing models or hypotheses, in order to test existing theories or new hypotheses. Major problems in Experimenting include achieving randomization, reliability, and replication to obtain the required statistical power, sensitivity, and orthogonality. Considerable advances in experimentation occurred in the early 20th century, with contributions from statisticians such as Ronald Fisher , Jerzy Neyman , Oscar Kempthorne (1919Kempthorne ( -2000, Gertrude Mary Cox (1900-1978), and William Gemmell Cochran (1909-1980, (Popović, & Ivanović, 2018, 2019. The experimenting not only contains planning as it is usually thought, but it also includes the sequential stages: creating designs, analyzing and constructing the plans of the experiments, optimizing the plans of the experiments and optimizing the response of the experiments. Creating Designs plans is the stage of forming the conceptual model and scheme of the experiment plan, with the number and sequence of individual experiments, with validity, reliability, and replicability established. Analyzing Designs involves considering the results of experimentation obtained, with repetitions, preliminary results, and deviations. Displaying Plots is a stage of automatic creation of outline diagrams of analyzed experiment plans, which facilitate the consideration and interpretation of the results obtained. Optimal Design is the process of refining a plan by reducing or increasing the number of experimental cycles with factors, surfaces, or mixtures. Response Optimization is the process of obtaining the optimum result of experimentation on an initially constructed optimization diagram. Experimenting involves many different types, but today they have mostly applied: Factorial designs, Response surface designs, Mixture designs, and Taguchi designs, (Bisgaard, 2008;Box, Hunter, & Hunter, & Stuart, 2005;Levin & Ramsey & Smodz, 2018;Montgomery & Runger, 2010;Nelson, 2004). This paper presents the application of the methodology of experimentation with designing, analyzing and constructing experimental plans, optimizing experimental plans and optimizing the results of experimentation, on one specific example of a chemical reaction from practice. An especially recommended selection of experimentation procedures was used, depending on the set requirements and needs, using the Minitab® computer commercial software.

METHODOLOGY
The experimentation methodology covers the known basic elements of experimentation, the main types of experimentation, and my own recommended choice of the type of experimentation required. The basic elements of experimentation are based on the known concepts: input and output quantities, input size levels, the interaction of input quantities, the number of experiments required, the schema of the planned experiment, the model and the matrix of experimentation, as well as variance and regression, analyzes. The input magnitudes of experimentation are independent Factors (x i : i = 1,2, n) with main effects A, B, .., N, e.g. process temperatures (A), which have several different values, e.g. Factor levels 90 ℃ and 100 ℃. The output magnitudes of experimentation are the dependent magnitudes of experimentation, e.g. Response (y) 98.7 ℃ with a confidence risk of α= 0.05. The interactions of input quantities are the interaction effects of particular factors, e.g. with the main effects of the interactions AB, AC, BC. The number of experiments required depends on the number of factors and their levels, e.g. the total number of experiments for 3 two-level factors is 2 3 = 8. The design point scheme is a 2-or 3dimensional representation of the experiment points. The model is a mathematical representation of the relationship between experimentation results and factor sizes. A design matrix is a matrix description of a plan used to analyze experimentation.
The main types of experimentation are with factors, with result surfaces, with mixtures, and with Taguchi experimentation. Factorial designs give the estimated results of experimenting with individual values. Response surface designs give the estimated results of experimenting with a series of values, which are located on the result surfaces. Mixture designs give estimated results of experimenting with individual components of mixtures. Taguchi designs provide estimated experimentation results for individual input sizes, according to the findings of an engineer and statistician Dr. Genichi Taguchi (1924-2012. The recommended choice of experiment type in Figure  1 includes all the major types of experimentation (designing, analyzing, constructing a plan, optimizing the plan, and optimizing the experimentation response), as well as practical experimentation procedures. It starts with the required type and successive phase of experimentation in order to select the necessary practical experimentation procedure, (Burman, Robert & Alm 2010;Hani, 2009).

CREATING DESIGNS PHASE
In the Creating Designs phase, an ideal schematic of the experimental design model is designed, with the number and sequence of individual experiments, with established validity, reliability, and reproducibility. E.g. When considering the effect of 3 factors at two levels, on a certain chemical reaction, two more factors need to be added, using the following steps: Step 1 -(Stat> DOE> Factorial> Create Factorial Design), step 2 -Select a special generation of experiments with 3 2-level factors, step 3 -of the plan with 2 3 = 8 experiments and full resolution, which has 0 center points per plan and 1 repetition at the center point of the plan, step 4 -Generator selection in Submitted (Adding Generator Generators) and Enter Generation Labels factors (D = AB, E = AC) and step 5 -analysis of the results obtained.    The obtained Pareto diagram, shown in Figure 6, shows three significant influencing effects of factors, which satisfy the limit of standardized effects (2.36): time (A), temperature (B), and timetemperature interaction AB), (Ghosh, & Rao, 1996;Montgomery, 2013;Sifri, 2014;Spall, 2010).
Screening design results are obtained after performing full factor experiments when the variability of the results can be analyzed and when certain factors are believed to have a stronger influence. E.g. In the example shown, chemical reaction data were collected using the following steps: Step 1 -data collection, step 2 -choice (Stat> DOE> Factorial> Preprocess Responses for Analyze Variability), step 3 -input of type of standard deviation (Compute for repeat responses across rows) in the field (Standard deviations to use for analysis), step 4 -entry ('Yield_1' -'Yield_8'.) in the field (Repeat responses across rows), step 5 -the input of the data type (StdYield) in the field (Store standard deviations in), step 6entering a data type (NYield) in the field (Store number of repeats in) and step 7 -analyzing the results obtained.
The results of the experimental design of the experiments in Figure 7 show the following results: a) standard deviation of column yield repeats (StdYield), b) a number of repeats in column (NYield), and c) 8 standard deviations and 8 number of repeats, with one standard deviation and the number of repetitions for each combination of factor settings, in the order where that combination first appears, filling in the remaining rows with the missing symbol (*), (Bisgaard, 2008;Montgomery, Douglas, 2013;Pronzato, 2008). Analyzing the variability of the results obtained is applied when certain factors are believed to have a stronger influence and is performed in two stages: 1) calculating least squares regressions to fit the reduced model and 2) analyzing the reduced model using maximum likelihood estimation to obtain the final regression coefficients. E.g. the example shown gives the following analysis of the variability of the results in phase 1 using the following steps: Step 1 -program selection (Stat> DOE> Factorial> Analyze Variability), step 2input (StdYield) in the field (Response, standard deviations), step 3 -(Terms) and input 2 in the field (Include terms from the model up through  The obtained Pareto diagram shown in Figure 9 shows the results of the significant effects of factors (A, B, C), which satisfy the limit of standardized effects (12.71). At this point, it is observed that the model should be reduced using the least-squares regression method to obtain factors (Time, Temp, Catalyst) that should be retained in the model. Of course, this model is just one of the possible scaled-down models to choose from, so you may need to fit several models to find the right model.
Analyzing the experiment plan, using phase 2 least-squares regression estimation using reduced model analysis, using maximum likelihood estimation to obtain the final model coefficients, the following results are obtained using the following steps: Step 1 -Program Selection (Stat> DOE> Factorial> Variability Analysis), step 2 -(StdYield) entry in the (Answer, standard deviations) field, step 3 -choice (Option) and entry (Maximum likelihood) in the (Estimate) method field), step 4 -choice (Conditions) and scroll (BC ) from (Selected Conditions) to (Available Conditions), step 5 -Selections (Charts) and Delete (Pareto, Normal, Half Normal) in the Table Effects field and Three in One in the Other Plans field and step 6 -analysis of the results obtained, (Hazewinkel, 2001;Sen, & Srivastava, 2011).
The obtained overview of the complete plan of experiments in Figure 10 shows the results of a) the results of the estimate (Effect), (Ratio Effect), (Coef), (SE Coef), Z-and P-values and VIF, b) the influence of the factor (Time) with the strongest effect (2.0365) since the effect ratio shows an increased standard deviation (7.6636) when the factor level (Time) changes, c) the factor (Temp) has the next strongest effect (− 1.1491) with the standard deviation increasing to (3.1552) when the factor (Temp) changes from low to high, d) the factor (Catalyst) has the smallest main effect (4300) and the effect ratio shows that the standard deviation increases to (1.5373) when the factor (Catalyst) changes from low to high level until the interactions are statistically significant and e) the regression function and structure of factor replacement:  The obtained diagram of the main effects of the interactions in Figure 11 shows the following results: (Time) and (Temp) factors have similar effects that increase with the transition from low to high.
The obtained diagram of factor interactions (Temp) in Figure 12 shows the following results: the increase in the factor effect is larger when both the factor (Time) is larger (50) and the effect of the factor decreases when the factor is small (20). not possible or desirable to select a model in advance. In the example of considering the influence of 3 factors with 2 levels on a certain chemical reaction, there is no such need, so a boundary contour plan diagram can be constructed. E.g. In the example shown, there is a significant interaction between reaction time and temperature, so that the dependence of the experiments on the input factors with the maximum reaction results and the minimum cost is determined by applying the following steps in phase 1): Step 1 -data collection, step 2 -Program selection (Stat > DOE> Factorial> Overlaid Contour Plot for Catalyst A, step 3 -Input (Cost) and (Yield) in Field (Selected), Input (Time) in Field (X-Axis) and Input (Temp) in Field (Y-Axis), step 4 -Choice (Contours) and enter preferred cost (28,35) in the field (Low) and (High) and enter preferred yield (35,45) in the field (Low) and (High), then in phase 2): step 5 -selection (Settings) and entry (A) in the (Setting) field for (Catalyst), step 6program selection (Stat> DOE> Factorial> Overlaid Contour Plot) for catalyst B, step 7 -entry (Cost ) and (Yield) in the field (Selected), entry (Time) in the field (X-Axis) and entry (Temp) in the field (Y-Axis), step 8 -selection (Contours) and entry (28,35) in the field ( Low) and (High) and input (35,45) in the field (Low) and (High), step 9 -selection (Settings) and entry (B) in the field (Setting) for (Catalyst) and step 10analyzing the results obtained, (Chernoff, 1972;Fedorov, 1972;Goos, 2002;Goos, Jones & Bradley, 2011;Kôno, 1962;Pronzato, 2008;Pukelsheim, 2006;Shah, Sinha, & Bikas, 1989).
The obtained optimization diagram for catalyst A in Figure 13 shows

RESPONSE OPTIMIZATION PHASE
In the Response Optimization phase, the process of obtaining the optimal experiment result in the initially constructed optimization diagram is applied when the result lines are interactively changed to obtain the desired experiment results. The optimization of the results of the experimentation is based on the determination of the estimated values by using certain weights (Weight), according to the diagrams of the minimization, maximization, and determination of a certain value according to Figure 15, with lower and upper bounds. In the example shown, the results of the experiment are optimized with the maximum of the reaction results and the minimum required costs, using the following steps: Step 1data collection, step 2 -program selection (Stat> DOE> Factorial> Response Optimizer), step 3input (Minimize) into the field ( Goal) for Cost and Maximize in Yield Goal, step 4 -Setup for Cost Costs (-, 28, 35) in Lower, Target, Upper) and for yield (Yield) input of values (35, 45, -) into the fields (Lower, Target, Upper) and step 5analyzing the obtained results.
The obtained optimal experimentation results in Figure 16 with

RESULTS AND CONCLUSIONS
All the necessary results of the experimentation were obtained. In the design phase of the experimental design, in the case of a certain chemical reaction, with 3 factors (A, B, C) and 2 levels each, resolution (III) was applied and two more factors were generated (D = AB, E = AC) so that obtained from the partial plan (1/4) with 8 experiments. During the analysis phase of the experiment plan, we obtained: a review of the analysis of variance with suitable F-and P-values for all model sources, model results, favorable coefficients of determination of the obtained experiments (R-sq = 98.54%, R-sq adj = 96.87 %, R-sq pred = 92.26%,) and it was observed that all three significant influencing factors were: time (A), temperature (B), and the time-temperature interaction AB, satisfy the limit of standardized effects (2,36). The results of the experimental design of the experiments gave: good model coefficients: (R-sq = 99.98%, R-sq adj = 99.83%, R-sq pred = 98.42%, coded factor coefficients, regression function and the obtained results of the analysis of variability showed that the strongest effect (2.0365) had a factor (Time), then (1.1491) a factor (Temp) and the smallest (0.4300) factor (Catalyst). The diagrams show that factors (Time) and (Temp) have similar effects, which increase with the transition from low to a high level and that the increase in the effect of factor (Time*Temp) is larger when the factor is larger and the effect of the factor decreases when the factor is smaller. The example of chemical reaction of 3 factors with 2 levels, does not require reduction or increase of the number of experimental cycles. In the phase of optimization of the experimental plan, diagrams of boundary contour plans are constructed, which in the white area contain the optimal factors (Time = 25) and (Temp = 150), which satisfy both the results of experimenting Maximum (Yield) and Minimum (Cost