DATA AnALYSIS TECHnIQUES In SERVICE QUALITY LITERATURE: ESSEnTIALS AnD ADVAnCES

Academic and business researchers have for long debated on the most appropriate data analysis techniques that can be employed in conducting empirical researches in the domain of services marketing. On the basis of an exhaustive review of literature, the present paper attempts to provide a concise and schematic portrayal of generally followed data analysis techniques in the field of services quality literature. Collectively, the extant literature suggests that there is a growing trend among researchers to rely on higher order multivariate techniques viz. confirmatory factor analysis, structural equation modeling etc. to generate and analyze complex models, while at times ignoring very basic and yet powerful procedures such as mean, t-Test, ANOVA and correlation. The marked shift in orientation of researchers towards using sophisticated analytical techniques can largely be attributed to the competition within the community of researchers in social sciences in general and those working in the area of service quality in particular as also growing demands of reviewers of journals. From a pragmatic viewpoint, it is expected that the paper will serve as a useful source of information and provide deeper insights to academic researchers, consultants, and practitioners interested in modelling patterns of service quality and arriving at optimal solutions to increasingly complex management problems.

from 1985 to 2013 employing varied data analysis techniques and scale refinement measures in this domain has been carried out by the researchers and the same are summarized in Table 1.
In majority of the empirical studies in the area, the first section of the published research deals with the data exploration methods, moving on to preliminary data analysis.The next section generally presents results of test of differences, such as t-test, ANOVA (Analysis of Variance) etc.,while quite a few studies, in subsequent sections, highlight results of confirmatory factor analysis and structural equation modeling techniques.These trends have been summarized in Figure 1.
Data exploration and detection of outliers  Khan (2011) Survey questionnaire approach Mean, standard deviation, EFA 11.Khare (2011) Survey questionnaire approach ANOVA, post-hoc analysis, multiple regression test 12.Liao et al. (2011) Online survey CFA, SEM 13.Adil (2012) Survey questionnaire approach EFA, SEM 14. Adil et al. (2013a) Survey questionnaire approach EFA, SEM 15.Adil et al. (2013b) Survey questionnaire approach EFA, correlation test, CFA Source: Prepared by the researchers Source: Prepared by the researchers is essential as they have strong influence on the estimates of the parameters of a model that is being tested.Moreover, the preliminary data analysis presents the results related to (1) the validity and reliability of the instrument based on internal consistency of the measures by testing the Cronbach's alpha together with inter-item and item-total correlations, (2) exploratory factor analysis (EFA), and (3) descriptive analysis associated with respondent's demographic data.Generally, the purpose of t-test is to examine the significant differences between demographic variable (such as gender, marital status etc.) vis-à-vis.constructs of the study, whereas ANOVA is usually used to examine significant differences between respondents, based on demographic variables viz.age, work experiences, income, occupation of the respondents as well as their educational qualifications.Confirmatory Factor Analysis (CFA) shows how well measured variables represent a smaller number of constructs and allows researchers to test hypotheses about a particular factor structure or measurement model.CFA procedure starts with the following tests: (a) the convergent, (b) the discriminant, and (c) nomological validity of the constructs followed by first-order and second-order factor models.CFA places substantively meaningful constraints on the factor models by specifying the effect of one latent variable on observed variables.While Structural Equation Modelling (SEM) is used to assess the relationship between predictive variable(s) and criterion variable(s).

DATA EXPLORATIOn
Very often the statistical procedures (viz.parametric tests) used in social sciences research are based on the normal distribution.A parametric test is one that require data from one of the large catalogue of distributions that statisticians have described and for data to be parametric certain assumptions must be true (Field, 2005).

Figure 2. Skewness and Kurtosis
used in their studies -including checking for the presence of outliers -only 8% of the time.Given what we know of the importance of assumptions to accuracy of estimates and error rates, this in itself is alarming.There is no reason to believe that the situation is different in other social science disciplines (Osborne & Overbay, 2004).
Statistical outliers are unusual observation in a sample that differ substantially from the bulk of the sample data (Lavrakas, 2008).An outlier could be different from other points with respect to the value of one variable (Karioti & Caroni, 2002;Karioti & Caroni, 2003) and substantially distort parameter and statistic estimates.Data can become both skewed and show kurtosis because there are extremely influential scores at one end of the distribution, resulting in an asymmetrically shaped distribution (refer to Figure 2).
Coleby & Duffy (2005) suggest that the 'outliers' could be reduced with the help of box plots.A box plot graphically summarises much of the numerical data as it exhibits the median, the inter-quartile range, outliers, maximum and minimum values.The interquartile range shows where the bulk of the data lies and also the dispersion of the data (Brochado, 2009).Though, there is a great deal of debate as what to do with identified outlier data points (Osborne et al., 2004) but it is only common sense that those illegitimately included data points should be removed from the sample (Judd & McClelland, 1989;Barnett & Lewis, 1994).
To further confirm the normality of the remaining data, skewness, kurtosis and Kolmogorov-Smirnov (K-S) tests can also be carried out.Skewness refers to the unequal distribution of positive and negative deviations from the mean, while kurtosis is a measure of the relative peakedness or flatness of the curve (Malhotra, 2003).Therefore, in a normal distribution, both kurtosis and skewness happen to be zero (Field, 2005).

Homogeneity of Variance
When conducting assessments or evaluations in the social, psychological or educational context, it is often required that groups be compared on some construct or variable (Nordstokke et al., 2011).Nordstokke & Zumbo (2007;2010) posit that when conducting these comparisons, typically using means or medians, we must be cognizant of the assumptions that are required for validly making comparisons between groups.It was further highlighted by these authors that the assumption of homogeneity of variances is of key importance and must be considered prior to conducting these tests.
The assumption of equality of variances is based on the premise that the population variances on the variable being analysed for each group are equal.The assumption of homogeneity of variances is essential when comparing two groups, because if variances are unequal, the validity of the results can get jeopardized i.e. increases Type I error leading to invalid inferences (Glass et al., 1972;Nordstokke et al., 2011).When there is reasonable evidence suggesting that the variances of two or more groups are unequal, a preliminary test of equality of variances i.e.Levene's test is conducted prior to conducting the t-test or ANOVA (Nordstokke et al., 2011).If Levene's test is nonsignificant (viz.p>0.5) then one may conclude that the difference between variances is zero, i.e. variances are roughly equal.

Independence of Data
The assumption is that data from different participants are independent (Field, 2005), which means that the behavior of one participant does not influence that of another (Malhotra, 2003).Thus, due care needs to be taken to ensure that the respondents don't interact with each other while responding to the questionnaire or an interviewer.

Reliability and Validity
A multi-item scale should be evaluated for accuracy and applicability (Kim & Frazier, 1997).Malhotra (2003) posits that scale evaluation involves an assessment of the reliability, validity and generalizability of the scale.Reliability refers to the extent to which a scale produces consistent results if measurements are made repeatedly (Peter, 1979;Perreault & Leigh, 1989;Wilson, 1995;Malhotra, 2005;Hair et al., 2006) while validity signifies the extent to which differences in observed scale scores reflect true differences among objects on the characteristic being measured, rather than systematic or random error (Malhotra, 2003).Merriam (1988) and Wenning (2012) argue that validity does not ensure reliability, and reliability does not ensure validity.Thus, a study can be valid, but lack reliability, and vice versa.From a number of differing approaches for assessing scale reliability, the simplest way to look at reliability is to use split-half reliability (Field, 2005).This method randomly splits the dataset into two.A score for each participant is then calculated based on each half of the scale.However, the method is not free from criticism.Split-half method splits a set of data into two in several ways and hence the results vary with ways in which the data were split.To overcome this limitation, Cronbach (1951) proposed an alternative measure that is loosely equivalent to splitting data into two, in every possible way, and calculating correlation coefficients for each split.The average of these values is equivalent Cronbach's alpha (α) which is the most common measure of scale reliability (Hair et al., 2006).An α value of 0.7 and more (in certain cases even 0.6) is often employed as a criterion for determining the reliability of a scale (Hair et al., 2006) which basically indicates that items are positively correlated to one another (Sekaran, 2003).A large number of empirical researches in the domain of service quality have employed the test of Cronbach's alpha to assess internal consistency of items (Cronin & Taylor, 1992;Singh & Smith, 2006;Seth et al., 2008;Sohail & Shaikh, 2008;Adil, 2011a;Adil, 2011b;Khan & Adil, 2011;Khare, 2011;Adil, 2012;Adil & Khan, 2012a;Adil, Khan, & Khan, 2013a;Adil, Akhtar, & Khan, 2013).
Having ensured that the scale confirms to the minimum required values of reliability, the researchers usually make one more assessment i.e. scale validity.Usually, all forms of validities are measured empirically by the correlation between theoretically defined sets of variables (Hair et al., 2006).The most common validity that needs to be tested at this point is predictive validity.
Predictive validity establishes whether a criterion, external to the measurement instrument, is correlated with the factor structure (Nunnally, 1978).

Exploratory Factor Analysis (EFA)
In social sciences, a researcher often tries to measure things that cannot directly be measured.This brings into picture the relevance of exploratory factor analysis (EFA)-a technique for identifying groups or cluster of variables.According to Malhotra (2003), factor analysis refers to a class of procedures primarily needed for data reduction and summarization, while, Hair et al. (2006) define factor analysis as an interdependence technique whose primary purpose is to define the underlying structure among the variables in the analysis.According to Malhotra (2003) and Field (2005), this technique may be used for (a) understanding the structure of a set of variables, (b) construct a questionnaire to measure an underlying variable, (c) reduce a data set to a more manageable size while retaining as much of the original information as possible, and (d) identify a new, smaller set of uncorrelated variables to replace the original set of correlated variables in subsequent multivariate analysis.
Prior to selecting the appropriate extraction method from the available methods such as principal component analysis (PCA), principal axis factoring [principal factors analysis (PFA)], unweighted least squares, generalized least squares, maximum likelihood, alpha factoring and image factoring, two things need to be considered are: (a) whether researcher's aim is to generalize the findings from a sample to a population, and (b) whether s/he is exploring a data or testing a specific hypothesis (Field, 2005).Where a researcher is interested in exploring the data and applying the findings to the sample collected or to generalize findings to a population, the preferred methods are PCA and PFA as these methods usually tend to result in similar solutions and rest of the methods are complex and are not recommended for beginners (Malhotra, 2003;Field, 2005).Normally, researchers consider PCA with varimax rotation to derive factors that contain small proportion of unique variance and in some instances, error variance.

Correlation
Correlational or convergent analysis is one way of establishing construct validity.Correlational analysis assesses the degree to which two measures of the same concept are correlated.High correlations indicate that the scale is measuring its intended concept (Hair et al., 2006).It is recommended that the inter-item correlation exceeds 0.30 (Robinson et al., 1991).In fact, reliability and validity are separate but closely related conditions (Bollen, 1989).More importantly, a measure may be consistent (reliable) but not accurate/valid (Merriam, 1988).On the other hand, a measure may be accurate but not consistent (Holmes-Smith et al., 2006).Thus, the results of correlational analysis also support the results of reliability analysis.

T -TEST
t-Test is based on t-distribution and is considered an appropriate test for judging the significance of a sample mean or for judging the significance of difference between the means of two samples (Snedecor & Cochran, 1989;Trochim, 2006).However, before applying t-test, Levene's test for equality of variances needs to be applied in order to further re-check for assumption of homogeneity of variance.If the Levene's test result happens to be significant i.e. p≥0.5, 'Equal variances not assumed' test result should be used, otherwise the 'Equal variances assumed' test results can be used for analysis (Field, 2006).
The American Psychological Association (APA) has recommended that all researchers should report the effect size in the results of their work.Field (2006) argues that just because a test statistics is significant doesn't necessarily mean that the effect it measures is meaningful or important.The solution to this criticism is to measure the size of the effect that we are testing in a standardized way.Effect sizes are useful because they provide an objective measure of the importance of an effect.So, in order to calculate the size of an effect, the suggested formula is: (1) Cohen (1988) has suggested that r = 0.10 is indicative of small effect while r = 0.30 and r = 0.50 represent medium and large effects, respectively.

AnALYSIS OF VARIAnCE (AnOVA)
Analysis Of Variance (ANOVA) is an extremely useful and powerful technique used where multiple sample cases are involved (Choudhury, 2009) viz.where one wishes to compare more than two populations at a single point of time.ANOVA splits the variances and allots the responses into its various components corresponding to the magnitude and source of variation (Miller, 1997).The procedure for conducting one-way ANOVA as described by Malhotra ( 2003) involves (a) identifying the dependent and independent variables, (b) decomposing the total variation, (c) measuring effects, (d) significance testing, and (e) result interpretation.
Using ANOVA, previous researchers have investigated significant differences among a number of demographic factors such as age, educational qualifications, occupation, work experience, monthly/annual income etc.Allil (2009) has investigated the differences amongst various categories within a particular demographic factor.The researchers, with a large number of options, get better insights with the help of related post-hoc analysis option.Normally, post-hoc analysis for multiple comparisons are performed using either Gabriel or Scheffe's post-hoc tests.

COnFIRMATORY FACTOR AnALYSIS
Confirmatory factor analysis (CFA) is theory or hypotheses driven and place substantively meaningful constraints on the factor model (Albright & Park, 2009;Mihajlovic et al., 2011).

Construct and Predictive Validity
Construct validity involves the measurement of the degree to which an operationalization correctly measures its targeted variables (O' Leary-Kelly & Vokurka, 1998;Seth et al., 2008).According to O'Leary-Kelly & Vokurka (1998), establishing construct validity involves the empirical assessment of unidimensionality, reliability, and validity (i.e.convergent and discriminant validity).Thus, in order to check unidimensionality, for each construct, a measurement model should be specified and CFA should beemployed.Individual items in the model should carefully be examined to see how closely they represent the same factor.A comparative fit index (CFI) of 0.90 or above for the model implies that there is a strong evidence ofunidimensionality (Byrne, 1994).
Once unidimensionality of a scale is established, it is further subjected to validation analysis (Ahire et al., 1996).The three most common accepted forms of validities are convergent, discriminant and nomological validity (Peter, 1981;Boshoff & Terblanche, 1997;Hair et al., 2006;Auken & Barry, 2009;Lin & Chang, 2011;Wymer & Alves, 2012).Convergent validity assesses the degree to which the two measures of the same concept are correlated while discriminant validity is the degree to which two are conceptually distinct (Hair et al., 2006).Finally, nomological validity refers to the degree that the summated scale makes accurate predictions of other concepts in a theoretical model (Hair et al., 2006).

Convergent Validity
Convergent validity is the degree to which multiple methods of measuring a variable provide the same results (O'Leary-Kelly & Vokurka, 1998).Moreover, convergent validity can be established using a coefficient called Bentler-Bonett coefficient (Δ) (Seth et al., 2008).Scale with values Δ≥0.90 shows strong evidence of convergent validity (Bentler & Bonett, 1980).Further, convergent validity can also be examined through the criteria suggested by Fornell & Larcker (1981) i.e.(a) the standardized loadings should statistically be significant, and (b) Average Variance Extracted (AVE) for each of the dimensions should be greater than 0.50.

Discriminant Validity
Discriminant validity is the degree to which the measures of different latent variables are unique.It ensures if a measure does not correlate very highly with other measures from which it is supposed to differ (O' Leary-Kelly & Vokurka, 1998).It can, therefore, be evaluated in accordance with the procedures described by Fornell & Larcker (1981) i.e. the AVE for each pair of the dimensions should be greater than the squared correlation for the same pair.

Testing of Measurement Theory Model
A measurement theory is used to specify how sets of measured items represent a set of constructs i.e. the key relationship between constructs to variables and between one construct to other constructs (Hair et al., 2006).The various stages involved in examining measurement theory are depicted in Figure 4.
(1) Model specification: Before making model estimation, the researcher first sets the assumed relationship between variables and establishes the initial theoretical model based on the theory and past research results.
(2) Model identification: One essential step in CFA is determining whether the specified model is identified.If the number of the unknown parameters to be estimated is smaller than the number of pieces of information provided, the model is underidentified while provision of more than one independent equation will make it overidentified.Therefore, without introducing some constraints any confirmatory factor model is not identified.The problem lies in the fact that the latent variables are unobserved and hence their scales are unknown.To identify the model, it therefore becomes pertinent to: (a) set the variance of the latent variable or (b) factor loading to one.
(3) Model estimation: Estimation proceeds by finding the parameters λ (lambda), Φ (phi), and Θ (theta) so that predicted x covariance matrix Σ (sigma) is as close to the sample covariance matrix (S) as possible.Several different fitting functions exist for determining the closeness of the implied covariance matrix to the sample covariance matrix, of which maximum likelihood estimation (MLE) is the most common method (Albright & Park, 2009).
(4) Model evaluation: Unlike EFA, CFA produces many goodness-of-fit measures to evaluate the model but does not calculate factor scores (Albright & Park, 2009).After getting the parameter estimation values, one must evaluate the model fit, and compare it with the recommended fit indices (eg.Kline, 1998;Byrne, 2001;Hair et al., 2006).Assessing whether a specified model fits the data is one of the most important steps in CFA (Yuan, 2005).While assessing model fit, it is not necessary or realistic to include every index included in the output.As there are no golden rules for assessment of model fit because different indices reflect different aspects of model fit, reporting a variety of indices is necessary (Crowley & Fan, 1997).In a review by McDonald & Ho (2002) it was found that the most commonly reported fit indices are the Comparative Fit Index (CFI), Goodness of Fit Index (GFI), Normed Fit Index (NFI) and the Non-Normed Fit Index (NNFI).Furthermore, Kline (2005) and Hayduk et al. (2007) asserted that the χ 2 along with its df and associated p value, should at all times be reported.
Moreover, it is suggested by Hooper et al., (2008) that it is sensible to report the χ 2 statistic, its degrees of freedom and p value, the Root Mean Square Estimation of Approximation (RMSEA) and its associated confidence interval.
(5) Model modification: If the model does not fit the data well, the researcher should check a number of diagnostics which suggest ways to further improve the model or perhaps some specific problem area (Hair et al., 2006).

First-Order Factor Model
A First-Order Factor Model (FOFM) means the covariances between the measured items are explained with a single latent factor layer (Hair et al., 2006).Empirically, first order factor accounts for covariation between observed variables (Babin et al., 2003).Usually, the covarinace terms are left free (Hair et al., 2006) while the factor loading of the first item in each construct is fixed to 1 (Anderson & Gerbing, 1988;Narayan et al., 2008).

Second-Order Factor Model
Researchers are increasingly employing higher order factor analyses (Kerlinger, 1986) as it imposes a more parsimonious structure to account for the interrelationships among the factors identified by the lower order CFA (Brown, 2006) and perform better on indices (eg.PNFI, RMSEA etc.).A Second-Order Factor Model (SOFM) is defined as consisting of two layers of latent constructs, modeled as causally impacting a number of first-order factors (Roy & Shekhar, 2010) i.e. a second order latent factor causes multiple first-order latent factors, which in turn explain the measured variables (Hair et al., 2006).
Both theoretical and empirical considerations are associated with secondorder CFA that requires two criteria to be taken into account: first, the second-order must be identified with at least three firstorder factors, and secondly, each individual first-order factors must possess a minimum of two observed variables (Kline, 2005).In this regard, Bagozzi (1995) stated that a second-order model is useful when the firstorder factors are distinctive and contain a significant shared variance.The number of higher order factors that can be specified is dictated by the number of lower order factors.Unlike first-order CFA, higher-order CFA tests a theory-based model for the patterns of relationships among the firstorder factors (Roy & Shekhar, 2010).These specifications assert that higher-order factors have direct effects on lower order factors; these direct effects and the correlations among higher-order factors are responsible for the co-variation among the lower-order factors (Brown, 2006).

STRUCTURAL EQUATIOn MODEL
Several researchers have suggested that causal relationships of factors and behavioural intention can best be analyzed using Structural Equation Model i.e.SEM (Schumacker & Lomax, 1996;Hair et al., 2006).In fact, available service quality literature provides evidence in support of use of SEM by researchers in the area to generate and analyse the theorized models (MacCallum & Austin, 2000;Byrne, 2001;Holmes-Smith, 2001;Al-Hawariet al., 2005;Dunsonet al., 2005;Mostafa, 2007;Khan & Adil, 2010;Lai et al., 2010;Seiler et al., 2010;Adil, 2012;Adil & Khan, 2012b, 2012c;Adil et al., 2013a).SEM technique provides more realistic models than standard multivariate statistics or multiple regression models alone.By using SEM, researchers can specify, estimate, assess, and present the model in the form of an intuitive path diagram to show hypothesized relationships among variables.Later, the proposed research model can be tested by carefully comparing obtained model values with recommended fit indices.In fact, assessing whether a specified model fits the data is one of the most important steps in SEM (Yuan & Bentler,1998).
An exogenous construct is latent multiitem equivalent to an independent variable; it is not affected by any other construct in the model.While an endogenous construct is latent multi-item equivalent to a dependent variable; it is a construct that is affected by other constructs in the model (Sharma, 1996;Hair et al., 2006).A latent construct is determined indirectly by measuring one or more variables.These measured variables are used as the indicators of latent constructs (Hair et al., 2006).
Anderson & Gerbing (1988) have proposed a two-step approach for analyzing the data.By using this two-step approach, the typical problem of not being able to localize the source of poor model fit associated with the single-step approach can be overcome (Kline, 1998).The single-step approach involves assessing measurement and structural models simultaneously (Singh & Smith, 2006).
The critical point in SEM is assessment of model fit.A large class of omnibus tests exist for assessing how well the model matches the observed data.The conventional overall test of goodness-of-fit assesses the discrepancy between the hypothesized model and the data by means of a χ 2 test (Srinivas & Kumar, 2010).However, χ 2 value is widely recognized to be problematic (Jöreskog, 1970) and sensitive to sample size (Vigoda, 2002).In large samples, the χ 2 test observes even trivial differences between the data and the hypothesized model, leading to rejection of the model (James et al., 1982;Bollen & Long, 1992;Browne & Cudeck, 1992;Hayduk, 1987Hayduk, , 1996;;Srinivas & Kumar, 2010).The χ 2 test may also be invalid when distributional assumptions are violated, leading to the rejection of good models or the retention of bad ones (Brown, 2006).Due to these drawbacks of χ 2 test, a number of alternative fit indices viz.Root Mean Square Error of Approximation (RMSEA), Goodness of Fit Index (GFI), Adjusted Goodness of Fit Index (AGFI), Comparative Fit Index (CFI), and Normed Fit Index (NFI) have been proposed by researchers (Bollen & Long, 1993;Hu & Bentler, 1999;Arbuckle, 2003;Hooper et al., 2008;Srinivas & Kumar, 2010) and used successfully in social research (e.g.Alkadry, 2003;Vigoda, 2002;Adil, 2012;Lii & Lee, 2012;Adil et al., 2013a;Adil et al., 2013b).In contrast to the χ 2 test that provides a strict yes or no decision regarding acceptance of the model, most of these alternative indices focus on the degree of fit (Hu & Bentler, 1999).Although each of these indices have their own advantages and disadvantages but they are relatively less affected by sample size (Albright & Park, 2009).

COnCLUSIOnS
The aim of this study was to assemble and assimilate the different data analysis techniques that have widely been used and reported in services marketing literature.The article provides a concise list of select studies on service quality covering the domain from conventional services to the internet-enabled services.The study provides varied perspectives on schematic presentation of data analysis techniques.Collectively, the extant literature suggests that there is a growing trend among researchers to rely on sophisticated quantitative analytical techniques to generate and analyze complex models, while at times ignoring very basic and yet powerful procedures such as t-Test, ANOVA and correlation.It is quite noticeable that from the year 2000 onwards, there has been a significant shift in the sophistication of methods being employed for data analysis i.e. there has been a shift from employing basic measures such as mean, standard deviation, correlation, regression etc. to embracing higher order multivariate techniques viz.confirmatory factor analysis, structural equation modeling technique etc. Lately, attempts by researchers to study the effects of mediating and moderating variables has further added to the complexity.The marked shift in orientation of researchers towards using sophisticated analytical techniques can largely be attributed to the competition within the community of researchers in social sciences in general and those working in the area of service quality in particular as also growing demands of reviewers of journals.Thus, researchers, practitioners and consultants need to continually update themselves of advances in data analysis tools and techniques in order to gain deeper insights and arrive at optimal solutions to increasingly complex management problems.

Figure 4 .
Figure 4. Steps Involved in Testing Measurement Theory Model

Table 1 .
Select Studies on Service Quality from1985 Till 2013