Using scale reduction techniques for improved quality of survey information

Summary of the item analysis Table 10 indicates that item analysis was used successfully in all four cases to reduce the scale. The reliability coefficients showed some improvement in all four cases. Although two of the four cases only showed a partial improvement, this does not negate the overall value of applying item analysis to refine the questionnaires. Importantly, in all of these cases, the researchers would have had more concise instruments to deal with. The implications of this are as follows: z The respondents would have taken a shorter time to complete the survey (especially in case four, which showed a scale reduction of 79% and an absolute improvement in reliability). z There is an increased probability of an improved response rate and minimization of response bias (Schmitt and Stults 1985). 7 Conclusion In this article the authors have highlighted a concern that there is insufficient attention being paid to questionnaire design by researchers. This is especially a concern since the raw data provided by surveys are the bedrock of good information and useful knowledge. In addition, the importance of designing instruments with a parsimonious set of items, so as to account for 'survey-fatigue' syndrome associated with reading text on electronic media, was highlighted. The authors argue that the use of statistical techniques, such as item analysis, not only aids in condensing scale items but may also improve the reliability of survey results. The latter was demonstrated through four cases, in which the authors evaluated data from completed surveys. Through the statistical processes described above, the authors have shown that the use of item analysis as a technique can improve survey results. The


Introduction
Surveys have been used since time immemorial to collect raw data for the production of information in a variety of contexts.The earliest use of the survey technique can be traced back to that of ancient Egyptian rulers who conducted censuses to help them administer their domains.Even the Old Testament of the Bible refers to the Lord asking Moses and Eleazar to 'take a census of all the congregation of the people of Israel...' The age of this research technique is perhaps indicative of its prominence today as a tool of both the modern scientist and of industry-based information managers.
The survey method is considered as the single most important approach in empirical social research (Kuechler 1998(Kuechler :1780 and one that most frequently underpins research designs (Van Staden and Visser 1991).Hinkin (1995) posits that over the past several decades, hundreds of scales have been developed to assess various attitudes, perceptions, or opinions of people in all walks of life.It is also common practice for surveys to be used by business managers as a source of information for decision making.
The application of surveys can be found in a diverse number of scientific journals in fields such as political science, psychology, education, computer science, medicine and informatics, as well as in mass media, industry and government research.There are various reasons for the prominence of the survey method among researchers.Possibly the most important, though not always acknowledged, are the positivist influences of philosophers such as Auguste Comte and Emile Durkheim since the first half of the nineteenth century.
In the 1990s, the introduction of a standardized Internet protocol for the transmission of information, and hence the advent of the World-Wide Web, has facilitated a proliferation of surveys for both scientific and market research (Couper 2000;Porter and Whitcomb 2003).The main tool for conducting surveys has been questionnaires, mainly because they are relatively easy to administer (Zhang 2000) and can efficiently gather sufficient data at a low cost.In the current era, both e-mail and Web-based questionnaires have been firmly entrenched as a tool of choice for researchers (see, for example, Andrews, Nonnecke and Preece 2003;Couper 2000;Stanton, Sinar, Balzer and Smith 2002), given the ease of administering surveys using Internet protocols.
However, given the popularity of surveys as a data collection tool, it is incumbent upon researchers to apply stringent measures to ensure the validity and reliability of the research instrument and hence improve the quality of the results.The appositeness of approaches used in questionnaire design, especially during the pilot phase, is one aspect of design that can enhance survey results.This is especially important in light of human computer interaction studies, which indicate that the reading of text on lengthy Web-based questionnaires is problematic for the average user as compared to filling out paper-based questionnaires (see, for example, Dillon, Kleinman, Bias, Choi and Turnbull 2004;Forsyth, Grose and Ratner 1998;Stanton et al. 2002).Therefore researchers and businesses that employ surveys need to take into account that a parsimonious set of items will be more conducive to ensuring that the respondents are able to focus on the questions and consequently provide valid and reliable responses.In addition, other studies have suggested that the number of items in a scale has an effect on responses, and that shorter scales is an effective means of minimizing response biases (Schmitt and Stults 1985).Moreover, Hinkin (1995) states that scales with as few as three items can have adequate internal consistency reliability.
More importantly, survey results usually provide the raw data for processing data into useful information, thereafter knowledge and eventually wisdom (Ackoff 1989;Zeleny 1987).If one had to take a supply chain view of Ackoff's Data Information Knowledge Wisdom (DIKW) hierarchy (Figure 1), it stands to reason that the quality of data produced by surveys is the bedrock upon which quality information and useful knowledge is produced.From the foregoing, a number of issues regarding the use of surveys to collect data have been highlighted, namely that the use of lengthy questionnaires is problematic, a large number of items in a scale could have a possible negative effect on response rates and shorter scales minimize response biases.Consequently the problem that this research aimed to address concerns how to arrive at an ideal parsimonious set of scale items so as to provide both focused and reliable responses for scale development.The principal research questions pursued in this research were therefore: Does the application of statistical techniques aid in condensing the number of items in questionnaires that make up a scale?Will the reduction of the number of items in a scale contribute to improved reliability of survey results?Pursuant to the above, this research explored the application of a statistical technique, namely item analysis, as a means to develop a more parsimonious set of items in survey instruments.

Background to the research problem
The motivation for this study initially emanated from an analysis of post-graduate research conducted within a post-graduate facility at a South African university in the period 2005 to 2007.In the analysis of the empirical work conducted by 30 students it was found that: The majority of students used a survey design (57%) Likert scales were used by 33% Only 15% had piloted the questionnaire Web-based surveys were used by 18%.
As a means of further investigating the notion that researchers are not using statistical techniques to improve the design of survey instruments, the authors also analysed peerreviewed articles that were published in the South African Journal of Information Management (SAJIM) over a two year period (2006 to 2007).In this analysis we found that: Of the studies that reported on empirical findings, 63% used questionnaires to collect top data.Of these, only 24% reported some form of instrument refinement.The main purpose of the latter was to improve wording or remove superfluous items.For example, O 'Brien and Kok (2006) reported that they used 'three independent respondents… to determine whether the questions are easy to understand'.In another article, Van Zyl, Amadi-Echendu and Bothma (2007) conducted a pilot study in which respondents were asked to 'comment on issues such as wording, ambiguity, layout, logic and coherency'.This demonstrates that researchers were mainly concerned with improving face validity of their instruments.The other 76% of the articles were silent in respect of whether any instrument refinement technique was used.
From the foregoing analyses, the authors found that the survey technique is indeed a popular method to collect data.However, none of the studies in both the analyses exercises described above, reported on the application of statistical techniques to refine the instruments, nor of any specific attempts to improve the reliability of results during design.Rather, the few articles that did report on conducting instrument refinement, did so with the primary objective to improve readability, wording and layout of the questionnaires, that is, improve the face validity of the instruments.

Brief overview of the epistemological influences on survey research
Survey research, which is based on quantitative methodologies, draws on notions of positivism.Positivist or logical positivist research is based on the notion that research can be objective, that the researcher is independent and that the results are valid, reliable and generalizable (Remenyi and Pather 2006).This type of research, which is often directly associated with the scientific method (Galliers 1992), draws on the notions of reductionism, determinism and falsification.In the physical and life sciences, positivism is regarded as being the research paradigm that has delivered the scientific and engineering successes such the electric motor, the internal combustion engine, manned flight, a heart transplant and a robot on Mars.
When scholars began to turn their attention to how organizations and the individuals within them functioned, they looked towards the scientific method.This led to a new scientific community that addressed what was then referred to as the social sciences.One of the initiators of this idea, Auguste Comte (Comte 1975), is regarded as the founder of positivist thought.The main thrusts of Comte's philosophy are summarized by Babbie and Mouton (2001:22) as follows: 'For Comte the ultimate idea is to establish a society founded on scientific principles.This ideal could only be realized if the social sciences obtain the same control over its domain as is the case in the natural sciences.It is therefore only logical that the best strategy for the social scientist is to follow the same methodology as that of the natural scientist.This means that in both domains the aim is to establish universally valid, causal laws of human behaviour.'Exponents of positivist traditions utilize mainly quantitative research methods.Roode's (2003) view of positivism is: 'The positivist regards the objects of her study as exactly that viz.objects, and applies the methods and practices of the scientific method in investigating these objects with the aim of reaching valid and truthful conclusions about them, top thereby contributing to knowledge that attempts to uncover universal laws to be used for predictive purposes.' Although today there are a number of other epistemological influences on research, such as interpretivism and critical realism, positivism is still held in high regard by many researchers, whether they are in the physical and life sciences or in the social sciences.In the current era we are witnessing an increasing number of studies that adopt non-positivist approaches, and we concur such approaches will continue to have relevance within particular research settings.However, given the dominance of positivist research, quantitative survey research will in all probabilities still continue to dominate the research agenda, especially in the social sciences.This article therefore underscores the need for increased rigour in the design of surveys in order to fortify our quests to 'establish universally valid, causal laws of human behaviour' (Comte 1975).

Overview of some of the prominent statistical techniques used for survey design
According to Schab (1980), there are three stages in the development of questionnaires, namely item development, scale development and scale evaluation.Given that the objective of this article is to propose a method for reducing the number of items constituting a scale, the focus of the techniques discussed here are limited to those associated with scale development.
Several studies in the literature have investigated and reported on scale reduction methods as specific strategies in survey design.An overview of these is presented in Table 1.Adequate internal consistency, that is reliabilities, can be obtained with as few as three items.
Adding items indefinitely makes progressively less impact on scale reliability.Hinkin (1995:973) also discusses sample size and reports that item-to-response ratios should range from 1:4 to 1:10 (Rummel 1970;Schwab 1980).Kline (1994:49, 73) states that for the results of factor analysis to be reliable, samples of at least 100 respondents are needed.If all of these considerations are to be taken into account in scale development, it would mean that pilot sample sizes should be of that magnitude.Another avenue that may be followed is to use a technique such as item analysis that works successfully with smaller pilot samples.The following section further explains this method.

Item analysis
Item analysis has many contexts in the literature, for example descriptive statistical analysis of individual items (De Man, Gemmel, Vlerick, Van Rijk and Dierckx 2002) and item difficulty in cognitive tests (Collins 2003).Nunnally (1967) developed a technique called item analysis that measures the correlation of each item with the underlying construct.In this context, items should correlate highly with the constructs they intend to measure.Cooper and Schindler (2003: 252) on the other hand describes item analysis as a technique to differentiate between respondents having high total and low total scores on summated Likert scale items (Cooper and Schindler 2003: 262).
In this article, we refer to item analysis as a strategy for attitude and opinion scale development and reduction as defined by (Edwards 1957).Likert scales, as used in the study of attitudes and opinions, are most often evaluated by this method.According to Edwards (1957) a mean score is calculated for each item.A t-test for the equality of means is applied to find significant differences between the means of high scorers (in the highest 25 percentile) and low scorers (in the lowest 25 percentile) on the scale (Cooper and Schindler 2003:263).The items that have the highest t-values, that is the most significant differences, are selected for inclusion.Those with a significance greater than 0,05 are believed not to add further insight to the construct and may be eliminated from the scale (Cooper and Schindler 2003).Unlike the situation in split-sample reliability, in item analysis, the responses are not divided into two random groups (Karavas-Doukas 1996).
6 Research method and study results To demonstrate the usefulness of item analysis during scale development, four completed studies were selected as cases.The criteria for selecting the studies were:

Various
The research utilized the survey technique Questionnaires using Likert scales were used The researcher/s did not utilize any statistical technique for item reduction or instrument testing.
The raw data was available to the authors.
After having selected the four cases, the authors then experimented with the data as follows: Simulated a pilot study of the instrument by isolating a first set of responses, according to the size of the original sample of responses Applied item analysis on the returned data of the responses of the simulated pilot study.
Used the resultant reduced number of items to simulate the actual survey, that is, responses to items which were excluded through our application of item analysis were not part of the simulation Applied the statistical tests used by the original researchers Compared the reliability results of our simulated survey with the original results of the researchers.
The results of the foregoing are reported in the next three sections.

Case one
In this study, the researcher investigated the perceptions of social status and its correlation with buying habits.The scale was made up of 14 items each measured on a Likert scale.The sample consisted of 100 respondents.For the purpose of this article, the first 40 respondents listed were included for the purpose of the pilot sample and the remainder (60 respondents) were included in the validation sample.The item analysis strategy was applied to the pilot sample.All items showing a p-value greater than 0,05 were eliminated from the scale, as can be seen in Table 2. Before using the t-tests, the items relating to construct A were listed as in Table 3.The Cronbach's alpha reliability coefficient for the construct in Table 3 was 0,460.After applying item analysis as, explained by Edwards (1957), two of the items, X12 and X14, associated with the scale for construct A, were eliminated in the experimental example, increasing the Cronbach's alpha to 0,584.The second row in Table 3 shows the remaining items in the construct.In applying the same analysis to the full sample, the Cronbach's alpha increased to 0,623.In this case the application of item analysis yielded a favourable result, that is, an increase in the scale reliability.
Table 3 Comparative Cronbach alpha results for items in A

Case two
The second case is taken from an online survey.The researcher in this case used both reliability and factor analyses in his investigation of the loyalty of online customers of small, micro and medium enterprises (SMMEs).The scale in this questionnaire consisted of 49 items describing 12 constructs.The first 80 responses were included in the pilot sample and the remaining 96 responses were included in the validation sample.Seven items were eliminated for not having significant differences between the low and high scorers of the pilot sample.Four of these eliminated items constituted a construct, which then was eliminated completely.The other three items belonged to three other constructs consecutively.The first item to be eliminated was one of the items describing the third construct.In the original analysis, the Cronbach's alpha for the third construct, of which the items are listed in Table 4, was 0,700.
Table 4 Items in the third construct with original and new factor loadings After applying item analyses, item X10 was eliminated from the scale, and the Cronbach's alpha increased to 0,793.The component matrix of exploratory factor analysis also showed higher loadings of the items onto the construct.
The second item, X27, to be eliminated was listed in the seventh construct.alpha in the original analysis for this construct was 0,677.Table 5 shows the original and resulting construct.The Cronbach's alpha in this case decreased to 0,611, but the factor loadings of the items on the construct increased.
Table 5 Items in the seventh construct, with original and new factor loadings The tenth construct, consisting of four items, was completely eliminated from the scale.The original Cronbach's alpha for this construct was 0,633, and all the factor loadings were below 0,57.The 49th item was the last item to be eliminated from the scale.This item was listed in the 12th construct.The Cronbach's alpha for this construct was 0,668 originally.The Cronbach's alpha value decreased to 0,633, but the factor loadings of the remaining three items increased.
Thus the application of item analysis in this case also yielded improved results, i.e. improved factor loadings, even though the Cronbach's alpha value (reliability) decreased.
Table 6 Items in the 12th construct, with original and new factor loadings

Case three
The third study investigated industry design perceptions of the respondents.The scale used in this survey consisted of 30 items, making up a possible six constructs.It was decided to assign the first 79 complete cases of the 338 cases to the pilot sample.In this study, factor analysis was used to determine the underlying constructs.
Item analysis was applied to the pilot sample and only one of the 30 items were eliminated.This item belonged to the fifth construct.Table 7 shows the results of the original and the new analyses.In summary, the application of item analysis during the pilot phase of this case resulted in an increase in the Cronbach's alpha value for the fifth construct and no change to any of the other five constructs.Once again this points to improved results.

Case four
In the fourth case the researcher investigated student drop-out rates and factors influencing it.
The number of respondents in this case was 490 of which the first 60 were included in the pilot sample.The remaining 430 were included in the experimental sample.The scale in this case consisted of 52 items describing a possible seven constructs.Only eleven items had significant differences between the low and the high scorers of the pilot sample and were thus included in the resulting factor analysis.Of the seven original constructs only four factors remained, of which two consisted of only one item each.
The first construct consisted of four items in the original factor analysis.The Cronbach's alpha value for this construct was 0,686.
Table 8 Items in the first construct with original and new factor loadings After item analysis was applied, X3 was eliminated from the scale and the Cronbach's alpha value increased to 0,752 and the factor loadings showed some increases as well.
In the second construct, which originally consisted of five items, item analysis eliminated X39 from the scale.In the original analysis the Cronbach's alpha value was calculated to be 0,369, as can be seen in Table 9.After item X39 was eliminated the factor loadings of the remaining items increased somewhat and the Cronbach's alpha value increased to 0,768 In this case, the application of item analysis reduced the scale from 52 to 11 items, and increased the factor loadings as well as the reliability coefficient in both constructs that contained more than one item.

Summary
The results of our experiment with the four cases above can be summarized as follows: Table 10 Summary of the item analysis Table 10 indicates that item analysis was used successfully in all four cases to reduce the scale.The reliability coefficients showed some improvement in all four cases.Although two of the four cases only showed a partial improvement, this does not negate the overall value of applying item analysis to refine the questionnaires.Importantly, in all of these cases, the researchers would have had more concise instruments to deal with.The implications of this are as follows: The respondents would have taken a shorter time to complete the survey (especially in case four, which showed a scale reduction of 79% and an absolute improvement in reliability).
There is an increased probability of an improved response rate and minimization of response bias (Schmitt and Stults 1985).

Conclusion
In this article the authors have highlighted a concern that there is insufficient attention being paid to questionnaire design by researchers.This is especially a concern since the raw data provided by surveys are the bedrock of good information and useful knowledge.In addition, the importance of designing instruments with a parsimonious set of items, so as to account for 'survey-fatigue' syndrome associated with reading text on electronic media, was highlighted.The authors argue that the use of statistical techniques, such as item analysis, not only aids in condensing scale items but may also improve the reliability of survey results.
The latter was demonstrated through four cases, in which the authors evaluated data from completed surveys.Through the statistical processes described above, the authors have shown that the use of item analysis as a technique can improve survey results.application of item analysis in all four surveys produced a more condensed set of items, the reliability coefficients showed an absolute improvement in two of the cases and a partial improvement in the other two.Consequently, the authors do not argue with absolute conclusiveness that item analysis can enhance survey results each and every time it is applied.Rather, this study does demonstrate that, through more diligent survey design, 'survey fatigue' can be countered by not only reducing the length of scales, but also improving survey results.
Finally, the authors urge researchers and practitioners to consider the use of techniques such as item analysis during pilot testing so as to improve the quality and reliability of information that is eventually produced through analysis.

Table 1
Scale reduction methods

Table 7
Items in the fifth construct, with original and new factor loadings

Table 9
Items in the second construct, with original and new factor loadings