Artigo Original

The use of numerical value of adverbs of quantity and frequency in the measurement of behavior patterns: transforming ordinal scales into interval scales

O uso do valor numérico dos advérbios de quantidade e frequência na medição de padrões de comportamento: das escalas ordinais às escalas de intervalos

La utilización de adverbios de cantidad y frecuencia en la medición de conducta: transformar escalas ordinales en escalas de intervalos

Artur Parreira
Universidade Lusófona de Humanidades e Tecnologias, Portugal
Faculdade Paraíso, Brasil
Ana Lorga da Silva
Universidade Lusófona de Humanidades e Tecnologias, Portugal
Conservatoire National des Arts et Métiers, França

The use of numerical value of adverbs of quantity and frequency in the measurement of behavior patterns: transforming ordinal scales into interval scales

Ensaio: Avaliação e Políticas Públicas em Educação, vol. 24, no. 90, pp. 109-126, 2016

Fundação CESGRANRIO

Received: 9 September 2015

Accepted: 26 November 2015

Abstract: This paper presents a research on rating scales in response to different situations. It aims to improve the significance and accuracy of ordinal scales, transforming them into interval scales. To reach this objective, the presented scales combine quantitative and qualitative perspectives, joining the ease of the Likert scale and the Thurstone’s procedure. In this research, a sample of subjects was asked to indicate the numerical value of adverbs, in reference to a numerical scale. The results were subjected to statistical analysis, to assess their validity. Combining the qualitative dimension with a quantitative evaluation, this procedure can meet the biopsychosocial specificities of subjects, as required by the complexity paradigm. The results of this study seem to be an affirmative response to the questions about validity and reliability, and about the practicality of this procedure.

Keywords: Questionnaire, Behavioral assessment, Value of adverbs, Evaluation tool.

Resumo: O artigo apresenta uma pesquisa sobre escalas de avaliação, com o objetivo de melhorar o significado e a precisão de escalas ordinais. O estudo visou identificar o significado numérico atribuído aos advérbios, que combinam a dimensão qualitativa (significado) e a quantitativa (quantidade). Solicitou-se a uma amostra de sujeitos que indicassem o valor numérico dos advérbios, em referência a uma escala numérica. Os resultados foram tratados estatisticamente, para avaliar sua validade e fiabilidade. Combinando a dimensão qualitativa e quantitativa na avaliação, vai-se ao encontro da especificidade biopsicossocial dos sujeitos, como pede o paradigma da complexidade. Acredita-se que os objetivos do estudo foram alcançados e que ele poderá ser útil a outros cientistas que estudam os comportamentos na área das políticas e práticas educativas e suas metodologias de avaliação.

Palavras chave: Questionário, Comportamentos de avaliação, Valor numérico de advérbios, Instrumento de avaliação.

Resumen: El trabajo presenta una investigación sobre escalas, con el objetivo de mejorar el significado y exactitud de las escalas ordinales. Está centrado en el valor numérico de los adverbios que combinan la dimensión cualitativa (significado) con la cuantitativa subyacente. Se solicitó a una muestra de sujetos indicar el valor de los adverbios, sometiendo los resultados a análisis estadístico, para mejorar su validez y fiabilidad. Integrando las dimensiones cualitativa y cuantitativa, se busca la especificidad biopsicosocial del sujeto, como lo pide el paradigma de la complejidad. Se alcanzaron los objetivos del estudio, y los resultados pueden ser útiles a otros científicos en el ámbito de las políticas y prácticas educativas y de sus metodologías de evaluación.

Palabras clave: Cuestionario, Evaluación de la conducta, Valor numérico de adverbios, Herramienta de evaluación.

1 Introduction

Studies on scales measuring attitudes and behavior have a relatively long history, since Thurstone in 1928 proposed a theory on the measurement of attitudes based on psychophysics models (THURSTONE; CHAVE, 1928). Thurstone’s approach to psychophysics is understandable, since in scientific thinking any observation must be translated into a quantitative result to be considered precise and measurable (BUNGE, 2000). In fact, the classical scientific concept of measurement is to assign numerical values to objects and events, according to defined rules (KERLINGER; LEE, 2002).

It is generally accepted, however, that psychological objects have specificities that must be taken into account, when you want to assign numerical values to its specific expression, the human behavior. Every psychic phenomenon – self, consciousness, expectation, attitude, motivation, etc. – comes from interaction with the biological context of the human individual; but this interaction varies along life: it is not the same in childhood, in adolescence, in adulthood or in old age. No doubt this diversity is a condition that enhances complexity; yet there are other dimensions that must be taken into account, that is, social, cultural, educational and economic conditions (SULBARAN, 2009). Any attempt to explain – and predict – this behavior must combine all these, and perhaps, other dimensions. So it is not possible to confine the measurement of behavior to the simple classical paradigm: interestingly, the explanation of behavior must rely on the paradigm of complexity (MORIN; LE MOIGNE, 1999) and must try to combine these different dimensions, namely: the qualitative and quantitative dimensions of behavior. The paradigm of complexity leads us to understand the subject as a living system whose structures, processes and behavior, occur at an established level of complexity. This level of complexity is determined by the system’s position on four factors:

  1. 1. Level of thought and information processed in the system: The higher the level of thought and information in the human system, the greater its complexity (LE MOIGNE, 2011).
  2. 2. Internal variety of the system: The more diverse the experiences and fields of reality constructed and cognitively interpreted by the human system, the greater its complexity (SIMON, 1987).
  3. 3. External Variety system: The greater the variety of the entities with which the system has continuous relationships, the more complex the human system in this criterion (VAZ, 2003).
  4. 4. Integration of the informational variety: The integration of different structures, processes, and patterns of behavior leads to enlargement of the conceptual boundaries and to richer meaning of the constructed reality. The living system develops the ability to deal with uncertainty in all domains of knowledge, and its reasoning becomes probabilistic, not simply deterministic (LE MOIGNE, 2011).

2 Theoretical Framework

Complex thinking recommends that the interpretation of human behavior takes account of all the data collected through the measuring instruments, and that each type of data is specifically analyzed and interpreted in the light of its level of complexity:

The study aimed at answering both concerns above, seeking to combine the qualitative and quantitative dimensions and reporting such knowledge to the level of complexity in which the subject is positioned.

The researchers adopted the format scale proposed by Likert, as these are the most used scales today and the easiest to construct (BOZAL, 2006). They are ordinal scales, in general, as Stevens points out (1946, quoted in BOZAL, 2006). Stevens categorized scales according to statistical operations allowed by them, and this categorization has become classic. Those focused in this article are ordinal and interval scales.

The instruments to capture data about behavior should adapt to the specificity of the psychological; only in this way we can create instruments with higher levels of validity and reliability. In the psychological field, the specificities scales must answer are mainly three:

To achieve this closer adjustment to the reality of the human subject, we start from the spontaneous evaluative behavior of the subjects, who currently use adverbs of quantity and frequency. This study is a continuation of another carried out in 2003 and published in 2006 (PARREIRA, 2006), whose objective was the substitution of ordinal scales by interval scales, in order to reach an adequate mix of qualitative and quantitative factors, and to create scales of accrued validity and reliability. The obtained results are quite similar to those found, for the same adverbs, in the present study, as can be confirmed comparing the Tables 1 and 5 below.

Table 1
Statistical results from 2003 study
Quantity adverbsNmeansdminmaxrange
Extremely2409.4430.7479101
Quite enough2407.4180.850594
Medially2404.7650.8602.585.5
Little2402.2070.8680.554.5
Nothing at all2400.2490.531033
Source: Parreira (2006).

These similar scores show that there is a high stability of the proposed evaluations: even very different composition and staggered in time (10 years) samples produce very close numerical scores. Thus, the use of these qualitative/quantitative scales seems both quite reliable and characterized by a solid empirical validity.

On this basis, the authors decided to conduct a similar research, aiming to gain a more extensive evidence for the following questions:

3 The Methodology

In this study, the authors started by choosing a set of quantity and frequency adverbs currently used by people when they want to mean the quantitative dimension of a cognition or an emotion. This is the starting hypothesis: A scale based on a stable measurement of the numerical value of these adverbs will present consistent and known distances between the various positions; thus it will be effectively an interval scale.

Any interpretation or action based on it will thus be more precise than the mere ordering and will be more adjusted to what is expressed by the subjects, that is, its validity will be accurated. A list of adverbs of quantity/intensity- those most frequently used in Likert scales – was presented to a sample of people. People were asked to attribute a numerical value to each adverb or adverbial phrase on a scale of 0 (meaning 0% intensity) to 10 (meaning 100% intensity), in order to be used as a quantitative scale measuring attitudes, emotions and behavioral patterns.

This procedure is similar to Thurstone’s, who also asked respondents to evaluate the numerical value of a sentence (the difference being that in this case, it is a quantity or frequency adverb or adverbial expression). Its advantage is the possibility of being used with any sample of people who will answer a test or questionnaire, as the scale is independent of the content of the evaluated sentence. The same procedure was used with frequency adverbs; in this case, however, the numerical scale used frequency reasoning, evaluating the frequency adverbs within a continuum from 0% frequency to 100% frequency.

3.1 The Sample

The sample, originally composed by 219 subjects, included mainly University students, some of them workers. Sample treatment: Evidently incongruent subjects or participants with missing data were excluded from the original sample, using the listwise method. Subjects that were not able to correctly understand Portuguese language, that is, the meaning of some adverbs, and detected outliers were also excluded. The final sample was composed by 198 subjects, although the following tables are based on 219 observations (Tables 2, 3 and 4).

Table 2
Frequency observed by gender
GenderFrequencyPercent
014666.9
17232.9
NA10.5
Total D37100
0 = Male 1 = Female Source: Authors research (2013).

Table 3
Frequency of age by classes
ClassesFrequencyPercent
[19,34[11452.1
[34,47[6228.2
[47,60[3516
[60,73[73.2
NA11.5
Total219100
Source: Authors research (2013).

Table 4
Level of education by groups
ClassesFrequencyPercent
1- Basic (5 years)31.4
2- First cycle (6 years)41.8
3- Second cycle (9 years)41.8
4- Secondary level (12 years)209.1
5- University level16876.7
6- Post-Graduate and PhD146.4
NA62.8
Total219100
Source: Authors research (2013).

Table 5
Quantity Adverbs and their observed Statistics in this Sample
Original adverbsTranslated adverbsMeaMedianModeSdminmax
A- TotalmenteTotally9.5210100.8337103
B- CompletamenteCompletely9.3410100.8927103
C- PerfeitamentePerfectly9.3610100.8227103
D- InteiramenteEntirely9.4410100.8217103
E- ExtremamenteExtremly9.259100.8447103
F- MuitoVery much7.62881.5414106
G- BastanteQuite enough7.08782.1212108
H- MedianamenteMedially4.62550.841363
I- ModeradamenteModeratly5.28550.898473
J- RazoavelmenteReasonably5.42550.899473
K- Mais ou menosMore or less4.18551.211165
L- PoucoLittle2.27221.072055
M- NadaNothing at all0.54000.873033
Source: Authors research (2013).

3.2 Procedure

Two adverb lists were presented to the subjects of the study: one about quantity adverbs or adverbial expressions, and another about frequency adverbs or corresponding adverbial expressions. Subjects were asked to evaluate the numerical significance of the adverbs in the list, in accordance with the following.

3.3 The results

3.3.1 Quantity adverbs

As one can see (Table 5), the adverbs that indicate extreme positive or negative intensities have a more precise evaluation, with less dispersion; those in the middle of the scale are less precise and are more disperse. The same occurred with the frequency adverbs. This is a result compatible with theories on psychological judgement:

The study is complemented by the graphs presented in Figures 1, 2 and 3, resulting from the use of the Principal Component Analysis methodology for ordinal data as described in Borg and Groenen (2005), (procedure CATPCA in IBM SPSS-Statistics); it confirms the idea that it is possible to build an interval scale based on quantity adverbs, as could be expected.

Main component analysis and discrimination measures of different categories of quantity adverbs
Figure 1
Main component analysis and discrimination measures of different categories of quantity adverbs

CATPCA - Quantity adverbs: a synthesis
Figure 2
CATPCA - Quantity adverbs: a synthesis
Source: Authors research (2013).

Main component analysis and discrimination measures of different categories of frequency adverbs
Figure 3
Main component analysis and discrimination measures of different categories of frequency adverbs

3.3.2 Frequency adverbs

In the case of frequency adverbs, the procedure was similar to that of quantity adverbs. It was presented an example of a sentence using a frequency adverb to subjects and they were asked to evaluate it in a scale as the shown in the box.

The Table 6 shows the results obtained with this procedure.

Table 6
The evaluated frequency adverbs and their observed statistics in this sample
AdverbsMeanMedianModeStdMin.Max.
Always88.3879010014.3950100
Extremely frequent83.785909013.3350100
Frequently68.536708016.8530100
Many times67.177707016.1930100
Quite enough times68.491707015.8330100
sometimes39.625404016.591080
Rarely16.01210109.711040
Never1,283003.36010
Source: Authors research (2013).

3.4 Practical Applications

The obtained results enable the construction of different equivalent scales, which can be adapted by the researcher to: respondents sample, issue and situation, according to the research objectives. This condition facilitates the transfer of Thurstone´s perspective to different situations and instruments, enhancing the researcher’s flexibility.

Two examples of these different scales (with 5 or 6 degrees based on the two categories of adverbs) can be found below (Tables 7 and 8).

Table 7
First example: a six position scale
A 6 position scale, adjusted to a questionnaire with items is presented below (responses and their treatment serve only as an example):
Extremely important(E)
Very important(M)
Enough important(B)
Medially important(md)
Little important(P)
Not at all important(N)
EMBmdPN
1. Is it important that the teacher gives incentives, praises and demonstrates personal esteem to his pupils, to motivate them?x
2. Is it important that the teacher gives a permanent example of openness to new ideas?x
These items focus the skill to motivate the students to develop more complete and open perceptions of reality. The respondent evaluates item 1 at level M = 7,62; and item 2 at level E = 9,25
Source: Authors research (2013).

Table 8
Second example: a five position scale
A 5 position scale, adjusted to a questionnaire with items is presented below:
Totally true(E)
Enough true(B)
Medially true(md)
Little true(P)
Not at all true(N)
TBmdPN
1. My interactions with other people are not very positive.x
2. My productive activities effectively create the resources for my subsistence. I am indeed an effective workerx
The subject has a positive and an assertive selfimage as a productive worker, with a score of 9,52; a positive image relying mainly on economic productivity; the relational dimension is not so positive; its distance to the top of the scale is quite large (numerical score 4,62). It must be said that current computational devices offer no difficulties to using these scales.
Source: Authors research (2013).

3.4.1. Scale examples from quantity adverbs

3.4.2. The Frequency adverb scales

The based on frequency adverbs scales are presented below. A first example of this scale, showing the numerical values of the adverbs and the distance between positions can be observed in Table 9.

Table 9
First example: an example of a frequency adverbs scale
AdverbsNumerical valuesDistances
A- Always88,38719,8984
E- Enough times68,49128,7663
F- Sometimes39,62523,6132
G- Rarely16,01214,7291
H- Never1,283-
Source: Authors research (2013). d1= G-H; d2 = F-G; d3 = E-F; d4 = A-E.

A second example: the frequency adverbs scale applied to a motivation test1

In its first part, the motivation test confronts value objects and situations, in sets of three. The purpose of this part is to confront the person with affectively guided choices – motivational dilemmas – which is the way motivation works, according to the motivational theory on which the test is based. In this part, the scale is composed by quantity adverbs.

The test

First part: the subjects mark the sentence they consider as the most important in the set; then, they choose the second one in importance and mark its position in the scale; finally, they mark the third one in relevance.

The questionnaire includes 21 sets, like the presented above, covering the most important areas of daily life: Personal life and family (11 sets); Work (5 sets); Leisure and free time (2 sets); Friends and friendships (3 sets).

Second part: the test measures the affective tone of the subject’s life, by registering - in the frequency adverbs scale – the frequency of emotions felt in his personal life and at work (results shown in Table 10).

Table 10
Frequency adverbs: results from the motivation test
SituationsFrequency in personal life (%)Frequency in work (%)
A- Situations that generate feelings of warmth, sympathy, friendship71.3959.69
R- Situations that generate feelings of anger, irritation26.0935.97
C- Situations that arise feelings of curiosity, desire to know, lust for information57.1265.54
N- Situations that arise feelings of rejection, revulsion, disgust9.0423.99
M- Situations that arise feelings of fear23.3027.87
D Situations that arise relax and certainty feelings58.8846.95
S- Situations that arise feelings of satisfaction, joy, enthusiasm62.1746.96
T- Situations that arise feelings of sadness, abandon, depression19.5126.24
O- Situations that arise feelings of pride and a sense of personal importance63.6986.44
V- Situations that arise feelings of guilt and shame9.4711.14
The items with substantial meaning differences are presented in italic black. Source: Authors research (2013).

Using the scale below, please mark the frequency of each feeling in your personal life and in work situations;

Scale for this part:

Table 11 shows that this procedure adequately evidences the contribution of interval scales to a more precise evaluation of behavioral factors and variables.

Table 11
Paired samples test: the variables with substantial mean differences
Pair differencestdfSig2 extremes)
MeanSt95% difference reliable interval
InferiorSuperior
PairA11.71316,96339.91819.4343,164200,005
Pair4N-14.95723,151-252.222-46.923-3,03210,006
Pair6D11.93628,377-0.6454245.1811,973210,062
Pair7S15.21129,52521.204283.0222,416210,025
Source: Authors research (2013).

3.2.1 Internal consistency

The value of the realibity for the internal consistency (SIJTSMA, 2009) is 0.861 for the 61 items, which means the motivation test presents internal consistency.

4 Conclusion

This study opens a path to the elaboration of interval scales suited to several types of psychological and sociological questionnaires, as they can be adapted to different speeches, ages, professional experiences, and cultural settings. No doubt it will be interesting for the researcher to have different options of scales, adjusted to the issue and to the sample under study. If those scales combined the words used by the subjects in their daily life, and if they could be quantified as true interval scales, we would obtain a fiable and valid instrument for behavioral research.

So this procedure allows us affirmatively answering the first question: the use of adverbs is an interesting basis for the construction of scales combining qualitative and quantitative approaches in a valid and reliable way. It allows avoiding the inaccuracies sometimes seen in Likert scales (LIKERT; ROSLOW; MURPHY, 1993) used by the authors, as shown in the two examples shown below:

In the first example, the position undecided does not contain an intensity position: it actually means a refusal of expressing a position; it is away of expressing frequency reasoning. The second example contains three positions out of the pretended scale: agree and disagree express the qualitative meaning of the position, but do not indicate its quantitative meaning; and the word neutral is even more explicitly away from the scale than undecided: they are expressions out of a scaling reasoning.

According to that, it is believed that the use of adverbs and of the frequency scale construction increases the accuracy eliminating ambiguities.

The results of this study constitute also an affirmative answer to the specific question about validity and reliability, as it was shown along the paper and especially in the presented examples.

In behavioral studies, it is quite inappropriate to speak about a zero position; but we hope that this study helps to define a minimum position (equivalent to zero in each population to whom the scale is applied). Actually this is what is used in every behavioral reasoning: there is a zero pint to each population in each variable. In face of these results, the objectives of this study have been reached and this could be useful to other behavioral scientists.

References

BORG, I.; GROENEN, P.J.F. Modern multidimensional scaling: theory and applications. New York: Springer, 2005. (Springer Series in Statistics).

BOZAL, M.G. Escala Mixta Likert Thurstone. Andull: Revista Andaluza de Ciencias Sociales, n. 5, p. 81-96, 2006.

BUNGE, M. La Investigacin científica. Barcelona: Siglo XXI, 2000.

CARDELLI, D.T.; ELLIOT, L. Avaliação por diferentes olhares: fatores que explicam o sucesso da escola carioca em área de risco. Ensaio: Avaliação e Políticas Públicas em Educação, v. 20, n. 77, p.769-98, out.-dez. 2012. doi:10.1590/S0104-40362012000400008

KERLINGER, F. N.; LEE, H. B. Foundations of behavioral research. 4a ed. Forth Worth: Harcourt College, 2000.

LE MOIGNE, J. L. L’exercice de la pense complexe permet lintelligence des systèmes complexes: interview by Jacques Perrault, Stephanie Proutheau, Edouard Kleinpeter and Alfredo Pena Vega). Hermès, n. 60, p. 157-163, 2011-2012.

LIKERT, R.; ROSLOW, S.; MURPHY, G. A simple and reliable method of scoring the Thurstone attitude scales. Personnel Psychology, v. 46, p. 689-90, 1993.

MORIN, E. ; LE MOIGNE, L. L’Intelligence de la complexité. Paris: L’Harmattan, 1999.

PARREIRA, A. Gestão do stress e da qualidade de vida. Lisboa: Monitor, 2006.

SIJTSMA, K. Reliability beyond theory and practice. Psychometrica, v. 74, n. 1, p. 169-73, 2008. doi:10.1007/s11336-008-9103-y

SIMON, H.A. CMU as an anti-entropic organization. Focus, v. 17, n. 2, p. 7-8, 1987.

SULBARAN, D. Medición de actitudes. Caracas: Escuela de Psicologia, Universidad Central de Venezuela, 2009.

THURSTONE, L.L.; CHAVE E. J. The measurement of Attitudes. Chicago: University of Chicago Press, 1928.

VIANNA, J.A.; SOUSA, S.M.; REIS, K.P. Bullying nas aulas de Educação Física: a percepção dos alunos no ensino médio. Ensaio: Avaliação e Políticas Públicas em Educação, v. 23, n. 86, p. 73-93, jan./fev. 2015. doi:10.1590/S0104-40362015000100003

Notes

1 This test was fully studied in a paper presented at SMTDA Congress, Lisbon, 2014. Here only applicable results are considered.

Author notes

Informações dos autores

Artur Marecos Parreira e Moreira Gonçalves: Professor Doutor. Professor Catedrático da Universidade Lusófona de Humanidades e Tecnologias, investigador do CPES, Coordenador do NESC. Contato: arturmparreira@gmail.com

Ana Lorga da Silva: Professora Doutora. Professora Associada da Universidade Lusófona, investigadora do CPES e pesquisadora do CEDRIC - CNAM. Contato: ana.lorga@ulusofona.pt

HTML generated from XML JATS4R by