The non-existent link between the logic of reflex and the ideology of natural selection: comments on Leão and Neto (2018)

Emilio Ribes-Iñesta

Recepción: 24 Junio 2018

Aprobación: 02 Octubre 2018

DOI: https://doi.org/10.5514/rmac.v44.i2.68541

Introduction

Leão and Neto attempt to show some possible links between the reflex conception of the operant in Skinner and his later proposal of selection by consequences. Their analysis may be qualified as an exegetic and hermeneutic essay, stressing possible “link” concepts such as shaping, differentiation, and probability, in an extensive, although not exhaustive, review of Skinner’s writings before 1957. I will argue that there are no logical or conceptual links between the reflex conception of the operant and the notion of selection by consequences, and that the identification of indirect or direct mentions of selective effects of reinforcement does not justify such theoretical possibility. Additionally, I will show that selection is not descriptive of a process or mechanism, but rather of an outcome. First, I will examine the logical limitations of the concept of the operant and why the reinforcer, as a component of the operant class, cannot exert any differential effect on the class itself. Second, I will provide evidence about the limited effects on response differentiation which, at least, question from an empirical point of view, the assumption of strong “selective” effects by the reinforcer. In this connection, I will argue against the loose use of the concept of probability made by operant theory. Finally, I will deal with conceptual, logical and empirical problems related with the concept of natural selection and its ideological foundations and implications.

The logic of the reflex and the distinction between respondent and operants

The foundational papers by Skinner on the operant-respondent distinction (Skinner, 1931, 1935, 1937, 1938) were unequivocally based upon the logic of the reflex. The reflex concept, was borrowed from the physiology of the nervous system (Sechenov, 1863/1978; Bekhterev, 1913/1953; Pavlov, 1927; Fearing, 1930; Canguilhem, 1955), incorporating the logical analysis of physical movement postulated by Cartesian mechanics. Skinner himself argued about the adequacy of the reflex arc notion as a logical model for the analysis of correlations between stimuli and responses, irrespectively of any neural structure being considered. Since I have previously examined the influence of the Cartesian mechanics paradigm on the formulation of the reflex concept and on conditioning theory in general (Ribes, 1996, 1999; Ribes & López,1985), I will limit my comments here to showing why the concept of the operant cannot be logically related in any way to the notion of natural selection.

Skinner formulated his research program advocating that the reflex concept could be devoided from any neural content. From an operational point of view a reflex consisted of the covariation or correlation of changes in stimulus conditions and some corresponding changes in response conditions. To identify a reflex simply meant to identify a stimulus-response covariation, irrespectively of the neural structures involved. Since the particular stimulus event or response event could vary in some properties without affecting the correlation, Skinner proposed the concept of stimulus and response classes to cope with the punctuate and unrepeatable nature of both kinds of events (Schoenfeld, 1972, 1976; Schoenfeld & Farmer, 1970). On an operational basis, two kinds of reflexes were distinguished: respondent, when a previous stimulus “elicited” the response, and operant, when the response was spontaneous or emitted and a stimulus could be presented as a consequence of its occurrence. This distinction between these two kinds of reflexes was based just on the possibility of identifying or not the stimulus provoking or determining the occurrence of the response, although it was assumed, as did other two-factor theories of that time, each kind of reflex was mediated by a different nervous subsystem. Skinner (1938) shared this assumption when reporting that he had been unable to condition with operant techniques the pupillary reflex, ignoring that adaption reflexes cannot be conditioned in any way (Sokolov, 1963).

While respondent reflexes consisted of a covariation or correlation of a stimulus class antecedent to the response class, operant reflexed consisted of a covariation or correlation of a stimulus class subsequent to the response class. Each class was identified by a defining property which made possible the correlation, irrespective of variations in other non-defining properties in the instances of the stimulus and response classes. The defining property shared by both classes was that specified by the contingency relation conforming the reflex as a necessary covariation or correlation. A severe restriction of the defining properties led to the point of the “natural fracture” of the reflex. Skinner (1938) acknowledged, in a footnote of The Behavior of Organisms, Kantor´s observation (following Dewey, 1896), that both components of the reflex are mutually dependent, in such a way that they cannot be considered isolated one from the other. This means that in the operant reflex, the reinforcing stimulus is a component of the correlation and not an external factor affecting the correlation. An operant consists of a correlation or covariation of a given response class and a given stimulus class identified as “reinforcing”. Both classes of events conform the operant correlation, in such a way that it is out of place to identify the operant just in terms of some class of responses as the dependent variable affected by reinforcers (or reinforcement) as the independent variable, as Skinner asserted (1953, 1957). Because of this, it is nonsensical to argue that the reinforcer, or reinforcement, selects operant responses, since reinforcers themselves are definitory components of any operant. The operant cannot be differentiated or selected by itself. Second variables affecting the reflex cannot include the reinforcing stimulus. It is a logical flaw to attribute the reinforcer with selective properties on responding. In any case, since the response and stimulus instances of the operant must necessarily correlate, the covariations in some properties of responding and the presentation of reinforcing stimuli should be seen as the outcome of predetermined operations. On the other hand, empirical evidence does not support an interpretation of “reinforcement” having differential effects on specific dimensions of responding. Morse (1966) analyzed intermittent reinforcement in terms of the interactions of differential and quantitative effects of reinforcement on responding, effects which are not necessarily symmetrical in spite of the controlling operations established. Given the spatial and responding restrictions related to deprivation or noxious stimulation in the operant chamber, rats and pigeons do not have too many options for behaving besides pressing the bar or pecking the key. Response patterns under different reinforcement schedules result from persistent behavior to one or two operanda, usually identical and proximal to the reinforcers dispenser. Differential effects of reinforcement have dealt with the patterning of response frequency or some other dimensional property of the bar-pressing response itself. Most studies show that, as Morse commented, quantitative effects on persistency interfere with “clean” differential effects. Two outstanding examples are, on one hand, the performance on differential-reinforcement-of-low-rate (DRL) schedules in which reinforcement after a pause is usually followed by response bursts (e.g., Holz, Azrin. & Ulrich, 1963). On the other hand, experiments reinforcing specific classes of long inter-response times (IRTs), show an increase in the frequency of IRTs shorter than those being reinforced (Anger, 1956; Malott & Cumming, 1964), or a decrease in the frequency of short IRTS’ when these are reinforced (Ferster & Skinner, 1957). When pauses between responses are reinforced according to a delay procedure (Wilson & Keller, 1953), response frequency is also higher than reinforcement frequency, performance resulting from the interaction of local periods of extinction and reinforcement. Similarly, the differential reinforcement of duration and effort properties of bar pressing using a continuous reinforcement schedule (CRf), results in rats responding below the prescribed criteria and exposing themselves to intermittent reinforcement (Notterman & Mintz, 1965). I do not attempt directly to review all the evidence on this issue, but it is questionable, at least, to assume that reinforcing stimuli are differential in their effects on responding, even in a restricted situation as the operant chamber. The conception of the operant as a correlation of classes and the empirical evidence on differential reinforcement do not seem to support any possible “selectionist” view in Skinner’s foundational contributions.

Reinforcement and natural selection

The preoccupation with natural selection is not indigenous to operant theory. Darwin (1859) thought that natural selection, the outcome survival struggle, was one of the three factors accounting for evolution, the other two being sexual reproduction and the acquisition of learned character (following Lamarck). Natural selection was a concept borrowed from Malthus’s (1798) conception of the negative effects of population growth relative to subsistence resources, used to explain meritocracy in social stratification. Malthus thought that alimentary resources were limited and that population increased in a geometric proportion regarding such resources. This asymmetry in the growth of population and resources led inevitably to misery, illness, hunger, perversions and destruction. Only the fittest, those in the upper social classes, were able to survive these demographic induced crises. For both, Malthus and Darwin (as well as Alfred Russell Wallace), the fittest were those that biologically or socially survived and were able to live in the best circumstances. The struggle for existence is seen as the drive moving biological evolution and social progress. The fittest survive and at the same time reproduce other individuals that are equally able to survive, statement afterwards formulated by Ronald Fisher (1930) as the genetic theory of natural selection, cornerstone of the so-called New or Modern Evolutionary Synthesis. This conception never described or explained how such a process could work. It was only the statement of a plain fact: some individuals survive or progress, and some do not. To say that they survived because they were the fittest is completely circular and redundant. Natural selection does not explain why this occurs nor how “nature” selects the best individuals and species. At least, in the case of social formations, economic, political and legal systems’ justifications seem to be more explicit regarding the criteria responsible for the establishment of social classes, meritocracy, and inequity. Malthus and Darwin’s selectionist viewpoints were not foreign to the establishment of economic liberalism stressing the role of entrepreneur individuals in social development during the first and second industrial revolutions in England. The same may be said of Neo-Darwinism and present-day economic neo-liberalism, and the dominant ideology about the wisdom of markets and the fairness of meritocratic progression. In both cases, racial differences have been posited as explanations of social differences (Galton,1889; Jensen, 1973; Herrnstein, 1971).

Skinner proposed selection by consequences as an intermediary link between natural selection and what he called survival of cultures (Skinner, 1961,1966), an unfortunate and oversimplified analogy based on the biological “principle” of struggle for existence.

Reinforcement-based theories have dealt, implicitly or explicitly, with the problem of the backward effect of the reinforcing stimulus, and its differential or selective correlation or “association” with the varying flux of behavior taking place. Several solutions were offered to these two problems. The backward effect was a crucial issue, since present events cannot affect absent events: responses are not taking place when the reinforcer occurs. Thorndike (1911) assumed that reinforcement (reward) and punishment effects, as response-stimulus connections, had to do with facilitating or interfering neural impulses. Hull (1943) posited the afferent stimulus trace, that due to neural transmission speed, could be simultaneously associated with the occurrence of the reinforcer. Guthrie (1936) had no problems at all, neither with the backward effect of the reinforcer nor with its assumed selected effect: the reinforcer closed the functional episode in which effective responding took place, preventing any other behavior from following. Finally, Skinner, dealt with the backward effect problem by postulating the operant as a correlation of response and stimulus classes, in such a way that the reinforcer affected not only one instance of the class but to the response class itself. Reinforcement, therefore, affected the recurrence of any instance of such a class in the future. Nevertheless, no account was given about how the response class was organized before the reinforcer presentation, especially when the same operandum was used to establish different operants through the development of reinforcement schedules.

This unsatisfactory solution of the backward effect of reinforcement, led Skinner to incorrectly equate recurrence frequency with probability of responding. The same physical instance of responding could be assigned to different operant classes by distinguishing different patterns and frequencies of occurrence. A careful analysis of these criteria shows that, in real-time and long-term periods, it is actually impossible to identify what “kind” of operant is taking place, without looking into the schedule operation. Probability is not a measure of events. Probability is an estimate of occurrence, and as a concept it is akin to fields in which phenomena are assumed to be stochastic in nature (which is not the case of psychology), or to calculations about the possible occurrence of an event, such as accidents, storms, lottery outcomes, etc. Basic science deals with probability only in reference to the relative frequency of events which it manipulates as random experimental variables, but not as a measure of their effects. Measure of relative or absolute frequency of responses cannot be equated with probability of responding. The latter would be an estimate of the occurrence of particular responses, whereas the former is a measure of the number of responses that took place in a given period.

Behavior is not a random process or phenomenon demanding the use of probability to describe its properties, at least in operant theory. Intrinsic variability is not a property of behavioral interactions. Therefore, probability, as an estimate of occurrence, is a concept incompatible with the notion of selection as a differential effect. An inadequate strategy has been to move the selection process backwards appealing to reinforcement history. The concept of reinforcement history, as it has been used in operant psychology, involves three different problems. First, it is not clear how specific reinforcement histories work facilitating the occurrence of corresponding operant classes in new or different situations. Simple recurrence of responses in the same situation do not need to be accounted in terms of their history of reinforcement. Such a use would be equivalent to the concept of memory in cognitive approaches. Second, in contrast to Hull (1943) and Herrnstein (1970), Skinner identified the strength of an operant in terms of its rate of occurrence or performance pattern (Skinner, 1938; Ferster & Skinner, 1957). Previous number or frequency of reinforcers are not indexes of operant strength. Therefore, the amount of previous reinforcement cannot be identified with reinforcement history and its assumed facilitating effect on responding in a new situation. And, third, the concept of history cannot be attributed “causal” functions, as replacement of actual events in a situation (Kantor, 1924-1926; Popper, 1957-1961). History, of any kind, does not account for or explain the happenings of present events. To say that a given behavior occurred due to its reinforcement history is tantamount to say that we ignore the circumstances affecting its occurrence.

Skinner was aware of the loose meaning given by himself to probability in regard to reinforcement. In Verbal behavior (1957) he argued that frequency of responding was not a significant measure, and that probability, in this case, referred to a specific response occurring to a stimulus with specific properties in the presence of a listener. It is obvious that this is an implicit acceptance of the inadequacy of the notion of probability to account for operant behavior in humans, at least. Given the functional specificity of linguistic interactions, it is nonsensical to conceive their occurrence in terms of probability. Probability, in the case of single events, can only describe the occurrence or nonoccurrence of such event, since the frequency range is restricted to 1 or 0. In contrast to the restrictions imposed to the spatial dimensions and properties of behavior in the experimental operant chamber, morphological and episodic properties of verbal behavior cannot be neglected and labeled as nondefinitory properties. To assume that the verbal community exerts the function of selecting verbal behavior is a gross oversimplification. So-called verbal behavior is nothing else than the individual participation in a common, shared social practice. Social behavior in language, to use Walter Benjamin’s expression (1996) cannot be conceived as a series of punctual episodes involving operant interactions between individuals. Language, as a form of life (Wittgenstein, 1953), is beyond the individuals’ behavior.

On the contrary, it provides functional sense to human behavior (Ribes, 1993). The logic grounding of the notion of the variation-selection pair is that selection only can take place when there are at least two possible instances among which to choose. But variation is not the same as variability in the sense of randomness, as the genetic theory of natural selection has stated. Variation means changes along identifiable dimensions, and should be also distinguished from variety and variable, the former pointing to an assortment of different things or properties, and the later to an event than can change in magnitude. Variation, strictly speaking, cannot be considered the condition in which a selection process may take place. Selection only may occur under variety, not variation nor variability. Assuming that the reinforcer “selects” responses, it would be necessary the simultaneous availability of at least two different responses taking place, and this is physically impossible.

But selection does not consist of a special kind of backward action. Selection is the outcome of actions taking place in the present time. Because of this, the extension of the Darwinian and Malthusian notions of biological and social survival to behavior are especially unfortunate and, to some degree, contrary to the logic of the concepts being used. Operant behavior, if words have any meaning, has to do with acts that affect environmental conditions (including the acts of other individuals). If something qualifies operant behavior is the fact that individuals select consequences, is that stimulus changes in the environment, by some kind of pertinent acting in each circumstance. Operant behavior is not a case of selection by consequences, but rather it consists of the selection of consequences. Changes in the environment are the outcome, not the antecedent of operant behavior. Operant behavior selects the environment, in contrast to so-called respondent behavior.

Revisiting Skinner´s foundational papers may represent a unique opportunity to appraise his contributions, contradictions and omissions. The best acknowledgement to his scientific legacy would be a critical analysis that could help to pave out the way for new proposals and perspectives which, to sum it up, is the aim of science.

References

Anger, D. (1956). The dependence of interresponse times upon the relative reinforcement of different interresponse times. Journal of Experimental Psychology, 52, 145-161.

Bekhterev. V. (1913/1953). La psicología objetiva. Buenos Aires: Paidós.

Benjamin, W. (1996). On language as such and on the language of man. In M. Bullock & M. W. Jennings (Eds.), Walter Benjamin selected writings. Cambridge, MA: The Balknap Press of Harvard University Press.

Canguilehm, G. (1955). La formation du concept de réflex aux XVII et XVIII siècles. Paris: Presses Universitaires de France.

Darwin, C.R. (1859). On the origin of species by means of natural selection, or the preservation of favoured races in the struggle for life. London: John Murray.

Dewey, J. (1896). The reflex arc concept in psychology. Psychological Review, 3, 357-370.

Fearing, F. (1930). Reflex action: A study of the history of physiological psychology. Oxford, England: Williams and Wilkins.

Ferster, C.B., & Skinner, B.F. Schedules of reinforcement. New York: Appleton Century Crofts.

Fisher, R.A. (1930). The genetical theory of natural selection. Oxford: Oxford University Press.

Galton, F. (1889). Natural inheritance. London: MacMillan.

Guthrie, E.R. (1935). The psychology of learning. New York: Harper.

Herrnstein, R.J. (1970). On the law of effect. Journal of the Experimental Analysis of Behavior, 13, 243-266.

Herrnstein, R.J. (1971). I.Q. Atlantic Monthly, 228, 43-64.

Holz, W.C., Azrin, N.H., & Ulrich, R.E. (1963). Punishment of temporally spaced Responding. Journal of the Experimental Analysis of Behavior, 6, 115-122.

Hull, C. L. (1943). Principles of behavior. New York: Appleton Century Crofts.

Jensen, A.R. (1973). Educational differences. London: Methuen.

Kantor, J.R. (1924-1926). Principles of psychology. New York: Alfred Knopf

Malott, R.W., & Cumming, W.W. (1864). Schedules of interresponse time reinforcement. Psychological Record, 14, 211-252.

Malthus, (1798). An essay on the principle of population. London: J. Johnson, in St. Paul´s Church-Yard.

Morse, W.H. (1996). Intermittent reinforcement. In W.K. Honig (Ed.), Operant behavior: Areas of research and application (pp. 52-108). New York: Appleton Century Crofts.

Notterman, J.M., & Mintz, D.E. (1965). Dynamics of response. New York: John Wiley.

Pavlov, I.P. (1927). Conditioned reflexes: An investigation of the physiological activity of The cerebral cortex. Oxford: Oxford University Press.

Popper, K.R. (1957-1961). The poverty of historicism. London: Routlegde and Kegan.

Ribes, E. (1993), Behavior as the functional content of language-games. In S.C. Hayes, L. J. Hayes, H.W. Reese, & T.R. Sarbin (Eds.)., Varieties of scientific contextualism (pp. 251-276). Reno, NV: Context Press.

Ribes, E. (1996). Cartesian mechanics, conditioning theory, and behaviorism: some Reflections on behavior and language. Mexican Journal of Behavior Analysis, 22, monographic issue, 119-138.

Ribes, E. (1999). Teoría del condicionamiento y lenguaje: un análisis histórico y conceptual.Ciudad de México: Taurus.

Ribes, E., & López-Valadez, F. (1985). Teoría de la conducta: un análisis de campo y paramétrico. Ciudad de México: Trillas.

Schoenfeld, W.N. (1972). Problems of modern behavior theory. Conditional Reflex, 7, 33-65.

Schoenfeld, W.N. (1976). The ‘response’ in behavior theory. Pavlovian Journal of Biological Science, 11, 129-149.

Schoenfeld, W.N., & Farmer, J. (970). Reinforcement schedules and the behavior ‘stream’. In W.N. Schoenfeld (Ed.), The theory of reinforcement schedules (pp. 215245). New York: Appleton Century Crofts.

Sechenov, I. (1863/1978). Los reflejos cerebrales. Barcelona: Fontanella.

Skinner, B.F. (1931), The concept of reflex in the description of behavior. Journal of General Psychology, 5, 427-458.

Skinner, B.F. (1935). The generic nature of the concepts of stimulus and response. Journal of General Psychology, 12, 40-65.

Skinner, B.F. (1937). Two types of conditioned reflex: A reply to Konorski and Miller. Journal of General Psychology,16, 272-279.

Skinner, B.F. (1938). The behavior of organisms. New York: Appleton century Crofts.

Skinner, B.F. (1953). Science and human behavior. New York: The Frees Press of Glencoe.

Skinner, B.F. (1957). Verbal behavior. New York: Appleton Century Crofts.

Skinner, B.F. (1961). Design of cultures. In B.F. Skinner, Cumulative Record (pp. 36.01-36.12). New York: Appleton Century Crofts.

Skinner, B.F. (1966). The phylogeny and ontogeny of behavior. Science, 153, 1205-1213.

Sokolov, E.N. (1963). Perception and the conditioned reflex. New York: Macmillan.

Thorndike, E.L. (1911). Animal intelligence. New York: MacMillan.

Wilson, M.P., & Keller, F.S. (1953), On the selective reinforcement of spaced responses. Journal of Comparative and Physiological Psychology, 45, 190-193.

Wittgenstein, L. (1953). Philosophical investigations. Oxford: Basil Blackwell.