Evaluating functional differences between positive and negative reinforcement through preference for parameters of sound manipulation

Joseph M. Lambert; Sarah E. Bloom; Cicely M. Nickerson; Casey J. Clay; Andrew L Samaha

resúmenes

secciones

referencias

imágenes

Abstract: The present experiment examined whether identical manipulations of identical parameters of positive and negative sound reinforcement influenced human response allocation in different ways. Three undergraduate students participated. Progressive-ratio schedules were used to identify preferred and aversive sounds the contingent presentation (or removal) of which had similar reinforcing values. These reinforcers then were incorporated into concurrent-operant parameter sensitivity assessments (PSAs) to evaluate whether participant sensitivity to dimensions of certain parameters (i.e., rate, magnitude, delay) differed across positive and negative reinforcement procedures. Sensitivity was identical across the two procedures for two participants, but not for the third. These results demonstrate positive-reinforcement PSAs do not always predict differential sensitivity to parametric manipulations of negative reinforcement. The implication, that positive and negative processes can have functionally distinct effects, is discussed.

Keywords:choicechoice,concurrent operantconcurrent operant,negative reinforcementnegative reinforcement,parameters of reinforcementparameters of reinforcement,humanshumans.

Resumen: El presente experimento examinó si manipulaciones idénticas de parámetros idénticos de reforzamiento positivo y negativo con sonido tuvieron efectos sobre la distribución de respuestas en humanos. Participaron tres estudiantes no graduados. Se usaron programas de razón progresiva para identificar los sonidos preferidos o aversivos cuya presentación contingente (o eliminación) tenía un valor reforzante similar. Estos reforzadores se incorporaron como parte de la evaluación de sensibilidad a los parámetros operantes concurrentes (PSA) para evaluar si la sensibilidad a las dimensiones de ciertos parámetros (i.e., tasa, magnitud, demora) difirieron entre procedimientos con reforzamiento positivo y negativo. La sensibilidad fue idéntica con los dos procedimientos para dos participantes, pero no para el tercero. Estos resultados demuestran que los PSA con reforzamiento positivo no siempre predicen sensibilidad diferente a las manipulaciones paramétricas del reforzamiento negativo. Se discutieron las implicaciones de que los procesos que involucran reforzamiento positivo y negativo pueden tener efectos funcionales distintos.

Palabras clave: elección, operante concurrente, reinforcement negativo, parámetros del reforzamiento, humanos.

Carátula del artículo

Artículos de investigación

Evaluating functional differences between positive and negative reinforcement through preference for parameters of sound manipulation

Evaluación de las diferencias funcionales entre reforzamiento positivo y negativo mediante la preferencia de parámetros de manipulación del sonido

Joseph M. Lambert joseph.m.lambert@vanderbilt.edu

Vanderbilt University, Estados Unidos

Sarah E. Bloom

University of South Florida, Estados Unidos

Cicely M. Nickerson

Utah State University, Estados Unidos

Casey J. Clay

University of Missouri, Estados Unidos

Andrew L Samaha

University of South Florida, Estados Unidos

Revista Mexicana de Análisis de la Conducta, vol. 45, núm. 2, pp. 173-198, 2019
Sociedad Mexicana de Análisis de la Conducta

Recepción: 22 Enero 2019

Aprobación: 18 Septiembre 2019

DOI: https://doi.org/10.5514/rmac.v45.i2.75562

Introduction

Reinforcement is a term that describes a dynamic interaction between environment and behavior in which behavior increases the probability of an event which, in turn, increases the probability of said behavior. Although the reinforcement process is primarily functionally defined, that is, identified by the effect this interaction has on event and behavior (i.e., acceleration in both cases), behavior-analytic textbooks (e.g., Alberto & Trautman, 2013; Catania, 2013; Cooper, Heron, & Heward, 2007) have distinguished subcategories of reinforcement based on formal properties of the process (i.e., positive reinforcement when contingencies entail stimulus presentations and negative reinforcement when they entail stimulus removals).

The basis for this distinction, however, has been questioned. Specifically, it has been argued that there is little evidence to suggest that the stimulus-presentation process (in and of itself) influences behavior in ways that are different or distinguishable from the stimulus-removal process. The argument is that the inclusion of descriptive qualifiers in an otherwise functionally defined concept can lead to false attribution of functional or ethical significance to properties of reinforcement that are relative, volatile, and likely irrelevant. Most poignantly, it has been argued that this practice leads to an incomplete analysis of environment-behavior interaction that impedes scientific progress (e.g., Baron & Galizio, 2005; Baron & Galizio, 2006a, b; Michael, 1975; Hineline, 1984; Perone, 2003).

A definition of reinforcement was proposed by Michael (1975) which describes environmental events in terms of contingent context-changes (generally), rather than contingent stimulus presentations or removals (specifically); thereby making descriptions of positive and negative reinforcement symmetrical, rendering the distinction meaningless. Although adopting this definition purportedly encourages a deeper appreciation for the complexity of variables at play during the reinforcement process, consensus across the field about the merits of abandoning the positive/ negative distinction has not been achieved (e.g., Baron & Galizio, 2005; 2006a,b; Chase, 2006; Iwata, 2006; Lattal & Lattal, 2006; Marr, 2006; Michael, 2006; Nakajima, 2006 Sidman, 2006; Staats, 2006), inviting additional empirical justification for either position (Critchfield & Rasmussen, 2007; Lattal & Lattal, 2006; Magoon, Critchfield, Merrill, Newland, & Schneider, 2017).

One potential difficulty with distinguishing the effects of stimulus presentations from stimulus removals in positive and negative reinforcement processes is that reinforcement is already functionally defined. That is, the classification already isolates and distinguishes events based on their common effect on behavior. Thus, experiments designed to evaluate whether positive and negative processes can have functionally distinct effects on behavior must identify as dependent variables secondary factors across which differences might be observed. Secondary variables for analysis which have been proposed include whether discriminations develop differently when positive versus negative reinforcement is employed, whether positive and negative processes operate differently on an organism’s physiology, whether specific feelings, or other emotional responses, are consistently evoked by one process or the other, and whether prerequisite antecedent events (i.e., the presence of aversive stimuli) consistently evoke competing responses not true of the alternative, among others (Baron & Galizio, 2005; Catania, 1973; Hineline, 1984; Magoon et al., 2017; Michael, 1975).

Another set of variables across which these processes might be functionally distinguishable could be reflected in their relative effect on behavioral sensitivities to changes in various parameters of reinforcement (e.g., rate, magnitude, delay). For example, variations in dimensions of parameters of reinforcement such as rate or delay can alter response allocation in concurrent-operant choice procedures (Baum, 1974; Herrnstein, 1961, 1970). When preferred dimensions of different parameters of reinforcement are pitted in direct competition with each other, behavior is often consistently biased toward some parameter manipulations over others (e.g., Kunnavatana, Bloom, Samaha, Slocum, & Clay, 2018; Mace, Neef, Shade, & Mauro, 1996; Neef & Lutz, 2001; Neef, Mace, & Shade, 1993; Neef, Mace, Shea, and Shade, 1992; Neef, Shade, & Miller, 1994; Perrin & Neff, 2012).

If positive processes operate on behavior differently than negative ones, a way to detect those differences may be to measure whether there are process-specific biases toward high-quality dimensions of specific parameters of reinforcement when choices between high-quality dimensions of various parameters are offered. If, for example, responding is biased toward high-quality dimensions of magnitude when positive reinforcement is manipulated, but toward high-quality dimensions of rate when negative reinforcement is manipulated, this would constitute evidence that positive and negative processes can alter the impact of reinforcement in functionally distinct ways. Thus, the purpose of this investigation was to evaluate whether process-specific biases occur in a concurrent-operant choice experiment targeting relative response allocation.

Method

Participants

Participants were three college students, two females, Krista (age 18 years) and Lucy (age 19), and one male, Mike (age 18). Each was paid US $7.50 per hour for time spent in this study. The average duration of participation was 19 hours (range 15 – 21).

Apparatus

Sessions occurred in rooms containing a table and chairs (room size varied by appointment [according to availability] and ranged between approximately 2.4 m × 3 m and 3 m × 3 m). Sound manipulations across all phases of this experiment were made with laptop computers. During parameter sensitivity assessments (PSA), these computers contained an electronic sketchbook designed in Processing®, which timed reinforcement intervals, signaled the onset and offset of periods of reinforcement, and tracked participant-response allocation. For conditions incorporating variable-interval (VI) schedules of reinforcement, schedule interreinforcer interval distributions were generated using the equation of Fleshler and Hoffman (1962) with 8 intervals. Those intervals were randomly selected with replacement. Facilitators always sat next to data collectors and across from participants.

To ensure that sound manipulations did not harm participants, all sounds presented in all phases of this study were 80 dB. This is the approximate volume of an active vacuum cleaner 3 m away and is 5 dB quieter than the minimum volume (85 dB) typically needed to produce eardrum damage at prolonged durations, according to the World Health Organization (“Making Listening Safe,” n.d.). The volume of all sounds was measured using a decibel reader app (Decibel 10th®) on an iTouch® handheld computer from a distance of 0.5 m from the sound source.

Measurement System and Dependent Variables

Paper and pencil data sheets were used by trained observers, who were informed of the study’s objectives, to collect data during sound assessments, PR reinforcer assessments, and post-test preference probes. The laptop computers were used to collect data during PSAs. During sound assessments, preferences were recorded as selection of one of two concurrently available cards labeled ‘sound” and “silence.” Similarly, during post-test preference probes the dependent variable was choice for positive or negative reinforcement PSAs.

During all PR reinforcer assessments, PSAs, and post-test preference probes participants could solve double-digit addition problems by picking up a 12.7 cm × 17.8 cm flashcard from a stack, writing a numerical value in a blank space below a printed equation, and handing the completed flashcard to the facilitator. All responses were free operant. During PR reinforcer assessments, the dependent variable was the breakpoint (averaged across three sessions) produced by each response-contingent sound manipulation. Breakpoints were defined as the schedule requirement for the final reinforcer delivered during a given session, that is before there was a pause in responding of 1 min (after the first 5 min of the session had elapsed). Average breakpoints were calculated by adding session breakpoints, dividing by the total number of sessions conducted for each sound manipulation (i.e., three), and multiplying by 100.

During PSAs, the primary dependent variable was the relative percentage of time spent working on flashcards available at each of two options. Working time started when one of the two concurrently available stacks of identical flashcards was touched and ended when a completed flashcard was handed to the facilitator. Percentages were calculated as (amount of time at one option / total time at both options) × 100

Interobserver Reliability

Point-by-point agreement between two observers about the occurrence of dependent variables was scored during 67% of sound-assessment sessions, 54% of PR reinforcer-assessment sessions, and 67% of post-test preference probes. Proportions of agreement (mean-count per interval [10-s bins]) between two observers were scored during 31.9% of all PSAs. Mean interobserver agreement (IOA) for sound assessments and post-test preference probes was 100%. Mean IOA for PR reinforcer assessments was 99.2% (range 89.5% to 100%). During PSAs, mean IOA for “timer on” was 93.8% (range 73.3% to 100%). Mean IOA for “timer off” was 94% (range 76.7% to 100%).

Procedure

One to 12 sessions were completed by participants during each appointment. The duration of appointments depended on participant availability. Sound assessments were conducted first to identify a pool of preferred and aversive sounds for use during subsequent progressive ratio (PR) reinforcer assessments. Next, the PR reinforcer assessments were conducted to identify a matched pair of positive and negative reinforcers (i.e., reinforcers with similar reinforcing values) for use during PSAs. Then, PSAs were conducted to determine whether process-specific biases toward specific parametric manipulations of reinforcement occurred. Finally, posttest preference probes were conducted to determine whether participants described preferring to work with positive- or negative-reinforcement procedures. Because human responding may be insensitive to schedule changes without adjunct procedures to increase discriminability (e.g., Mace et al., 1996), arbitrary and unique stimuli were assigned to each dimension of each parameter of reinforcement manipulated. These stimuli are shown in Table 1.

Sound assessments. Each sound was evaluated during a single session (i.e., a block of three trials). All sessions were conducted at a table with two cards. One card contained the word, “sound” and the other contained the word “silence.” Card location was counterbalanced across trials. To minimize extended exposure to aversive sounds, sessions from preferred-sound assessments were interspersed with sessions from aversive-sound assessments.

Preferred-sound assessments. Prior to conducting these assessments, participants were asked to name their favorite song. The reported song was entered into the “new station” bar of the Pandora. website (Pandora. is a website that creates personalized radio stations by compiling a playlist of songs that have similar musical properties to preferred songs reported by the listener). The first five songs that followed the favorite song on the new radio station then were selected. Thus, the sound pool consisted of each participant’s favorite song and an additional five songs possessing some of the favorite song’s musical properties, as defined by Pandora..

All sessions were started in silence. Prior to each session, the instruction, “when you touch this,” was delivered while the “silence” card was simultaneously touched by the facilitator. Then the instruction, “nothing will change” was delivered. Afterward, the facilitator remained quiet for 30 s. Next, the instruction, “when you touch this,” was delivered while the “sound” card was simultaneously touched by the facilitator. Then the instruction “you get sound” was delivered. Afterward, the relevant sound was turned on for 30 s, and then turned off. Finally, the first trial of the session was initiated with an instruction to “pick one.”

If the “silence” card was touched during any trial of a session, the sound remained off for 30 s and then the session was terminated. When “silence” choices were made, the relevant sound was immediately discarded and a different sound was introduced during the next session. If the “sound” card was touched three consecutive times within a session, that sound became eligible for evaluation during subsequent PR positive reinforcer assessments.

Table 1
Summary of specific values and correlated stimuli

(top) Summary of specific values and correlated stimuli associated with high- and low-quality dimensions of reinforcement parameters: rate, magnitude, and immediacy. (middle) Summary of reinforcement parameters available at each response option during PSA baselines. (bottom) Summary of reinforcement parameters available at each response option, along with summaries of proportions of time spent in reinforcement (assuming exclusive responding toward a given option), during PSA tests.

Shaded regions indicate when parameter values differed across reinforcement options. SR = reinforcement; PSA = parameter sensitivity assessment

Sound-escape assessments. All sessions were started with sound. A continuous loop of each potentially aversive sound (e.g., a crying baby, a honking horn, a fire alarm, a variety of tones, static, etc.) was played at all times, except when the “silence” card was touched. Prior to each session, the sound was turned on, the instruction, “when you touch this,” was delivered while the “sound” card was simultaneously touched by the facilitator. Then then the instruction, “nothing will change” was delivered. Afterward, the sound continued to play for 30 s while the facilitator remained quiet. Next, the instruction, “when you touch this,” was delivered while the “silence” card was simultaneously touched by the facilitator. Then, then the instruction, “you get silence” was delivered. Afterward, the sound was turned off for 30 s, and then turned back on. Finally, the first trial of the session was initiated with an instruction to “pick one.”

If the “sound” card was touched during any trial, the sound played for 30 s and then the session was terminated. When “sound” choices were made, the relevant sound was immediately discarded and a different sound was introduced during the next session. If the “silence” card was touched three consecutive times within a session, that sound became eligible for evaluation during subsequent PR negative reinforcer assessments.

PR reinforcer assessments. The purpose of PR reinforcer assessments (adapted from procedures described by Hodos, 1961; and Knighton, Bloom, Samaha, & Clark, 2012) was to identify preferred and aversive sounds with “matched” average breakpoints (i.e., average breakpoints that fell within 1 PR step of one another) for use during test conditions of PSAs (described below). Three PR reinforcer sessions were conducted for each of three preferred sounds (identified during preferred-sound assessments), three sounds from which the participant consistently escaped (identified during sound-escape assessments), and a “no-consequence” control (to ensure that correct responding was not maintained by automatic reinforcement). Thus, 21 total PR-reinforcer sessions (three for each preferred sound, three for each aversive sound, and three for the no-consequence control) were completed by each participant. PR positive-reinforcer sessions were randomly rotated with PR negative-reinforcer sessions and control sessions to minimize prolonged exposure to aversive sounds. Sounds were selected arbitrarily from the pool of sounds generated by sound assessments described above.

Each PR reinforcer session began with the instruction, “you don’t have to do anything you don’t want to do. If you’d like, you can work to earn (remove) sound. Otherwise, you can interact with this (while a low-preferred item identified via a multiple-stimulus without replacement preference assessment [DeLeon & Iwata, 1996] was indicated), or do nothing at all.” Afterward, the participant could solve math problems from a single stack of flashcards. If an incorrect response was made, the participant was prompted to, “try again.” If a second incorrect response was made, the flashcard was discarded and the participant was prompted to “pick another one.”

Correct responses were reinforced with 30 s of access to (or escape from) sound, according to a PR schedule for which the response requirements increased by 150% following every reinforcer delivery. That is, the first reinforcer was delivered after the first correctly solved math problem. Then, a second reinforcer was delivered after two additional math problems were solved. A third reinforcer was delivered after three correctly solved math problems, a fourth after five, a fifth after eight, and so forth. The session was terminated once no attempts to solve a math problem occurred for 1 min (after the first 5 min had elapsed).

Parameter sensitivity assessments. Each participant was exposed to baseline and test conditions of a concurrent-operant PSA (adapted from procedures described by Neef & Lutz, 2001). During PSAs, a participant could respond and obtain reinforcement in the same manner described for PR reinforcer assessments, with a few variations. First, reinforcers were delivered according to a VI schedule instead of a PR schedule. Second, a participant could respond on either of two concurrently available stacks of flashcards (as opposed to one in the PR reinforcer assessment), and could change the stack from which they worked at any given time during any session (each option was under the control of an independent VI schedule and there was no changeover delay). Third, reinforcement parameters (along with schedule-correlated stimuli) varied by response option within each session; as well as across conditions. Fourth, session duration was 10 min. Finally, if a participant was not actively engaged in a task for 10 s, s/he was prompted to work.

Prior to starting each session, schedule-correlated stimuli (see Table 1) were arranged under two identical stacks of flashcards approximately 0.5 m in front of the participant, 30 cm apart. Specifically, two 21.5 cm × 28 cm mats, with distinctive symbols printed on their front left and front right corners, were placed under each stack of flashcards. Mat-color was consistently paired with VI schedules (rate), left-symbols were consistently paired with durations of reinforcement (magnitude), and right-symbols were consistently paired with reinforcement delays (delay). After stimuli were arranged, the participant was instructed, “You can work on either option to earn (or escape) sound. During each session, try to be sensitive to differences between the two options while you work to produce desirable consequences.” In a fashion similar to pre-exposures described above (e.g., “when you do this… you get this…”), the participant was exposed to contingencies of reinforcement available at each option. Contingencies at either option during these pre-exposures were not described verbally for the participant. Following pre-exposures, the instruction “pick one” began the session.

During sessions, correct responses before a scheduled interval lapsed produced the vocal prompt, “pick one.” Correct responses after a scheduled interval lapsed produced the programmed reinforcer (i.e., sound presentation or removal). When a reinforcer was earned, flashcards were covered at both response options, timing of VI intervals at both response options stopped, and target sounds were delivered or removed according to programmed reinforcement parameters (which varied by condition). This included time spent in 30-s delays between reinforced responses and reinforcer deliveries, which were programmed into some sessions of PSA baseline and test conditions, as well as some post-test preference probes. Following reinforcement, the VI interval timer restarted, response options were available, and participants again were instructed to “pick one.”

Baseline. The purpose of baseline was to screen for differential sensitivity to high-quality dimensions of each parameter of reinforcement (i.e., immediate, high rate, or high magnitude). During baseline, arbitrarily selected sounds (not matched) available from pools generated through preferred-sound and sound-escape assessments were delivered contingent upon responding to evaluate whether high-quality dimensions could control greater than 50% response allocation when alternative options delivered low-quality (i.e., delayed, low rate, or low magnitude) dimensions of the same parameter, when dimensions of all other parameters were constant (see Table 1).

The order of baseline conditions varied by participant. Each condition (e.g., S^R+ rate) was completed before subsequent conditions (e.g., S^R+ magnitude) were initiated. Sessions within a condition continued until a participant allocated more than 50% of her or his responding to the high-quality option of the relevant parameter of reinforcement across three consecutive sessions. The relative location (i.e., left versus right) of high-quality dimensions of each parameter (as well as schedule-correlated stimuli) was counterbalanced across sessions within each condition.

Test. High-quality dimensions of two parameters of reinforcement were pitted against one other (e.g., high rate versus high magnitude) in a concurrent schedule. Correct responding on either of two concurrently available (identical) stacks of math flashcards was reinforced according to the conditions shown in Table 1. Contingencies with high-quality dimensions of one parameter (e.g., rate) contained low-quality dimensions of the alternative (e.g., magnitude). Dimensions of the third parameter (e.g., delay) were constant across options. The relative location (i.e., left versus right) of high-quality dimensions of each parameter (as well as schedule-correlated stimuli) was counterbalanced across sessions. Test conditions terminated when stable patterns of responding across three consecutive sessions were observed. The entire assessment ended when high-quality dimensions of each parameter were pitted against high-quality dimensions of all other parameters using both positive- and negative-reinforcement procedures.

Positive-reinforcer PSAs were completed before negative-reinforcer PSAs were initiated. Likewise, test conditions were completed in the same order (i.e., rate versus magnitude, delay versus rate, then magnitude versus delay). The values selected for each parameter of reinforcement (i.e., rate, magnitude, & delay) were chosen so that exclusive responding at either option would produce the same proportion of reinforcement across equivalent periods of real time (see Table 1). This was done to increase the probability that noted biases (if observed) would be a product of differential sensitivity to parametric manipulations, rather than simple maximization of reinforcement.

Post-test preference probes. These probes were similar to PSA test sessions, with important modifications. First, participants were given the choice to work with either positive- or negative-reinforcement procedures. Prior to each choice, the specific parameters of reinforcement available at each option were described while correlated stimuli corresponding to each dimension were indicated (e.g., “when you work at this option for an average of 15 s, you will get 30 s of reinforcement following a 30 s delay). Next, the participant was asked if she or he wanted to work to produce positive reinforcement (i.e., the preferred song) or negative reinforcement (i.e., escape from the aversive sound). If negative reinforcement was chosen, the aversive sound was turned on and the instruction, “pick one” was delivered. If positive reinforcement was chosen, the instruction, “pick one” was delivered. Participants then were allowed to work at either response option. Probes ended after a single reinforcer at either response option was earned. Following reinforcement, a new probe was initiated. Probes were conducted for each comparison until three consecutive choices for positive- or negative-reinforcement were made for each of the comparisons arranged during PSA tests (e.g., rate vs. magnitude). Comparisons were presented in the same order as they were during PSA tests (i.e., rate vs. magnitude, delay vs. rate, and magnitude vs. delay).

Procedural Fidelity

During PSAs, an independent observer evaluated procedural fidelity across 25.2% of all sessions using a “yes/no” checklist, which identified important session components (e.g., stimuli manipulated, schedules of reinforcement, reinforcer delivery, session duration, etc.). Session fidelity was then calculated by dividing the number of “yes” scored by the sum of “yes” and “no” and multiplying by 100. Mean fidelity to PSAs was 99.1% (range 91% to 100%).

Results

Results of PR reinforcer assessments are summarized in Table 2. Results of PSA baseline and test conditions, as well as post-test preference probes, are summarized in Table 3. The choice performance of each participant during the final three sessions of each PSA baseline condition is shown in Figure 1. During baseline, > 50% of responding was allocated toward response options with high-quality dimensions of the three parameters of reinforcement targeted for this study (i.e., rate, magnitude, delay), when the alternative produced low-quality dimensions of those same parameters. The criterion of differential sensitivity to high-quality dimensions of each parameter thus was achieved, advancing each participant to the PSA test. Because positive and negative reinforcers used during baseline were not matched, their relative effects on response allocation were not compared.

Results of the final three sessions of each PSA test condition for each participant are shown in Figure 2. For Mike, matched positive and negative reinforcers used during PSA tests produced average breakpoints of 21 (range 18-27) and 22 (range 12-27), respectively. With both positive and negative reinforcer PSAs, Mike’s choices were most sensitive to high-quality magnitude manipulations, followed by rate, then delay. When rate competed with magnitude, the average relative response allocation toward magnitude across the final three sessions of the condition was 68% (range 57-82%) in the positive reinforcer PSA and 65% (range 63-69%) in the negative reinforcer PSA. When delay competed with rate, the average relative response allocation toward rate across the final three sessions of the condition was 100% in the positive reinforcer PSA and 94% (range 82-100%) in the negative reinforcer PSA. When magnitude competed with delay, the average relative response allocation toward magnitude across the final three sessions of the condition 99% (range 97-100%) in the positive reinforcer PSA and 100% in the negative reinforcer PSA. During post-test preference probes, positive reinforcement was selected to the full exclusion of negative reinforcement across all test conditions.

Table 2
Results of progressiveratio reinforcer assessments

Note: Shaded regions indicate “matched” positive and negative reinforcers for each participant. Avg. = average; PR+ = positive reinforcer progressive-ratio assessment; PR- = negative reinforcer progressive-ratio reinforcer assessment. N/A = not applicable.

Table 3
Results of baseline and test conditions

Results of baseline and test conditions of positive and negative reinforcer PSAs for Mike, Krista, and Lucy (respectively).

Bolded underlined text indicates preferred parameter during PSA assessments. Italicized text indicates where outcomes differed by process. PSA = parameter sensitivity assessment; R = rate; M = magnitude; D = delay; SR+ = positive reinforcement; SR− = negative reinforcement

Figure 1
Final three sessions of baseline for positive and negative PSAs

Figure 2
Results of participants’ PSA tests

Results of participants’ PSA tests. Positive-reinforcement tests for each participant are displayed in the first panel. Negative-reinforcement tests for each participant are displayed in the second panel. Parameter rankings are indicated by numbered text to the right of each graph, based on relative influence over behavior during relevant tests. Shaded text indicates cases in which rankings differed by reinforcement process. R = rate. M = magnitude. D = delay. V = versus.

For Krista, matched positive and negative reinforcers used during PSA tests produced average breakpoints of 7 (range 2-12) and 6 (range 3-8), respectively. In both positive and negative reinforcer PSAs, Krista’s choices were most sensitive to high-quality delay manipulations, followed by magnitude, then rate. When rate competed with magnitude, the average relative response allocation toward magnitude across the final three sessions of the condition was 59% (range 57-63%) in the positive reinforcer PSA and 81% (range 79-82%) in the negative reinforcer PSA. When delay competed with rate, the average relative response allocation toward delay across the final three sessions of the condition was 76% (range 61-80%) in the positive reinforcer PSA and 86% (range 85-88%) in the negative reinforcer PSA. When magnitude competed with delay, the average relative response allocation toward delay across the final three sessions of the condition was 81% (78-83%) in the positive reinforcer PSA and 80% (79-80%) in the negative reinforcer PSA. During post-test preference probes, positive reinforcement was selected to the full exclusion of negative reinforcement across all test conditions.

For Lucy, matched positive and negative reinforcers used during PSA tests produced average breakpoints of 5 (range 1-12) and 5 (range 0-12), respectively. In positive reinforcer PSAs, Lucy’s choices were most sensitive to high-quality magnitude manipulations, followed by rate, then delay. Conversely, in negative reinforcer PSAs, her responses were most sensitive to high-quality delay manipulations, followed by magnitude, then rate. When rate competed with magnitude, the average relative response allocation toward magnitude across the final three sessions of the condition was 77% (62-93%) in the positive reinforcer PSA and 73% (59-92%) in the negative reinforcer PSA. When delay competed with rate, the average relative response allocation toward rate across the final three sessions of the condition was 70% (range 60-83%) in the positive reinforcer PSA. In the negative reinforcer PSA, the average relative response allocation toward delay across the final three sessions of the condition was 95% (range 90-97%). When magnitude competed with delay, the average relative response allocation toward magnitude across the final three sessions of the condition was 76% (70-87%) in the positive reinforcer PSA. In the negative reinforcer PSA, the average relative response allocation toward delay across the final three sessions of the condition was 83% (78-87%). As was the case for Mike and Krista, positive reinforcement was selected to the full exclusion of negative reinforcement across all test conditions during Lucy’s posttest preference probes.

Discussion

One barrier to comparing the effects of positive and negative reinforcement is that it is difficult to design valid procedures for making such comparisons. Commenters on the matter (e.g., Lattal & Lattal, 2006; Magoon & Critchfield, 2008; Magoon et al., 2017) have suggested empirical inquiry may be possible if researchers hold constant across reinforcement processes intensities of motivating operations, stimulus qualities, and experimental procedures. In response, this experiment targeted the same intensities (i.e., 80 dBs) of biologically-relevant stimulation (as opposed to tokens or points; for a rationale, see Alessandri & Riviere, 2013; Alessandri & Cancado, 2015) from a single stimulus class (i.e., sound) whose contingent manipulation could produce both positive and negative reinforcement. To avoid difficulties with interpretation, all stimuli selected were likely to operate on participants’ physiologies in similar ways. Further, contingencies were asymmetrical. That is, the presence of either stimulus manipulated did not automatically entail the absence of the alternative (and vice versa). Finally, dimensions of stimuli were manipulated across ranges that precluded the introduction of colloquial baggage (e.g., transitions from loud to quiet, hot to cold).

For two participants (i.e., Krista & Mike), hierarchies of influence produced by positive and negative PSAs corresponded perfectly, suggesting no differences in effect as a function of whether the reinforcing stimuli were presented or removed. For the other participant (Lucy), hierarchies of influence differed by PSA, suggesting the impact of reinforcement contingencies might be modified by process-specific variables. Specifically, when behavior was maintained by positive reinforcement, Lucy’s response allocation was most effectively controlled by reinforcement magnitude (followed by rate, then immediacy). Conversely, when behavior was maintained by negative reinforcement, Lucy’s response allocation was most effectively controlled by variations in the immediacy of reinforcement (followed by magnitude, then rate).

A couple of limitations of this study should be noted. For example, VI intervals were programmed with replacement, and rates of obtained reinforcement were not tracked. Thus, it is neither possible to evaluate the degree to which obtained rates of reinforcement conformed to programmed rates, nor to conduct matching analyses (e.g., Baum, 1974). Further, PSA comparisons were conducted using an AB design (S^R+ PSA, then S^R− PSA). Thus, we did not establish experimental control of Lucy’s differential sensitivity to reinforcement parameters across processes.

Notwithstanding these limitations, Lucy’s results may still be important because they appear to represent an exception to a rule whose certainty is prerequisite to the definitional shift proposed by Michael (1975). Specifically, Lucy’s positive-reinforcement PSA did not accurately predict her sensitivity to manipulations of parameters of negative reinforcement. This finding challenges the premise upon which Michael’s alternative definition can be justified. Specifically, a definition which marginalizes the value of the positive/negative distinction, disincentivizes the discrimination, and discourages distinct lines of empirical inquiry can only be warranted if the effect of the stimulus-presentation process is truly functionally indistinguishable from the effect of the stimulus-removal process. Lucy’s results introduce a degree of uncertainty to this assumption.

Although the purpose of this study was to examine whether positive and negative reinforcement processes can operate on behavior in distinguishable ways, our findings may also have applied implications. Through considerations of individuals’ idiosyncratic differential sensitivities to certain parameters of reinforcement, applied researchers have identified ways to adapt function-based treatments in a manner that effectively suppresses problem behavior without the use of extinction (e.g., Athens & Vollmer, 2010; Kunnavatana et al., 2018). Lucy’s results, however, should lead to scrutinizing the circumstances under which this is possible. For example, future research might evaluate the extent to which treatments for negatively reinforced problem behavior (e.g., escape from demands) can be effective when informed by PSAs conducted using positive reinforcers (e.g., preferred sounds, tangibles). Until this research is conducted, practitioners might take care not to overgeneralize the results of positive-reinforcement PSAs when making treatment decisions about problem behavior maintained by negative reinforcement.

When choice was available (i.e., during post-test preference probes), positive reinforcement PSAs were exclusively chosen over negative reinforcement PSAs. This is noteworthy because PR reinforcer assessments indicated that the specific positive and negative reinforcers manipulated during PSAs had near-identical effects on behavior (i.e., they were “matched”). As preference and reinforcement are related constructs and because the amount of behavior a reinforcer supports often correlates with preference (cf. DeLeon, Frank, Gregory, and Allman, 2009; Glover, Roane, Kadey, and Grow, 2008; Penrod, Wallace, and Dyer, 2008), one might be tempted to draw equivalence between the relative impact of a reinforcer with relative preference for the conditions under which said impact was possible. However, this would be ill advised. In this experiment, functionally matched positive and negative reinforcers did not produce indifference for the contexts in which those reinforcers were delivered.

Selecting negative reinforcement during post-test preference probes would have entailed selecting a context in which aversive stimulation was frequently and consistently presented for the sole purpose of shaping and reinforcing behavior meant to remove it. That no one chose this should not be surprising. However, it remains a significant finding. To expedite the learning process, applied behavior analysts will often contrive instructional contexts that artificially establish consequences as reinforcers (e.g., Tiger, Hanley, & Bruzek, 2008). Thus, plausible applied analogues to the post-test preference probes are likely to exist (e.g., cases in which independent choices to enter classrooms or therapy rooms are not made, inside of which effective reinforcement contingencies actively maintain responding).

Certainly, aversive situations are not the exclusive domain of negative reinforcement (Baron & Galizio, 2006a, 2006b; Michael, 2006; Perone, 2003). However, by definition, contact with them is prerequisite to it. Thus, the probability of conditioning instructional contexts with aversive qualities seems relatively high if practitioners consistently artificially contrive negative reinforcement opportunities. The effect of this conditioning may not be apparent when analysis is constrained to primary effects on targeted responses (cf. PSA results for Mike & Krista). However, it may become clearer when effect on collateral behavior (e.g., approach or avoidance responses for the instructional context) is taken into consideration (Staats, 2006). Future researchers might further explore the short- and long-term direct and collateral effects of applied programming that exclusively employs functionally matched positive or negative reinforcement procedures. They might also explore parametric analyses of systematic combinations of positive and negative reinforcement (e.g., 70% positive, 30% negative) most likely to produce optimal outcomes across both short- and long-term time frames.

Material suplementario

Acknowledgments

An earlier draft of this manuscript served to fulfill the dissertation requirements of the first author. Correspondence concerning this article should be addressed to Joseph M. Lambert, Department of Special Education, Vanderbilt University, Nashville, TN 37203. Phone: (615) 343-0824. Fax (615) 343-1570.

We thank Katie Blakely who assisted in conducting the study and Thomas Higbee, Timothy Slocum, and Michael Twohig for their comments on an earlier version of this manuscript.

References

Alberto, P. A., & Troutman, A. C. (2013). Applied Behavior Analysis for Teachers. 6th. Prentice Hall.

Alessandri, J., & Cancado, C. R. (2015). Human choice under schedules of negative reinforcement. Behavioural Processes, 121, 70-73.

Alessandri, J., & Rivière, V. (2013). Timeout from a high-force requirement as a reinforcer: An effective procedure for human operant research. Behavioural Processes, 99, 1-6.

Antebellum, L (2010). Need you now, on Need You Now [mp3], Capitol Records.

Athens, E. S., & Vollmer, T. R. (2010). An investigation of differential reinforcement of alternative behavior without extinction. Journal of Applied Behavior Analysis, 43, 569-589. doi: 10.1901/jaba.2010.43-569

AWOLNATION (2011). Jump on my shoulders, on Megalithic Symphony [mp3], Red Bull Records, Inc.

AWOLNATION (2011). People, on Megalithic Symphony [mp3], Red Bull Records, Inc.

AWOLNATION (2011). Sail, on Megalithic Symphony [mp3], Red Bull Records, Inc.

Baron, A.B., & Galizio, M. (2005). Positive and negative reinforcement: Should the distinction be preserved? The Behavior Analyst, 28, 85-98.

Baron, A., & Galizio, M. (2006a). The distinction between positive and negative reinforcement: Use with care. The Behavior Analyst, 29, 141-151.

Baron, A., & Galizio, M. (2006b). Distinguishing between positive and negative reinforcement: Responses to Nakajima (2006) and Staats (2006). The Behavior Analyst, 29, 273-277.

Baum, W. M. (1974). On two types of deviation from the matching law: Bias and undermatching. Journal of the Experimental Analysis of Behavior, 22, 231-242. doi: 10.1901/jeab.1974.22-231

Catania, A. C. (2013). Learning (5th edn). Cornwall-on-Hudson: Sloan. Google Scholar.

Carr, E. G., & Durand, V. M. (1985). Reducing behavior problems through functional communication training. Journal of Applied Behavior Analysis, 18, 111-126. doi: 10.1901/jaba.1985.18-111

Chase, P.N. (2006). In response: Teaching the distinction between positive and negative reinforcement. The Behavior Analyst, 29, 113-115.

Cooper, J. O., Heron, T. E., & Heward, W. L. (2007). Applied behavior analysis.

Critchfield, T.S., & Rasumussen, E.R., (2007). It’s aversive to have an incomplete science of behavior. Mexican Journal of Behavior Analysis, 33, 1-6.

Cudi, K, (2009). Day ‘n’ night, on Man on the Moon: The End of Day [mp3], Universal Motown Records.

DeLeon, I. G., Frank, M. A., Gregory, M. K., & Allman. M. J. (2009) On the correspondence between preference assessment outcomes and progressive-ratio schedule assessments of stimulus value. Journal of Applied Behavior Analysis, 42, 729-733. doi: 10.1901/jaba.2009.42-729

DeLeon, I. G., & Iwata, B. A. (1996). Evaluation of a multiple-stimulus presentation format for assessing reinforcer preferences. Journal of Applied Behavior Analysis, 29, 519-532. doi: 10.1901/jaba.1996.29-519

Drive, F, (2007). Sleepless Nights, on Thank You Happy Birthday [mp3], JIVE Records.

Elephant, C.T., (2010). Shake me down, on Seven Second Surgery [mp3], Universal Records.

Eminem, (2002). Lose yourself, on 8 Mile (Music from and Inspired by the Motion Picture) [mp3], Shady Records/Interscope Records.

Estelle, (2002). American Boy, on Shine [mp3], Atlantic Recording Corporation.

Fiasco, L. (2007). Superstar, on Lupe Fiasco’s the Cool [mp3], Atlantic Recording Corporation.

Flatts, R. (2004). All-American girl, on Carnival Ride [mp3], Nashville: Lyric Street Records, Inc.

Fleshler, M., & Hoffman, H. S. (1962). A progression for generating variable-interval schedules. Journal of the Experimental Analysis of Behavior, 5, 529-530. doi: 10.1901/jeab.1962.5-529

Glover, A. C., Roane, H. S., Kadey, H. J., & Grow, L. L. (2008). Preference for reinforcers under progressive- and fixed-ratio schedules: A comparison of single and concurrent arrangements. Journal of Applied Behavior Analysis, 41, 163-176. doi: 10.1901/jaba.2008.41-163

Gorillaz (2005). Feel Good Inc, on Demon Days [mp3], Parlophone Records Limited.

Hanley, G. P., Iwata, B. A., & McCord, B. E. (2003). Functional analysis of problem behavior: A review. Journal of Applied Behavior Analysis, 36, 147-185. doi: 10.1901/jaba.2003.36-147

Herrnstein, R. J. (1961). Relative and absolute strength of a response as a function of frequency of reinforcement. Journal of the Experimental Analysis of Behavior, 4, 267-272. doi: 10.1901/jeab.1961.4-267

Herrnstein, R. J. (1970). On the law of effect. Journal of the Experimental Analysis of Behavior, 4, 243-266. doi: 10.1901/jeab.1970.13-243

Hineline, P.N. (1984). Aversive control: a separate domain? Journal of the Experimental Analysis of Behavior, 42, 495-509.

Hodos, W. (1961). Progressive ratio as a measure of reward strength. Science, 134,

Horner, R. H., & Day, H. M. (1991). The effects of response efficiency on functionally equivalent competing behaviors. Journal of Applied Behavior Analysis, 24, 719-732. doi: 10.1901/jaba.1991.24-719

Iwata, B.A. (2006). On the distinction between positive and negative reinforcement. The Behavior Analyst, 29, 121-123.

Iwata, B. A., Dorsey, M. F., Slifer, K. J., Bauman, K. E., & Richman, G. S. (1994). Toward a functional analysis of self-injury. Journal of Applied Behavior Analysis, 27, 197-209. doi: 10.1901/jaba.1994.27-197

Jay Z (2009). Run this town, on The Blueprint 3[mp3], S. Carter Enterprises.

Knighton, R., Bloom, S. E., Samaha, A. L., & Clark, D. (2012). Evaluating choice as a reinforcer. Poster Presented at the Intermountain Graduate Research Symposium, Logan, UT.

Kunnavatana, S. S., Bloom, S. E., Samaha, A. L., Slocum, T. A., & Clay, C. J. (2018). Manipulating parameters of reinforcement to reduce problem behavior without extinction. Journal of Applied Behavior Analysis, 51, 283-302. doi: 10.1002/jaba.443

Lattal, K.A., & Lattal, A.D. (2006). And yet…: Further comments on distinguishing positive and negative reinforcement. The Behavior Analyst, 29, 129-134.

Lovato, D. (2011). Give your heart a break, on Unbroken [mp3], Hollywood Records, Inc.

Mace, F. C., Neef, N. A., Shade, D., & Mauro, B. C. (1994). Limited matching on concurrent schedule reinforcement of academic behavior. Journal of Applied Behavior Analysis, 27, 585-596. doi: 10.1901/jaba.1994.27-585

Mace, F. C., Neef, N. A., Shade, D., & Mauro, B. C. (1996). Effects of problem difficulty and reinforcer quality on time allocated to concurrent arithmetic problems. Journal of Applied Behavior Analysis, 29, 11-24. doi: 10.1901/jaba.1996.29-11

Magoon, M. A., Critchfield, T. S., Merrill, D., Newland, M. C., & Schneider, W. J. (2017). Are positive and negative reinforcement “different”? Insights from a free‐operant differential outcomes effect. Journal of the Experimental Analysis of Behavior, 107, 39-64. doi: 10.1002/jeab.243

Making Listening Safe (n.d.). Retrieved from https://www.who.int/pbd/deafness/activities/1706_PBD_leaftlet_A4_English_lowres_for_web170215.pdf?ua=1

Marr, M.J. (2006). Through the looking glass: Symmetry in behavioral principles?. The Behavior Analyst, 29, 125-128.

Michael, J. (1975). Positive and negative reinforcement, a distinction that is no longer necessary; or a better way to talk about bad things. Behaviorism, 3, 33-44.

Michael, J. (2006). Comment on Baron and Galizio (2006). The Behavior Analyst, 29, 117-119.

Nakajima, S. (2006). In response: Speculation and explicit identification as judgmental standards for positive or negative reinforcement: A comment on Baron and Galizio (2005). The Behavior Analyst, 29, 269-270.

Neef, N. A., & Lutz, M. N. (2001). A brief computer-based assessment of reinforcer dimensions affecting choice. Journal of Applied Behavior Analysis, 34, 57-60. doi: 10.1901/jaba.2001.34-57

Neef, N. A., Mace, F. C., & Shade, D. (1993). Impulsivity in students with serious emotional disturbance: The interactive effects of reinforcer rate, delay, and quality. Journal of Applied Behavior Analysis, 26, 37-52. doi: 10.1901/jaba.1993.26-37

Neef, N. A., Mace, F. C., Shea, M. C., & Shade, D. (1992). Effects of reinforcer rate and reinforcer quality on time allocation: Extensions of matching theory to educational settings. Journal of Applied Behavior Analysis, 25, 691-699. doi: 10.1901/jaba.1992.25-691

Neef, N. A., Shade, D., & Miller, M. S. (1994). Assessing influential dimensions of reinforcers on choice in students with serious emotional disturbance. Journal of Applied Behavior Analysis, 27, 575-583. doi: 10.1901/jaba.1994.27-575

Penrod, B., Wallace, M. D., & Dyer, E. J. (2008). Assessing potency of high- and low-preference reinforcers with respect to response rate and response patterns. Journal of Applied Behavior Analysis, 41, 177-188. doi: 10.1901/jaba.2008.41-177

Perrin, C. J., & Neef, N. A. (2012). Further analysis of variables that affect self-control with aversive events. Journal of Applied Behavior Analysis, 45, 299-313. doi: 10.1901/jaba.2012.45-299

Perone, M. (2003). Negative effects of positive reinforcement. The Behavior Analyst, 26, 1-14.

Shins, T. (2003). Fighting in a sack, on Chutes Too Narrow [mp3], Sub Pop.

Sidman, M. (2006). The distinction between positive and negative reinforcement: Some additional considerations. The Behavior Analyst, 29, 135-139.

Staats, A. W. (2006). Positive and negative reinforcers: How about the second and third functions? Behavior Analyst, 29, 271-272.

Swift, T. (2008). Teardrops on my guitar, on Taylor Swift (Bonus Track Version) [mp3], Nashville: Big Machine Records, LLC.

Swift, T. (2012). We are never ever getting back together, on Red [mp3], Nashville: Big Machine Records, LLC.

Takahashi, M., & Iwamoto, T. (1986). Human concurrent performances: The effects of experience, instruction, and schedule-correlated stimuli. Journal of the Experimental Analysis of Behavior, 45, 257-267. doi: 10.1901/jeab.1986.45-257

Tiger, J. H., Hanley, G. P., & Bruzek, J. (2008). Functional communication training: A review and practical guide. Behavior Analysis in Practice, 1, 16-23. doi: 10.1007/BF03391716

Underwood, C. (2007). All-American girl, on Carnival Ride [mp3], Recordings Unlimited.

Notas