logo
    Current applications and potential future directions of reinforcement learning-based Digital Twins in agriculture
    3
    Citation
    74
    Reference
    10
    Related Paper
    Citation Trend
    Recent research findings suggest that the initial reductive effects of noncontingent reinforcement (NCR) schedules on destructive behavior result from the establishing effects of an antecedent stimulus (i.e., the availability of "free" reinforcement) rather than extinction. A number of authors have suggested that these antecedent effects result primarily from reinforcer satiation, but an alternative hypothesis is that the individual attempts to access contingent reinforcement primarily when noncontingent reinforcement is unavailable, but chooses not to access contingent reinforcement when noncontingent reinforcement is available. If the satiation hypothesis is more accurate, then the reductive effects of NCR should increase over the course of a session, especially for denser schedules of NCR, and should occur during both NCR delivery and the NCR inter-reinforcement interval (NCR IRI). If the choice hypothesis is more accurate, then the reductive effects of NCR should be relatively constant over the course of a session for both denser and leaner schedules of NCR and should occur almost exclusively during the NCR interval (rather than the NCR IRI). To evaluate these hypotheses, we examined within-session trends of destructive behavior with denser and leaner schedules of NCR (without extinction), and also measured responding in the NCR interval separate from responding in the NCR IRI. Reductions in destructive behavior were mostly due to the participants choosing not to access contingent reinforcement when NCR was being delivered and only minimally due to reinforcer satiation.
    Extinction (optical mineralogy)
    Stimulus (psychology)
    An attempt was made to modify a socially desirable response of mental patients. It was found that instructions to the patients had no enduring effect unless accompanied by reinforcement. Also, it was found that reinforcement was not effective unless the reinforcement procedure was accompanied by instructions that specified the basis for the reinforcement. Maximum change in behavior was obtained when the reinforcement procedure took advantage of the existing verbal repertoire of the patients. A significant methodological finding was that substantial modification of the behavior of psychotics could be achieved by briefly delaying, rather than withholding, food reinforcement.
    Citations (208)
    Using 3 levels of pre-experimental set (high, low, none) and 3 levels of noncontingent external E reinforcement (75%, 25%, none), 2 types of self-reinforcement were investigated. 90 students responded to problem statements and then were asked to indicate those responses which they felt were helpful (self-reinforcement) and those which would be particularly helpful and deserving of reward. No differences were found in either self-reinforcement or “deserving of reward” rate for the 3-set groups. E-reinforcing rate produced significance in self-reinforcing rate with those receiving 75% E-reinforcement giving the greatest number of self-reinforcement and those receiving 75% E-reinforcement the least. There was no sex difference for self-reinforcement. The only significant difference in the “deserving of reward” rate was for sex, with males feeling more of their responses were deserving of reward than did females.
    Citations (4)
    Three experiments were conducted to investigate the theoretical reduction of rate and duration of reinforcement to their product, rate of reinforcement‐time, under concurrent chain schedules. In Exp. I, rate of reinforcement‐time was varied by varying rate of reinforcement delivery, holding duration of reinforcement availability constant; in Exp. II, rate of reinforcement‐time was varied by holding rate of reinforcement delivery constant and varying duration of reinforcement availability; in Exp. III, rate of reinforcement‐time was held constant by varying both rate and duration of reinforcement simultaneously and inversely. For all three experiments, both relative rate of responding and relative time spent in the initial link were found to match approximately the relative rate of reinforcement‐time arranged in the terminal link. These data were interpreted as support for the notion that rate and duration of reinforcement may be functionally equivalent and reducible to a single variable, rate of reinforcement‐time.
    Operant conditioning
    Constant (computer programming)
    Citations (37)
    Patterns of operant emission produced by intermittent reinforcement schedules have been explored extensively by numerous investigators (3). The result has been a general conclusion that variable ratio reinforcement schedules are more applicable to social behavior and, in addition, produce higher and more stable operant emissions (2). While a rationale has been provided for the applicability of variable ratio reinforcement to social behavior, it is significant that there has not been an adequate explanation for the higher efficiency of variable ratio reinforcement. Such an explanation seems imperative in view of the general acceptance of the effects of differential reinforcement which suggest that the emission of a particular operant rather than otherr which would produce the same reinforcer is some function of thai operant's ability to produce the reinforcer in a greater amount, at a higher frequency and with a higher probability.' If differential reinforcement is logically faithful to the basic assumptions underlying operant theory, it seems that conclusions drawn about the efficiency of any intermittent schedule must be altered. Illustratively, given several different operants all of which produce the same reinforcer, and all linked to different intermittent reinforcement schedules, the future occurrence of any one of the operants should be a function of whether its production of the reinforcer more closely approximates continuous reinforcement than the others. Thus, under certain conditions, e.g., when intervals between reinforcements are sufficiently attenuated, a FI schedule might be much more efficient than VR schedules, VI schedules or DRL schedules. Succinctly, regardless of the reinforcement schedule employed, that operant linked with the schedule producing the greatest amount, frequency, and probability of reinforcement should be the operant most likely to occur-the schedule linked with the operant, defined as most efficient. Commensurately, the most important variable to be considered regarding reinforcement schedules is not the generic type, but rather, the degree to which the schedule employed provides reinforcement on a continuous basis prior to the onset of satiation.
    Operant conditioning
    Differential reinforcement
    Citations (0)
    Positive reinforcement was more effective than negative reinforcement in promoting compliance and reducing escape‐maintained problem behavior for a child with autism. Escape extinction was then added while the child was given a choice between positive or negative reinforcement for compliance and the reinforcement schedule was thinned. When the reinforcement requirement reached 10 consecutive tasks, the treatment effects became inconsistent and reinforcer selection shifted from a strong preference for positive reinforcement to an unstable selection pattern.
    Extinction (optical mineralogy)
    Citations (77)
    This study examined the effects of reinforcement and reinforcement plus information on both appropriate and inappropriate behavior in subjects provided with direct reinforcement and those seated adjacent to them. Four female kindergarten subjects who were of average intelligence were chosen on the basis of engaging in a relatively high percentage of inappropriate behavior. The subjects were randomly assigned to one of two pairs and within each pair, one subject was randomly designated as the one to be administered direct reinforcement (target subject). The remaining subject in each pair (non-target subject) received no direct reinforcement but was seated adjacent to the target subject. Each pair of the subjects were then exposed to seven experimental conditions: baseline, reinforcement for appropriate behavior, reversal, reinforcement f or inappropriate behavior, reinforcement for appropriate behavior with information about the contingencies, reinforcement for inappropriate behavior with information about the contingencies, reinforcement for appropriate behavior with information about the contingencies. Changes in the non-target subjects were observed as a function of witnessing a target subject receive reinforcement for appropriate behavior. When inappropriate behavior was reinforced in the target subjects, only slight changes were observed in the non-target subjects. Information about the contingencies increased the effectiveness of reinforcement in all subjects. This was particularly relevant to inappropriate behavior. The results are discussed with regard to the vicarious reinforcement literature and with regard to the efficacy of providing information along with reinforcement in order to augment it.
    Citations (0)
    A differential-reinforcement-of-other-behavior (DRO) schedule with trials and delayed reinforcement was investigated. Periodically a wheel was briefly available to rats, followed six seconds later by brief availability of a bar. Variable-ratio food reinforcement of wheel turns was adjusted to give 95% turns. After variable-ratio-five reinforcement of bar presses produced 100% pressing, then separate ratio schedules were used for presses following turns (turn presses) and presses following nonturns (nonturn presses). Increasing nonturn-press reinforcements decreased turns, even though total reinforcements increased. Reversal by decreasing nonturn-press reinforcements raised turns, though with hysteresis. Thus food reinforcement increased nonturns even though delayed six to ten seconds after nonturns, a delay that greatly reduces response reinforcement. Those and other results indicate that the turn decrease was not due to reinforcement of competing responses. Evidence against other alternatives, and the reduction of responding by increased reinforcement, indicate that the term inhibition is appropriate for the phenomenon reinforced. Response-specific inhibition appears appropriate for this particular kind, since its effects are more specific to particular responses than Pavlovian conditioned-inhibition. Response-specific inhibition seems best considered a behavioral output comparable to responses (e.g., both reinforcible) but with important properties different from responses (e.g., different reinforcement-delay gradients).
    Differential reinforcement
    Bar (unit)
    Citations (9)
    In order to obtain the study of the bonding properties between the reinforcement-concrete and give full play to the material properties, a lot of research has been carried out on reinforcement-concrete. Existing reinforcement-concrete studies contain mainly reinforcement-concrete bonds, reinforcement lap, and anchorage of reinforcement. The reinforcement-concrete bond test mainly measures the bond-slip curve between the two to determine the bond strength between reinforcement and concrete. The reinforcement lap test is mainly used for the performance study of the anchorage length of reinforcement in concrete, whether the lap bars are in contact with each other, which can be divided into two forms: contact lap and indirect lap. The anchorage test of reinforcement is conducted to study the reduction of the connection length between reinforcement and concrete while meeting the force requirements. According to a large number of tests, the bond strength of the reinforcement is affected by the shape of the mixed reinforcement, the thickness of the protective layer of the diameter concrete, the spacing of the reinforcement, the transverse reinforcement restraint, and the material properties of the reinforcement and concrete. This paper discusses the test methods, influencing factors, and the lack of existing research in the study of the performance of reinforcement-concrete bonding, and lap and anchorage properties.