In the past few days, I’ve often read about a misunderstanding of negative reinforcement, so I want to write a few words to define negative reinforcement and its use in operant conditioning. There are clear definitions for these terms. Therefore there is no need for personal interpretation. It is just a description of a natural learning mechanism related to behavior and its effect on the environment.
The “negative” in negative reinforcement is derived from the mathematical “minus”, which stands for the removal of a stimulus. However, it is not enough to understand negative reinforcement to the point that something is taken away because the definition goes even further. One important component in order to make negative reinforcement work is that the stimulus you remove has to be unpleasant / aversive, because negative reinforcement works due to the relief when you take something away. Therefore, negative reinforcement training doesn’t owe its name to a “rating”, as often thought, but nevertheless involves something unpleasant (an aversive stimulus) as a trigger. If the removal of something doesn’t cause relief, it’s not negative reinforcement by definition. In this case, however, there wouldn’t be behavior at all, because the proof of negative reinforcement is, that the reinforced behavior increases, so the behavior becomes more frequent.
If the removal of something feels unpleasant for the learner (and the learner wants to avoid that), it’s negative punishment. In the end, this also means that what is removed has to be important for the animal, so we remove something pleasant. Negative punishment is not only characterized by the fact that we take away something pleasant, but the punished behavior needs to be shown less frequently as a result.
If a stimulus announces no consequences at all (the stimulus is not aversive enough or the consequence is not worthy enough to allow behavior to occur), then we speak of extinction. This doesn’t mean that a stimulus is no longer perceived. The stimulus can continue to be aversive / unpleasant for the animal (learned helplessness or blunting), but doesn’t lead to a reaction anymore. The cue / discriminative stimulus is weakened and less important for the animal because it’s no longer followed by reinforcement. If a stimulus is presented without direct physical contact (e.g. sounds), this can even lead to it being completely ignored by the organism. An example of human stimulus discrimination is the sound of the fridge, which is no longer noticed, once the organism get’s used to it, until you pay attention to it (e.g. because the sound changes or stops > broken fridge > consequence)
Our behavior towards the learner can always be determined within the quadrants, and by the occuring behavior of the animal, we can see which form of reinforcement has been applied – as long as we work with just one type of reinforcement (positive OR negative reinforcement). As soon as we start to mix the quadrants, it becomes very difficult to clearly identify which quadrants have been used and therefore which kind of motivation is present in the learner (and which emotion is attached to the behavior). From this point, we get more and more into an interpretation without actual evidence. Surely we can build a hypothesis based on body language and various factors, but we can no longer provide evidence. This response is therefore always taken from the user’s point of view and is often based in the individual feeling. This is not always wrong or even bad, but the problem with feeling is, that you can only feel for yourself – not for your learner. And because we cannot ask the animal and are often led by our own emotions, it’s recommend to use positive reinforcement as much as you can to make sure your learner feels comfortable with everything we do.