Rewards or Punishments: What to consider?
Part I

By: Pamela “PJ” Wangsness, CPDT

Which is better when training a dog, rewards vs. punishments? This is a complicated topic and when charged with emotional components, the discussion can quickly deteriorate. So, I think before we can debate the fallout of punishments or the successes of rewards or visa versa, it is important to understand, comprehend and define the terms so the intended message serves its intended function. This question is complex and will be written in a series and condensed for this newsletter.

First consider how humans and dogs interact, learn, communicate and train. In order to understand and respond to the question that is better, we need to briefly consider additional information before we can make an informed decision based on valid information.


For humans and dogs to interact and socially bond depends on shared sociobiological similarities and the ability to exchange socially significant information. In other words, both humans and dogs are dependent as young on a mother for nursing, warmth, and protection, they exhibit similar distress vocalizations when cold, hungry, or separated from siblings or material contact; they exhibit a long development period involving playfulness. Group coordinated behavior is another shared feature of both human and canine social organization. It is equally important to remember, dogs do not speak English. The good news, however is dogs have the ability to learn 200 words through association.1 When these associations are developed we then have an ability to communicate interspecies.



What is interspecies (human and dog) communication? On the very basic level communication can be defined as the reciprocal exchange of information between two or more individuals. A communication exchange consists of at least three components, a sender, a reciprocating receiver, and a signal.2 So how does a human communicate with a dog? When a human is training a dog the human is attempting to influence the dog’s behavior. Think about it, consequences influence behavior. Humans or dogs do things because they know other things follow. Therefore, depending on the type of consequence that follows the behavior, humans or dogs will provide and offer some behaviors while learning through consequences to avoid others.



The term reinforcer is more precisely defined in terms of the measurable effect it has on behavior. Therefore, a behavior when reinforced, with other things being held equal, the future probability of that behavior is increased by the reinforcing event.

The technical distinctions between reward and reinforcement are clear. However, that technical language becomes unclear when it comes to the term punishment. Simply, both terms are scrutinized and in common usage are typically directed toward the agents of the behavior – not the behavior itself. The term punishment appears to suffer the same sort of ambiguity as the word reward. In practice, this is not a very serious conceptual problem, since punishment is operationally defined as an event that lowers the probability of the behavior that the punishment follows. The term reward can be given a similar operational definition of can be used as a synonym for reinforcer. So for future reference, the terms reward and punishment will be used in the more technical sense of events that differentially increase or decrease the future probability/frequency of the behavior they follow. For ease, let’s take this one more step and define as follows, and agree there are two ways in which the probability/frequency of the behavior is affected by the consequences it produces:

Reinforcement increases the relative probability or frequency of the behavior it follows

Punishment decreases the relative probability or frequency of the behavior it follows.

There are two ways in which behavior is reinforced or strengthened:

  1. Positive Reinforcement (R+ or S R+) occurs when a behavior is strengthened by producing or prolonging some desirable consequence.
  2. Negative Reinforcement (R- or S P ) occurs when a behavior is strengthened by terminating, reducing, or avoiding some undesirable consequence.

There are two ways in which behavior is decreased or weakened:

  1. Negative Punishment (P- or S P-) occurs when a behavior is weakened by omitting the presentation of the reinforcing consequence.
  2. Positive Punishment (P+ or S P+) occurs when a behavior is weakened by presenting the previously escaped or avoided consequence.

In combination, these basic reinforcing and punishing contingencies provide four ways for modifying behavior.

Dogs gain practical information about the physical and social environment through the consequences or their behavior. These experiences teach them how to control or manipulate significant events vital to their interests. Learning is a cognitively organized pattern that must be mastered before complex behavioral skills can be acquired.

Typical reinforcement events will satisfy some physiological or psychological need. To hungry dogs, the opportunity to acquire a flavorful treat is worth effort and work. So, if food is made contingent on a dog sitting when asked, the do will quickly learn that sitting on cue results in the desired food or treat (positive reinforcement.) After numerous experiences, the probability of the dog sitting on cue increases as long as the dog remains motivated and the sit is reinforced or rewarded. Through this very simple lesson a dog learns that its actions can control the environment and makes learning itself essentially rewarding.

At the same time, negative reinforcement occurs when a dog discovers that a particular response terminates or avoids the presentation of an aversive (anything the dog does not like) stimulus. Traditional obedience training makes liberal use of negative reinforcement. For example, a sit is often taught by applying upward pressure or pulling on the leash and collar while applying pressure on the rump. Often, most dogs will first struggle and attempt to resist the pressure, but after several trials, they usually learn to escape the pressure by following the applied forces in the correct direction and successfully learn to sit under compulsion (being compelled, coerced, constrained.4 ) If the cue, sit, is presented before the onset of the pressure the dog will learn to avoid the negative event by sitting in response to the cue alone. After several trials, the dog will begin to recognize a casual link between the presentation of the avoidance cue and avoidance of the anticipated aversive outcome (pulling on the collar.) This learning depends on anticipatory signals that reliably predict certain outcomes. This pattern is confirmed (acquisition) or disconfirmed (extinction) by repeating those experiences.

Additionally, you can add another layer, intrinsic reinforcement (part of the task itself) and extrinsic reinforcement (external to the task). For example, intrinsic positive reinforcers are inherent to behavior such as playing ball, chasing a car, jumping on guests) these behaviors are enjoyed and maintained without additional external reinforcement. Intrinsic negative reinforcers are inherent to the relief provided by behaviors that avoid or terminate situations that are annoying in themselves. For example, growling or snapping when the dog is threatened or escaping confinement when the dog is left alone. Both intrinsic and extrinsic incentives are an important part of dog training and behavior modification.

Now add, Premack Principle to this training, which basically states that a behavior occurring at a high frequency or probability tends to reinforce behavior occurring at a lower frequency or probability. Accordingly, the determination of whether a particular behavior is a reinforcer or punisher depends on its relative probability with respect to the behavior it follows. For example, if the performance of coming was paired with an even more exciting and reinforcing opportunity such as playing ball or tug (of course the dog must enjoy this activities) the dog is more likely to come. If you trained a sit-front to a high degree of proficiency, thus making it a high probability and then chained it to the less probable ‘come’ you would have an outcome of a better probability to the recall.



We will further discuss this in the upcoming part; however, to get you started thinking; Consider – Steven Lindsey writes that two complementary motivations drive instrumental learning: the maximization of positive outcomes and minimization of aversive ones. These complementary motivations correspond to the notions of positive and negative reinforcement. If a response becomes more probable as a result of its producing a desirable consequence (e.g. petting or food), then the potentiating effect is referred to as positive reinforcement. So, if a response becomes more probable by its terminating or avoiding an aversive stimulus (e.g. leash correction), then the effect is referred to as negative reinforcement. Positive and negative reinforcement are the two primary ways in which a goaldirected behavior is acquired and maintained.


Our Testimonials