Jumat, 08 April 2016

Errorless Learning versus the use of No Reward Markers

0


Errorless Learning versus the use of No Reward Markers

Errorless learningis a type of training that sets humans or animals up with the goal of a 100% success rate while learning.  Today, not only zoos, marine parks, and dog trainers use errorless learning, but also teachers of children and people with learning disabilities use it with their pupils instead of trial and error learning.   
This type of training was first introduced by Herbert Terrace in 1963 in a discrimination experiment with pigeons. Terrace was trying to find a way to reduce the emotional behavior that interferes with operant behavior when an animal makes an error in discrimination training.  He trained pigeons to discriminate between two squares of color.  With one group he used errorless learning by creatively setting the pigeons up to succeed in offering the correct behavior right from the start, while with the other group he used trial and error learning.  The group of pigeons set up for errorless learning offered an average of 25 incorrect behaviors during the testing period, while the pigeons trained by trial and error offered the incorrect behavior between 2,000 and 5,000 times.  His astounding results have paved the way to more precise learning procedures with less unwanted side effects, benefiting a wide variety of learners, from people suffering from Amnesia, bomb sniffing dogs, to performing killer whales. 

Errorless learning as opposed to trial and error learning has been scientifically proven with animals and humans to:

*Minimize the number of errors in the training session
*Decrease time spent learning a skill
*Reduce future errors, as they have never been practiced
*Create less frustration, stress, and aggression
*Not inhibit behavior
*Not create a conditioned emotional response associated with punishment to any part of the behavior or task
*Not create a conditioned emotional response associated with punishment to the trainer or the training environment

An example of errorless learning:

Perhaps you have taught your dog to touch a target with his nose, and also step on a target with his paw.  After repeating the cue of touching the target with his nose with the target 1 foot from the ground, you then put the target on the ground.  Most dogs will be highly likely offer foot targeting as well as nose targeting because of the situational cue of the target being on the ground, unless they have worked on stimulus control for both behaviors.  Instead of using a no reward marker or another type of punishment for an incorrect behavior, you can simply set the dog up for success from the start.  You could do this by lowering the target gradually, shaping approximations of the final behavior so that nose targeting continues successfully until the object is on the ground, or you could prevent errors by having the dog stand on a stool with his paws to keep them in place when you put the nose target on the floor.  Plan and think creatively to create precise, reliable, and highly reinforced behaviors using errorless learning!

Why do dog make errors in training?

Behaviors can deteriorate because of incorrect criteria, timing, and/or reinforcement.  Animals naturally vary behavior and so it is impossible to achieve no errors.  Regression is also a natural part of learning in all creatures.  A context shift can also affect behavior, as dogs do not generalize well.  For example, if your dog “knows” sit in the kitchen, your dog might not “know” sit in the yard on the grass, sit while another dog is playing Frisbee next to you, or sit in the dog park.  So if the trainer wants stimulus control over the behavior (a reliable behavior in all the situations the trainer asks for it), the behavior must be proofed and reinforced to the degree the trainer wishes in all the scenarios he wishes.

Other reasons that errors may occur are if your animal is over aroused, sick, tired, full, injured, overweight, out of shape, fearful, nervous or stressed.  The environment and distractions could also be disrupting your training session.  Your reinforcement could be to blame by not being of a high enough value, or too predictable.  Reinforcement in scientific terms, increases behavior. So if the behavior is not increasing- it’s not being reinforced.   

What do you do when errors start popping up? 

When training using errorless learning, a warning sign that your plan needs to be modified is when your animal starts offering too many incorrect behaviors.  Instead of punishing the dog by using a no reward marker to give the dog information that he was wrong, modify your training plan to set your dog up for continued success.  You can use shaping to reinforce approximations of the desired behavior.  

When proofing and adding new criteria, you must lower the level of existing criteria.  You can use the environment, props, cues, previous training, as well as reinforcement placement to set your dog up for faster success.  If your training plan is not yielding results, stop doing it and think creatively!

If your dog is failing in the middle of a behavior chain, go back and reinforce the behaviors that are faltering to create a stronger chain.  All behaviors in behavior chains need to be equally reinforced or the chain could fall apart at its weakest link.  The area of a chain that falls apart the fastest, tells you which area is the weakest and needs to be reinforced the most.

For using errorless learning in not just training sessions but also everyday life, you can use these guidelines:

Reinforce- the behaviors your dog is already doing that you find desirable and they will increase.
Train- new behaviors as alternate behaviors to replace the ones you don’t like.
Interrupt- behaviors you find undesirable so they don’t attain a reinforcement history.  You can do this by using a previously trained with positive reinforcement recall, attention noise, leave it cue, or asking for a different behavior from your dog to interrupt the undesirable behavior from continuing.
Prevent- your dog from practicing unwanted behaviors by using management.

For information on solving behavioral problems and interrupting undesirable behavior inside and outside of training sessions without using physical or psychological intimidation, read the Progressive Reinforcement Training Manifesto here:
www.dogmantics.com

What is a No Reward Marker?

A No Reward Marker is a trained Secondary Punisher, or in other words a Conditioned Punisher that predicts no reinforcement is to follow.  With enough conditioning of a word or sound to be the predictor of no reinforcement, the word itself will create a conditioned emotional response in the animal similar to the disappointment of not being given the reinforcement he was expecting.  After conditioning, when this word is used during training, it will cause the animal to be less likely to repeat the behavior he was doing in the future (if conditioned correctly and if the behavior isn’t self reinforcing).  Trainers use NRMs to punish, or in other words suppress behavior with the hopes that they will cause the behaviors to be less likely to be repeated in the future.  Examples of NRM’s are “no”, “eh-eh”, “oops!”, “wrong”, “sorry” and “try again”.

The problems with using No Reward Markers:

* NRMs can cause frustration, stress and even aggression.
*They can inhibit behaviors you dislike, but also inhibit behaviors you had wanted to keep.
*They can create a conditioned emotional response associated with punishment to a cue or a behavior (known as a poisoned cue) if used often.
*They can create a conditioned emotional response associated with punishment to the trainer and/or the training environment if used often.
*They can give the trainer the idea the dog is to blame rather than a faulty training plan.
*If your dog is over-aroused, stressed, confused, fearful or sick your dog might perform a behavior incorrectly, and punishment will only mask the underlying problem.
*Using NRM’s are positively reinforcing for the trainer- meaning that a trainer might unconsciously start using them more often in training sessions as they give a feeling of instant gratification.  Making a trainer less likely to modify the training plan and more likely to punish the dog instead.

Look at the dog in the picture.  Imagine the trainer had said “Oops!” the moment the dog sat down in front of her, because the dog sat too slowly.

The next time the trainer asks for the cue the dog could offer an even slower sit, or perhaps offers another learned behavior like a down, or an alternate dog behavior like jumping up, whining, barking or growling. There is the possibility that the dog could offer a faster sit, but what if the dog doesn’t?

Perhaps the dog understands the concept of a NRM but superstitiously responds by acting as if it was the eye contact that was incorrect, perhaps the dog associates the punishment with being too close to the fence, or perhaps that he should not be in front of the trainer. Perhaps it was a combination? Perhaps the trainer does not want the dog to sit ever again, as when the dog had jumped on the trainer the NRM meant to never do that behavior ever again. 

Instead of using a NRM, the trainer could reinforce the dog’s fastest sits to build the muscle memory and a reinforcement history of the desired speed of sitting.  Instead of having the dog guessing about what he shouldn’t be doing, the trainer could reinforce him for doing what she wants him to be doing, and building a stimulus response association of only the correct behavior.  The trainer could set the dog up for success by making him more likely to sit fast by playing tug and getting the dog excited before asking for the cue, not asking for the behavior when the dog has just woken up from a nap and luring the dog into a fast sit with a treat until the dog is sitting at an appropriate speed prior to asking the cue.  

Classical Conditioning occurs in your training whether you like it or not.

If you say “down” and your dog sits, and then you say “wrong”, a secondary punisher follows the behavior of a sit.  This not only punishes a sit offered in response to the cue “down” but it also causes the behavior of siting to be conditioned with the secondary punisher.  This means that the next time you say “sit” your dogs brain might activate the memory of the NRM associated with the behavior in the past, and it could lead to confusion down the line as well as illicit a conditioned emotional response associated with punishment if NRM’s are often used in training.

In the video below Tedd Judd, PHD, Board Certified in Clinical Neuropsychology by the American Board of Professional Psychology, shares a great example of how using trial and error learning as opposed to errorless learning with an Amnesia patient caused the incorrect behavior to be more likely to occur in the future, rather than the desired one:






In the video Tedd Judd gives the example of a patient with Amnesia, in the hospital. The Doctor asks the patient, “Do you remember my name?” The patient says “No” and the doctors replies “Well, take a guess”, and the patient answers “Dr. Smith?”.  The doctor then answers, “No, It’s Dr. Judd”.  The next morning the Doctor asks the same question. “Do you remember my name?” and the patient replies “No”, and the doctor says “Can you take a guess?”, then the patient replies “Was it Dr. Smith?” Then the doctor replies, “No, it’s Doctor Judd”.  Then the next time the doctor goes past the patient the patient says “Oh, hi Dr. Smith!!!”  This happened because the patient was remembering their mistake, instead of the appropriate response. 

This same scenario can happen with dogs, a dog can remember and build muscle memory for the incorrect response even if a NRM was given.  With errorless learning where your goal is to shape successful approximations of the final behavior, the dog will not have the opportunity to think of, learn or practice incorrect responses.

An example of this is using trial and error training with No Reward Markers while teaching a dog to weave through agility poles.  During trial and error training the dog could zoom through the poles incorrectly, and you could say “Whoops!”, try again, and then the dog gets it right.  Perhaps you do 10 repetitions and the first time the dog was incorrect, then correct, then had 3 more errors, but then was successful the last 5 times.  It could seem that your dog has learned from his errors, however there is a higher possibility that the dog will repeat the mistakes he just repeated 4 times in the trial of 10 and than if you did 10 trials using errorless learning where the dog only make a mistake 1 out of 10 times.  This is because the dog has practiced doing the error more times.

Using a NRM in the middle of a behavior chain can not only punish the behavior in the chain, but can also punish the behaviors previously done in the chain, and can cause the cue to become poisoned (create a conditioned emotional response associated with punishment to the cue or the behavior). 

If you used a NRM for the dog exiting the weave poles in the middle of the poles, instead of completing the weaves correctly, and for some reason you had to use the NRM multiple times in this exact area of the weaves, your dog could start to have a conditioned emotional response associated with punishment when reaching that area of the weave poles that have been continually punished and your dogs behavior could change because of this conditioned response.

As Ted Turner, an internationally renowned Animal Behaviorist and marine mammal trainer says, regarding the use of punishment in training; when you reinforce your dog for something “you are putting money in a reinforcement account.   If you put a punishment in there, you drain your savings.  If you put too many punishments in there, there will be nothing to draw from.” 

In my opinion, it is easier to compete with the environment and distractions and be the most reinforcing option for your dog when you do not use punishers or conditioned punishers, as you have not “drawn from your reinforcement savings”.  To condition a behavior as secondary reinforcer (which means the animal will more readily do it without primary reinforcement in the future), stronger conditioning occurs if the behavior is only paired with reinforcement and never punishment, such as a NRM.  After many repetitions using errorless learning, the cues and behaviors your dog does should elicit a conditioned appetitive emotional response, in other words the dogs feels a similar feeling when he hears the cue of the behavior and completes the behavior to the feeling of being reinforced.

No one said training with errorless learning is easy.  It is much easier to watch an animal and say ‘yes’ when you like what they are doing and ‘no’ when you don’t like what the animal is doing.  It is much harder to create a training plan and adjust the plan using creative thinking when things go wrong.

In my opinion only the most talented trainers should implement such a complex method such as No Reward Markers into their training plans, and if the trainer is that talented, then they shouldn’t be making that many errors in the first place to need NRMs.

0 komentar:

Posting Komentar

 
Powered by Blogger