Thursday, December 30, 2010

Verbal Discrimination and Differential Reinforcement

 Cue Discrimination Work with Elo

Since THIS post, I've been working really hard on Elo's verbal cue discrimination. In the past couple of days it actually seems like we're making some progress!! The discussion came up on Crystal's blog, so I thought I would post an update on how we've been working this.

I started with just three commands: sit, down, and stand. I tried to work just a few repetitions of each command a few times a day--maybe counting out ten kibbles and just working until I ran out, with a treat for each successful response. I picked a different location to work each time--the living room in different corners, the kitchen, a bedroom. I also rotated between sitting and standing. I tried to just mix up the commands randomly so that he would be doing sits from downs, down from stand, stand from down, and all the possible variations. This actually went really well and I didn't have too many problems. He seemed to know "stand" better than I thought he did.

So I added in a fourth command, everything went to hell, and I gave up.

But only for a little while. We've been back at it after doing some significant work with taking treats nicely. It's a lot easier to be patient when your fingers aren't bleeding! I basically just fed him the majority of his dinner by hand for awhile and required him to take each kibble nicely without being told. Once he started doing well with this, I started adding in commands, since every time I gave a command he got a little bit amped up and started snapping again. We worked through this as well and it kind of progressed back into verbal discrimination exercises.

I've increased the number of repetitions. I try to stop before he gets confused, but if he's doing well I don't arbitrarily limit the number of reps I will do. We are working just four commands--sit, down, stand and bummer (the first trick he learned--head on paws)--and varying location and position as above. I've found that he does best when he is very calm, so I usually wait for and reward eye contact a few times between each command and make sure he is paying attention before I give a command. Instead of cycling through all commands randomly, I am working just a couple per session. For example, for the past few nights we've been working sit and bummer primarily. I also add in a few stands and lie downs randomly so it doesn't get to be too much of a pattern.

I am consciously physically cuing in some ways. For bummer I will lower my head slightly and look at the ground. I will also empty-hand lure a response if he fails to respond to a verbal (e.g., for stand or sit from down). So this probably isn't pure verbal discrimination work, but I am honestly more concerned about correct responses to cues than what he is cuing off of. I am also trying to pay attention to the pitch of commands and keeping them consistent.

And a Couple Thoughts on Differential Reinforcement

Differential reinforcement is another thing I've been thinking about lately--having been brought to my consciousness by a couple of other blogs I read. Differential reinforcement, as I understand it, means essentially reinforcing some instances of a behavior and not others, specifically, only reinforcing increasingly better instances of that behavior.

I am terrible at fading treats. And I have previously had this idea that once a behavior is learned you fade treats by simply reinforcing every other, then every third, and just increasingly spacing out rewards. That's not to say I still treat my dogs for every correct response to every cue, and I don't generally carry around a treat bag unless I'm training something new. But I can't say I've ever really understood the concept of differential reinforcement or ever applied it successfully. But for the past couple nights I've been experimenting with it for Jun's response to a down from a stand--she's been turning it into a "bow" first, since that's a trick we've been working on. I c/t the first bow-down after her butt hit the ground. She offered me the same thing for the next rep. No treat. The next rep was a bit better, so I c/t. The following rep was the same. No treat. The next rep was slightly better but I wanted a bit more, so no c/t. The next rep was nearly perfect, c/t. I don't know if this worked because I applied (or tried to apply) differential reinforcement, or because she realized that we were working down and not bow, but the behavior seemed to improve faster than I would normally expect it to. This is really interesting and I plan to keep playing with it. It sure seems like a more purposeful way to treat than just random reinforcement.

ETA: If I'm totally off track with this, please let me know. It's a concept I'd really like to understand and learn to use.

2 comments:

  1. Absolutely that's how it works. You are "shaping" a behavior using successive approximations. What once earned a reward is not enough now. I also use a no reward marker sometimes. So for Kate's contacts if they're slow I say "yuck-get off" very plainly, but she knows that wasn't what I was looking for. Gives the dog information about the response. Anyway you might have seen this article but lots of information on schedules and reinforcement.

    http://www.clickersolutions.com/articles/2001/ratios.htm

    ReplyDelete
  2. I don't think I have seen that article. I will check it out.

    That makes sense. I didn't think of it as shaping, but I guess it is! I think once a dog "knows" a behavior, 8 don't really think about "improving" it very often. But DR would give me a logical way to reward some instances of a known behavior and not others.

    ReplyDelete