Transcript for:
Operant Conditioning Basics

all right we move on now to operant conditioning and we'll see the difference between operant conditioning and classical conditioning of course as we learn about operant conditioning this chapter is laid out like most of them are we start with a little background make sure we're clear on the theoretical concepts we're talking about and then go into some bringing substrates look inside the brain what parts of the brain are implicated in operant conditioning and then finish up with hopefully at least a little clinical perspective all right so one of the names often mentioned in association with the early kind of development of operant conditioning is Thorndike who it's most well known for experiments he did with cats and puzzle boxes and basically he put cats in puzzle boxes like that's probably one of them in some psychology Museum and in order to get out you had to the cat had to like do like put its paw in this little hole and pull a string or something like that and the door would open and what he did was he put the cat in the box and measure how long it took for the cat to get out and you know watch it see what it did and then he put it in the box again same box and he'd measure how long it took the cat to get out and then he'd do it again and again and again and that's the data you see in the top right here time to escape in the seconds on the vertical y-axis by trial on the x axis so you can see that by the 23rd time he put the cat in the box he put the cat in the box in the back of the cat just when I think pong pong had got out and dropped the mic and his point was kind of that this is what we see right is the end product of this of learning but when we see it if I just show you my cat doing that you would say dang you got a smart cat that cat figured it like it got it it understands the box now and his point was well cats don't understand or figure out the box and the way humans do rather this learning can be explained by a very simple learning rule he called the law of effect which goes like this behaviors with positive outcomes are more likely in the future and behaviors that have negative outcomes are less likely or decrease in the future that's it and in fact you don't need a conscious mind or any kind of a-ha moment or cognitive figuring anything out for this rule to apply so this applies to lots of behaviors and many different types of organisms including humans and doesn't can happen rather automatically with much less sophisticated neural machinery than humans have necessarily now why does this data support that well this point was look at this function right first time it took the rat 160 seconds so this isn't that hard a puzzle box and the next time it got out quicker but then it jumped back up and kind of eventually settled down right cats don't like being in boxes so presumably if after the first time they figured it out this function should drop right down to zero seconds on the second trial but still you know three four trials in this rats taking a rat this cats taking a minute to get out of the box 60 seconds on the fourth trial for example with a human right once you figured it out looked around looked at the pulleys tried me tried some stuff and got out the first time if I put you back in the box what would you do one student once said they would scream which I thought was funny but sure you might scream but you might just go oh I know how this works and get out immediately right so starting with the second trial on if there was a figuring it out happening the way humans do then this function should drop from however high it is on the first trial down to close to zero and just stay there the rest of the time but that's not what happened and observing the cat in these data he said look here's what's really happening the cat is kind of trying a bunch of Randall crap and some of it is not doing anything and some of it is having a positive outcome getting out of the box and the behaviors that aren't working or having a negative outcome so to speak which means keeping you trapped in the box and so they're becoming less likely to happen in the future whereas those that are have positive outcomes getting out of the box become incrementally more likely in the future so it's this simple rule that is guiding behavior and it turns out this is a very powerful rule and like I said there doesn't have to be any what we think of as a conscious mind behind this and it's just that all right so you may have noticed on the previous slide this SRO configuration here and the point of this is that in classical conditioning right what was being learned an association between the conditioned stimulus the Bell and the unconditioned stimulus the meat powder right they learned that those went together somebody heard the bell they expected to meet powder yada yada yada so you say you would say it's a CS us association that's learned in operant conditioning it's a little different we say it's an s.r.o association that is learned and the S stands for stimulus the R stands for response in the O stands out comma so the basic idea is that what is learned is that given the presence of a certain stimulus in the example before with Thorndike's catbox it would be the box the visuals and all the cues associated with the box given the presence of a certain stimulus a certain response whatever it was pushing in the bonus lever and pulling a string will generate or cause a specific outcome the door opening and you getting out so sometimes these are called stimulus response outcome contingencies because a response generates an outcome but only contingent upon the presence of some stimulus so the outcomes can be either positive or negative right as we talked about on the previous slide if they're positive then the behavior right before them is more likely if they're negative the behavior right before them is less likely in the future but the words positive and negative behaviorists don't love those because or to say that you know the cat likes getting out of the box or doesn't like being trapped in the box because they don't want to get into what a cat likes or doesn't like or get inside the head of animals at all and talk about subjective states because we'll never know what a cat likes and doesn't like really we can guess based on its behavior yeah but you can't really prove it scientifically so they use the terms reinforcement and punishment to describe different types of outcomes basically good ones and bad ones but they basically get around having to make any guesses about what cats like or don't like they don't really care listen if a cat does something and then there's an outcome after it and that outcome causes the behavior to be more likely in the future that outcome is a reinforcer if there's a behavior followed by an outcome and that behavior becomes less likely in the future then that outcome is a Punisher so here we can objectively all look at these data and we can identify stimuli as reinforces or Punishers that way without ever having to talk about what a cat likes or what feels good to someone else or anything like that so that's two big classes of outcomes reinforcers and Punishers you can probably put pretty much anyone in most operant conditioning paradigms into one of those categories we combine those terms with positive and negative this is probably stuff you've learned before but you need to brush up on it an important thing to note here is that positive doesn't mean good and negative doesn't mean bad in this context positive just means a stimulus was introduced like a spanking was introduced or a five-dollar bill was introduced you might say one of those would be a Punisher or one of those would be a reinforcer but they're both positive because a stimulus is introduced negative means a stimulus has been removed like I take away your iPad that might be a Punisher but I can also take away chores that might be a reinforcer but they both be negative so you can have positive punishment and negative punishment then you can have positive reinforcement and negative reinforcement and if you just think of those two things determining whether it's a reinforcer or punishment or one step' and then determining if it's positive or negative in another step keep them separate you should have no trouble identifying if I gave you a scenario which of those four categories it falls into so for example if I give you $5 for doing your chores that is positive reinforcement if I take away chores to get you to do homework that is negative reinforcement why is it reinforcer reinforcement because there's a behavior I'm basically trying to increase a behavior so that makes it a read for sir I can do this by adding something good or taking away something bad can be positive or negative it doesn't matter punishment the goal is for me to reduce some behavior so if I wanted to reduce the frequency of temper tantrums I can give cookies for being good positive punishment wait that's whoops no that's not that's confusing I'm sorry so I could give a spanking for having a tantrum that would be positive punishment because I switched to changing the behavior to not having Tantrums what I want to focus on here is having Tantrums I want to decrease the frequency and so I'm going to use Punishers I can introduce a spanking every time you have a tantrum we'll get into how the benefits of and bad things about positive punishment of course or if you have a tantrum I can take away your xbox or your iPhone or whatever that would be negative punishment okay so that little graph down there in the bottom might help you sort this out but just remember there's four categories positive reinforcement negative reinforcement positive punishment and negative punishment and I think if you listen to what I just said it should be pretty clear how to put a given outcome into one of those four categories okay so let's look at the are the response a little bit we just looked at the outcome we'll look at the response and then next we'll look at the stimulus a little bit we're just kind of working our way backwards through this SRO contingency or Association learning so what is the response that's learned initially it was theorized that it's an automatic almost reflexive specific motor program that fires off so probably partly because these paradigms frequently involved pigeons not particularly very clever animals as far as we know relatively and so you know if the lights on then they can peck a disc and get a pellet then they learn this and their thinking was the light triggers an automatic motor program of pecking that fires off like I said almost reflexively and then they get pellets and then that the light just triggers that response automatically and suggesting that it was this sort of dumb automatic firing of one motor program they cited data like what's shown there in the upper right hand corner which is just showing that if you train a pigeon to pack to get safe you know when the light turns on you can Peck the disc and get a pellet so every time the light comes on they Peck the disc that's classic operant conditioning and if you put a cup full of food in the cage after this the pigeon still responds just as strongly to the light being on as if there's no food available the point being if they were really doing this to get the food why would they keep doing it if there's a big pile of food right next to them they don't need to Peck the disk anymore you just eat that pile of food and so they said look it's this automatic reflexive motor program that fires off they can't really help it it's not even really connected to the food anymore the light just causes the response to happen automatically and they called this the Protestant ethic effect this is kind of inappropriate but it's referring to the Protestant work ethic that work is good and you should just do it in and of itself never mind the goals work is good because it's work of course that's not what the pigeon is thinking the point here is just that it seems that this response almost is unconnected to the outcome in this case because as you can see in that graph in the upper right the pecking responses per day on the disk are just as high actually a little bit higher when there's a food readily available that's when there isn't so it doesn't seem to be connected to the food anymore in that specific case however that was a fairly early theory and future work further work discovered that if the motor program is blocked the animal will use other methods to achieve the same ends so here's another an example a couple examples rats who are trained to wade through a maze filled partially with water so not so deep that they have to swim later put in the same maze but with it yes so deep that they have to swim so it's flooded so they couldn't wait anymore they had to swim nevertheless they swam to the goal without a problem the point here is that if it was really a specific motor program how does walking instantly translate into swimming those are two very different motor movements for rats so that's just indicated that well it's not quite as simple as just being a simple Overland motor program that fires off another example demonstrating that point is if a rat is trained to press a lever with their paw to get a pellet or something and then you tie their paws to their body they will figure out a different way to press that lever and right away so they'll crawl over and press it with their nose to get the pallet the point here is that the response is not a single literally single behavior a single motor program maybe it's pressing the lever with your paw maybe it's pressing it with your nose maybe it's pressing it with your butt it's a class of behaviors that produce an effect or the behavior has called it a behavioral unit but the point is it's not quite as simple as just light Peck light Peck because if you can't pack then light triggers some other thing it's a little more closer to light means I need to get that lever pressed however I can cognitive psychologists might call this a goal or an intention of course behaviorists don't want to worry about internal representations or anything like that so they just call it a behavioral unit alright so let's talk about the s a little bit the stimulus you might also think of this as the context it matters operant conditioning is about contingencies if I'd make this response then I get the outcome but that's not always true contingencies change for example if I cry mom picks me up but only if mom is sober so mom being sober or not in this example is the discriminative stimulus if sobriety is there then the crying gets me picked up if it's not then the contingencies not in effect I know it's a rather example but all right so there's an introduction to the SRO contingencies that are learned in operant conditioning of course the paradigm that is probably most often associated with operant conditioning paradigm means kind of like experimental setup is the Skinner box mmm shown on the right here where you've got a box that can hold a rat or pigeon or whatever and hugely there's methods for delivering stimuli discriminative stimuli like a speaker for noises or lights or you know maybe a little screen and there's a lever for collecting responses and a food pellet tray for delivering rewards and maybe an electric grid for the living will punish in this had a lot of advantages over Thorndike's catbox because you could run trials and collect data automatically you don't have to pick up the cat and put it back in the box at the end of every trial and when a trial has started can just kind of be up to the animal I can press the lever whenever they want and then they can wait or not wait it immediately and also you have like I said a wider array of outcomes in contexts instead of just the one box you're trapped in here you've got different lights you can present it sounds shocks etc etc and of course the classic Skinner box paradigm there's say a light and when the light is on pressing the lever gets you a pellet of course it could be a light shock as well which would be a Punisher probably right would reduce lever pressing behavior so if you condition an animal to press the lever for a reward and then instead the reward switches to a Punisher that will decrease the lever pressing at a faster rate than just ceasing the reward obviously but here you know this s is a light or a tone or some stimulus that signals that the lever is you know live or active that the are o contingency is in effect that's the discriminative stimulus and of course the rate of your behavior the behavior the lever pressing is the response or the behavior and you can measure the rate of this and they do and the outcome is whatever the outcome is the food or the shock all right so this slide just shows some schematic data of the way that we might look at it after doing a typical operant conditioning experiment let's say we're a rat is learning that when light is on pressing the button gets you some food light stimulus press button response food outcome and the data are shown in two different ways here all focus on the top graph first which just shows the behavior in terms of response rate which means how many responses the animal is doing per minute and the graph that's what's shown on the Y or vertical axis obviously higher means a faster rate of responding lower means less responding and then of course the x-axis on the bottom shows time and minutes and the left half of each graph that's kind of yellowish is during acquisition when the lever is on so to speak the SRO contingency is in effect and then starting at about minute 13 where the graph turns purple that's where the lever is off this ro is no longer in effect you can press the lever but you're not gonna get a pellet anymore and not surprisingly as you can see when the whole thing starts the rat doesn't press the button because why would they they have never gotten anything for it before but eventually they press it when the light is on just by chance because rats explore and a pellet pops out and gradually the animal acquires or learns this contingency lights on press the lever would get a pellet and I wouldn't say that they discover it in the way we do of course because we'd be like oh then we press it immediately you can see that the rate of responding gradually increases because of the law of effect behaviors that have positive outcomes are more likely to happen in the future there's no need to posit that the rat has figured anything out in fact this data and the gradual nature of the increase suggest that's probably not the case in any rate as you can then see of course when looks like the rate of responding kind of peeks out at 10 presses per minute and then they stop delivering the pellets in the rate of responding gradually tapers off because it's no longer being rewarded that is called extinction in this paradigm acquisition is when the animal is learning the SRO contingency and extinction is when the SRO contingency is no longer in effect then we can look at the pattern of behavior in these two phases and it makes pretty good sense the bottom graph is showing the exact same thing it's just showing cumulative responses instead of response rates so the only thing that's different is the y-axis and they show data like this because they used to collect data from these experiments with like a piece of paper being dragged under a pencil they picked up every time the animal press the lever and so if it the more up steps you see here the more the animal is responding if it's a flatline that means the animal isn't responding at all so it's the same idea it's just measuring the cumulative total responses that the animal has has made and those obviously just go up and up and up they never go down you can't undo a long lever press but you can see that the steeper this is that's the faster the rate of responding and of course it's steepest during acquisition and then during extinction it tapers off to eventually no responding at all all right so just so we're clear here let's talk a little bit about the differences between classical and operant conditioning the classical conditioning of course we talked about learning conditioned stimulus unconditioned stimulus associations bail meat powder operant conditioning we talked about learning stimulus response outcome contingencies but just on the face of it here one difference is that in classical conditioning the outcome if you want to call it that the meat powder unconditioned stimulus is delivered no matter what regardless of behavior every time in operant conditioning of course the outcome rather the food pellet is delivered only if the operant behavior the lever press which is called the operant behavior you have to sweat that happens right so the animal doesn't have to press the lever in the classical conditioning paradigm there's really no choice about it stimulus is delivered that triggers a reflexive automatic response which happens right and then a conditioned stimulus is associated with that unconditioned stimulus that triggers the automatic response but an operant conditioning a discriminative stimulus is presented and then the animal can press the button or not there's a degree of call it choice but it's not a reflexive automatic response that has to happen and then of course the delivery of the pellet doesn't happen every time either it depends on the behavior of the animal which is flexible so the upshot of all this is that in classical conditioning you can kind of get control of simple reflexive automatic behaviors but an operant conditioning you can actually shape more complex non reflexive I don't know if I want to call them voluntary but things more interesting than drooling let's take a look so illustrating probably the pinnacle of psychology's impact on the world today you can see that yes this isn't photoshopped we have gotten squirrels to water-ski I know I know pretty much everything else doesn't even seem to matter anymore doesn't how do we do this amazing thing through shaping and chaining where you basically using the principles of operant conditioning break a for shaping you break a complex behavior like waterskiing which would be tough for a squirrel to do into simple pieces and then you reinforce each one of those eventually shaping a much more complex behavior so if you put a squirrel in a pool and had another squirrel at the helm of a motor boat kind of looking back waiting for his friend to get on the water-ski and say punch it and put on a little vest you'd be waiting a long long a long long long long long long time for that to happen on its own so you have to use shaping with shaping you put the squirrel in the pool with the idling motor boat and the squirrel ignores it and tries to get out because he's trapped in a pool but eventually he might swim over near the surfboard just by chance and if so you give him a nut or some reinforcer you do this until as soon as you put him in the pool he swims over to the surfboard then you stop rewarding that and you reward him for putting a little paw on the surfboard and you reward that then another paw then you stop rewarding that and he has to actually climb up on the surfboard to get the nut you get the picture eventually as soon as you put the squirrel on the pool he'll swim over the surfboard jump on throw on a little safety vest grab the thing and say punch it and then water-ski around the top it's truly amazing the important thing here is that you understand the idea of shaping we you basically reward little bits of a complex behavior one at a time until and this vastly speeds up the rate of learning and also probably allows you to conditioned behaviors that you could never condition if you just waited for the animal to water-ski for example on its own before rewarding it at all so initially contingency is introduced for simple behaviors swimming over to the surfboard and as that gets better now the contingency is moved to a more complex version of the response swimming over and climbing on and so on and so forth chaining is very similar basically it's the same idea you build complex response sequences by linking together sro conditions so the way of viewing this here is that each response results in a change in context which serves as the stimulus for the next SRO chain so the animal does something is for reward right and that something has an effect on the environment it changed the contingencies literally the stimuli that were there are now different and they become the s for the next SRO contingency that's learned and so this is similar to shaping but it's thought of in a little bit different way where each response changes the context which serves as the stimulus for the next SRO in the chain but they both have in common that it's basically stringing together a bunch of what you might call simple simpler SRO associations so that you can condition much more complex behaviors and I guess one difference is that in chaining you could theoretically string together a bunch of totally unrelated behaviors we're shaping is considered to be more like one complex behavior but I don't know if that the difference is all that meaningful to me an interesting thing you can do is backwards chaining for example if you start with the puzzle completely made right so normally making a puzzle requires a huge number of steps right it would be hard to condition anyone to do this an animal in particular but it's easier sometimes to go backwards where you start with the completed puzzle but for one piece missing all right and then when they put that piece in they're rewarded all right and that leads to the stimulus where you've got two pieces missing and when they get both of those in they get the first piece in that triggers the SRO they've already learned right the puzzle missing one piece and then that they do automatically so you only have to learn one puzzle piece at a time so to speak and it's easier than starting with a table full of random puzzle pieces and trying to wait to reward them for each successive piece added and shaping chaining these can be used to do truly amazing things I was being taking it lightly when we talked about the squirrel waterskiing but seriously the way that they get for example rats to search for landmines in parts of the world where Wars happened and ordnance was left behind and now people are trying to farm in Wow for example and getting their legs blown off so they need to find all of these thousands and thousands and thousands and thousands of bombs and things that were dropped and planted in these countries because some of them are still alive so what they can do is train rats using shaping to walk ahead of someone and they actually are good through smelling and scratching and searching they've been rewarded for finding mines and so they actually do this and they're small enough so typically I don't think they set them off if you're thinking how cruel that is they find them they get a reward no one's trying to blow up rats here but the point is this is pretty amazing and relatively low-tech waiting to do something that would be really hard to figure out how to do otherwise using shaping and chaining and of course using reinforcers so let's talk a little bit about these these outcomes starting with reinforces their reinforcements have been categorized as primary or secondary primary are things that satisfy a basic biological need or drive Hall's drive reduction theory don't worry about that theory about it food for example sex these are primary reinforcers secondary reinforcers might be able to be traded for primary reinforcements but they don't actually satisfy any biological need in and of themselves like money or you know tokens you can trade this in for whatever snacks later or something like that but you can't eat it it's just a symbol so to speak so they have their different effects and of course how effective a given reinforcer is depends on the state of the animal right how hungry are they foods not gonna work if an animal's gorged water will work great if an animal is very thirsty maybe it's not surprising to learn that many animals will press more for cocaine than for food so different reinforcers not surprisingly have their different effects and these effects can differ depending on the state of the animal for example what's shown here is how much a baby will suck for different types of reinforcer plain water that's a primary reinforcer for thirsty and sweetened water that's a reinforcer too and maybe a little yummier and if we look at what's going on here at the graph on the lower right okay we've got session one where everybody let's see there's three groups here one has sweet water for both sessions that's the top darker green line in the first session they suck a lot in the second session they suck a lot another group has plain water both times that's the lower green line in the first session they suck a little less in the second session they suck a little less why plain runners not quite as yummy as sweet water but if you look at the important group is the red one they got sweet water for the first session and then plain water in the second session so obviously their first session looks just like the sweet water the whole time group because they both just got sweet water but in the second session when they got plain water they do not look like the plain water second session group why because their history they had sweet water and that was delicious and now this plain water sucks so the point is that you can't ever think of a reinforcer as having this one fixed value it totally depends on the organism that's history and the current state of the organism and the other point of course is the distinction between primary and secondary reinforcers the other big category of outcomes Punishers is like I said before any outcome that decreases the frequency of the behavior that happened right before it initially Skinner and Thorndike noted based on their observations that punishment did not seem to be as effective at controlling behavior as reinforcement so their views based on a lot of careful observation was that you know punishment obviously works but in some it wasn't as effective as a reinforcement with further investigation we now know punishment actually can be very effective but it is trickier to use than reinforcement probably Skinner and Thorndike didn't fully appreciate that how you use it is a lot more important with punishment and there's a lot more things you have to be careful about with punishment than with reinforcement so in that sense reinforcement is better but if you are aware of all these things and careful punishment can be very effective so here are four examples I'd like to like you to know of things that make punishment a little trickier to use because they can act as barriers to it actually being effective so the first one cheating has to do with the fact that it can be really hard to keep a contingency constantly in effect right so basically for example if we want to keep people from speeding we want to punish that behavior which we do by giving them funds or tickets or whatever and the problem is you can't always catch every speeder and so the contingency can't always be in effect and the problem is that people in this case but animals altogether they learn these very subtle even if you're not aware of these little discriminative stimuli that might be telling them when a contingency is in effect not they will find them and learn them for example humans know that if you know there's no police around there's a big stretch there's no u-turns little turnarounds there's nowhere for a cop to hide then speeding is not punished they can kind of cheat the contingency if you know you hit your sibling when your parent is in a bad mood then you're going to get punished great that will reduce hitting hopefully but if you can learn that when your dad is a little happier than normal hitting doesn't get you punished then you find ways to cheat this contingency and then hitting behavior isn't decreased which is the whole point of punishment here so for effective punishment it's really important that the contingency is always in effect our animals will find ways to learn when it is and when it isn't and take advantage of that and do the behavior just at different times when they see that it won't be punished and sometimes keeping a contingency always in effect is really hard second one here called concurrent reinforcement is the idea that you might not be aware of all the Punishers and reinforcements that are happening in a complex environment for example talking in class which you want to decrease that behavior you want to punish it and it may be punished by a teacher you know you get sent to the principal's office or whatever but concurrently it might be rewarded by classmates right they might laugh at your jokes or think you're cool or whatever and so it's just an example of how it's not always clear all the consequences of a behavior and so even though you may be punishing a behavior there may be other outcomes especially in complex environments that are working against you by rewarding that behavior number three punishment leads to more variable behavior and this makes sense but it probably wouldn't have thought of it if it hadn't been pointed out to me right reinforcement increases the frequency of a very specific behavior when you do this one specific behavior is rewarded and what does that mean is going to happen in the future more of that specific behavior however punishment takes one specific behavior and decreases the probability of that but the problem is it's less predictable what's going to happen in the future all I did is make one behavior less likely what takes its place who knows maybe it's gonna be something worse we really have no control over that or any idea what it might be because all punishment does is remove something from the future right reinforcement creates a future or reduces the uncertainty of what's going to happen in the future because you know there's gonna be more of this one predictable certain thing punishment reduces the predictability of the future because you take something that has been happening and remove it what takes its place does anyone's guess and finally initial intensity matters for punishment to work you really need to start strong so for the first offense of speeding someone should be fined $500 if you want punishment to work this is what the science tells us however that you know take away your license for speeding for a month for the first time or something unfortunately that often violates our sense of fairness right makes more sense to us being like wow this was your first time so I'll take it easy on you but you're only gonna get three strikes but uh and that may make more sense from a kind of fairness ethical standpoint but in terms of a controlling behavior and reducing some behavior you don't want to happen using punishment that is not the most effective way to do it so there's a conflict there between what works and what feels fair of course another problem with punishment that probably all of you have thought of is the emotion mm-hmm emotional baggage I mean you're a lot of times you're you know you have to impose something painful or unpleasant on another organism whether it's a rat or your child or whatever and there are emotional consequences to this that throw all sorts of complicated unpredictable monkey wrenches into play here I mean I don't have to tell you about the questionable nature of you know spanking or whipping kids it can produce fear anxiety and rage that can obviously cause problems down the road for people but also can actually literally impair the effectiveness of the punishment if you're trying to decrease some behavior punishing it with positive punishment than spanking for example can impair behavior in ways beyond just reducing the frequency of the specific thing you're trying to get rid of it can produce generalized behavior disruptions so I don't know maybe you stop someone from peeing on the floor but now they're banging their head on the wall right and also you know organisms are hardwired to respond to being attacked or insulted it can produce aggressions is another example of emotion coming into play and messing up your cold hard calculated efforts to reduce the frequency of temper tantrums or whatever so punishment can cause aggression which is certainly a disruptive behavior and therapists disagree still over the utility of punishment in in therapy and in child-rearing I would say and we this is a you know a discussion topic you guys probably have as much or more valuable perspective on this as I do some of you I would say you know we can talk about this in the zoo meeting I would say clearly reinforcement is preferable all things being equal but don't be fooled into thinking punishment never works and shouldn't be used it can be very effective alright so here's just an interim summary as usual I suggest you pause this read through this and go yeah yeah okay okay wait what but I won't go through these points it's just redundant and the second half of the interim summary is shown here you