Transcript for:
Timing and Schedules of Reinforcement in Operant Conditioning

all right so we looked at the s in the R in the oh and mention some things that influence how well operant conditioning works like the history of the animal whether you use punishment or reinforcement etc and there's some more stuff that affects the effectiveness of operant conditioning timing is a big one that people can study in the lab and you can control it very well and study its effects on behavior as well as the relationship between the behavior and the consequence and that gets into schedules of reinforcement which we'll also talk about so timing it's pretty straightforward it's like I said it's the same idea as timing when we discussed it in terms of classical conditioning the closer in time the behavior and the consequence the better the learning in general not necessarily it depends on the stimuli and the organism but here's just some data showing know if we already saw this graph or not but how much you get a increase in responding to the stimulus right so how much are you get to the s that's what cumulative responses is measuring and it's not rate here this is cumulative so each time there's a response that one ticks up so it's never gonna go back down it's just adding up responses the steeper it goes up the faster those responses are happening but anyway if you've got the lights on level press pellet comes out right away with no delay that's a zero second delay and as you can see the increase in responding is faster in the zero second delay than in the four second delay as time goes by the responding is in is increasing I don't know if I the rate actually looks pretty steady but it's definitely happening as time goes by a four second delay a ten-second delay you get very little responding after an hour it looks like maybe there's been 10 lever presses so one thing to know about timing is that it's helpful neural networks learn faster if the consequence the outcome comes right after the behavior the are the response however that is not always the case sometimes an animal has to press the lever a bunch of times to get an outcome a pellet other times an animal has to maybe wait a certain amount of time all right those are ratio or interval schedules of reinforcement we'll get into those just to back up here in operant conditioning the contingency is learned if s then R leads to o right if the light is on then a response the lever press gets me a pellet the outcome and Skinner reason that the form of this contingency would control the pattern of behavior and what do we mean by form of the contingency well like I said if s then v RS could get it go or if s then 20 hours or if s then sometimes five sometimes 20 sometimes 15 there are a bunch of different ways to set up the contingency and these are called schedules of often a reinforcement I mean you could conceivably punish as an outcome here but typically these are looked at with reinforcement not punishment so the different schedules of reinforcement you can have fixed ratio variable ratio fixed interval or variable interval and Skinner that others went to town exploring how using these different schedules effects behavior because his whole deal was look if we can make psychology a science about predicting behavior we've really got something here I mean who can argue with the worthwhile nosov and also who can argue with the objectivity of that everyone can see and measure behavior so that's what he's going for here and let's look at these different schedules of reinforcement a little more closely all right ridiculousness let's start looking at ratio schedules of reinforcement then we'll move on to interval schedules and first we'll look at fixed ratio schedules winner in every x RS produces 100 that's confusing it just means every certain number of responses lever presses for example produces one outcome so for example when we say an fr one schedule is a fixed ratio and the ratio is one so basically every single response gets a pallet that's called continuous schedules of reinforcement meaning it happens every time pretty pretty uncommon those are only one of many ways most of the ways we'll talk about here to set up the contingencies would be called intermittent schedules of reinforcement because not every single lever press gets a pallet fr2 would be a fixed ratio of two where every second lever press gets a pellet fr3 means you have to press the lever one two three times and then you get a pellet and then again you have to pass it one two three times okay so these are fixed because after you get a pellet it's always the same number of presses that will get you another pellet so the schedule is fixed and it's ratio because it's about the number of presses that gets you the pellet not the amount of time that's passed okay here's a variable ratio where sometimes the pellet comes after three presses sometimes after seven presses sometimes after one sometimes after ten whatever in this particular example it looks like the first one came after three then it took seven then one then one then 18 and this is a variable ratio because the number of lever presses the button presses that gets you a pellet varies from trial to trial it is still a ratio schedule though because it's all about how many times you press the button sing around and waiting time elapsed has nothing to do with it and this demonstrates a vr6 ratio because even though it may have never actually taken six presses to get a pellet on average if you add up all those times over the whole training session if you add up how many times on each trial it took to get a pellet and you take the average of all those that's the number that goes after the V R so this is a variable ratio six schedule demonstrated here okay and then of course we can look at how these two different schedules of reinforcement affect behavior that is how often and at what rate the animals respond press the button you might say press the lever and here's the data and how they show it these are cumulative responses right so the variable ratio schedule five if you look at the graph those orange arrows are the time points at which a reinforcer was administered and if you look at the variable ratio five you see that it's a line going steadily up which indicates that and it's going up pretty steeply over time which every time basically the the animal presses the button it takes up then you can see that that's just ticking up at a steady rate over time and a pretty rapid steady rate without much or any or very much of any pauses the fixed ratio 5 schedule so you know on average over the whole course of the sessions they're gonna get the same number of reinforcers in both cases but in one case they respond their brains out for a variable ratio five schedule for a fixed ratio that means that every five presses they get a pellet the rate of responding the pattern of responding is very different first of all it's not steady they take a little pause after each reinforcer why don't they do this with variable ratio I don't know if you just showed me the fixed ratio I'd say you know they're pausing to eat their pellet but for some reason they don't do that with variable ratio they just respond steadily through the whole thing and not only that another difference is that even if they didn't pause the variable ratio line is steeper which indicates a faster rate of responding so it's just kind of interesting again breakthroughs in science often are about ways of visualizing data that make you go what does that mean and this is just a good example of that it's not necessarily intuitive that you get pauses with a fixed ratio then not with a variable ratio I don't think it's even intuitive that a variable ratio would necessarily result in a higher rate of overall responding but they do why what does it mean can you think of real-life examples of these two different schedules well a variable ratio that's pretty easy right slot machines gambling it's all variable ratio behind those every however many pulls you might get hit the jackpot but you never know how many it's gonna be from we need to win and fixed ratio is what's a good example of a fixed ratio if we don't know for every ten questions you complete you finish in homework or something like that for every whatever I don't know if that's a bad example but you can think of a fixed ratio schedule reinforcement in the real world on your own here's just a dingaling slide demonstrating a fixed ratio and variable ratio again the top every three presses you get a cookie right that would be a fr3 schedule below that you have a variable ratio the first time it took four presses the next time two presses maybe the next time three minutes then six bah-bah-bah-bah-bah-bah it will average out to something at the end in this case there's only two trials but it would average out to three depresses so that would be a vr3 interval schedules are different okay because it's not about the number of presses that happens it's about the amount of time that has passed since the last reinforcement so for fi five-minute schedule for example that would be a fixed interval 5-minute and it would mean that after you get a cookie the lever just doesn't work for five minutes and then once five minutes has passed the first press after that five minute time period gets you another cookie and then you have to wait another five minutes the variable interval is the same idea except it's not always five minutes sometimes it might be one minute you have to wait sometimes nine minutes sometimes three ba-ba-ba-ba-ba-ba it would average out to five if it were a VI five so for interval schedules of reinforcement it's not about the number of presses it's about having to wait a certain amount of time before the button even works before the lever even works and how much time can change from trial to trial if it's a variable schedule or it could be this number on each trial if it's a fixed interval schedule so let's look at how responding differs for the two interval schedules of reinforcement fixed and variable and if we look we see a similar patterns in that the variable schedule results in a steady rate of responding whereas the fixed schedule you get these kind of pauses so one difference to note here though is the green line right and the purple line the overall slopes of those are about the same so you get about the same rate of responding with an interval schedule if it's variable and if it's fixed the pattern is different right the fixed interval schedule means every 10 sec if you were after every 10 seconds you press the button you get a cookie then it doesn't work for 10 seconds etc etc you get this scalloped pattern which is kind of unique those curves right so it's kind of like they get the cookie or the pellet and then they kind of stop for a second and then they start slowly responding and they speed up and speed up and speed up and speed up excuse me up until I hit the 10 seconds they get pellet and then they stop for a second and they slowly ramp up this is the only schedule of reinforcement fixed interval where we see that kind of unsteady rate of responding that starts slow and gets faster and faster again why is that exactly it's not entirely clear it's not like these animals are figuring things out and responding based on the kind of Reason that you would apply to why this responding patterns look like this these are just kind of automatic learning mechanisms that shape behavior because they don't have the same conscious reasoning experience that we do presumably and you know you can argue with me about that so let's look at them all together oops yeah yeah fixed interval of course for these interval schedules responding before the interval has elapsed has no consequence then those bullets just described a scalloped pattern you see for fixed interval so here's all the patterns of responding kind of stuff together on one slide for fixed ratio fixed interval a variable ratio and variable and as you can see the variable ratio schedule produces the highest rate of responding that is it's the steepest line this one over on the far right does have some little pauses in it but whatever their minimum I'm like for fixed ratio we are in fixed interval where there's some predictability to it for some reason these functions show responding pauses after each reinforcer whereas the variable schedules it's kind of steady throughout another thing to notice is that the variable ratio results in the highest rate of responding so if you want to maximize response which is exactly what people who own casinos want to do I want to literally just maximize the number of time you put a quarter in my slot machine then it's a no-brainer to use a variable ratio schedule of reinforcement it has to do with how many quarters you put in how many times you pull the lever but you never know how many is going to result in a reward that is across many species and very reliably the schedule of reinforcement that produces the highest rate of responding all right so you're probably aware behaviorists get way into studying this kind of stuff when you want it was taking a class with dr. Bullock knows this he's a very accomplished and excellent scientist and a behaviorist and they do things like discover the matching law so if you've got two concurrently available variable interval schedules of reinforcement for example a VI two-minute schedule for smiling and the VI for minute schedule for note-taking that basically means if you smile you get praise a reward a gold star whatever some kind of reinforcement and then you can smile you want and won't do anything until two minutes has passed and then your first smile after that gets you another gold star that's a VI two minute for smiling VI four minute for note-taking same deal you get reward for taking notes and then you don't get anything for the next four minutes but the first note you take after four minutes gets you another reward that's a VI 4 minute for note-taking and then you look at the relative frequency of these two behaviors as a function of their different schedules of reinforcement and the matching law basically says that the response rates two concurrent VI schedules correspond to the rate of reinforcement for each schedule in other words in that VI 2 VI for VI to for smiling VI for for note-taking example you basically get rewarded twice as much for smiling than you do for note-taking right because the schedule is half of you have to wait I only have to wait half as long to get reinforced and so how does this affect the relative amount of smiling and note-taking in general it the the amount of smiling will be twice the amount of notetaking because the rate of reinforcement is twice that so the point here is that the ratio responding matches the ratio of reinforcement that's basically what the matching law of a choice behavior says and that graph on the right which is a little bit confusing because you have to read the axes and understand them the vertical y-axis is the rate of responding on a which is one of the two tasks let's say smiling in percent and then it's shown as a function of the rate of reinforcement in percent so the point is if you're reinforced you know 80 percent of the time for smiling and only twenty percent of the time for note-taking you're gonna see eighty percent of the time spent smiling and twenty percent of the time spent note-taking so the the Gold Line is just a perfect diagonal showing that what would happen what you'd see if the rate of behavior perfectly matched the relative rate of reinforcement for that behavior and the Green Line is what you actually see and the point is they're almost the same showing that the relative proportion of time spent in a given behavior matches the relative rate and reinforcement that is in effect for that behavior so this whole field of studying behavior when there are multiple schedules of reinforcement maybe with different reinforcers simultaneously available and looking at how organisms allocate their limited resources their time and energy to certain behaviors depending on the schedules of reinforcement and the types of reinforcer this is called behavioral economics right so how do we spend our behavioral capital where do we spend our time what do we do in another thing that they talk about is the Bliss point here and this is where you've got two alternative reinforcers available and an organism can allocate their resources all to one all to the other or any ratio any combination you know half to one half to the other or 80% to one 20% to the other and you can look at an individual organism and get an idea of what their relative subjective values they assigned to these outcomes are by looking at their Bliss point so let's look at these graphs on the bottom because this is a little complicated let's say you've got $100 a week to spend these are your resources you can think of them this is just an example it could be your time right you've got 24 hours in a day how are you going to spend them but in this example you've got a hundred dollars a week how are you going to spend it and as we know there are really a million things we could spend our money on but in this tiny little lab example here you've got two possibilities dinners are $20 each and albums are $10 each and that purple line basically shows every possible way you can allocate your $100 on the top left you could if you don't give a crap about albums but you love dinners you could get five dinners and no albums or you could get four dinners and two albums or three dinners and four albums or two dinners and six sounds a one dinner and eight albums or if you just don't care about food and really love albums you get ten albums and no dinners presumably we mean dinners out here I assume you're still eating and for any individual they're going to fall somewhere on this line if we put them in this situation where this again this is just an example typically this is done in a laboratory with animals and multiple schedules of reinforcement for different reinforcers and you can see someone every animal every including humans is gonna fall somewhere on this line and where they fall is that individuals bliss point so I might choose all albums you might choose half albums half dinners doesn't matter the point is everyone's got their individual bliss point which reflects how much they subjectively value the two reinforcer so it's kind of interesting you can figure out whether rats prefer carrots or broccoli and of course the list if you change the cost of the two outcomes right now dinners what on the left there are twenty dollars each on the right there are fifty dollars each or albums hasn't changed but it changes the possible ways you can allocate and of course can also change your bliss point so this is the kind of stuff behavioral economists which is probably at the cutting edge of behaviorism out there I think it's they're doing a lot of really interesting stuff and again if you're interested in this and it's dr. Bullock can make why this is important much clearer than I can and it's really cool stuff but this is the kind of stuff that they do all right so what do you need to know what behavioral economics is roughly again it's defined right at the top of the slide and bliss point is for a given individual organism it is the way that they allocate their resources to two or more alternative reinforcers okay and it reflects the relative subjective value that they give to each of those two reinforcers someone who's at the bottom right of either these curves values albums a lot more than dinners see there's a couple more examples of the kind of things behavioral economists have discovered the Premack principle is that if a high frequency activity and by high frequency we just mean a behavior that an organism does more often just on its own of its own volition for whatever reason like watching TV we all watch this week you know 1-800 million dollars we'd all just watch TV all day every day because we know that it's high frequency maybe not but you get the point if a high frequency presumably more desirable but we're talking about behaviorism so we're not getting into desirable we're just it's high frequency that just means it's done a lot if a high frequency more desirable and thus more often engaged in activity like watching TV this made contingent upon a lower frequency one just one that is not done as much it can be used as a reinforcer to increase the amount of time spent on the lower frequency one so it basically just means you don't have to worry about what an animal likes or doesn't like if there's something an organism including the human animal does a lot and you say alright whatever that thing is you can't do that unless you do this other lower frequency thing you can use that high frequency access to that high frequency behavior watching TV to reinforce and thus increase performance of the lower frequency behavior doing math homework that's the premed response deprivation hypothesis says relative frequency of behaviors is not the critical factor so there it kind of disagrees with the Premack principle a lot of times these you know one theory will compete with another one and they'll go back and forth says that an activity can be used as a reinforcer or a reward even if it is relatively low frequency if you just deny access to it so the point of this one is that look they're kind of saying you could even use doing math homework as a reinforcer even though the organism that is an 11 year old in its natural habitat that is my house will hardly do any math homework on its own it is a low frequency activity if you restrict access to it and I say that's it you can't do math homework at all then after a few months of no math homework I could even use it as a reinforcer that doesn't hold any water in this case because if I said you don't have to do math homework anymore for the rest of your life there would be nothing but dancing and zero frequency math homework would occur but the point of the response deprivation hypothesis is that if you just withhold access to a behavior or response even if it's not a particularly high frequency one you can use it as a reinforcer by allowing an access to it for example maybe this is a little more realistic if I forbid you from eating toast for a month you will work to have toast even though normally you're like toast isn't all that great I don't care about it I don't eat it that much it's if I forbid you from having toast it becomes more desirable basically is the behind for whatever reason and again behaviorists don't care about what's going on inside your head they just care about the effects of contingencies and on behavior behavioral economics how can I control how you allocate and spend your time and allocate resources to certain behaviors how can i shape your behavior to guide you into this or that specific act okay here's the interim summary again pause it read through this and now let's look at the brain substrates just a little bit some of what we've learned about what brain regions support operant conditioning and more different aspects of operant conditioning you will recall the basal ganglia a set of subcortical structures that link sensory and motor cortices and more they are important for movement and they consist of the doesn't even list them here but it's the caudate the putamen and the Globus pallidus that kind of curly blue thing is the caudate and then there's some nuclei that are being pointed to there as dorsal striatum in the figure underneath the Curley thing that are the putamen and the Globus pallidus are kind of near each other down there so they're in there that what you need to know is that they're subcortical right these are deeper in the brain beneath the cortex they're kind of on either side of the thalamus and you may recall they're important for initiating voluntary movement in a bunch of other things without worrying too much about the details there the it has been found that these structures seem to be important in SR learning stimulus-response learning part of operant conditioning in particular for example the Protestant ethic you'll remember refers to the fact that once an organism lady learn once an organism learns an SRO association or contingency for example red light means press I get a pellet right red light press red light press then even if you give someone a pile of pellets they'll still keep responding when the red light turns on that's the Protestant ethic they keep working even though they've already got the outcome they don't need anymore they've learned this kind of automatic light press light press light press that's the SR Association lesions to the dorsal striatum result in a preserved ability to learn simple response outcome associations like just press in any case means food press food press food response outcome but if you leave in this part of the brain and then you start adding discriminative stimuli like red light means press food green light means press no food learning is impaired so this is why we think the dorsal stratum is important for that s our association the stimulus response the light press light press because when we introduce changes to that contingency like okay now we're gonna make red light means don't press green light means depress that sort of new learning about s our relationships like press relationships is impaired if the dorsal striatum is lesion and thus we surmise that part of the brain is important for learning particularly those as our associations stimulus-response associations on the other hand this different brain region the orbitofrontal cortex seems to be more important for the ro4 response outcome piece of the SRO chain that is learned in operative commissioning orbital frontal cortex shown in that little green zone there is just it is not a sub cortical region it's an actual part of the cortex that is right above the eyes orbit out means eyes so that's the orbital frontal cortex if we lead in that part of the brain animals can still learn s1 our reward s2 are no reward contingencies logano green light means press and you've got a green light press you get a pallet red light press you get no pallet but if the outcomes are then switched so green light press no pallet red light press yes pallet learning is impaired so the striatal dependent stimulus-response associations still seem to be in fact they still have the light press light no press learning is still there but the connection between the presses and the rewards in those two cases cannot be relearned so new learning about the response outcome contingencies is impaired and thus we think might involve or require or depend on processing in the orbital frontal cortex also if you use single unit recording to monitor the activity of neurons and that part of the brain during a delay right neurons there fire differently depending on the identity of an outcome and whether it is a reward or a punishment so if you see s are light red light press and then there's a delay for the outcome happens activity in the orbital frontal cortex is different depending on whether or not the outcome is a cookie or an ice cream cone or a shock or whatever indicating again supporting the hypothesis that this part of the brain is important for learning about the r.o piece of the SRO chain all right another piece of understanding how operant conditioning happens in the brain has to do with you know what is pleasure in the brain of an animal anyway what does it what makes it want to do something and one thing that we know is that dopamine is involved in this what is often called the reward system and there are mid brain structures indicated in this drawing here as those two little blue dots in the midbrain which is right on top of the brain stem there right above that bulge which is the pons and the ventral tegmental area of ETA and the substantia pars compacta or SN C are two mid brain nuclei that project dopaminergic innervation to other parts of the brain and for example the ventral tegmental area sends you know axons literally up into the frontal cortex in those at the ends of those terminal buttons they release dopamine into various other areas and we know this circuit is important for reward because rats will respond for electrical stimulation to the VTA until they literally collapse from exhaustion they will if they've got food available on one lever and then on the other lever when they press they get a little electrical stimulation to the VTA which results in the release of dopamine to these distributed areas in the frontal cortex they will ignore the food lever and press for that don't matter Jake innervation to their frontal cortex until they pass out and starve so this is a very strong reward circuit right we know there's a reward circuits in place for eating right and sex we organisms evolved mechanisms in their brain to drive them to do these things that they must do in order to survive and reproduce because really their behavior is shaped by their DNA their genes which have the goal of reproducing themselves and you can kind of co-opt these drive reward circuits as demonstrated here by simply administering electricity into the VTA so this is one way we know dopamine is important for drive and reward circuits okay but if we dig a little deeper it looks like this dopaminergic innervation going to the orbital frontal cortex from the ventral tegmental area and from the substantia to the basal ganglia these are - those are the two midbrain nuclei that send dopamine to other brain areas this is clearly important for motivating animals to act but I think I said on the previous slide that rats love stimulation to the ventral tegmental area that's based purely on seeing that they will respond until they pass out to get stimulation to that eventual tegmental area but does that mean that they subjectively would judge it as pleasing hard to tell with a rat but you can do experiments where you can actually separate out this kind of wanting motivating stimulation which is what we actually see with this dopaminergic circuit versus liking that is the ability to judge that something is more pleasing than another so drugs that interfere with the dopamine system we just talked about on the previous slide disrupt instrumental conditioning that is the animals are not motivated to pursue reinforcers and this graph on the lower right just shows an example if you give a dopaminergic antagonist that's what pim azide is it's a drug that blocks the action of dopamine then you try to condition an animal and look and see get it to respond for food or maintain a reasonable rate of responding for food that responding falls off so is it still need food sure it's still an animal but we've king curd with the neural hardware that motivates organisms to pursue rewards and as you can see here the rate of responding or decrease in responding rather after some conditioning basically when you give an animal pema's ID that blocks the action of dopamine it's as if you're looking at extinction their behavior their rate of responding lever presses fall off the same way that a group who don't have any drugs but aren't getting reinforced at all do so the red line the pin is ID group is still getting reinforced but they're on a drug that blocks this motivating reward circuit don't for me the circuit and you can really see that obviously that drug interferes with the normal effects of reinforcement that's as if they're not being reinforced anymore they look just like a group that isn't being reinforced but if you look closer animals don't seem to enjoy VTA stimulation it's not like a ecstasy it's they're anxious aroused motivated anticipatory right they they need it right but they want it do they enjoy it like it it's another question it may actually be there are other circuits that are more involved in liking so even when these animals or including humans have the dopaminergic dopaminergic circuit blocked and there are not motivated to seek rewards they're still able to judge the relative pleasing nough some certain things suggesting that yes we've interfere with the motivation here but we have not impaired their ability to judge or understand goodness relative pleasure that's why we say where as dopamine may be involved in wanting endogenous opioids which we haven't mentioned yet may be involved in liking okay so actual pleasure may involve a different circuit and notice the word opioid in there obviously people are also motivated to stimulate opioid receptors and endogenous opioids you know if you get a runner's high or you suffer a broken leg or something these system kicks in I gear to keep you from feeling terrible basically it's a natural sort of pleasure circuit that's built in and drugs that affect opiate receptors affect hedonic value of primary reinforcers like food and pain that is the subjective pleasurable experience of outcomes stimuli is affected by messing with this opiate system whereas messing with a dopamine system affects the motivation and drive to obtain certain reinforcers so that's why we like this slide setup is wanting versus liking the dopamine system may be more involved in motivating walking whereas the opioid system more involved in liking goodness judgments pleasure alright so here's the brain substrates recap slide there's a couple things on here I should probably mention they might be a little confusing if I don't so basal ganglia seems to be important for SR Association stimulus response and these are the types of strong associations that are exemplified by the protestant effect where you have an animal continuing to see the stimulus and respond automatically to it even if the connection to the goal seems a little shaky for example for rat learns amazed visual cues of the maze are the stimulus for running the maze is the response and the outcome is getting the food at the end of the maze you can actually see you can put a pile of food along the path of the rat and they will just walk right over the path the pile of food to get to the end of the maze to get the food at the end of the maze this is the strength of that s our relationship and that's what the basal ganglia has been associated with whereas the orbital frontal cortex has been associated more with learning the response outcome associations if you leave in that part of the brain and change those response outcome contingencies you will see impaired learning and then the dopaminergic innervation to the frontal cortex from the ventral tegmental area and to the striatum from the substantia is associated with wanting and liking is associated with opioid innervation which didn't show pictured where it happens because it's kind of all over the central nervous system relative decrease in opioid release for an animal that has been getting sugar water but now only gets regular water may explain the negative contrast effect so there's a slide awhile back that showed that babies will suck more for sugar water than for regular water but if they get regular water in session one than regular water in session two they'll suck about the same amount in both sessions if they get sugar water in session one and regular water in session two they'll suck a lot less in session two in the group that got regular water both times that's the negative contrast effect I don't think I labeled it on that but don't worry about it point here is that this has been associated with a relative decrease in opioid release meaning that if you compare session to the babies or in rats whatever that got regular water in the first session versus the rats that got sweet water in the first session when regular water is available in session two having gotten sweet water before causes the regular water to result in a reduced opioid release in the brain than if they had never had sweet water the point being the in the brain the neuro physiological underpinning of being like this sucks I had sweet water and now this regular water sucks I don't want it is that the regular water literally results in a lesser release of endogenous opioids in the cortex right here's just a couple clinical perspectives so looking at operant conditioning we'll look at schizophrenia in a little bit drug addiction so schizophrenia we haven't talked about and I'm sure you've all heard of it maybe you probably have varying levels of familiarity with it affects about 1% of the population at most on sets usually around 20 or so and it's kind of stereotypically crazy severe schizophrenia is associated with hallucinations usually auditory but they can be full-blown and people believe crazy stuff like aliens are trying to read my thoughts and if I don't put this piece of chocolate up my nose at 10:30 every morning just crazy stuff like that and so I guess a more clinical description would be that the symptoms are diverse they include hallucinations delusions also flattened effect I didn't mention that so abnormal they lack what seems like normal human emotional responses and not surprisingly social impairment and if you're dealing with all these things you're gonna have impaired social interactions so it's it's tragic disease really the neural Mackay so it's the neural mechanisms are not well understood something this complex honestly consciousness is such a complex result of an incredibly sophisticated symphony of information processing happening just the right time in just the right way in so many different brain areas that give rise to some of the parts being the whole being so much greater than the sum of the parts it's not surprising that every now and then one little part of the system can be off in a certain way or the inter way several parts in the rack can be off and throw the whole thing out of kilter I'm surprised it doesn't happen more often but it's one of those diseases where we're dealing with like hallucinations delusions really high order weirdness it's not like this person can't see the color red or has a specific deficit with short-term memory or language production it's like really high order crazy so it's not surprising we don't understand the neural mechanisms behind it very well at all we have noticed for example certain physiological differences that schizophrenic seem to exhibit one being that the medial temporal lobe is where the hippocampi are are abnormally shaped you can see on the bottom there there's just there's some they're not laterally balanced on the left and right and it looks like the left one that seems to be a little smaller the point here is just that this is one hallmark of schizophrenia is different and abnormally shaped hippocampi medial temporal lobes and these regions the hippocampus is important for memory and I think we I've covered that and we'll cover it more later and maybe even more specifically the formation of associations between things mùi concepts etc so the possibility that these hippocampal abnormalities affect relational processing in schizophrenic s-- is supported from data like these from this acquired equivalence task which is not super easy to map on to operant conditioning paradigm but basically it's where you are to associate certain stimuli like you see a picture of a brunette and a bluefish and a green fish and you are told to choose which fish do you think this which of these two things do you think this woman would like then you pick one and then you're either incorrect or correct and the operon piece here is that basically if you given this stimulus if you choose the correct choice you are rewarded with a correct or five points or whatever so that's how it's an operator conditioning paradigm it's really in the full context a more subtle learning paradigm and you learn that certain in this example ladies like certain things like the through feedback right you learned that the brunette likes the blue fish and then when you see the blonde face you learned that okay she goes with the blue fish as well and the acquired equivalence comes in where so let's say we also learned that the brunette likes red fish okay but we don't learn yet what the blonde likes between red and yellow and so when the blonde comes up with those two choices normal people show some transfer of equivalence between the blonde and the brunette because they both liked blue fish and the brunette liked red fish then people go oh well if the blonde and the brunette both liked similar fish in the blue green choice case they'll probably like similar fish in the red yellow fish choice case as well so when they see the blonde and are asked to pick which fish she would like they picked the red fish because the brunette liked the red fish and they liked the similar things in the past all right so that's acquired equivalence and schizophrenic s-- are impaired on this they are unable to transfer that kind of oh well these women are similar in general so if they're similar in this other way I'll bet there similar in this red fish preference as well they don't show that kind of transfer so we say let's see equivalence between stimulate helps participants they are impaired at generalizing earlier equivalence between the blonde and the brunette in terms of their liking of blue fish over green to new situations and this has to do with presumably hippocampal processing that is forming kind of higher-order associations not literal associations like the blonde likes the red fish we have no evidence of that but because she's similar to the brunette and the brunette likes the red fish that's probably a involve some hippocampal functioning and that's supported by the fact that schizophrenics have smaller hippocampus and they're impaired on this task these are the data from that task if you'll see on the left the schizophrenic slurred the basic associations just as well as non schizophrenic that is they learn just as quickly that when you see the blondes you should pick the red fish because that's what she likes when you see the brunette you should pick the red fish you know after a few trial and error trials of those you learned through feedback what who likes what and this gets your friends learn that just as well as anybody else and that's shown in the left graph during acquisition the orange bar s disease is short for schizophrenic HC is probably hippocampal control whatever it's normal people and the wide vertical axis is showing errors and you can see that the errors are the schizophrenic make a few more but they're about the same for the two groups over on the right you see first the two bars on the right hand graph left two bars are after a retention interval so this is basically just testing memory you learned yesterday who liked what and obviously there's more than just two women in two types of fish here and the real study and performance is just about the same again for schizophrenic as it is for normal people after the delay so they're able to form basic associations simple order first order associations and retain them just about as well as everybody else you might be like well wouldn't that require the hippocampus too and I don't know yes probably maybe not as much not as status sophisticated a level I mean remember they do still have hippocampi they're just abnormal where you really see the performance decrement is in the far right two bars you see that the skin is schizophrenic perform way more poorly on the transfer trials that is guessing if the blonde woman would like a certain type of fish that you've never seen gotten any feedback on before based purely on her similarity with the brunette so the brunette likes purple fish you might assume the blonde lady does too because they both like blue fish well schizophrenics don't make that assumption they don't make that leap they make that higher order Association interestingly antipsychotics which are drugs that are used to treat the symptoms of schizophrenia partially treat this impairment so there may be some action in hippocampal circuits that fixes not fixes alleviates some of the dysfunction that results in the terrible things associated with schizophrenia and also alleviates some of the association problems that are observed in this acquired equivalence task then the application of operant conditioning to addiction is probably even a little easier to make pathological addiction of course can be defined as a strong habit maintained despite harmful consequences right that's when you really need to start worrying about your addiction because we're all have addictions to all sorts of stuff all the time but if it starts making your life worse then it's starting to tip into the realm of a pathological addiction then in some cases of course it's just flat obvious that this is a pathological addiction if someone's homeless and lost their family and in poor health because they are addicted to crystal meth that's a pathological addiction and the reason why I say it's pretty easy to map on operant conditioning is because you know what something's haywire with the stimulus response normal stimulus response outcome wiring the connections between the response and the outcome are not normal or have been twisted seeking pleasure involves positive reinforcement of voiding withdrawal involves negative reinforcement and it's a double whammy right so obviously when you do your heroin you are reinforced with the wonderful feeling of heroin which are not on dodging us but rather eggs on Jonah's external artificially induced opiates right so we talked about how the endogenous opiates system is involved in feeling of pleasure judgments of pleasure so not surprisingly squirting a bunch of those into your blood results in a pleasurable feeling so the squirting of the heroin into the blood is reinforced same time after you've done it a few times you start to feel withdrawal when we talked about how classical conditioning can explain withdrawal and craving and all that stuff as your body tries to maintain homeostasis so then you get your you see your works or the pack of cigarettes or whatever your body has a compensatory condition compensatory response which makes you crave it even more and then you do it and now you that doing the drug is also reinforced negatively that is it removes the awful feeling of craving withdrawal your body's out of homeostasis it needs the drug to get back to it doing the drug is now negatively reinforcing because it removes that the negative or the unpleasant sensations of needing a drug being craving so drug use is supported it's kind of a double whammy it's supported by positive reinforcement and negative reinforcement although liking a drug may help initiate addiction the incentive salience hypothesis which we talked about before where dopaminergic system is important for wanting and motivating opioids are more important or involved in the sensation of pleasure the experience of pleasure suggests that addiction is maintained by wanting a drug and more than the pleasure after a while yes at first it's about seeking that positive reinforcement but after a while it's more seeking that negative reinforcement and probably involves that dopaminergic wanting system more than the pleasure system if you speak to interview cocaine addicts after a while they don't even really report pleasure from a dose of crack or cocaine they want it it's arousing and excite they're motivated to get it but is it really a pleasurable experience anymore not so much the same is true for severe alcoholics right do they are they really experience happy right to be drunk now and all addictive drugs cause the release of dopamine from the ventral tegmental area so this circuit is certainly important it's a major player in drug addiction this slide just demonstrates a couple ways that different drugs affect the dopaminergic system in the brain is a close-up of a synapse you've got a the terminal button of one neuron at the end of an axon that's the presynaptic neuron drawn at the top here with those little vesicles full of dopamine in it when the action potential reaches the end of the neuron those vesicles bind with the cell membrane and peep their contents into the synapse their contents of course being dopamine which dress across the gap and binds to receptors and they open channels and next to neuron depolarizes and becomes more likely to fire bah-bah-bah and of course then the dopamine kind of drifts away or is eaten by enzymes or sucked back up by the presynaptic neuron then the process can happen again so amphetamine actually increases the amount of dopamine released into these synapses whereas cocaine interferes with the reuptake of dopamine into the presynaptic dopaminergic neuron in both cases the end result is that there's more dopamine floating around in the synapse which results in the increased action which because it's part of this evolved hardwired reward motivation circuit that we all have causes a chasing of that behavior that caused this increase in dopamine it's you know co-opting the part of how nature wired us to be motivated to do anything unfortunately it motivates us to snort cocaine this slides just illustrating that of course addictions don't have to be - drugs you can be addicted to behaviors gambling eating sex internet youth shopping exercise but anybody know somebody who's just runs too much I know a guy who ran runs so much you know is like bones are starting to fracture but he just can't stop its addicted to the endorphins you know there are worse addictions to have but an addiction to anything is bad it's just gets excessive and like I said what makes it pathological is that it starts messing with your life and you're making you unhappy and drug addictions are maybe a little easier to study because they're actually you know hitting the system with a particular psychoactive chemical which affects a neurotransmitter system somehow whereas with the behavioral addiction that's not necessarily the case I mean it's true that gambling causes you know a release of certain of dopamine I'm sure but learning about how these circuits work in drug addictions can help us learn about how they work in behavioral addictions as well okay speaking of treatments how does any of this help us well we want to try to treat schizophrenic some addicts of course drugs are one way now Trek's own is a drug that inhibits dopamine production and can help treat heroin addicts and compulsive gamblers by reducing that reinforcing spurt of dopamine that happens that makes it so hard to quit by simply inhibiting dopamine now does inhibiting dopamine have other effects cognition and life side effects so to speak hell yeah but if heroin addiction or being a compulsive gambler is ruining your life sometimes they're worth it you can also of course use cognitive or behavioral non drug therapies some of which we've mentioned when we were talking about drug addiction in the context of classical conditioning extinction right breaking the response outcome link so stimulus I don't know you're the site of your cigarettes response lighting up a cigarette outcome the delivery of the nicotine you know you can smoke herbal cigarettes that don't have any nicotine stimulus response no outcome right there are over 100 ways to break the response outcome link that's what extinction would mean in the case of operant conditioning and thinking of waste it might seem like a no-brainer but thinking of it from this perspective can sometimes help you think of creative ways to break that link for example or reduce that particular maladaptive behavior for example if you reinforce alternative behaviors so eat ice cream or that's a bad example cuz I can was you get addicted to ice cream but for not smoking so if you reinforce the alternative to the maladaptive behavior research shows that over time that will increase the relative frequency of the non-smoking behavior compared to the smoking behavior for example delayed reinforcement so you know timing has shown that if you wait between the response and the outcome the more you increase that delay the weaker the SRO linkage becomes so if you will have a craving to smoke and you can wait an hour before you do it the research suggests that overall that will weaken the stimulus response outcome maladaptive chain that is causing so much trouble distancing is a term that's used basically just means you know get away from the essence right so if the stimulus is your party buddies and you that's the stimulus response with drinking outcome is you're hammered avoid that those stimuli so that is again that seems like a no-brainer but there's power in knowing that the data really show that if you can do this which seems like a no-brainer it will help you deal because these are all even though it might seem like common sense they are all backed up by data they're based on conditioning principles that are can be observed in rats in the lab of course in reality you don't pick one of these things you basically throw everything you've got at these problems so I don't know if anyone's done any drug counseling but you know you do whatever you can to work and these methods may seem kind of obvious or silly but being aware of the principles and biology of conditioning and a little bit of how we know what we know about how neural networks learn in the context of classical and operant conditioning can make you a better addiction therapist