Exploring Computational Theory of Mind

all right let's get started so today we're going to talk at some length about what i mean by this idea of mars computational theory level of analysis it's a way of asking questions about mind and brain and we're going to talk about that in the case of color vision and that's going to take a while we'll go down and do the demo we'll come back and talk about color vision and how we think about it at the level of computational theory and why that matters for mind and brain and then we're going to start in the second half a whole session on which is going to roll into next class on the methods we can use in cognitive neuroscience to understand the human brain and we'll illustrate those with a case of face perception and we'll talk about computational theory light very briefly a face perception what you can learn from behavioral studies and what you can learn from functional mri and then we'll go on and do other methods next time everybody with the program all right so to back up a little the biggest theme addressed in this course the big question we're trying to understand in this field is how does the brain give rise to the mind okay that's really what we're in it for that's why there's lots of cognitive science we're trying to understand how the mind emerges from this physical object and so for the last few lectures you've been learning some stuff about the physical basis of the brain what it actually looks like some of you guys got to touch it i hope you thought that was half as awesome as i did and we got a sense of the basic physicality of the brain and some of its major parts but now the agenda is how are we going to explain how this physical object gives rise to something like the mind and the first problem you encounter is what is a mind anyway i drew it as a weird big amorphous cloud because it's just not obvious how you think about minds right it feels like one of those things like could you even have a science of the mind like what is mine it's all kind of nervous making right um and so our field of cognitive science over the last few decades has come up with this framework for how we can think about minds and this isn't even a theory it's more meta than that it's a framework for thinking about what a mind is and the framework is the idea that the mind is a set of computations that extract representations okay now that's pretty abstract you can think of a representation in your mind as anything from a percept like i see i see motion right now or i see color and as you learned before you might see motion even if there uh isn't actually motion in the stimulus but that representation of motion in your head that percept that's that's a kind of mental representation or if you're thinking you know why is nancy going through this really basic stuff she's insulting our intelligence if you know something like that is going on in the background as i'm lecturing that's a thought that's a mental representation of a sort or if you're thinking oh my god it's after 11 and i'm not going to get to eat until 12 30 i'm going to starve you know whatever thoughts are going through your head those are mental representations too right and so the question is here how do we think about those and so this idea that mental mental processes our computations and mental contents are representations implies that ideally in the long run if we really understood minds we'd be able to write the code to do everything that minds do right and that code would work in some sense in the same way now that's a tall order mostly we can't do that yet like not even close a few little cases in perception kind of sort of maybe but mostly we can't do that yet but that's the goal that's the aspiration and so question is how do we even get off the ground trying to launch this enterprise of coming up with an actual precise computational theory of what minds do and the first step to that is by thinking about what is computed and why and so that is the crux of david mar's big idea right the brief reading assignment that i gave you guys that from mar and he's talking about how do we think about minds and brains step number one what is computed and why so we're going to focus on that for a bit here and let's take vision for example you start with a world out there that sends light into your eyes that's my icon of a retina that blue thing in the back the back of your eyes sends an image onto your eye and then some magic happens and then you know what you're looking at okay so that's what we're trying to understand what goes on in there in a sense what is the code that goes on in here that they takes this as an input and delivers that as an output okay more specifically we can ask as we did in the last couple of lectures let's take the case of visual motion so suppose you're seeing a display like this like something in front of you somebody jumps on a beach like that and there's visual motion information what are the kinds of things so that's your input what are the kinds of outputs you might get from that well to understand that we need to know what is computed and why so what is computed well lots of things you might see the presence of motion you might see the presence of a person actually you can detect people just from their pattern of motion if we should have done this at that demo write me a note to think about that next time if we stuck little tiny leds on each of my joints and we're in a totally black room and i jumped around and all you could see was those dots moving you would see that it was a person it would be trivially obvious so motion can give you lots of information aside from something's moving and what direction is it moving you can see someone's jumping that also comes from the information about motion you can infer something about the health of this person or even their mood so there's a huge range of kinds of information we glean from even a pretty simple stimulus attribute like motion and so we're going to understand how do we perceive motion we first need to get organized about what's the input and which of those outputs are we talking about and probably the code that goes on in between in your head or in a computer program if you ever figured out how to do that will be quite different for each of those things but that's the way you need to be thinking about mines okay what are the inputs what are the outputs and then as soon as you pose that challenge like okay let's say it's just moving dots and you're trying to tell if that's a person think about what is the code you'd write just these moving dots how the hell are you going to go from that to detecting if those dots are on the joints of a person who's moving around versus on something else that's how you think what are the computational challenges involved okay and i'm not going to ever ask you guys to write that code we're just going to consider it as a thought enterprise to kind of see what the problem is that the brain is facing that it's solving okay and so mars big idea is this whole business of thinking about what is computed and why what the inputs and outputs are and what the computational challenges are getting from those inputs to those outputs that all of that is a prerequisite for thinking about minds or brains okay we can't understand what brains are doing until we first think about this that's why i'm carrying on about this at some length and mar writes so beautifully that i'm just going to read some of my favorite paragraphs because um paraphrasing beautiful prose is a sin so mars says trying to understand perception by studying only neurons is like trying to understand bird flight by studying only feathers it just can't be done to understand bird flight you need to understand aerodynamics only then can one make sense of the structure of feathers and the shape of wings similarly you can't reach an understanding of why neurons in the visual system behave the way they do just by studying their anatomy and physiology okay you have to understand the problem that's being solved okay further he says the nature of the computations that underlie perception depends more on the computational problems that have to be solved than on the particular hardware in which their solutions are implemented so he's basically saying we could have a theory of any aspect of perception that would be essentially the same theory whether you write it in code and put it in a computer or whether it's being implemented in a brain yeah mar was many things he was a he was a visionary a visionary who studied vision a truly brilliant guy with a very strong engineering background and this is you know now pervading the whole field of cognitive science that people take an engineering approach to understanding minds and brains to try to really understand how they work okay so to better understand this we're going to now consider the case of color vision and so in this case we start with color in the world that sends images onto the back of your retina some magic happens and we get a bunch of information out so the question we're going to consider is what do we use color for okay and we're going to use the same strategy we used in the edgerton center of trying to understand some of the things that that we use color for by experiencing perception without color okay what are the outputs okay so to do that we're going to head over right now to the imaging center and we're going to have a cool demo by rosalesa so if it's going to be faster to leave your stuff here i don't know maybe we should yeah yeah we'll lock the room okay um yeah how long are we going to be today 10 minutes something like that and i need everyone to boogie because there's a lot of stuff i want to get through today so let's go all right so what do we use color for when we have it it's not a trick question supposed to be really obvious now yeah what's your name chardon hi uh do you think it was booty yeah yeah choosing which what else related to that but different yeah yeah yeah like what what did you notice that you could identify better but besides identifying and choosing what else much more generally bringing things into our awareness like with the reds in particular the strawberries yeah like do you find them easier to find no much harder oh yeah right harder without the light exactly what else yeah like driving like you need to have color to know the traffic lights totally totally that's a modern invention but a really important one what else are we in general are we very general or like whatever what do we use color for i mean we use it to like figure out like what's what to eat because it's like one of the strawberries isn't actually a strawberry so yeah i use color too uh-huh and the bananas did anybody notice sometimes it's hard to tell yeah point in the bag say more totally did you feel like people's faces looked a little sickly absolutely absolutely okay so this is just to show you that a lot of a lot of computational theory starts with sort of common sense of just reasoning what do we use this stuff for it helps to not have it just to reveal what we use it for but you guys have just reinvented the the key insights and early field of color vision okay so standard story is to find fruit like if you ask yourself how many berries are here take a moment get a mental tally how many berries okay ready now how many berries okay you see more and in fact there's a long literature showing that um primates who have uh three cone colors we're not going to go through all the physiological basis of cones and stuff like that but they have richer color vision because the number of different color receptors in their retina they're better at finding berries and in fact a paper came out a couple years ago where they studied wild macaques on an island off of puerto rico called cayo santiago and the macaques there have a natural variation genetically where some of them have two photo two color photoreceptors instead of three okay and in fact they followed them around and the monkeys that have three photoreceptors types are better at finding fruit than the ones that have only two okay so that story that's just been a story for a long time turns out it's true and also as you guys have already said to not just find things but identify properties like you can probably tell whether you'd want to eat those bananas on the bottom maybe not it's hard to tell on the top which ones you'd like and yet that's all you need to know okay so these are just a few of the ways that we use color and why it's important but there is a very big problem now that we try to think figure out okay what is the code that goes between the wavelength of light hitting your retina and trying to figure out what color is that thing so here's the problem we want to determine a property of the object of its surface properties its color right that's a material property of that thing but all we have so here's a thing we'll call that reflectance it's a property of wavelength but you can think of it as for now just a single number it's a property of that surface but all we have is the light coming from that object to our eyes that's called luminance i'm not going to test you in these particular words but you should get the idea okay so that's what we have that's our input but here's the problem the light that's coming off the object is a function not just of the object but of the nature of the light that's shining on the object that's called the illuminate so the problem is we have this equation this light coming from the object to our eyes is the product of the properties of the surface and the incident light okay and our problem is we have to solve for l i'm sorry we have to solve for r the property of the object given l what is r that's a problem okay that's kind of like if i said a times b is 48 please solve for a and b okay that's known in the field as an ill-posed or underdetermined problem we don't have enough information to uniquely solve this okay that's a very very deep problem in perception and a lot of cognition we are often in fact most of the time in this boat okay so the implications are when we want to infer reflectance of the property of the object from l we must bring in other information right we must be able to make we must have some way to make guesses about i about the light shining on that object okay so the big point is many many inferences in perception and cognition are ill-posed in exactly this way all right and so here are two other examples of ill-posed problems in perception in shape perception you have a similar situation you have stuff in the world that's making an image on the on the back of your eyes okay that's optics what we're trying to do as perceivers is reason backwards from that image what object in the world caused that image on my retina that's sometimes called inverse optics because you're trying to reason the opposite way that's basically what we're doing in vision so here's a problem like it's a crappy diagram but if you can see here there's three very different surface shapes here that are all casting the same image for example on a retina you could do this with cardboard and cast it with a shadow does everybody get what this shows here what that means is if you start with this and you have to reason backwards to the shape that caused it that's an ill-posed problem big time it could be any of those things this information doesn't constrain it does everybody see that problem okay so that's another ill-posed problem here's a totally different example of an ill pose problem that's that's big in cognition when you when you learn the meaning of a word especially as an infant trying to learn language the classic example the philosophers like god knows why philosophers like weird stuff but never mind um somebody points to that and says gav a guy and your job is to figure out what does gavagai mean okay so gavagai could mean all kinds of different things it could just mean rabbit if you already have a concept of a rabbit it could mean fur it could mean ears it could mean motion if the rabbit is jumping around or in the example the philosopher's love it could mean undetached rabbit parts weird but anyway philosophers like that kind of thing anyway the point is it's ill-posed we don't know from this what is the correct meaning of the word does everybody see how this underdetermines the correct meaning of the word we don't have enough information to solve it okay so um yeah so there's a there's a whole literature on the extra assumptions that infants bring to bear to constrain that problem so they can make a damn good guess about what the actual meaning of the word is okay the whole big literature quite fascinating okay but for now i just want you to understand what an ill-posed problem is and why it's central to understanding perception and cognition okay so back to the case of color as i said the big point is that lots of inferences including determining the reflectance of an object are imposed and so we have to bring in assumptions and knowledge from other places from our knowledge of the statistics and the physics of the world our knowledge of particular objects all kinds of other things must be brought to bear okay so all of that again is considering the problem of color vision at the level of mars computational theory notice we haven't made any measurements yet we've just thought about light and optics and what the problem is and what we use it for okay all this stuff you know what what is extracted and why are the reflectance of an object useful for characterizing objects and finding them what cues are available only l and that's a problem uh because it's ill-posed okay next question so obviously we get around and we can we can figure out what colors are which what are the other sources of information that we might use in principle and that humans do use in practice okay um and so all of that kind of stuff has been done without making any measurements we're just thinking about the problem itself okay all right um so next uh mars other levels of analysis algorithm and representation and hardware are more standard ones you will have encountered which is why i'm making a big deal of computational theory it's really his major novel contribution but it's better understood by contrast with these so at the level of algorithm and representation this is like what is the code that you would write to solve that problem right and so we could ask how does the system do what it does can we write the code to do it and what assumptions and computations and representations would be entailed so how would we find out how humans do this well one of the ways is a slightly more organized version of what you guys just did and that's called psychophysics psychophysics just means showing people stuff and asking them what they see or playing them sounds and asking them what they hear you can do it in very sophisticated formalized ways or you can do it like we just did talk to us about what the world looks like okay usually psychophysics means a slightly more organized version okay so here's an example in fact it's a cool demo also from rosa and so what i'm going to do is i'm going to show you a bunch of pictures of cars and your task is going to be to shout out loud as fast as you can the color of the car okay they're going to appear on the screen everyone ready as fast as you can shout it out loud here we go what color okay interesting okay here's another one uh-huh interesting ready here we go here's another one okay here's another one ah you guys caught on to that pretty fast okay so um good job um nice consensus although i noticed a little bit of transition there which is very interesting um but here's the thing all of those cars are the exact same color the body of the car is the exact same in all of them and if you don't believe it here's i'm going to occlude everything except for a patch okay here we go boom they're all gray i know it's awesome it's rosa that's awesome not me i just had to bum this because it's so awesome okay so rosa spent months designing these stimuli to make a particular test particular ideas about vision but the basic demo is simple and straightforward you can get the point here okay so what's going on here what's going on here is that you guys the algorithm running in your head that's trying to figure out what is the color of that car is trying to solve the ill-posed problem and it's using other information than just the luminance of light coming from the object it's using information from the rest of the object it's making inferences about l the luminance the light hitting the object okay and in particular when you look at that picture up there what is the color of light shining on that car yeah right officially known as teal in the field but some of you shouted out green first because that's what you saw first is the color of light okay what's the color of light hitting that car yeah purple magenta here yeah and over there yeah yellow orange yeah okay so basically what your visual system did is look quickly figure out the color of the incident light l and use that to solve the otherwise ill pose problem of solving for r the color of the car and in this case this demo shows that if you just change the color of the illuminant light and hold constant the actual wavelengths coming from that patch you can radically change the perceived color of the car everyone got that okay yeah if i ran this to a computer yeah asked to get the the intensity of the pixel at like on the hood of the car there it would it would not correspond to yellow across one degree uh well it depends what you're asking the computer exactly if you hold up a spectrophotometer that's just going to measure the wavelength of light they're all gray right there on top of those cars they're all the exact same neutral gray that's just the raw physical light coming from that patch okay but if you coded up the computer to do something smart and you coded it up to take other cues from the image try to figure out what l is and therefore solve for r you might be able to get it to do the right thing i just mean just like you look at the pixels like in the matrix like the color on the car would it be what yellow is great they're all great so that's what that's what i was trying to show you here is that in fact they are actually gray right that's the cars are underneath there and you can see they're all exactly the same and they're gray and there's no color in it okay everyone got that all right okay so all of that is a little baby example of psychophysics what we do at the level of trying to understand the algorithms and representations extracted by the mind to try to figure out what are these strategies that we use to to solve problems about the visual world okay and so behavior or psychophysics or seeing as you just did can reveal those assumptions and reveal some of the tricks that we're using in the human visual system to solve those ill-posed problems okay so in this case it was assumptions about the illuminant that enabled us to infer the reflectance from the luminance okay the third level mar talks about is the level of hardware implementation in the case of brains that's neurons and brains and so we won't cover this in any detail here but there's lots and lots of work on the brain basis of color vision we'll mention it briefly next time so this is some of rose's work showing those little blue patches on the side of the monkey brain that are involved in color vision and some work that rosa did in my lab showing the bottom surface of the human brain with a very similar organization with those little blue patches in there that are particularly sensitive to color so you can study brain regions that do that if it's a monkey you can stick electrodes in there and record from individual neurons and see what they code for and you can really tackle at multiple levels the hardware hardware neural basis of color vision and brains as well okay so the big general point is we need lots of levels of analysis to understand a problem like color vision okay and so accordingly we need lots of methods to understand those things all right so what i want to do next is now launch into this this whole thing about the different methods that we can use in the field in this part of the lecture we'll go on to next time but let's get going everybody good with this so far all right um so we're going to use the case of face perception to think about the different kinds of questions and different levels of analysis and face perception so let me start by saying why face perception not just that i've worked on it for 20 years although i'll admit that's relevant there's lots of other good reasons beyond that why we should care about face perception so i don't have a demo that enables me to kind of put you in a situation where you can see everything but faces that would be cool and informative if we could do that but failing that i can tell you about somebody who's in that situation and this is a guy named jacob jacob hodes so this is a picture of him recently i met him around a decade ago when he was a freshman at swarthmore and he sent me an email and he said i've just learned about face perception and the phenomenon of prosopagnosia the fact that some people have a specific deficit in face recognition and it explains everything in my life and i want to meet you and i said because he knew i worked on face perception i said that's awesome i would love to meet you but i got to tell you i'm not going to be able to help so if you're interested in chatting please please come by um but i don't want you to feel like i'm going to be able to do anything useful he said no i don't care i just want to i want to understand the science so he comes by and by the way one of the one of the things that people have wondered for a while is are people who have particular problems with face recognition are they just socially weird are they just like bizarre like maybe a little bit on the spectrum they don't pay attention to faces and so they don't get them very well and so forth um or can they be like totally normal in every other respect except for just face perception and so i was very interested i'd only emailed with this guy and when he showed up in my office within about you know 15 seconds it's like this is like the nicest normalest kid you could ever meet such a nice guy so normal socially adept smart thoughtful lovely lovely person so i chatted with him for a long time and he told me he was then halfway through he grew up in lynn massachusetts and he went off to swarthmore his freshman year and he had been having a really rough time of it because in his hometown he was with the same group of kids all the way from first grade through high school and so in fact he he just can't recognize faces at all never could when he was a little kid his mom used to drive him to the practice field and they would sit there and come up with cues about this is how you tell that's johnny he's got this weird thing about us here and this is how you tell that's bobby and they would like practice and practice um and so he developed these clues to be able to figure out who was who in his small little cohort of kids that he knew um you know all the way through high school then he goes off to college and it's all these new people and he's screwed and he said to me that he was just devastated because he would go to a party and he would meet someone and think wow this is a really nice person you know they they i would really like to be this person's friend but he would realize he would have no way to find that person again and so the point he's like you know you don't want it like it's kind of like oversharing to say when you've met somebody for 10 minutes like by the way i'm not going to be able to find you you have to find me it's like you just don't want to have to go there yet right so there's all kinds of things that would make it a real drag to not be able to recognize other faces and now having said all of that i'll say that a surprisingly large percent of the population is in jacob's situation about two percent of the population it will be unsurprising if there are one or two of you in here and if there is you can tell me later i'd love to scan you um but um uh about two percent of the population has routinely fails to recognize family members people they know really well right um and so and interestingly this is completely uncorrelated with iq or with any other perceptual ability your ability to read or recognize scenes or anything else yeah that's the kind of thing where you either have it or don't have it oh good question no it's a it's a it's a gradation so the two percent of the bottom are not like this two percent who are really screwed and everyone else is up here it's a hugely wide distribution and the point is that the bottom end of that distribution is really really bad like they just can't do it at all similarly the top end of that perspec distribution is weirdly good they are so good at face recognition that they have to hide it socially because otherwise people feel creeped out like for example as one of those people they're called super recognizers a bunch of them have been hired by um by investigation services in in london recently as part of their kind of crime solving unit those people are so good that um that one of them said to me we scanned a few of these people one of them said you know if i if i if i you know she recounted this event where she's um standing in line waiting for movie tickets and she realizes that the person in front of her in line was sitting at the next table over at a cafe four years before she says if i share this information with that person they'll be creeped out so i've just learned to keep it to myself but i know that was the same person right so there's a huge spread you had a question a while back patient um is that so like for example jacob looking at a person could describe that absolutely he knows that it's a face he can tell if they're male or female he can tell if they're happy or sad um it looks like a face to him he just doesn't look different than anyone else yeah is there any difference like okay like for example my father he can tell faces in person just fine but like when he watches videos of people he just cannot like he cannot recognize faces at all so is there any like difference there are lots of cues i mean that's a very interesting exercise to think about what are the cues that you have in person right you have all kinds of other things first of all there's lots of constraining information the person you're looking at there are all kinds of things you know about where you are and who that might be that help right um so yeah there's many different cues to face recognition that might be engaged here so my point is just that um face recognition matters like you can get by if you can't do it but it sucks it's really hard okay okay so mort yes questions no they see the structure of a face they see a proper face if the eye was in the wrong place they would know they absolutely know the structure of the face it just looks they all look kind of the same by the way there's a we don't have time to talk about this in any detail but there's a well-known effect that probably many of you guys have experienced which is called the other race effect and that is the fact that they all look the same whoever they are if you have less experience looking at that group of people you're less well able to tell them apart okay i have this problem teaching all the time i grew up in a rural lilywhite community my face recognition is not so good to begin with and it's really not good for non-caucasian faces it's embarrassing as hell it feels disrespectful i hate it you know i fault myself but actually it's just a fact of the perceptual system your perceptual system is tuned to the statistics of its input and um and it's not so plastic later in life um and so um a way to simulate a version that some of you may have experienced is whatever race of faces you have less experience with if you find those people hard to distinguish it's not that you can't tell it's a face it's not that you could tell that you would be able to tell if the nose was in the wrong place it's just hard to tell one person from another so it's a lot like that i really need to get going so i'll take one more question and go wait could you like kind of use an analogy it's like being able to tell people apart by like their hands or something like to the point that like you just like you know you can't really tell people apart by like their hands usually so is that kind of how people feel like it's just looking at a body that's all you had yeah yeah probably probably yeah and there is by the way an interesting literature you show people photographs of their own hand and a bunch of other hands people can't pick out their own hand from so yeah you're right we're not so good at that um okay i'm going to go ahead if you guys are interested i could post there's some there's a whole fascinating literature here but actually i got dinged last year for talking about face recognition too much and prosopagnosia we all heard about in the 900 in 900. so i took most of that out and now you guys are asking me so i don't know what the right thing is but i'm going to go on and i will put some optional readings online especially if you send me an email and tell me to do that okay so point is faces matter a lot they matter you know for the quality of life they're important because they convey a huge amount of information not just the identity of the person but also their age sex mood race direction of attention so if i'm lecturing like this right now and i start doing that you guys are going to wonder what the hell's going on over there yeah i saw a few heads turn i'm just doing a little demo here right we're very attuned to where other people are looking okay so there's just one of many different social cues we get from faces they're just an incredibly rich bunch of information in a face um we read in aspects of people's personality from the shape of their face even though it's been shown with some interesting recent studies there's absolutely nothing you can infer about a person's personality from the shape of their face we all do it and we do it in systematic ways another reason this is important and faces are some of the most common stimuli that we see in daily life starting from infancy where i think about 40 percent of waking time there's a face right in front of an infant's eyes and probably these abilities to extract all this information have been important throughout um our primate ancestry so that's just to say there's a big space of face perception and now we're going to focus in on just face recognition telling who that person is all right so what questions do we want to answer about face recognition well a whole bunch of them and what methods do we want to use so let's start with some basic questions about face recognition well first as usual we want to know what is the structure of the problem in face recognition what are the inputs what are the outputs why is it hard right just as we've been doing for motion and color that's mars computational theory level we want to know how does face recognition actually work in humans what computations go on what representations are extracted and is that answer different do we are we running different code in our heads when we recognize faces from when we recognize toasters and apples and dogs okay another facet of that do we have a totally different system for face recognition from the recognition of all those other things if so then we might want different theories of how face recognition works from our theories of how object recognition works how quickly do we detect and recognize faces that'll help constrain what kinds of computations might be going on and of course how was face recognition actually implemented in neurons and brains so those are just some of the big wide open questions we want to answer so now let's consider what are our tools for considering these things and you guys should all know what tools are available for thinking at the level of mars computational theory basically just thinking right you can collect some images too but basically to understand this we just think so for example um as i keep saying at the level of mars computational theory we want to know what is the problem to be solved what is the input what is the output how might you go from that input to that output okay so for example here's a stimulus that might hit a retina and then some magic happens and then you just say julia okay so we want to know what's going on in that magic okay and if a different image hits your retina you go oh brad that is i wouldn't i'm live in a cave but i barely get out of the lab but i understand that these are people most people recognize that's why i use them that's a question what goes on here in the middle um and your first thought is well duh easy we could just make a template a kind of store the pixels that match that image and take the incoming image and see if it exactly matches and that's going to work great right no why not louder yeah yeah absolutely that's not going to work at all and the problem is that we don't just have one picture of julia that we can match there are loads of loads of totally different kinds of pictures of julia all of which we look at and immediately go julia no problem okay and so that means what what is it that we're doing in our heads if we're storing templates we have to store a lot of them okay so um all those differences in the images so we could memorize lots of templates well that has long been taken as like the reductio out absurdum like that's the ridiculous hypothesis how could that be how could there be room in here to store lots of templates of each person um and furthermore how would that work for people we don't know the other idea which is very vague right now is that well maybe we extract something that's that's common across all of those maybe something like the distance between the eyes something about the shape of the mouth of other kinds of properties that might be invariant across those images that is that you could come you could pull out that information from any of those images okay it's sounding very vague because it is vague nobody knows what those would be but the idea is maybe there's some image invariant properties of a face you can get from here that you can then store and use to recognize faces okay so now we can to think about this we can step back and say okay how is this done in machines so machine face recognition didn't work well at all until very recently okay and then all of a sudden a couple years ago as i said here's another paper from the different one that i showed you before this one is vgg face one of the major deep net systems for face recognition it's widely used there was another one the year before all of this since 2014 2015 hugely cited widely influential they're on all your smartphones boom it all just happened like nearly overnight okay with with the availability of lots of images to train deep nets so now these things are extremely effective um and accurate and so in some sense those networks are possible models of what we're doing in our heads when we recognize faces it doesn't mean we do it in the same way but it's a possibility it's a hypothesis we could test okay yeah what is the current state of the literature surrounding getting other information from people's faces like moods lots lots they're like simple you know there's like conferences and um machine vision competitions on extracting you know personality properties mood properties every possible thing you can imagine this is like a huge a lot of people care about a huge field in computer vision and there's also a huge field in cognitive science asking what humans pull from success oh god others would know that better than me i bet it's pretty damn good a lot of it yeah yeah i mean these things are suddenly extremely effective yeah okay and there will be by the way later in the course my postdoc katharina dobbs who knows that literature much better than i do we'll talk about deep nets and their application in human cognitive neuroscience and she knows a lot about the various networks that process face information okay so this is progress now we have some kind of computational model trouble is nobody really has a intuitive understanding of what vggg face is actually doing like you know how to train one up there it is but how do we don't really understand what it's doing and further we have no idea if what it's doing is anything like what humans are doing okay so it's progress that we have a model now that we didn't have like five years ago um but we still have all these questions open okay so on this first question what do we want to know what we've discovered at the level of mar computational theory is a if not the central challenge and face recognition is a huge variation across images right which you know just by thinking about it or trying to write the code okay so um oh i'm just barely able i'm going to race along and anja's going to tell me in five minutes to switch okay um so i want to talk just a little bit about behavioral data i'll run out of time and we'll roll this in last time because i want to include functional mri because you guys need it for the assignment okay so how are we going to figure out what humans represent about faces okay so here we are we consider this possibility that one way to solve this problem is by essentially memorizing lots of templates for each person another possibility is this kind of vague and kuwait idea that maybe there's some abstract representation that'll be the same across all of those how are we going to figure out which humans do well if we're really memorizing lots of templates for each person and that's how we recognize them in all their different guises that wouldn't work for people we didn't know that is you you wouldn't be able to take two different photographs of the same person and know if it's the same person or not right because you could only do this by memorizing everybody get that idea whereas whatever this other idea is it should work somewhat for novel individuals you don't already know here are two photographs same person or different person so now let's ask can humans do this do we store lots of templates for individuals or can we do something more abstract well if we can store if we simply deal with this problem by storing lots of templates for each individual maybe not literally pixel templates but some kind of literal some kind of snapshot then the key test is we shouldn't be able to do this matching task if we don't know that person everybody get the logic here okay so let's try it so this paper a few years ago jenkins at all asked that question so here's what they did they collected a whole bunch of photographs of dutch politicians with multiple images of each politician okay then they gave them to people on cards and they said there are multiple images of each person and i'm not going to tell you how many different politicians are in this deck just sort them in piles so there's a different pile for each person okay i'm going to show you a low-tech version of this i'm going to show you a whole bunch of pictures all in one array and you guys are going to try to figure out how many people are there okay everybody ready i'm just going to leave it up for a few seconds it's going to be lots of pictures your task is how many different individuals are depicted here here we go okay write down your best guess just kind of look around you know okay everybody got a guess okay write down your guess okay how many people think there are over 10 different individuals there one okay how many people think over five yeah probably half of you how many people think over three most of you they're two what does that mean that means you can't do it that means you can't match different images of the same person if you don't know that person pretty surprising isn't it we think we're so awesome at face recognition because most of the time what we're doing is recognizing people we know people we've seen in all different viewpoints and here arrangements and stuff if you haven't see if you don't have lots of opportunity to store all those things and it's a novel face we're really bad at that okay yeah but there's a constraint of time yeah yeah i was trying to make the demo work but you know the way okay so the way they do this task people have unlimited time and they're just kind of sorting them the mean number of piles that people made in this experiment was seven and a half correct answers too okay okay now you might say well maybe those are shitty photographs right okay so here's the control those are dutch politicians they then did the same experiment on dutch people who look at that photograph and in about two seconds say two duh okay so if you know there's nothing wrong with those photographs it's just a matter of whether you know those people or not okay so the point of all of this is that this crazy story that in fact what a lot of what we're doing i'm sort of simplifying here but a lot of what we're doing in face recognition a lot of the way we deal with all this image variability is not that we have some very abstract fancy high-level um representation of each individual face we just have lots of experience with faces and we use that so that if we have a novel face that we don't have all that experience with we're not so good at it i'm going to run out of time so i'll take one question and go on um how do they control for you know the issue you said about like if you don't have experience with like certain races yeah i'm sure whenever you do face recognition experiments you make sure that you know if your dominant subject pool is caucasian you have caucasian faces or whatever yeah if unless it's something you don't understand i'm going to hang around after class you can ask me questions there or you have to if you have to go you can email me because i really want to get through this next bit okay okay so um okay so there we are with that so what this suggests kind of sort of is that whatever we're doing it's something that benefits enormously from lots and lots of experience with that individual maybe it's not literal memorization of actual pixel-like snapshots but it's something more like that than anybody would have guessed before this experiment okay okay all right i'm gonna skip this awesome stuff here okay um okay so uh the the the benefits of actually gonna come back and do that slide next time too we're gonna cut straight to functional mri i'm sorry about this but i just really want you guys to have this background in case you don't you probably do but um so functional mri another cool method in cognitive neuroscience and how would it be useful here okay so first what is it functional mri is the same as regular mri that's in ten probably tens of thousands of hospitals around the world the big advances in functional mri were when some physicists in the early 90s figured out how to take those images really fast and how to make images that reflect not just the density of tissue but the activity of neurons at each point in the brain okay that was big stuff okay early 1990s and so the reason it's a big deal it is the best highest spatial resolution method for making pictures of human brain function non-invasively that means without opening up the head all right so that's an important thing that's why there's lots and lots of papers on it that's why we're going to spend a lot of time on it the bare basics are that the functional mri signal that's used is called the bold signal that stands for blood oxygenation level dependent signal okay what that means is this basic signal uh is blood flow and so the way it works is if a bunch of neurons some place in your brain start firing a lot that's metabolically expensive to make all those neural neurons fire um and so you have to send more blood to that part of the brain so it's just like if you go for a run the muscles in your legs need more blood delivered to them to supply them metabolically for that increased activity and so the blood flow increase to your leg muscles will increase okay well similarly the blood flow increases to active parts of the brain now the weird part of it is that for reasons nobody completely understands the blood flow increase more than compensates for the oxygen use so the signal is actually backwards active parts of the brain have less not more deoxygenated hemoglobin compared to oxygenated hemoglobin and the relevance of that is that oxygenated hemoglobin and deoxygenated hemoglobin are magnetically different in the way that the mri signal can see so the basic signal you're looking at is how much oxygen is there in the blood in that part of the brain and hence how much blood flow went there and hence how much neural activity was there did that sort of make sense i'm not going to test you on which is paramagnetic and which is diamagnetic i never remember i couldn't care less but you should know what the basic signal is right it's a magnetic difference that results from oxygenation differences that result from blood flow differences that result from neural activity more oxygenated and because it over compensates for oxygen for for the metabolic use of the neurons the active parts that you see with an mri signal have more oxygenated hemoglobin right okay all right so that's the basic signal and because that's the basic signal there's a bunch of things we can tell already so first of all um i'm just going to am i going to do this i'm going to skip over this it doesn't really matter because it's all based on blood flow one it's extremely indirect neural activity blood flow change over compensation different at magnetic response mri image right so you would think with all those different steps that you would get a really weird non-linear messy crappy signal out the other end and it is one of the major challenges of my personal atheism but actually you get a damn good signal out the other end and it's pretty linear with neural activity which seems like kind of a freaking miracle given how indirect it is okay but that has empowered this whole huge field to discover cool things about the organization of the brain okay nonetheless are many caveats because it's blood flow the signal is limited in spatial resolution down to people fight about this but around a millimeter there are cowboys in the field who think that they can get less than a millimeter maybe i don't know it's debated uh and the temporal resolution is terrible blood flow changes take a long time think about it you start running how long does it take before the blood flow increases to your calves well if you're really fit it's probably fast but still going to take a few seconds takes about six seconds for those blood flow changes in the brain after neural activity and it happens over a big sloppy chunk of time and so you cannot you don't have much temporal resolution with functional mri does that make sense okay um okay the because it's this very indirect signal that also means that we get it when we get a change in the mri signal we don't exactly know what's causing it is it synaptic activity is it actual neural firing is it one cell inhibiting another is it a cell making protein you know i mean it could be any of these things right so we don't know and that's a problem um and another problem is the number you get out is just the intensity of the detection of deoxyhemoglobin it doesn't translate directly into an absolute amount of neural activity the consequence of that is all you can do is compare two conditions you can never say there was this exact amount of metabolic activity right there you can only say it was more in this condition than that condition okay all right so those are the major caveats nonetheless we can discover some cool stuff okay so let's suppose to get back to face recognition you wanted to know is face recognition a different problem in the brain from object recognition right if if it was you might want to write different code to try to understand it from the code you're writing writing for object recognition it's something you'd kind of want to know okay so here's an experiment i did god 20 years ago anyway simplest possible thing so it's an easiest way i can explain to you the bare bones of a simple mri experiment you pop the subject in the scanner you scan their head continuously for about five minutes well they look at a bunch of faces for 20 seconds they stare at a dot they look at a bunch of objects they stare at a dot okay five minute experiment you're scanning them that whole time and then you ask of each three-dimensional pixel or voxel in their brain whether the signal was higher in that voxel while the subject was looking at faces than while they were looking at objects okay and when you do that you get a blob i've outlined it in green here but there's a little blob there this is a slice through the brain like this that blob is right in here on the bottom of the brain and the statistics are telling us that the mri signal is higher during the face epochs than the object apex everybody with me here which implies very indirectly that the neural activity of that region was higher when this person was looking at faces than when they were looking at objects okay now whenever you see a blob like that really you want to see the data that went into it so here's mine this is now the raw average mri signal intensity in that bit of brain over the five minutes of the scan you can see the signals higher when that person is looking at face in that region when the person is looking at faces these bars here than when they're looking at objects there everyone get that that's what the stats are telling us this is just the reality check of the data that produced those stats okay so now in fact you can see something like that in pretty much every normal person i could pop any of you in the scanner and in 10 minutes we'd find yours okay um now here's the key question does this so far let's suppose you find this in anyone you do all the stats you like it's as robust as you could possibly want do these data alone tell us that that region is specifically responsive to faces no why it not just like that certain arrangement of um features or it could be reacting to the light in terms of reflecting off good keep going what else yes you could do yeah then it might still be faces but it would be different if it's human faces versus any faces we kind of want to know right the code would be different yeah the face is a part of something absolutely where the object is the whole thing what else just objects are simpler or maybe just easier maybe it's just hard to distinguish one face from another and so you need more blood flow it's just really what that thing is that's a generic object recognition system but it has a harder time distinguishing faces from each other because they're so similar so there's more activity everybody get that okay what else i'm going to go two minutes over so if people have to leave if that's okay i'll try not to go more than two minutes over what else yeah yeah as i just said there's all this stuff we get from a face not just who is it but you know are they healthy what mood are they in are they where they look and all that stuff okay so what you guys just did this is just basic common sense but it's also the essence of scientific reasoning and we'll do a lot of that in this class and the crux of the matter is here's some data here's an inference and so your job is to think is there any other is there any way that inference might not follow from those data how else might we account for those data okay you guys just did that beautifully okay so the essence of good science is whenever you see some data and an inference ask yourself how might that inference be wrong how else might we account for those data okay so that's what you guys just did i had previously made a list of other things that might mean it could respond to anything human you would said any kind of face but it could also be just anything human maybe a response to hands any body part anything we pay more attention to anything that has curves in it or any of the suggestions you guys made okay so the crux of the matter and how you do a good functional mri experiment or make a strong claim about a part of the brain based on functional mri is to take all these alternative accounts seriously and so as just one example what we did in our very first paper is say okay there's lots of alternative accounts let's try to tackle a bunch of them so we scan people looking at now three-quarter views of faces and hands and we made them press a button whenever two consecutive hands were the same that's called a one-back task or whenever two consecutive faces are the same by design that task is harder on the hands than the faces so we were forcing our subjects to pay more attention to the hands than the faces okay and what we found is you get the same blob still responding more to faces than hands and so the idea is by showing that we've ruled out every one of those things it's not just any human body part doesn't go to hands oh so anything human it's not just any body part it's not anything we paid attention to because we made them pay more attention to the hands it's not anything with curvy outline okay and so that's just a little you know tiny example of how you can proceed in a systematic way to try to analyze what is actually driving this region of the brain you come up with a hypothesis and then you think of alternatives to the data and you come up with more hypotheses and then you think of ways to test them and we'll do a lot of that in here okay that's what i just said um and i'll just say that there's lots of data since then that region of the brain is actually extremely uh very much prefers faces um and it's present in everyone and next time we will talk about um uh the fact that that looks like it's suggesting that we have a different system for face recognition for than object recognition but there's a lot we haven't yet nailed the case and you guys should all think about what remains okay thank you sorry i was racing there i will hang out if you guys have questions

Transcript for:Exploring Computational Theory of Mind

Transcript for:
Exploring Computational Theory of Mind