Lecture: Conversation with Oren Etzioni on AI and the Allen Institute

Oren hello good morning so for those listening in I'm here with Oren etcioni he is the founder and CEO of the Allen Institute for artificial intelligence and the acronym for that is aiai so it's often shortened to ai2 um Oren how long have you been doing this well it all started for me about nine years ago there wasn't an ai2 but the late Paul Allen's team reached out to me and said he wanted to create an allen Institute for AI and it was up to me to come in write a plan and make it happen it sounded like an incredible challenge I remember saying to people people ask me why are you doing this you're fat dumb and happy as a professor at the University of Washington tenure Professor you've got a good thing going why would you do something crazy like this and I said the sky's the limit with uh Paul Allen's Vision with his Resources with his commitment to AI we could do amazing things in Fast Forward nine years I feel like uh there's a lot still to do but we have done some good things so I want to I want to dig deep into what the Allen Institute has been up to um because it's kind of amazing how much you've accomplished in nine years just the the impact and the unique way you've done it but first I want to I want to give people listening in a sense of who you are so I had my Oren moment uh I think it was 2018 2019 at latest and you came and you gave a talk in San Francisco and you did a live demo which is pretty unusual already of natural language processing NLP and the room was packed and I was one of many you know people at startups trying to make this stuff work and hoping to make it big and uh you know this Legend walks in and gives a presentation uh and you had it live on the ai2 website a bunch of models doing stuff we take that for granted today that you would have models just running live powered by gpus I to my knowledge you were the first doing it and what was absolutely mind-blowing was you would you would do what we all do which is you'll show some some cherry-picked examples but then you did something that no one ever does not then not now you then broke it in front of us you would just say yeah looks impressive right but let me just show you what happens when you change the wording of the input just a little bit and you said this stuff barely works and those words like seared into my brain and they did a lot of good because it made me very skeptical and pragmatic so that when you actually got something to work you didn't just say you know you didn't overhype it and so you know that that that is you in a nutshell straight straight shooter just tell me like what you were feeling back in 2018 2019 you were halfway through you know this this Grand project you had laid the groundwork um and you would contribute it a ton to it people don't realize that this whole obsession with Muppet names really began with Elmo right right well John thanks uh for remembering this and and the demo and so on I I do feel like um the principles that we have are our work guides us through hype and turmoil and ups and downs and uh that's the history of AI but it's also the future of AI so let me take the rich things you talk about and break them down first of all we have now a phenomenal language models that do quite remarkable and certainly very impressive things but um our colleague Kate Metz of the New York Times says never trust an AI demo so you're absolutely right that if you if if you don't get to kick the tires if you don't get to ask the right questions then you don't really know uh how well it does and by the way the best demonstration of that is actually in all of our living rooms with or are in our phones right with Siri and Alexa and so on you ask Alexa for something you can get a phenomenal answer and then you change the wording slightly and it says I don't understand that so we have to be very careful with what we impute to these systems and of course there was a recent Brew aha about is this uh Google AI system sentient uh and and so on then of course uh of course it's not so so I do think it is very important as you say to to be a straight uh shooter uh and it's also true that one of my favorite sayings because sometimes people tune in and they're like wow this is amazing one another favorite saying of mine is our overnight success has been 30 years in the making so if you look at a model like gpd3 or Lambda or the the latest of the bunch they do have a long history that goes back to Bert goes back to Elmo which you kindly remembered which was invented uh at ai2 won the best paper awarded 2018 goes back to word to VEC which came to Google and actually goes all the way back to the 50s where a linguist if I recall correctly named Harris said You shall know a word by the company it keeps right it's almost biblical in in the way he phrased it right and it explains most most of NLP today exactly exactly that's the underlying principle that we can understand the meaning of words and from there the meaning of sentences and even Beyond simply by looking at their context and looking at a large number of contacts so in some sense uh all of NLP today if I were told stand on one foot and explain all of NLP today I would say ye shall know the a word by the company it keeps but multiply that by a billion or 10 billion companies context and you'll have NLP today but it's pretty revolutionary right because there was a whole period where we thought grammar mattered like encoding the rules of grammar we thought that was really important I think that grammar does matter but the remarkable thing about this technology particularly when it's played out with large amounts of data right a billion 10 billion sentences a large amount of CPU power is that that data processing can recover the rules of grammar nuances of semantics Etc so it's not that grammar doesn't matter is that this technology is remarkably good at at least approximating very very well those rules that we have and of course by the way we know that people only approximate those rules too right we often say things and write things that are ungrammatical but kind of sound right so uh it's it's really doing probably a better job modeling language than the rules of grammar before you got into uh institution building how would you describe yourself as a practitioner scholar in the lens of today you weren't an NLP guy necessarily you weren't you know how would you describe yourself I've always been fascinated with two questions the first one is one of the most fundamental intellectual questions across all of Science and philosophy what is the nature of intelligence how do we build an intelligent machine over time I've also added the ethical question which maybe we'll have a chance to get into should we build an intelligent machine and what would that mean for Humanity what would I mean for society but that's one part and the second part of me that's a lot more practical the part that's founded startups and that the lights in Technologies has asked how can we use AI to build valuable technology and search in software agents in natural language processing what was that conversation like that early conversation with Paul Allen where you were making this pitch or was it he making this pitch did you come to it together how did it come about and uh for those who you know there might be some in the audience who don't know Paul Allen is the co-founder of Microsoft sadly passed away pretty recently um but an intellectual Maverick he absolutely was an idea man and that is the title of his autobiography which I really recommend to people it's uh really worth reading and I think that uh his key role in Microsoft particularly early on was to have that vision of the PC Revolution and and what it would mean it's hard to imagine now right where you've got a computer in every pocket and in our eyeglasses and you know 200 computers in our car but back then right computers were far from ubiquitous and the idea that we'd have a computer uh on every desk was completely revolutionary so so Paul Allen was a Visionary and uh I found talking to him in incredibly inspiring right and I'm not paid to say that the the per man is has passed away but he is and will always remain one of my absolute Heroes and um not idle but uh Inspirations mentors for his Relentless focus on you might call it the prize and the prize not being a billion dollars or a trillion dollars the prize being how do we understand uh intelligence and of course he had a whole nother Institute the Allen Institute of brain science that was dedicated that is dedicated to understanding the brain it's like the wet lab side of this exactly somebody once asked him do you think that the Neuroscience approach the wet lab is going to be successful in the long run or is it going to be um the more software oriented approach that we use in Ai and he said look to me it's a horse race and I've got a bet on on uh on both horses so so what was the race though did he want artificial general intelligence did he want to just cracked the scientific mystery of what it is did he want to harness it like what did those pitch meetings look like Paul was fascinated and I continue to be to be fascinated by two related questions the first one is absolutely the most hairy audacious big question you could ask which is what is the nature of intelligence and human level intelligence you know no uh consolation prizes the real thing and so he was always asking us about that he was always relentlessly looking to the Future and saying okay what would it take to get there how can I help you uh does this scale the second thing and I think it comes from his fascination with human knowledge his mother was a librarian so he was fascinated with how do we collect human knowledge and how do we get a computer to understand it back in the 70s I believe early 70s he said look it's one thing to take a book collect all the words in the book and put them in an index effectively what today we call a search engine and it's quite another thing to understand the meaning of the book and answer the questions at the back of the book think of exercises at the end of each chapter in a textbook and so even in the 70s before a lot of this technology was around he understood that meaning understanding the meaning of of text of of knowledge was very very tough for a computer I mean have we even gotten closer though or are we fooling ourselves you know what I think about is I use semantic search all the time just at work it's it's a great tool it's really powerful but it's so easily fooled you sort of crack through the shell and you realize if this thing understands scare quotes what it's reading it's doing it in a very different way for me because I can just change a single word and consequentially to me and it just falls apart it clearly doesn't understand it the way I do so are we are we chasing up the wrong tree when we say we're chasing a text understanding or is it all just Performance Based we don't really care if it understands we care about getting jobs done find me documents that are about you know four-legged animals that love to bark you know if I didn't know the word dog but I knew what I was I could describe what I was after we would love a computer system that would just find the right stuff even if it had no idea and I don't care if it has any idea let alone feelings about what I'm searching for well John you're asking the most profound uh question at the heart of this field I'm not sure I can answer it in the 25 words or less but let me uh let me take some shots so go on goal and it'll be more of a dialogue so to the question to ask is it Performance Based uh the first answer is that our performance has gone way up right so if you take any objective measure and there are many uh and back in the day we were interested in a computer answer an eighth grade test right the region science exam uh the SATs and initially the answer was a resounding no it did little better than a chance on multiple choice questions it was getting close to 25 percent and fast forward now it can get 80 or 90 percent it's you know better than the most high school students and I really wonder how it's doing it I really wonder well so so we know we have a lot of uh insight into that and I'll get that in a second but to take the performance question uh we can check the box we now have exceptional performance but now we're debating as as you're raising the question of okay so what does that performance mean and uh there's a famous saying from you know herb Dreyfus uh the late philosopher from Berkeley who said look we've uh run up to the top of the tree and we're shouting that we're on our way to the Moon right uh it doesn't scale and it's not really a way to do space flight we're just you know climbed a tree so so I uh right right so again the uh that metaphor is not drawn to scale right it's it's it's more than a tree but still the technology that will get you to the space station you know to kind of Riff on this metaphor may not be the technology that gets you to Mars and certainly not the technology that gets you out of the solar system right so um so I think that when we talk about uh competence when we talk about uh genuine understanding there's a real debate uh in the field and there's some people uh like Gary Marcus who is brilliant and pointing out um how this technology falls short and we can see that these large language models do things that are called hallucination you ask it questions that are meant to trip it up like who is the president of the United States in 1492 and it'll answer something like Columbus right it it won't realize that the United States didn't have didn't exist in 1492 didn't have a president that so there's hallucination there's lack of robustness right you paraphrase the question and if you ask me the same question different words most likely I would say hey John that's the same question I'm going to give you the same answer but uh AI technology will not and by the way by the way that was a perfect demonstration of ye shall know a word by the company it keeps you know the machine sees the string 1492 it it basically has seen enough it knows you want to look for a person Columbus pops right up and so that's that's a case of the dumb pet trick with data failing you that that's exactly right the the remarkable thing is every time we identify um uh a trap like this a phenomenon a place where AI trips up well our our colleagues who are um deep learning gurus just get more training data just modify the training regime and they solve that one and so we're exactly so is it is it a game of guacamole or is there a fundamental Paradigm that goes all the way to to human level intelligence I would say that that's the question of the age and I would look to people who are a lot deeper into deep learning pardon the the inadvertent pun like uh Jeff Hinton and John lacun right there touring Award winners and I would say that they themselves while they're very much enamored of deep learning and this kind of Paradigm say that the current underlying algorithm the current algorithms I should say back propagation supervised learning current neural network architectures don't take us all the way there they see the limitations of the current technology but they do see that the Paradigm this distributed computing with simple Computing elements and weight updates on edges between them is the foundation for a much more sophisticated architecture that will get us all the way there and of course if we look to the brain right neural networks are a gross gross simplification of the brain but we do have an existence proof right we do know there's net of one n of one exactly so here's a great quote from you um that is a good sag to dive into the ai2 impact so you said ai2 is the place to do work that companies won't do and universities can't so I think that really to me captures the weirdness of this thing you built it is neither a university nor a company what is it well uh first to give credit where credit is due which is always extremely important in Academia right we don't do it for the money that is a quote that I uh repeat from my colleague Noah Smith who's a professor at UW and uh yeah a leader at attitude but it's a wonderful characterization of of what we do and and sorry John so repeat the question how do we do it why do we do it I wasn't sure no more like what is it like it's this strange hybrid uh it's had all this impact but without any of the benefits or problems of being a company nor any of the benefits or problems of being a university it's like I I don't know many things like it also you've tagged on an incubator to it now so it's like definitely unique so I'm a product of the University system I was a grad student I was a professor for more than uh 20 years and I I love that system for uh intellectual exploration for intellectual Freedom uh for uh the kind of uh debate and surprises that it produces but it does fall short when you're trying to build systems some problems require a sustained effort over some number of years requires uh engineering sophistication and it's hard to do that with students who who need to graduate and actually it's not it's not even fair to ask students to play engineer for years on end right because you have to worry about their education exactly that's that's the primary uh goal so over time over my 20 plus years at the University I did Rue sometimes how gosh we really want some things to go into the real world and get more sustained investment than just tossing them over the transom writing a paper writing a research prototype and hoping that somebody will pick up the ball so at ai2 where we have researchers and Engineers working shoulder to shoulder and that's an important part of it it's a egalitarian Community it's not the case that the researchers up on top of Mount Olympus and they're cracking the whip to mix the metaphors Italian Engineers do this do that it's much more the case that they're collaborating the engineers are telling them look here's what you need to do to build a working system we have semantic scholar that John of course you're intimately familiar with you've written about uh you and you were one of the folks to first uh announce it to the broader Community when you were writing for Science and so on which was which was wonderful for us so something like semantic scholar which to those who don't know it's a free search engine for scientific content it has a you know it's approaching 100 million users a year it has 200 million papers in its Corpus that sort of scale and running AI at that scale requires a lot of engineering and we have a very strong engineering team folks who came out of Amazon and Google and uh uh other places to be able to do that and you just could not build semantic scholar build it sustain it iterate on it and University you could build a prototype actually you know what a good comparison point is is archive so archive grew out of the University system it's sustained by the university system and you can see how far you can get with a system you know archive and symmetric scholar are like Worlds Apart cementia scholar is a full product with you know incredible amounts of of hand Engineering in it and maintenance and users and you know I think archive is about as far as you can get a preprint server that puts PDFs on a website and even archive is an exception right it's it's uh it's it's rare that you have it's a jam it's a gem yes absolutely and it but it's rare as you say that you have a university system that can operate uh at a very large scale so um and then on the other hand we have no profit mode semantics scholar does not have a business model I don't think there's a good business model in that space because it's it's meant to be free I know you have opinions about academic publishing yes yes well we can get it into that if you want but uh yeah I think uh too much money is made in uh for over things that really ought to be free to to benefit humanity and maybe to bring this uh full circle to the late Paul Allen um our mission is AI for the common good so that's why I I say you know universities can't but companies won't companies appropriately right have the mission their their for-profit mission but uh Paul Allen is a major uh philanthropist and he won philanthropist of the Year award a few a few years before he passed away um he wanted to make the world a better place the Allen Institute of brain science released the free brain atlas there was a tremendous resource uh catapulting research in that realm and our mission has always been to bring to the foreign release systems data sets uh open source software that help to bring the field forward so what's up with this incubator then is this is this so the what from what little I know about it you have added a startup incubator to ai2 so that ideas I presume can spin out and have a chance to be nurtured is that uh is that you hedging against like sometimes actually companies are the right way to solve problems or is it what is that is it for the future health of ai2 not not really so from from day one we had an incubator in recent times uh with a yeah yeah it started our very first startup uh kid AI it was actually started in you know 2014 2000 oh how did I miss this it's just not well known well because it was very small and then once we got the right leaders in place it's grew it's grown and grown and now we're approaching uh three quarters of a billion dollars in the total evaluation of companies founded and acquired or company xnor which was a computer vision at the edge company um was was acquired by Apple and we've done now more than 20 companies in the pre-seed stage the the analogy is to think of a university and their commercialization uh Center so University of Washington where I was as when Stanford has one of course the famously created Google and Yahoo uh and other other major companies and it's actually in my my mind a natural part of the life cycle of universities that some ideas and technologies that are created in a very nascent incipient form in the University it makes sense to transfer them to a for-profit context and that's where you both get the resources to make them shine right to take them to the next level and also you you get the opportunity right to to create value and value Creation in my mind is a is a great thing I'm not in any way a socialist thinking oh we should all just be working for the common good I think some of us should be working for the common good and I feel very privileged to be in that position some of us should be in startups figuring out how to revolutionize the world and make a killing at it even though I disagree very strongly with say Elon musk's views about AI I very much uh am blown away with his success with Tesla right so we win have Tesla if we didn't have for-profit startups hang on which part do you disagree with the uh the robots are going to kill us oh yeah yeah uh Elon Musk is famous for having said with AI we're summoning the demon and and I think that's just uh that's hype and actually the worst kind of hype it's hype from somebody who you think uh would know better right he's such a brilliant uh man if he gives a lot of credence to to statements that are just not rooted in in any data although he's not alone there are a lot of uh cautious voices he's just uh the biggest on Twitter he's the biggest and he's you know the most art are articulate but I I do agree with you there's an interesting conversation uh around this issue uh I feel very strongly that we don't have any basis for some of these uh fears about AI I've written about this uh and like you mentioned earlier right you've had the own experience anybody who's built an AI system knows just how much Blood Sweat and Tears uh we put in to eke out the the modest level of performance that we get let alone this AI That's free form you know taking over Humanity can't be turned off that we see in uh in Hollywood movies so I think it's really important to distinguish science from science fiction and hype in Hollywood from from the reality other people just extrapolate more strongly for the future they have ideas like hard takeoff sure AI is not very powerful now but what if you uh uh turn your back on it what if all of a sudden there's a sharp uh increase uh you know you leave for the weekend and you come back in on Monday this you know fat AI like your tester's in charge exactly it's smoking a cigar and saying I've been expecting you Dr ezioni uh and and it's just uh it's it's not realistic but to understand that you have to uh get a lot more technical I I do want to share uh two metaphors that I think help that one is these technologies that we're talking about where we tune the uh edges on what um the sorry the weights on edges in the neural network which is what our deep learning technology is doing that technology is the moral equivalent if you're not a technical person of adjusting the gain and The Equalizer and various buttons on your stereo and there's a except you have billions of dials in this case you have billions of dials you're adjusting them automatically but after you've adjusted them really well it's still just going to be a stereo there's no way that you find the right adjustment on lots of dials on your stereo that'll become the Death Star it's still going to be a stereo the same with these large language models that again as I mentioned there's a lot of are they sentient and so on those large language models are basically mirrors okay they Mirror by collecting this Corpus of all these words in the company they keep they mirror that collected discourse back at us and when we look at the mirror we see we can see glimmers of intelligence because we see reflection of our own discourse the thing that's important to realize about a mirrored technology is that you can scale the mirror you can have a very large mirror but a very large mirror is not going to turn into a death star I I definitely agree on the second Point um I think of these large language models as data telescopes they're just amazing amazing devices to look back on all this amazing data we ourselves created with language which is its own Mystery so really we're just looking at our own Mystery but on the first one I would say um biologists you know so you said hey it's just a big stereo you know it may be impressively large and it may be twiddling its own dials but it's still just a stereo and I think a lot of biologists would say well you know I can show you a cell that you know an amoeba that is really just running around trying to gobble up food and you know make more amoebas um and it's not that different from a neuron it's really just a lot of very strange um natural history that led to the the job of being a neuron as a cell and yet when you add them all together you know you get walking talking goofballs like you and me and so either you admit that we're not that special or you admit that there's something special in the system so but that's all that's basically kept uh philosophy grad students uh in business for all time but I do want to address your comment because I think it's an important one and where I would take exception is with the word at so cells are the basic building block of life guaranteed neurons are the basic building block of the brain we have neural networks the units and neural networks are actually very very much simplified relative to a neuron but never mind that I would accept that perhaps we have discovered some of the basic building blocks but it's not the case today that I can give you a cell and say Here's a cell make me a human right uh far far from it and that we understand uh unless of course that cell is a fertilized egg sure sure but it turns out to be pretty easy well that's that's the natural process right and we're going to keep this uh PG rated John right so uh what I'm saying is that we don't know how to artificially produce a cell and even if we did we wouldn't know how to turn that cell into a human and so even if we had a neuron right even if we could build simulate a neuron in a computer we don't know how to turn that into a brain or into human level intelligence so the organizing principles are still uh what was lacking and one last point because this is something I'm so passionate about and actually gets a loss sometimes and all the hype and all the excitement about the technology the one other point I want to make is even if somehow we came up with a recipe a mechanical process to produce a human by cloning right to produce an intelligence by doing the AI equivalent of cloning we still want to understand we want to understand the organizing principles of how do you build a human we want to understand the organizing principles of how do you build an intelligence so we can fix problems so we can go uh Beyond it so we can use these Technologies for the common good right to cure diseases to solve and also to know ourselves in a deep way exactly all right let's let's swerve for a sec I I really don't want to run out of time before we dig into some of the cool stuff that's actually happening at ai2 so you know back to that point about ai2 being a place where you could do things that companies won't do and universities can't let's dig into a couple of them so over the years you've been absolutely swinging at the fences with you know attempts to make um an AI system that can solve math problems where you know that's not directly you know to your point about companies won't do it you're not going to make a buck out of a uh at least directly out of an AI system that can pass eighth grade math tests it's not relevant uh no one's going to pay you millions of bucks for that but a university can't do it because had taking taking a look at the papers you guys are producing the infrastructure required to get there is monumental so what are what are some of the big swings at the fence that you've been doing that excite you lately well um so semantic scholar is the biggest one right where we have a scientific search engine built from the way down we are in the process of releasing a sub-project of that headed by uh Dan weld who was a professor for many years at the University of Washington now joined us to lead this project uh it's called a semantic reader and that is basically when you're reading papers right I feel like if you want to think about the history of reading scientific papers okay we have the cave wall then we had the printed page then we had PDFs you can read online and not much progress since then right we still kind of a labor over PDFs well the semantic reader allows you to seamlessly look at citations while you're in text to look up definitions for terms uh in line to do a lot of uh things I don't have time to describe skimming um things that make the process of reading a scientific paper that much more efficient and we so is this like a machine reading over your shoulder and like taking notes for you not that sophisticated it's much more of a tool right so think of Acrobat Reader plus plus it's souped up to make it easier for you to read so uh here's a very concrete example something that we're very proud of is we've used language models to create tldrs one sentence summaries of of papers that are really quite high quality these have been published and measured and they're really quite good so often as you're reading a paper there's references to other papers and you're like what do I do do I click on that and suddenly I'm reading another paper and they have a reference and I go down some kind of uh infinite Rabbit Hole exactly infinite Rabbit Hole or do I you know noted them but then forget about it well with the semantic reader you can hover over that reference and get a a tldr says okay that's what that paper is about and you can make a quick decision hey let me make with one click I'll save that in my library for future reference or no that's not really what I'm saying I'll I'll ignore it so just little uh affordances little um tricks that are enabled by AI that allow you to focus better and just be more efficient at reading the paper and because is the secret mission here to make AI researchers better at doing AI with the help of AI in a flywheel is this is this a virtuous cycle it's meant to be but it's to make scientists across all disciplines better better at their job so if we can make scientists across biomedicine people working on climate change what have you if we can make them 10 percent more efficient uh that that is significant and potentially we can make them a lot more efficient right if I uh give you a tldr that you know saves you an hour of groveling through the text or even better allows you to pursue something that you might uh might have missed let me give you actually another example we all now use adaptive feeds we just don't call them that our Twitter feed right is automatically organized by an AI that studied as our Facebook feed some people use of course in that case the motivation behind the algorithm is to make you click on ads and exactly exactly you're doing the same thing but with a higher purpose that's exactly right so we have a feed for scientific papers and you can train it it'll show you new papers that you might have missed that might have resolved that might result in amazing breakthroughs and you'll tell that I like this I don't like that yeah that's interesting to me and it'll automatically compute tomorrow's feed when new papers came up in archive and elsewhere to help you in what's ultimately a needle in a haystack search right finding that key result that you with your human intelligence connect with another result and have this amazing breakthrough so yeah case in point of what you're saying the entertainment fees we have have a profit motive but who has the motive to help you be a better more successful scientist uh in whatever your field of study is ai2 does with AI for the common good cool all right give me a second big bet something something crazier well uh we have uh scientists working uh to fight uh illegal fishing using computer vision so uh there's a lot of satellite data uh but uh smaller countries don't have the resources to analyze that data and identify uh illegal fishing boats that are impacting their country's livelihood and so on so we've uh saddled up to help solve that problem recently with uh it's called Skylight and uh we we just won a national competition uh that was actually run out of the government to have the who's got the best tools for analyzing the satellite data and ai2 came in first in the U.S we're very proud of that that just happened a few months ago we are engaged in using deep learning for climate modeling we're very interested in the problem of how will precipitation Reign right how will that change as climate changes in an unprecedented fashion that's incredibly important for agriculture for irrigation for making decisions about the sort of infrastructure you need to keep us fed right as as the climate changes well uh we're using the same types of models to help make these uh these sorts of predictions but really the craziest sorry John go ahead I was just going to ask um I had recently heard that deep learning had found its way into weather modeling and I didn't read enough into it to understand how it kind of baffles me why would you use a neural network to make you know such a model but at the end of the day it is just prediction and deep learning is the ultimate prediction engine that's exactly the answer uh whenever you have a lot of data and you want to make a prediction we've learned the Deep learning models are almost invariably really really strong but I want to get to the craziest project and maybe this is what you're alluding to and that's the problem of common sense so that's a problem that's been a holy grail for AI how do we build a machine that has common sense it's been a holy grail of AI for decades but there really hasn't been much progress on it until recently where yujin Choi who's a professor at the University of Washington shares her time with ai2 she's leading a team that works across both organizations to figure out how to endow computers with common sense how they can you know if I ask you can an elephant fit through a doorway you would say probably not if I ask you what's bigger uh a nickel right the coin or the sun you would say or you're being silly why why ask me these questions but if you ask that question of most computers they don't know right they don't have the kind of Human Experience you have I think it actually goes deeper than that I think that's just a great demonstration of the lack of common sense but the thing that I you know the devil's NLP work every day of you change one inconsequential word and the model just has no clue suddenly it all maps back to a lack of common sense and I I want to highlight again to go back to this fundamental question about should we be worried about AI I think that common sense and Common Sense ethics are actually really important here so one of the fanciful scenarios that people love is the notion of you tell your computer to produce paper clips and it goes crazy kind of a Magician's Apprentice type of scenario and it uh produces you know it kind of takes over all of Humanity's resources to maximize pay-per-clip production and we all die in the process right there's no food there's no energy there's just paper clips well what is that if not a tremendous lack of Common Sense and of ethical sense so if we want to work towards having machines play a better role in our lives it makes sense to start working on these problems now but in a constructive fashion not in a philosophical fashion all right uh oh my gosh you know Chicken Little disguise falling fashion but to say okay how do we build into computers the sense to not cause harm and this is the alignment problem that people often talk about how do we align AI with what we should be caring about for our own good yes although it's an important twist so the alignment problem really comes from a traditional reinforcement learning where uh ethics and values are reduced to a number and you say you know I've got the number 15 for some World John's got the number negative 15. how did John and I or how the computer and I align our numbers but that in my mind is actually a gross oversimplification because how do you build something that figures out what are the right actions figures out how to evaluate a situation right we often find ourselves in moral quandaries we often make mistakes and then recover from them so you say common sense is the the first Mountain decline before all Elders it's certainly a necessary amount to climb uh I you know I never want to like say the problem that I'm working on takes Primacy on other people's problems but I would say that traditional value alignment and reinforcement learning is is grossly oversimplified and ultimately inadequate for for for common sense and for moral reasoning and so um Gaijin is is tackling Common Sense what are what are the angles of attack on this well so one of the huge questions that we we touched on is is are neural networks enough do you also need to create uh symbolic knowledge you know Thou shalt not kill right is does that have any uh any value do you can you just use sentences from the internet which can be as we know toxic full of uh sexism racism xenophobia uh anti-gay sentiment and also mutually exclusive claims about everything exactly so is our moral sense going to come from just uh a large and arbitrary collection of sentences or do we have different ways uh to build a moral sense in a more responsible fashion and so so those are some of the questions she studied uh and again it's it's a it's a very rich project is language enough what about uh should we put in robots should we put in computer vision can we learn from videos on YouTube right there's a lot to learn language is just a limited they data streams a lot of the work is now becoming multimodal so what do you think is the best bet we have today for making any progress on common sense I mean so far I'd say the the most impressive work has just been in creating better benchmarks to reveal how far we are from True Common Sense understanding that's actually been a great project across the world it's just showing our laundry with benchmarks that are actually challenging enough to show that no no we we really are miles away we're at the top of the tree nowhere near the moon so I I think that there is a lot of value in that and I think that continues there is a funny phenomenon that when you build a benchmark that's large enough and the community kind of Demands right we uh learn arguably from relatively few examples but here they say hey if you don't have I don't know at least 100 000 examples in your benchmark it's not worth thinking about but then the Benchmark becomes kind of its own narrow task and then you find where we train a deep learning uh system on you know ninety percent of that data we tested on the remaining 10 and lo and behold it does well in that kind of narrow task and you're still left with this kind of doubt yeah so we solved the task we solved The Benchmark we solved the data set but did we actually solve the underlying problem and often we find the answer is no right it's brittle then we make a little change and all of a sudden it falls apart right uh so so I I do think we need to go beyond this you know one data set uh one problem at a time to build something that cuts across multiple problems but where are we going where are we going Beyond benchmarks who's who's actually doing something that you think has a a possible chance of being part of this near future system that will have common sense or something approximating it you know it I I take it you're skeptical that it's going to be a bigger language model so um again Eugene China team in the project called Mosaic is building a massive resource of Common Sense knowledge a repository so you don't have to relearn it every time is this like Doug Lynette's like big collection of statements so it's analogous to Doug lynott's psych project which went over many decades but there are several key differences first of all Psych was a heavy logical system and this is a much more modern system with elements of crowdsourcing text uh model generation but it is still a big collection of Common Sense statements right it it is it is so in that sense it's analogous the second thing is the psych project uh at some point I think it was in the 90s gave up on the academic community on careful experimental measurement whereas the Mosaic project continues to produce new algorithms and Innovations and to be uh both measurable and open another thing about site because it was always hidden from views a little bit like The Wizard of Oz this thing is amazing trust me trust me but you know you can't look uh behind the curtain and I realize these are uh strong statements but I do I do just want to give a nod to the fact that it was the right idea at least in your mind of collect common sense as a a very literal sense of statements about the universe uh absolutely true I think douglanad and his team the psych team deserve a lot of credit for their courage to tackle this so Holy Grail problem uh yeah in the 80s and they did it with with the methodology at the time I think they kind of lost their way over the years uh and so we've picked up the Baton and other people uh in in the community I I also want to just mention that another data set that we have which is called I think it's the norm back is a data set of little kind of vignettes or Snippets with questions like is it okay to mow your lawn at five o'clock in the morning or you know is it okay to kill a bear is it okay to kill a bear to save your child is it okay to kill a bear to amuse your child all kinds of little uh shorts and areas like this and a label that says yes it's okay it's not okay it's not desirable Etc and that's right where the labels come from um so so they've come from uh people and also from collecting efforts uh done by other people we've we're always trying to amalgamate and bring in uh resources created by others and then of course uh give them back to the community so we've created the the most powerful resource for training uh starting to train ethical AI systems so let's dig into that a little bit this is this is really interesting so I can imagine you can have uh what you're generating is Gold Label data you know like we know and love across all of AI but it has an unusual property which is that at the decision boundary there are going to be ambiguities where people disagree and there's no amount of consensus that will get you to agreement there are statements that people simply disagree on and they always will what do you do with that that's actually a really unique kind of data it has built-in permanent ambiguity you're exactly right right with a science question or math questions there's one right answer uh typically certainly when we're doing with grade level uh science not the case here and actually the system that we built on this which is called Delphi it was it's available actually a demo at delphi.ilnai.org so again open it up and you can see with some effort it's quite easy to trip it up get it to say the wrong thing well uh when you ask Delphi a question it can actually relativize its answers it can say if you're a conservative you would think this and if you're a liberal you would think that so it's starting yes yes so so um right it tells you that and you can pose the question you say uh I I don't want to get into controversial or painful topics but you take abortion right and uh it'll it it has learned a model of the conservative view of abortion of the liberal view again it has a long way to go but it's exactly a platform to study the ambiguity that you were talking how do you so I'm about to ask you a question in knowing full well that you've been sort of dragged through hell and back uh in relation to the Delphi project but zooming out just a little bit how do you how do you make productive progress on areas like this that you know are just fraught you know that people are going to be upset you know any anything where you have a language model saying things like this is right or wrong according and if you're a liberal or a conservative someone's going to get upset how do we how do we make that okay to do that research knowing that you're treading into a bit of a Minefield like I can imagine one extreme is we just don't ever touch that stuff but I know how you feel on that topic that's you're you're leaving gold on the floor you're you're just yeah it's it's not just gold it's um I think that science is really hampered if there are questions that are uh third rails right we're not allowed to study uh how do we build ethical AI systems because people will get upset I think that's uh that's highly problematic and you're right that when we released Delphi uh to the public and we probably could have done better in terms of putting uh warning labels on it make sure you know that this is not the BL and end-all this is a research prototype meant to uh you know for open inquiry and so on but uh people did get upset and I would say two things first of all this is a great illustration of uh the the adage where companies won't if we were a Microsoft to Google Amazon worried about our brand we wouldn't do that look what happened with Tay right it was uh taken off and there hasn't been you know Tay 2.0 and so on you know Microsoft no Microsoft hasn't touched that well they have a brand to protect I I respect that uh our brand uh does not need to be protected it needs to be the spirit of uh of honest and open inquiry and if we are alarming people actually I think there's value to that if you look at what neural networks do and you conclude hey this really needs to be controlled better then we've done part of our job right that's a good thing so I don't think that we Court controversy but we are steadfast in our support of of open inquiry as opposed to some kind of cancel this don't do it it's too uh it's too fraught I do want to remind people in the audience whatever the perspective is about the technology and about the effort to remember that behind this technology there are people grad students researchers and those peoples have feelings and and I have to say when all the negative energy towards Delphi came across I felt bad but I didn't feel bad because I was involved in releasing the project or people who are upset I felt bad for the people at ai2 who were the recipients of all this energy and that energy I think could have been more uh constructively targeted I think anyone nowadays is very cautious about putting a language model out behind you know a text input portal anywhere on the internet and maybe that's one of the Practical outcomes is that you just have to be very careful because it's all too easy to elicit offensive stuff out of this language model because it is a mirror of ourselves and you know we are offensive to each other and that's all baked into the language it learned from and so yeah I it just seems like uh there's just a lot of caution around doing what you do which is to just be open with your work and put it out there as a prototype warnings and all I think fewer and fewer entities in this space are willing to take that risk well I really hope that uh we over the years remain willing to do that appropriately it needs to be done right but there are people for whom it almost seems like a sport right uh to use your your phrase from a previous conversation a a a a sport to come and Bash uh these sorts of efforts and it's all too easy I I don't think it's it's sporting and uh I don't think I think you can always do it so I I would not recommend to the Olympic Committee to include large language model bashing in in the next Olympics I I would instead encourage the people who are worried about that to engage with building better models with building better controls with uh because these models are being built and AI is taking an increasingly the participatory role in society and decision making so we need to figure this out not to bury the issue because it's it's too fraught last question Lauren if you could time travel back those four years to when you you know said in that pact room to all these people including me you know this stuff barely works um what would you say today to you know this packed room there's some tens of thousands people listening right now um a lot of them hopeful and you know taking part in this revolution which is absolutely underway I mean it's unbelievable what you could do today compared to four years ago and who knows four years hence has your advice changed I would say that I along with many people have been surprised with the progress of the technology so I would say uh this stuff barely works but I would add the Proviso but it's moving super super fast and then I would still add the cautionary notes never trust an AI demo and um even if this looks very impressive think about what's under the hood what are the implications for society don't get caught up in the hype not the negative hype of the sort that Elon Musk spots but also not the positive hype of the sort okay we have achieved sentence we have not thanks Lauren great talking with you thank you John a real pleasure um

Transcript for:Lecture: Conversation with Oren Etzioni on AI and the Allen Institute

Transcript for:
Lecture: Conversation with Oren Etzioni on AI and the Allen Institute