Good afternoon everyone. Welcome to the class of the China's perspective on global development initiative. I've heard that it seems to be kind of the last class of your... course right or yeah is it right so it's my honor and it's my pleasure to have this course sharing and share some progresses for the artificial intelligence with you So actually today my topic is on the artificial intelligence.
I believe everyone have heard these words and everyone, you must have some concerns in the thinking. And maybe you also have some practice on this, on the research and applications on AI. So what I will talk about includes three parts.
One is for the fundamental concepts of AI. And whether you have already known it or not, let's have a look. a very brief overview of what is really important for this concept.
And then I will briefly discuss what are the challenging topics today we are thinking of. Actually, I summarized the eight of them. Let's check how many of them that you had ever been thinking of.
And lastly, I will take almost half of my course on this part about the recent progresses. I will share you with some recent AI research and application progresses in China and especially in Tsinghua University. So my name is Ming Zhang, I'm coming from the Computer Science and the AI Artificial Intelligence Institute.
So firstly, let's talk about the fundamental topic concepts of artificial intelligence. And yes, I'm sure this one does not work. So I have to check about this. Okay, so...
Firstly, let's take about the different types of the knowledge. Different types of the knowledge. The first one is about the precess knowledge. So actually, in our words, we have... quite a lot of the precise knowledge that we have to confirm and in most cases they are unique answers.
For example, what is about one plus one is equal to equals to two and like the multiplicators so everyone starting from our childhoods we are we start from learning this precise knowledge. Actually the precise knowledge, yes it exists in the in the universe, but that is not so much. And quite a lot of the knowledge is about the next part. The second part of the knowledge is about the common sense knowledge. Well, what is common sense knowledge?
Oh, it is not a good point because, actually, I have some... hidden secret for you, but now you can see everything well. For example, the common sense knowledge, we know that birds can fly.
But actually, how about the ostrich? Yeah, that is what I want to discuss about you. We know that ostrich is kind of the special bird that it is the bird, but it can't fly. So that is kind of the exception from the common knowledge.
And I try to Actually, that not only happens to this kind of exceptions. For example, how about the dead bird? So if a bird is dying and it can't fly. And if it even more general cases, an injured bird cannot fly.
So that is kind of the, quite a lot of exceptions to this common sense. common sense knowledge. Well that is pretty difficult for the, maybe for the children, that when we tell them that birds can fly, and then later when he or she meets an ostrich, he found that it is not the truth. Well maybe actually it's still the common sense knowledge, but they have some different cases, and that is for human beings, and sometimes this kind of the knowledge some exceptions even we don't know.
We human beings, we don't know the exceptions. And yes, it's really difficult for the artificial intelligence, AI agent. And then the third type of knowledge, kind of the uncertain knowledge.
Well, here the uncertain knowledge is... For example, we say that if it is cloudy and humid, then it's likely to rain. But it's just likely to rain. In maybe by the 80%. about the possibility or maybe 60% of the possibility.
Even today, in our weather forecasting report, they will never say that, or they will seldom say that, definitely we will rain today. today or tomorrow, they will say that it has a really high possibility to rain. Even sometimes we have 100% of the possibility or 90% of the possibility to, it will rain tomorrow. Well, that's kind of the uncertain knowledge and you will say that this uncertain knowledge govern, is just governing our life yesterday, today and tomorrow.
So, well, let's change the language into computer science. We say that, or the mathematics, we say that probability. So that's why probability-based approaches just achieved such a big success in the recent 20 years.
That's just because we are trying to get in more and more types of knowledge. And in very early... years like for example 60 years ago when in the very early stage of artificial intelligence that when we're talking about the AI, we're talking about the expert system, then people are working on the precess knowledge.
And then to some common sense knowledge, people put those known information into the expert system and trying to do some inference or to try to help us to do some reasoning. But I have to say that in early stage we have quite a lot of hope, but finally we are all disappointed by the real progress about the AI technology because we couldn't handle such quite a lot of the uncertain knowledge. But today, when more and more uncertain knowledge can be handled or can be learned, then we have more and more approach.
We are approaching the success, but I have to say, to be frank, We are far from the full, the complete success. We are far from the final targets. We will bike to this point later.
And then when you have already understand what types of knowledge we have in our real life, then let's take a look at the bottom line of the knowledge acquisition. Take the riding bicycles into consideration. How many of you are able to ride bicycles? bicycle?
Please raise your hand. More than half right. Okay, thank you.
And then, can you answer my question? How to keep balance when you are riding bicycle by one or two sentences? Can anybody have a try? How we keep balance when we are riding the bicycles?
Yeah, try to have a try. Have a try. Yes, please. Oh yeah, that is very important tips that don't look at the watch your feet, but just look at somewhere far away, right?
Yes, it's a very good point, but actually, even if we are looking far away from here, we are still possible to falling down, right? So, and yes, that is one of the really important tips. And anything else? Well, please.
Keep what? Keep following? Oh yes, that is pretty important, yeah. But actually that is not how to keep balance, but that's how to learn how to ride the bicycle.
We will come to this point later that's really important. So okay, so maybe everyone feel that it is something we could, we know, and we have mastered, we have the skill of riding bicycle. Actually you know that for human beings. Once you learn how to ride a bicycle, you will most probably seldom forget about it. So even after, for example, 20 years you didn't ride any bicycle, and then 20 years later when you come, when you have a bicycle, you could still run it really smoothly.
But how to do that? It is really difficult to speak out and it is really difficult to summarize. So that is kind of the bottom line of the knowledge acquisition.
not so difficult for example. Let's see, we think that's not so difficult. So, think about the children get to know cats.
How to describe the cat? So what is cat? Suppose that you have a daughter or son or a very young brother or sister.
Yes, please, have a try. What would you first like try to define as character? Yeah.
Yeah. Yeah. Yeah. Thank you. That is also a very good answer.
But actually, that is how to learn what is a cat, just like how to learn how to ride a bicycle. But that's not how to keep the balance and how to describe that because, yeah, that is pretty difficult. Because we can't give, maybe we could give it a definition in some biological science. scientists but actually we can't give very precise information on what's the definition, how to describe that. Okay, so that's what we are facing too, the challenges or the AI, how to, what the AI is facing too, because we can't tell them what is what we want them to know.
So how do we know that? Just like our, the really wonderful student said that, learn by example. And actually that's learned by quite a lot of examples, not only one or two. We show them different things like the cat and the dog, with all kinds of different dogs. Even if, for example, this dog is wearing glasses, that's no problem.
That's even, we could show a child all kinds of the animals to show, hey, this is a cat. Another one is also the cat. So they are cats from the different viewpoints. not and then a child could learn what is cat. And another way is learning by practice, just like our another students said previously, learning by practice.
Just after you have fallen down, by quite a lot of times suddenly you feel that you could ride bicycles. That is what we human beings are doing. So what the AI, what the computers are doing.
So that is not easy. So So we have the machine learning technology that belongs to the artificial intelligence technology. Like they learn from features. Well, for example, I know quite a lot of you are now native speakers in Chinese.
So I'd like to show you these two examples. I believe that's not the... how many of you had learned Chinese in your...
learning here in your campus here. Can you recognize what the two characters are? The first one for the non-native speakers what is this one?
Two yes and this one is? Yes yeah yeah that's perfect because actually that would be difficult for the non-native speakers because the two characters are too similar to each other even especially when we are writing it in hands and reading. Then how can we teach the computers to know them what the difference for the two characters for the non-native native speakers, even for those who don't know English, can you, don't know Chinese, can you speak out what's the difference from these two factors, two characters?
Have you find that? Yeah, anyone could tell? Yes, please.. Yes, that's perfect.
Okay. So actually, thank you. We take around five to ten seconds to find the differences and the characteristics about these two factors.
We call them features in machine learning. So actually any other things maybe it's not so in the horizontal line if it have a some kind of the angle that's not the key point. The key point is the length about the two tokens. Actually, that is a feature. So in quite a long time in artificial intelligence and machine learning, we are working, people are working on firstly trying to find very useful and helpful characters for what we want to learn and then we teach the computer to say that hey they are the different features and then you need to use those features to do the character recognition.
And then that is what kind of thing, for example, for lasting for 40, around 40 to 50 years. And then people think about, we are thinking about, can those features be found automatically? Well, that is a great question.
And when we are thinking about this possibility, if we could find those features automatically, then we will have saved quite a lot of money. efforts for human beings to teach the computer, to teach the artificial intelligence. And yes, later people used the neural network.
So actually I have to say that although we know everyone maybe heard about the deep learning here in the class, you have ever heard about that, but actually deep learning is just something about the computation, something about the deep, the more layers of neural networks, but it is not the truth that the neural network was proposed in the most recent 10 years. Actually, neural networks was proposed in the very early stage of artificial intelligence history, at least 50 years ago. But at that time, we don't have such a fast computer, and we don't have such big storage resources.
So actually, we don't have enough resources to support our calculations. about the complex and the multi-layer, multiple layer or even more than one thousand layers of the neural networks, then we don't have quite a lot of success on the final results. results, but we could do only kind of the simple tasks. But in the most recent years, when people are racing, when we have quite a lot of the supercomputer or the more computation resources, thanks for the Moore law, and we have quite a fast developing about the hardware computing, computer hardware technology, and then we could do quite a lot of the deep neural networks and the people.
For example, this is a figure I just found from the internet, but you could find all kinds of such kind of the images and technologies from the internet. For example, what we submit to the neural network is just this kind of information. And in the middle layers, those layers could automatically learn what is the key points, key features. For example, you could see that are the faces. And that.
are the key points, key features about cars. You could see wheels here and also elephants. For the elephants they will have the different kind of the U shape with a long nose and also for chairs they will have different types of the chairs.
All of these features in the middle level features and in the higher level concepts that are automatically learned by the models, by kind of the neural models. And so that is kind of the big progress for the artificial intelligence because we could get quite a lot of success here. Try to automatically learn what features should be although it is far from perfect and although actually neural network is not the only helpful and useful machine learning algorithms, but that kind of among the strongest models currently. I believe that there will be more later. So then, let's have after talking about this kind of information, let me summarize what when we are talking about artificial intelligence, what we should think about firstly, what are the most important things that we should think about?
to input. Showing input kind of the rules or tips like please watch, look at as far, somewhere far away when you're trying to learn the bicycle, ride the bicycle, or we just give them provide or kinds of the tips. different images for the cats so that is what is the input what we give the AI agent or AI computer program to learn the second how to describe it. It makes that how to represent it. We call it representation, for example, for cat.
For cat, we can represent it as just an image here, just the composed of the pixels. For example, 1,024 pixels or 8 multiplied by 8 pixels or we give them some descriptions or definitions that have two ears, one long and one soft tail and could live with the human beings as kind of the pet. What kind of the information, how can we represent them in the same way? system.
That is the second question. And the third one is, what is the target output? The target output is, for example, whether it is a cat or not, or what animal it is. You could know that.
The difficulty about the two questions is really different, right? For whether it is a cat or not, you just say yes or no. Even if you know that it's not a cat, but even if you don't know what it is you could get 100% score for this one but actually in other cases not and also what really important for example for the uncertain knowledge the output is not whether it will ring tomorrow yes or no no our output could be with how many possibility it will ring 20% or 80% or a precise answer with a precise knowledge or some common sense knowledge or the exceptions so that governing they could decide the output and the last one is the algorithm the algorithm is that what kind of model what kind of a computer algorithm to learn for to generate this output or to search to get this output well for example artificial intelligence or like the neural models neural networks that I just mentioned as an example is one branch of the algorithms, but that's not all.
And I have to say that maybe till now in the past 60 years people have proposed, I believe, more than thousands of the different algorithms in the artificial intelligence. But some has already been proved to be a field. Some achieved great success and some previously people think that's a big field but now it's a great success just like the neural models.
Finally, what is the performance? How to evaluate it? Well, actually, I will show you some example later, just several minutes or several tens of minutes later.
By our results, let's check how evaluation could be made and kind of the performance. So that is the first part of my talk. And the second part about the challenges.
Well, actually. if we have enough time maybe I could put the challenges for I could teach that for around for three to four classes but actually I only prepare to take around 10 to 15 minutes to talk about the challenges but before that let's have a warm-up about the discussion well take the personalized recommender system into as the example because everyone you must have used some kind of the recommender system. For example, when you buy something in the e-commercial website, in Amazon, in eBay, in Jingdong, in Taobao, anything, where online shopping, you must have used some personalized recommender system.
And when you are watching the videos and the shop videos, when you are checking that, when you are listening to some music and you read some news feedings, All kinds of these stuffs are the automatic recommendation, personalized recommendation systems. Even when you are planning for some travel. Okay, so think about your experience.
Which is more important to you? The following factors, for example, fairness, diversity, and accuracy. I believe that you could understand. and what those what those factors mean, right?
Fairness, diversity, and accuracy. Or you mostly just focus on paying attention to one of them with a high priority, or you just cared about all of them. And the third question in addition, do you think there are some any other influence factors for you to evaluate for the personalized recommender system? take around two or three times, two or three minutes for the discussion. Who would like to share with me, with your, or share with all of our classmates?
What's your, the important factors? You could say something about only one question or all of them. Anyone who would like to share?
Yes? Yeah. So in most cases, apps, that's for me, but.
Yeah, thank you. I mean, in many cases, you probably don't know what you get, what are you trying to get. Or you have a very vague sense of what kind of spectrum of merchandise you want to get. Then if the algorithm contains.
the kind of mechanism for diversity which could be triggered once you log in, you'll be more likely to browse it into a kind of like a brochure of like merchandise. So the more you browse, the easier for you to make sure... It's kind of like the process for you to finalize what's in your mind. It's kind of like a process of narrowing down the spectrum, then pinning towards what exactly do you want, that kind of process.
I think for online shopping, that's a strong suit. Okay, thank you. That does make sense that diversity is really important when you're trying to browse the different categories to find, trying to find out what fits me most.
Thank you. Great. Anyone else could share or do anyone have any other considerations as a factor?
Yes, please. Yeah, the gold over there. Yes. Thank you. I think fairness and diversity are subjective whilst accuracy is not.
So in the case of online shopping, you want to, I would say accuracy should come first. You want to know that the price you see is right. the address of the shop is correct, the tax you're going to be paying, they're giving you the right number, and only after that would you be interested in knowing about the fairness and the diversity related to... and each person has different... different preferences on this.
Somebody might think fairness would be something, a product that's environmentally friendly, and that might be their priority. Another person might be more concerned that the person who produced the product was getting paid a fair wage, and that might be their idea of what they want in terms of fairness and then in terms of diversity. Some people might want to buy local products, some people might want to see products produced by different ethnic groups, etc.
So because it's subjective, I think on this, the machine would need to be more sensitive. need to know what that person's particular preferences are, but accuracy, I think, would be true across the board. Yeah, thank you. Perfect. That's coming from another viewpoint, coming from the subjective and objective.
So actually, that's great. You also give me kind of the explanation that why the recommender system, they basically use the accuracy as kind of the gold standard to evaluate the accuracy. everything in their recommender systems. Firstly, they use the accuracy related metrics and then they could invite add some other factors like the fairness, diversity, novelty. Well, that might be because of the subjectives and objective because the yes, the personalized factors does have quite a lot of impacts on the people's choice but we could have some something related objectively, relative objective.
to that. That's a very good point. Yeah.
Thank you. And any other idea or any other factors you are caring about? Yes, please. Yeah.
Okay. Thank you. Yeah. I would like to remove fairness in terms of accuracy because when I go online to shopping, I want to make sure that the quality of what we are going to put in our buying relates to the actual quality.
So in my own definition, in that sense also leads to accuracy. Yeah, thank you. That is a very good point.
So actually, I have to say that in very, take the recommender system into consideration. In early years, people only cared about the accuracy. And later, people raised the diversity factors.
And in most recent years, I have to say that within the recent seven, six to seven years, people start to pay more attention on the fairness. Well, but here when people are talking about fairness, does someone talk about, just like you, for example, whether, for example, the price is fairly linked to their quality, for example. And someone also think about it.
That is kind of the item quality and the fairness. And sometimes people also think about the user fairness because people start from. some public available data set like the movie lens and some others, other general use the public available data set, they find that the performance that the people the red manner system provide to the females are worse than the males, the performance.
And for example the results they provided to the old ones or the younger ones are worse than the middle-aged ones. that's not because the system the algorithm They do have some balance, hey, I don't want to provide good results for the female or for the young and old. Maybe because of, for example, the data, the bias of the data. We have a smaller number of the old and the young, old and teenagers, but we have more data about the middle ages.
Or because, for example, sometimes people find that males would like to give the comments. Yes. And the females will be a little bit shy, or in general, a little bit more shy to provide their comments to the. but that's only one part of the explanation.
So not based on that people are, the system is willing to do that, but just the thing is that the observations are that there are unfairness exist in all kinds of different recommender systems. So people are trying to working on the fairness issue and what you are talking about is very important that fairness sometimes is not an independent factor. Sometimes it links to the quality, links to the accuracy, and maybe links to the diversity. Sometimes people also think diversity is part of the fairness.
So that's really wonderful. I feel that we have provided very valuable comments and notions on those factors and then you must could understand that is not easy. What about the challenging topics we are facing for artificial intelligence?
The first one is confidence. Here the confidence includes one part on one side, the system's confidence to their result. And on the other side, we, the human beings, how confident we are to the AI system.
Do AI systems? face the crisis about the confidence? Can we believe in that? Can we believe in the AI system when it gave us some suggestions or even some decisions?
Just like for example automatic driving. So let me ask that how many of you will trust the automatic driving system to drive? drive the car out of your control, you just don't do anything there.
How many of you would like to trust that, to do the automatic driving now? Please raise your hand. Yes, good, we have around ten. So how many of you, for those who does not raise your hand, who do not raise your hand, then how many of you, for the remaining, you believe you will trust?
the autonomous driving system car system to fully control the car beyond of you you don't need to add anything in Please raise your hand. For example, maybe not now, but in 10 years I will trust them. Okay, yeah, also some around a little bit less than 10, and how many of you think that I will never trust them to fully control?
Maybe they could help me to control, but I will never trust them, please raise your hand. So yes, we also have similar number. So for the three questions, we have almost the same group of the students, so that is kind of the reason.
of the confidence. The confidence challenge to the AI system. And then the second one is ethics. Well I know that everyone whether you like the fiction novels or movies you must have ever been thought of or talked about this. What are the ethics principles for the AI systems?
Does it should be Should it be the same as the human beings? Or should it be said to always follow, put the human beings as the first priority? Just like the very early, the famous, some fiction novels mentioned, the three golden laws for the robots, right?
So, or the AI system could have their own ethics. Do any of you know of any of those? need to do the ethics alignment for with the human beings with it with our society with us so that is kind of the challenging issues and that is what we are discussing right now today especially in the most recent one year and I believe they will keep discussing here in the next several years or even longer here we doesn't mean here who are sitting in this classroom and does not only mean the one in Tsinghua University or ones in China, but actually it means that almost everyone on earth, in the world that we are from people, researchers and the governments from Oregon, different countries, we are trying to discuss this ethics issue about the artificial intelligence.
And the third one is fluency and accuracy. This is kind of the technical issue. Actually, even for the previous two, they are something closely related to the technical issue. The fluency and accuracy, well, you must have ever used kind of the large-longer model, either for example, chat GPT or Bing, or Bing OpenAI or some like the other ones, like Baidu Wenxin Yiyan or kind of the Baichuan or all kinds of the large-language model, or LLAM.
Well, on the first day, when they just jump out, just like something jump out to everyone early this year, people are so surprised to the performance about it. But actually, that is just about in the notion about the research. researchers, AI researchers, that is not because about its strong power of accuracy. No, it's because of the strong power of fluence.
So I will show you our survey result before this class I had asked our teachers to send you some survey questionnaires. Thank you that quite a lot of you have submitted your survey result so that we could do some statistics and I will show you later that actually what we are talking about the hallucinations about large-language models it is something that is not accurate. The large-language models could provide our provide us wrong results, hallucinations, well quite a lot of the ridiculous results but it is fluent.
So that is kind of the challenge we are facing today. Okay, okay, boys and girls, let's continue. Yeah, so, okay.
So, boys and girls, let's continue about the challenging issues, and we just talked about the first three challenging issues, and now we come to the fourth one. The first one. It's about the explainability and Interpretation.
Well here, what is the difference from the two concepts? Well, explainability is kind of the explanation to the users. For example, for the recommender system, why the system recommend this item to me, but not the others.
So that is explainability. And interpretation is the monitoring of the system itself. why the system will get this result, this ranking list but not the other, and why the system will have kind of, for example, fairness issue, or why we couldn't rank the most fresh results at the top one, why we will give this popularity bias, why, so the interpretation is kind of the explanation to the system, why the system get this for the best. results and explainability may some always related to some users'information need and the understanding about the users. And yes, the next challenge is fairness.
Well, when we are talking about fairness, we start from the very early point, maybe six to seven years ago, people start from what the definition should we give to the fairness? What is fairness? Well, then people were taking, for example, something concept from the social science, like Gini, the Gini score to measure the fairness about the system.
And people are thinking about someone is talking about the group fairness with some sensitive attributes like the male and the female, the high price or the lower price ones, something the item, different items. And sometimes people are talking about the fairness with the performance-based fairness. People are saying that when we're talking about the fairness, it doesn't mean we should provide equal results to everyone. So that is actually essentially unfair. So what we need to do is that we equally provide the equally helpful and useful items to the different users because different users have have the different information need.
So that is working people have put tried around two to three years talking about what is fairness, how to evaluate it, what are the definitions. Then we're thinking about how to improve the fairness. And around three to four years ago, when we are trying to improve the fairness, we were trying to improve the accuracy.
So that is kind of the treat-out. if you want to be more fair to the users we have to hurt some accuracy the performance but good news is today in the most recent three years to us three years we start to find that we could do some win-win strategy we could have some win-win situation that we could increase both accuracy and the fairness. And we are still thinking about is there anything we still need to do? Is there anything that we both heard that accuracy and fairness and can we do take more factors in and make the not the trade off but the multi-threshold ring situation.
So that is kind of the fairness issue and privacy. The privacy, I don't need to talk quite a lot about the privacy. privacy, but privacy is really important today when quite a lot of information are sharing with each other and sharing with the system. People are paying more and more attention on the privacy issue. We will remind you about this point later.
And effectiveness and efficiency. Well, we know that large-language model does help in some cases, but actually we also know that the that to train your large longer models is not a trivial thing. Sometimes just for training for one round they will need around just kind of the Heng Ji Thousand or Southern just kind of the million or ten million US dollars for just one training for one round. So that's a kind of the something related to the risk the money, the time, so related to the efficiency issue.
So can we do some improvement on that? And the last challenge is about philosophy. Philosophy and methodology.
So just two days ago, I attended a conference talking about the AI and AI governing and AI ethics and AI technologies. You know that I am just having the discussions with the social scientists and philosophers and people from philosophy. from legal cases and from the company and also from the scientists or even from some history.
People are doing some history research. So while we're talking about philosophy, for example, think about the airplane we take. Everyone takes, you must have ever tested today.
Actually, there is a really important philosophy based behind the airplane. when people are just trying to make the airplane. Should it be something just like the bird flying? It means that people need to use a wing to help flying.
But actually in early stage, when people are trying to fly into the sky, several hundreds of years ago, people were trying to do something with big wings, but they failed. find the aerodynamics. Today, we still don't know so clearly why the so heavy airplanes can fly just in the sky. But actually, it helps. And it don't use anything just like the wind flying.
Don't use their wings. But it's a fixed wing in with a little adjustment. So that is kind of the philosophy. Similarly, when we're talking about artificial intelligence, well, I know that every scientific fiction movie or novel, they are talking something just like the human beings. The robots like the human beings, or sometimes we are talking about the robots will replace human beings, but actually, what should the AI be?
Does it should be the brain-like? When we're thinking about the thinking, reasoning, Should it be the brain like, just like human beings, what our brain works? Actually, we don't know how our brain works. We don't know clearly even today. And then, or the artificial brain.
Should we build some artificial brain like quite a lot of the neurons with a lot of the connections? Yes, neural model works today, but actually it's still far from the human beings'brain. Well, or the brain.
brain inspired, brain inspired computing. So here the brain inspired means that we don't need to build the AI system just like exactly or always learn from the human beings because AI is something different from human beings. So maybe after today when you are getting back to home, when you are writing some scientific fiction novel or design kind of the movie. Maybe you could design something different.
The future world, the AI in the future world, that doesn't necessarily to be just look like the human beings with all kinds of the irons, eye masks, or strong muscles. Maybe not. Maybe just kind of, just like the water drop, just like a leaf.
Or even just like something fully transparent, just like flower. Just like the water. It's possible.
And yes, actually today, people are just using something just like the liquid and with the intelligent, some carding set, and to use that kind of the liquid-based AI equipment to help the doctors to do some operations. That is helpful. So that's not necessary to be something just like the human beings.
Okay, so that are the eight challenges. And one of the biggest challenges is about the philosophy behind and what we are trying to do. All kinds of the different possibilities.
And now the progresses. Well, for this part, I'm really proud and really glad to show you the progresses about the AI. Hi, in Tsinghua University. So firstly I will show you some kind of the introduction about the AI Institute that what people, what researchers are doing now. But actually this video is, I believe that is created around two years ago.
So we are not talking about the large long model, but I will talk about that a little bit later. An era of artificial intelligence has emerged quietly. When it comes to AI, you should never miss either China or Tsinghua University.
Hi, I'm Liu Yang. Welcome to Institute of Artificial Intelligence, Department of Computer Science and Technology, Tsinghua University. Here, we have a superb research environment and an innovation-encouraging tradition. Professors and students are conducting exciting research together, which has significant influence in the world. Our research team is divided into eight groups, focusing on eight different directions.
Now I would like to invite teachers from these groups to introduce some research achievements that they take pride in. Our lab aims to teach computers to understand human languages, speak and communicate with people. Especially, we want to make AI understand the human culture in languages and develop AI points to write classical Chinese poems.
Today, text generation is a hot topic in artificial intelligence. Well, AI can generate fiction and assets from some given keywords or sentences. I'm working on text generation. For instance, by generating stories using knowledge, planning, and other deep learning techniques. Besides poems and stories, we can also let AI generate music.
Specifically, we use AI technologies to generate Melodies, calls, and a compliment arrangement. Meanwhile, we are studying neuroscience and cognitive science, trying to bridge artificial intelligence and brain intelligence. Don't underestimate the level of robot planning. We can first model the individual observation model, gesture database, and knowledge base, establish the mapping relationship between the gesture and the database.
By the behavior learning, we can use knowledge base to collect a fingering, then we can robot to play the music. We also do some research related to intelligent service robot. Our robot can show you around in our lab. On the other hand, in banks or in hospitals, our robot can answer any questions what you may ask.
In order to realize the flexible interactions, we also do some research on the key technologies. Relating it to the multi-modal information processing. In addition to such amazing robots, we also have autonomous driving cars.
In recent years, we focused on visual deep learning based methods for environmental perception, autonomous navigation, and decision making. There's lots more information you'll be interested in than our information travel technology do have. We have proposed a series of advanced web search and explainable recommendations based on user modeling and user satisfaction.
Our technology has already been leveraged in many international and domestic companies, and we have served more than billions of users. All the above systems have been heavily using machine learning techniques. So my group has been working on various topics in machine learning, including probabilistic machine learning. Adversarial Revision in Machine Learning and also DC making with uncertainty. Now you have no sign of our achievement, but what you see is only the tip of the iceberg in artificial intelligence.
We sincerely welcome you to Tsinghua University. We can work together with the best students from all over the world. Okay, so that is a very brief introduction to the AI progresses and I will take some of the examples to give a little bit further updates with what we are doing just after the previous two years where the videos were made.
Firstly it's about the fundamental theory of AI and machine learning that is made by the group about the Statistical Artificial Intelligence and Learning group and what they are doing is for the basic algorithms like the patient method, deep learning. Deep learning is just what we showed you as an example, efficient machine learning. Here the efficient means that we need to take the cost into consideration. And adversarial machine learning, that's something just like a battle. One is protect and another one is the kind of the fighting against.
And reinforcement learning, which is really popular today in the most recent two years based with the large-ranger model that the reinforcement learning has been paid more and more attention to and brain-inspired AI, that is what our philosophy that it's not necessary to be the brain-like AI but the brain-inspired AI. And here is a brief journey about the work on this series about the machine learning. In early years here around 10 to 15 years ago, people were working on the probabilistic graph models and actually today it These are still also the really powerful techniques and Bayesian and also the the scalable learning and the the inference algorithms and deep learning.
Well, we started deep learning from around 10 years ago. Actually, we also have quite a lot of achievements and also the programming decision for the probabilistic programming and decision-making. Well, the lab, they also share some open source resources on the GitHub, like for example, Zhu Sun is for the GPU library for the probabilistic deep learning.
So that has been received quite a lot of stars and does help. And this is the most recent advances in the diffusion model. Now here are the generated images by the prompt, there are two yellow birds in a big green tree. So they are the generated result, really beautiful and wonderful ones.
And here are some kind of the replacement. This is original image, and people do. We do some extraction, and then we replace them. Silver dress with a gold dress. So actually here they should also have something.
Well, this is purely generated by the hamburger and chips. So from different triangles, it is really wonderful. also generated with kind of the sacred or dreamlike colors and styles. So that is kind of the diffusion model and the adversarial attack and defense in deep learning. For example.
Well here, for us, for human beings, it is a panda, right? Yeah, for machine, it is also AI. It is a panda with a 57% confidence.
So what this group has done is that they add this kind of noise. To us, it shows to be something nonsense, just a noise. And then they generated this picture. So this picture still looks mostly like a panda. like the panda to us, but actually to an AI algorithm, they will recognize as a gibbon with a 99% of the confidence.
So that is that AI is not human beings. When they are working on this picture, the same algorithm, they will think that it is not panda, but a gibbon. This is called the attack, the adversarial attack. Think about that when you are showing some picture to the AI, and then you add some noise here, and then it will have the exactly totally wrong results.
So that is terrible. Well, we also have this kind in the world, in NIPS, NeurONIPS 2017, they have this AI security computation, and this group won the first place in all chats. And this computation is held by Google and hundreds of teams that have participants in this.
in this kind of attack. So that is really important to the security, to today's AI safety and security. And they have released quite a lot of the work in the on the open source on GitHub and if you need you could do use that like the diffuser or and what we showed just the...
previously. And then the next progress is on another way. It's also kind of the, this is kind of the interdisciplinary research.
Can we use some cognitive science to help us to understand the users modeling when we are looking at the short videos? For example, when you look at some short video. in the TikTok or some videos on the YouTube, what our brains are looking at, the activity of brains, how about the mood change and how about the biological signals and also... about the user behavior. We try to understand that and we use the definition in psychology called immersion.
Maybe you are not so familiar with immersion but maybe some of you have ever heard of it. about the flow. The flow means that you are in a really pleasure status that you're happy, you are peace and quiet and you just focus on what you are doing. You forget about the other things, forget about the troubles and forget about the other environment. You just focus on what you are doing currently and you enjoy, you are enjoying what you are doing.
That is the status of immersion. Well I hope that you could take and be led to the immersion status in my course. Well, they have four characteristics about the immersion status, like lack of time awareness, a kind of the sense of transpiration to other reality, and a kind of the emotion involvement and capivation.
Well, we recruit more than 200 correspondents or respondent to do the survey. and the survey participants to wear some like the intelligent wearable kind of wearable devices and also they finally do the field study for one week. We give them one mobile phone with the newly registered account in the shop view and to check their status and also we call them into the lab to do the lab study with the EEG equipment.
while just collecting their EEG signals. And all of these are just under the been proved, firstly proved by the ethics committee in Tsinghua University. So we don't have the harm to the users. Well what we found?
Well firstly we found that when we people are watching at the shot videos for when the time when time passes the immersion store just keep increasing people could enter the immersion style but actually after 50 minutes 55 minutes or after 60 minutes the immersion score will be dropped. Will drop so actually take a kind reminder if you are watching for example the short video and online videos, please limit your experience within one hour because longer time you will lose your immersion. Well, and what's really interesting, well here, if it is a personalized video recommendation, people will get back.
better immersion status. And for randomly sent result, people will less likely to be enter the immersion status. Left is the not immersion, and the five is the more immersion scores. But what is really important is this one.
This result, if we mixed the personalized recommendation results and the random results, it shows that we do not always show what you like previously. Based on that we could randomly also add some something maybe new, some new type of the video, something we don't know whether you had ever been interested in previously, but the immersion status does not hurt. It also gave us some opportunities to give you better, more diverse result.
to break the bubbles, break the information bubbles. Because if always this one is better, then the system will always recommend what you had ever liked, what you have ever seen, what you have ever watched. So what we can see will be narrowed down. But currently, our study shows that it's not necessary. We could merge these things with the random ones.
And also, we also watch about the signals. For example, check that this is about the prefrontal area, just something just like here. It has a positive correlation with the immersion.
And also for this part, we call the frontal, some part like here in our eye, in our head, and then it have kind of the negative negative correlation with the immersion status and also sometimes about this part. Actually this part, delta, is kind of related to something like the sleepy. And this one is something related to... to the high arousal and high cognition status. So think about if we have such kind of equipment that everyone you are wearing the EEG equipment.
and we could show some color on top of your head, we could check whether if you have in the positive status, then you are in the immersion status here. And what this kind of immersion help, we could find that immersion can be predicted somewhat we see less loss. And also, this one is really important, for example, currently, Almost all the video system or short video recommender system use a view ratio as the accuracy. Whether the recommendation result is good or not.
The video, the visual ratio. Sometimes people use a liking. Whether you like, you click like about that video.
Yes. If yes, you are, your mentions, your main, you are satisfied with it. But actually, the correlation between the liking and the satisfaction is only 0.6. Yes, it is positive correlation. It's acceptable, but that's not so perfect.
But if we use the immersion, the correlation could be 0.80. Even if we don't use a true immersion the user tells us, but use a predicted immersion with some noise, because we can't guarantee the 100% of the correctness about the prediction result, the correlation will increase to 0.74. that is much higher, about 30% higher than previously. So it means that please don't use how many times I watch on this video as a matrix. Just use a predicted immersion status about the user about that.
And then we come to some really popular terms about large-language models. I believe that everyone is expecting to listen to something related to large-language models today, but yes please be aware that that is not the only thing about the artificial intelligence, although it has been one of the more important part of the AI. Well, previously I would like to give the discussion with our students here inside the classroom, but then I thought why not do that pre-class and then I could show you the result directly. So, Let's check what we learned from. Have you tried large-longer models?
How frequently? What tasks you use? And what's your evaluation about user satisfaction?
Thank you. I believe that among you, that 23 students have submitted your survey to you yesterday, may not. So you helped me to make these results just coming, statistical results in our class.
Frequency. Well, around one third of the students use the large lung model daily. That's much higher than I had ever expected. And 22% weekly, only 5% monthly, and one third tried a few times.
So it shows that we rely on it very much, pretty much, every day or every week, or we just tried it, we didn't use it. That's interesting. Well, here is the usage practice.
percentage what tasks we use for the large-language models. More than half, 55%, use them to solve problems. And because that's a multi-choice, you must know, 55% to use it as a task assistant. For example, please help me to revise my summary.
Actually, the summary I sent to you is revised by the large-language model. Actually, I have to say that the English summary I sent to you is first generated by large language model when I provide the Chinese version but actually that's not so helpful because they have made some mistakes on the professional keywords or something and I have to revise it manually and so that's kind of text assistant but yes it reduced my time, saved my time and 50% asked for ideas and 45% for information travel. you and 41 for research purposes and this is really interesting user satisfaction honestly something about that you know that the top satisfied task is the test assistant we received a 77% of the satisfied and very satisfied with the yellow and the the the shallow orange and then with the almost the lowest part about here, almost the lowest part of the dissatisfaction, these two parts, the red and the dark orange, only 9%. Well, something really interesting, while you could say that ask for idea, ask for problem in specific area and research purpose usage, well, they have not so good satisfaction around the 40 to 45% and the highest dissatisfaction with 18%, 27% is near one third and 18% of the dissatisfaction and satisfaction, that's really interesting. And for leisure, it's okay.
For leisure, it's just that's not so good, not so perfect, but not so worse, not so bad. So that's really interesting. It shows that but we have different intent, so we have different magic for the large-longer models. And what kind of answers we expected to have?
So mostly for factual answers, most of ours we use, we hope for information travel, for search, and for research purpose usage or solve problem we need and we are expecting for the factual results. And for the ask for ideas, leisure and ask for advices, we are more expecting the creative results. And also it depends on ask for advice, what kind of advices we could use. And this one, areas to improve.
According to our experience, your experience, what are the most important things to improve? Actually, firstly, although we have the top one to three, three, but look at this pie chart. It's almost unified with each factor.
It means that everything needs to be improved greatly. So while the first one is hallucination, well hallucination, around 68%, around near 70% people think that hallucination is not the thing I could bear to because it's It has some wrongs. And the second one is this one.
This is a multimodal. We hope that it could provide more multimodal information. And this one is privacy. Privacy, again. We have mentioned it has challenging issues.
And also, 50% of us believe privacy is a really important issue, not only for AI, but also for some specific technique of large-ranger models. OK, thank you for your survey to help us to get. is knowledge about large-language models.
And then what we did, we have for this group, they have teach you CAG, that is Knowledge Engineering Group. They have built the fundamental models. Like they call them chat GLM. And they released a different version about the big model, the middle moderate size model, and the smaller model. And If you are interested in that, you can scan the barcode.
Actually, for example, there are six billion parameter model. Well, six billion parameter is really huge, but actually to large-language models, it is a small model. It is a small large-language model. Well, it has received more than 50,000 stars on GitHub because people really find this really interesting, important, and more than 10 million downloads from the HuggingFace. And it has been just released by two weeks.
That is take the number one download once and HuggingFace. Well, so this is the kind of the alignment about functionality between the TeachU's GLM and the GPT that they have the chat. version, we have the cog version, we have the code for codes, and web version, and also on the way for the cog, cognition and agent version.
So here are the roadmap, you could say that although people were working on what we know like the chat GPT here, but actually they start to work on the 2021, although large number models getting to known by peoples. in this way around this time. But before that, people are still stuck working on that.
And now they are working on the CHA-GPT. And here, they use a CHA-GLM and the visual version, the cognition version. Here is the result about their story generation. So due to the time limitation that will go, I want to rate them sentence by sentence, but they could generate the multi-language stories, and they could do the applied math, could also generate codes.
All of these ones are codes are generated also by the chat GLM and they have the agent GLM. Here the agent LLM, they are ranked here. This is agent LLM and then better than GPD 3.5 but still worse than GPD 4. And here is also the agent GLM here on the out domain list and here is the GLM in the better generalization score ways. And also they have some different versions for more powerful, more fast.
And some cheap ones, and still cheap but a little bit more expensive than the fast ones, the different version. So then what we can do when we have these foundation models, large-slang models, Then, for example, we take the legal case into consideration. This work has been done in the TTOIR group, that is the same as my group, together with the previous immersion work. Well, we know that the Lush Law Model does help, but it still... have quite a lot of things that need our improvement that does not help, as we said, some factual cases.
We want to do some specified area knowledge, then we need something, so for example the legal cases. You can understand if we provide generally some case for the lawyer to say, hey, for example, in the US it is example based, instance based case. case judgment, then we can't just give the lawyer a case that, hey, there is a previous case in this case, this one has been put in the prison for 10 years, but then actually that is not the truth. That never happened before.
So that is terrible. That could be the kind of the disaster to the legal cases. So we need the factual things.
We need the legal related knowledge. and we need to let the AI, the large-slang model, to understand those knowledge. So then that is a choice.
Well, one way, we have the general foundation model like the large-slang model, LM, and then we have legal data. The first way is that we feed the legal data to the LM. And then we got the legal foundation model. This is one way, the latter way.
We also have a right way. Another way is that we have the large language model, and we have the legal data. Then we build another. the hybrid legal LLM, a big model with a smaller expertise model.
And we use this hybrid model to help do the legal cases understanding. Both way works actually. So this is kind of the example how to use the first way, the legal foundation model. So in this way, this part, the middle part is kind of the the Lachlan model combination about the Lachlan model and the legal related work. And another way, so by this way it does help that have the better performance compared to either the Chinese general large-long model or the Chinese legal model.
And then by this way it will help get a better result. And the second direction, it also helps that we have the large-long model and we use the legal-prompt model. And to use this model... to help you the legal pre-training and legal instruction training and then get some optimization and get better results. So here is an example that they can retrieve some legal cases document and ask to clarify questions.
So the left part is kind of the event plus chat GPT. The right part is the proposed approaches, Likari. the hybrid model plus chat GPT and you could find that if we don't do anything related to the specific expert knowledge then they will generate some results duplicate results or useless results but by the help about the legal related expert knowledge they could have much better results and much much more reasoning, like when we're talking about the drug, and then they are talking about the buying drugs, collusion, and taking drugs.
They are taking for different ways for the legal judgment. Well, the next direction is for AI for science. So we also have the natural language processing group working on this for the intelligent chemistry.
Well, that is something. I feel really fancy that can AI system beat chemistry experts? Well this is kind of the question answering and they asked the AI system and also asked the expert in the chemistry to answer the question and they found that they could learn the AI system could learn the chemistry knowledge in an unsupervised manner and it is better higher score than the five chemistry major students.
So that is maybe people are thinking that yes, it's much better than the students, the experts, human experts. And can it further connect to the real world? Not only just the question answering examinations. Can we be linked to the real world?
Is that possible? And then we could use this kind of AI technology to help the automatic experiments. like this.
So this work is really inspiring and this year's work has been publishing the new nature communications so that's kind of the foundation science improvements. Well and also the chemical science that's also the part of thing and they just have been for example this is a tour learning demo for the meta-analysis they just asking about the comparison between three drugs for type 2 diabetes that is popularly assessed today. Well, the Lach-Langer model can search for related academic literature and filter out those useful data and draw more evidence and conclusion. So that does help. And There is another group in the Tsinghua University Institute of AI Industry Research that they do some drug discovery.
Can we find discover new drug? So you know that today the drug But to produce a new drug took quite a long time because a lot of money and tried a lot of fields, but actually they are trying to do this. How to do?
They use scientific large-language models. to do some dry experiments. And then they build some physical principles and get some wet experiments.
Wet experiments means that two experiments in the to do the drug production. And then they from dry to wet is by the active learning. And on the other way they could do the reinforcement learning. By this kind of the way we will reduce quite a lot of the time.
money for the new drug. Maybe in the future it is possible to find a new drug within several, maybe just not take around 10 or 20 years like today. Okay quickly go back to the robots. This is for the just kind of the you could notice that they could do some they achievements so to show they have the challenge for the rehabilitation group.
They just like use their muscles to help these disabled perform something really wonderful by the RoboTag system and they do in the coffee period they could do the automatic throat swab collection just do the test and do the very good results. Okay I believe that That is all for us and then an open question. We don't need to discuss here, but please think about that. Now, do you think AI will rule the world and defeat human beings?
And why? Whether and why? Okay, thank you very much for your attention.
Hope you enjoyed the talk. Thank you. Yes.
Yes. Good. Yes.
I hope that you have ever entered the immersion. status in my talk. Thank you, enjoy.