Transcript for:
IBM THINK 2023: AI and Business

>> Welcome to IBM THINK 2023! >> AI generated art, AI generated songs. AI, what is that? It sure is a lot of fun. But when foundation models are applied to big business, well, you need to think bigger. Because AI and business needs to be held to a higher standard. Built to be trusted, secured, and adaptable. This isn't simple automation that is only trained to do one thing. This is AI that is built and focused to work across your organization. This isn't committing to a single system. This is hybrid ready AI that can scale across your systems. This isn't wondering where an answer came from. This is AI that can show its work. When you build AI into the core of your business, you can go so much further. This is more than AI. This is AI for business. Let's create. (MUSIC) >> Please welcome Senior Vice President and Director of Research, IBM, Dr. Dario Gil. (Applause) >> DARIO GIL: Hello. Welcome, welcome to the last session of THINK. And I understand some of you even had a drink. How special. So, I hope you've enjoyed the last two days with us. And what an incredible year it has been for AI. You can really feel the change that is happening all around us. And there's just no denying that the pace of this technology continues to be exhilarating and that its implications are now so clear for all to see around the globe. I am just fascinated by AI. And as a technologist, this level of excitement really comes about only maybe once or twice every decade. And I am just thrilled to see all the possibilities that this technology is going to enable. Because it's really going to impact every industry. From customer care to transforming data centers and logistics to medicine, to manufacturing, to energy, to the automotive industry, to aerospace, communications, you name it. It's really going to impact every one of our businesses and really touch every aspect of our lives. So, it's really exciting. And while sometimes the pace of this technology can feel, you know, daunting and scary, the opportunities to harness foundation models and generative AI with proper governance, the opportunities are immense. The emergence of foundation models and generative AI is really a defining moment. And we need to recognize its importance, we need to capture the moment. And my advice is, don't be just an AI user. Be an AI value creator. Just think about it, as an AI user, you are limited to just prompting someone else's AI model. It's not your model, you have no control over the model or the data. Just think carefully about whether that's the world you want to live in. As an AI value creator, on the other hand, you have multiple entry points. You can bring your own data and AI models to Watsonx or choose from a library of tools and technologies. You can train or influence training, if you want. You can tune, you can have transparency and control over the governing data and AI models. You can prompt it, too. Instead of only one model, you will have a family of models. And through this creative process you can improve them and you can make them your own, your own models. Foundation models that are trained with your data will become your most valuable asset. And as a value creator, you will own that and all the value that they will create for your business. So, don't outsource that. You can simply control your destiny with foundation models. So, let me show you how we become, and allow you to become a value creator with Watsonx. Watsonx is our new integrated data and AI platform. It consists of three primary parts: First, Watsonx.data it is our massive curated data repository that is ready to be tapped to train and fine-tune models, with state-of-the-art data management system. Watsonx.ai, this is an enterprise studio to train, validate, tune, and deploy traditional machine learning and foundation models that provide generative capabilities. And Watsonx.governance, this is a powerful set of tools to ensure your AI is executing responsibly. Watsonx.data, Watsonx.ai, Watsonx.governance, they work together seamlessly throughout the entire lifecycle of foundation models. And true to our commitment to hybrid cloud architectures, Watsonx is built on top of Red Hat OpenShift. Not only does it provide seamless integration of Watsonx components, it allows you to access and deploy your AI workloads in any IT environment, no matter where they are located. WatsonX is the AI platform for value creators. And look, I don't need to tell you that deploying these technologies is not easy at the enterprise level. But the platform changes that. So let's take a look now at how an entire AI workflow end to end works in the platform. The lifecycle consists of preparing our data, using it to train the model, validate the model, tune it, and deploy in applications and solutions. So let's start with data preparation. So say you're a data scientist and want to access the data that is in a Public Cloud, some that is on prem, some that may be in another external database, or in a Public Cloud, a second one, or anywhere else outside your hybrid cloud platform. So you access the platform from your laptop and invoke Watsonx.data. It establishes the necessary connections between the data sources so you can access the data easily. We have been building our IBM data pile combining raw data collected from public sources with IBM propriety data. We are bringing data from different domains, the internet, code, academic sources, enterprise, and more. We have used Watsonx.data to collect petabytes of data across dozens of domains to produce trillions of tokens that we can use to train foundation models. And besides the raw data and our proprietary data, we allow clients to bring their own data to enrich and improve their purpose-built foundation models. It is all stored in .data. With granular metadata that provides traceable governance for each file or document. So, now we took this and we move to filter and process the data. First, we identify the provenance and the idea of the data. Then, we need to categorize it, we classify it, for example, in a pile for different languages, let's say English, Spanish, German, and so on. A pile of code data that we then separate by programming language, Java, Ansible, COBOL, and so on. And any other category that we have. Now we filter it, we do analytics and get rid of duplicated data. Now identify hate, abuse, and profanity in the data, and we remove it. We filter it for private information, licensing constraints, and data quality. By annotating, we allow data scientists to directly determine the right thresholds for their filtering. Having done all of that, the pile is now ready for the next step. We version and tag the data. Each dataset, after being filtered and preprocessed, receives a data card. The data card has the name and the version of the pile, specifies its content, and filters that have been applied to it. And any other relevant content to make it easy to manage and track different choices that of the work and the right subsets of the data that we have used to develop the foundation models. Now, we can have multiple data piles. They coexist in .data and access the different versions of data for different purpose is managed seamlessly. So, we are now ready to take the pile and start training our model. This is step 2 in our AI workflow. So, we move from .data to .ai and start by picking a model architecture from the five families that IBM provides. These are bed rocks of models, and they range from encoder only, encoder- decoder, decoder only, and other novel architectures. Let's pick the encoder-decoder sandstone to train the model and pick a target data pile version from the piles that is .data. .ai allows training with computing resources across the hybrid cloud. In this case it runs on IBM Vela. Vela is the first of a kind cloud native AI super computer that we built last year. It gives you bare metal performance in the cloud with a virtualization overhead that is less than 5%. And we are making it available as a service. Watsonx.ai auto scales the resources for the training being done. And the first thing that we need to do is to tokenize the data according to the requirements of the model. So, we first query the data using the version ID for the pile we want to use. That materializes a copy of the dataset on Vela for tokenization. What this means is that, for example, we were building a large language model, the sentences in the data are broken into tokens. And this process can create trillions of them. And we use the tokens to train the model. Now, training is a very complex and time consuming task. It can require dozens, hundreds, even thousands of GPUs and can take days, weeks, and even months. Training in Watsonx.ai takes advantage of the best open-source technology out there to simplify the user experience. Built on code flare, using PyTorch and Ray, it also integrates Hugging Face to bring you a rich variety of open formats. Once training is done, the model is ready for validation. So for each model we train, we run an extensive set of benchmarks to evaluate the model quality across a wide range of metrics. Once the model passes all the thresholds across the benchmarks, it is packaged and marked as ready for use. For each model, we create a model card that lists all the details of the model. We will have many different models, trained on different piles, with different target goals. Next, we go to Watsonx.governance to combine the data card that has the detailed provenance information for the data pile that was used for training, with the model card that has the detailed information on how the model was trained and validated. Together, they form a fact sheet. This fact sheet is cataloged in .governance and all the other fact sheets for all the models that we have available for use. Now let's go on to tune the model that we just created, and what we mean by that is to adapt it to new downstream tasks, which is the basis for the large productivity gains that is afforded by foundation models. So, say, in this case, you are a different person, and you are the application developer. So, you can access Watsonx.ai and start by picking a model from the catalog to work with. We have a family of IBM models specialized for different domains. But we also have a rich set of open models, because we believe in the creativity of the global AI community and in the diversity of models it offers, and we want to bring that to you. In this case, we pick sandstone.3b from the IBM language models, which is the model that we just trained. We set up the options for tuning, the tuning approach. We pick summarization as an example, as the base model to use. Now, we can access and use business proprietary data to tune the base model and for the task that we choose, whether that business data is located in anywhere in the hybrid cloud platform. So, now we send prompts and tuning data, and that's used to tune the model in .ai. You get the outcome of the prompt on the model. This process happens back and forth, back and forth many times, and in the end, you end up with a set of ideal prompts to use. The model is now specialized and ready to deploy. This is the final step in our AI workflow. The application where you want to use the foundation model can live in the public cloud, it can live on-prem or on the edge. And you can really deploy and run foundation models efficiently wherever you need them. And the deployed model can be used in many different applications. So, for example, we've embedded foundation models in Watson Assistant. For text generation in Assistant, you describe the topic that you want the assistant to handle, and it generates the corresponding conversational flow. We have an inference stack to scale the serving of the model in applications. It consists of state-of- the-art technology that has been field tested for scalable model serving. This is how Watsonx allows us to go from data to a model that is trusted, governed, deployed and ready to serve, and how we can scale that model to different applications. Once models are deployed, we continuously monitor them and update them in both .data and in .ai. We call this constant process our data and model factory. At Watsonx.governance monitors the models, if there's any change that may impact how the model can be used or performs, be driven because we have new data that can be leveraged or there's a change in some regulation or law or data licensing. Any change detected by the .governance process guides and process the update to both the data and the model. The idea of the model factory is that is central to solid and proper governance of AI. Now, all of these updates need to happen without disrupting the underlying applications that are leveraging the foundation models. And this data and model factory is in production today. We have already produced over 20 models across modalities like language, code, geospatial and chemistry, and spanning different sizes of models from hundreds of millions to billions of parameters. We have infused these foundation models into IBM products, Red Hat products, and our partners' products. At IBM, over 12 foundation models are powering our IBM NLP library, which is used in over 15 IBM products and is available to ISVs. Granite models train over code are part of IBM Watson Code Assistant, which has been applied in the Red Hat Ansible Automation Platform. And as you heard earlier in this event, SAP has partnered with us and is infusing foundation models into their solutions. So, Watsonx is really ready for you to create value with AI. Now, to maximize what you can do and the innovations at your disposal, we believe that you should bet on community. Because, the truth is, one model will not rule them all. And with the innovations and models that it develops, the open community is super charging the value that you will be able to create. To be true to our belief in the diversity and the creativity of the open AI community, we are proud to announce our new partnership with Hugging Face. So let's invite to the stage cofounder and CEO of Hugging Face, Clem Delangue. (Applause) >> CLEM DELANGUE: Hey, Dario. >> DARIO GIL: Clem. >> CLEM DELANGUE: Thanks for having me. >> DARIO GIL: First of all, welcome to IBM THINK. We are just delighted to have you here. So let's begin by, tell us a little bit about yourself and how and when you got interested in AI and how did Hugging Face get started. >> CLEM DELANGUE: Yeah, thanks so much for having me. I, actually, started in AI almost 15 years ago. I look at the room at the time we couldn't have filled it. Maybe it would have been one person, two persons in the room at most. As a matter of fact, we weren't even calling it AI at the time, we were calling it computer vision. I was working at French company – I am French, as you can hear from my accents – and we were doing computer vision on device, on mobile. The company went on to get acquired by Google after. But I never lost my passion and excitement for AI. So, seven years ago, with my cofounders, Jillian Thomas, we gathered around this passion for AI and started Hugging Face, right, what you see on my T-shirt, basically. We started with something completely different. We worked on conversational AI for three years and as it sometimes happens for startups, the underlying platform and technology ended up more useful than the end product. When we started to release part of it on GitHub, we started to see open-source contributors joining us, we started to see scientist sharing models in the platform, leading to what Hugging Face is today. >> DARIO GIL: So I mentioned the power and the creativity of the open community creating in AI. Just share with us some statistic, how big is it? How much energy is there in that community and how much should we expect in the creativity available to all of us? >> CLEM DELANGUE: Yeah, the energy in open-source AI is insane these days. Just a few weeks ago I was in San Francisco. I tweeted that I would be around and that we could do some sort of a small get-together for open-source AI people. We thought we would get maybe a few dozen, few hundred people. And the more the days came, the more people ended up joining. We had to change locations three times to something at the end almost as big as that, we had 5000 people. People started calling it the Woodstock of AI, so that's just an example. We are competing with that, the Woodstock of AI. Just proof of how vibrant the open-source AI community is. We think the same thing on Hugging Face, right? Since we started on the platform four years ago, we grew to now having over 15,000 companies using the platform including very large companies like Google, like Meta, like Bloomberg, all the way down to smaller companies like Grammarly, for example. And collectively they have shared over 250,000 open models on the platform, 50,000 datasets, and over 100,000 open demos. Just last week 4000 new models have been shared on the platform. So, that shows you kind of like the magnitude and energy in open-source AI community. >> DARIO GIL: Just think about that, 4000 models in one week. So, one of the myth busting things that we were chatting about is that the element of one model will not rule them all, right? There's going to be a huge amount of innovation that is happening from so many sources. So, perhaps, you could share with us, what are some examples of innovation that you see? We have seen scale. But what are some examples that really caught your eye or you think were particularly powerful? >> CLEM DELANGUE: Yeah, I mean, it's interesting because since the release of ChatGPT, right, and some people have said, okay, ChatGPT is a model to rule them all. 100,000 new models have been added on Hugging Face, right? And, obviously, companies, they don't train models just to train models, right? They would prefer not to do it because it costs money to train models. But the truth is, if you look at how AI is built, when you can build smaller, more specialized, customized models for your use cases, they end up being cheaper, they end up being more efficient, and they end up being better for your use case, right? Just the same way every single technology company learned how to write code, right, and to have a different code base than their competitors or than companies in other fields. We are seeing the same thing for AI, right? Every single company needs to train their own models, optimize their own models, learn how to run these models at scale. Every single company needs to build their own ChatGPT because if they don't, they won't be able to differentiate, they won't be able to create the unique technology value that they have been building for their customers, and they lose control, right, if they start outsourcing it. That's what we are seeing on Hugging Face and in the ecosystem as a whole. >> DARIO GIL: It's back to the philosophy of don't be a prompt tuner user, right, be a value creator with all of this. So let's talk about our partnership for a minute. Why are you excited about bringing the power of all of this community into Watsonx, in the context now of an enterprise, you know, need and meeting the needs of our clients that are here listening? >> CLEM DELANGUE: Yeah, obviously, Hugging Face and IBM share a lot of the same DNA, right, around open-source, open platform, kind of, like, providing extensible tools for companies. For me, one of the most iconic collaboration partnership of the last decade is IBM plus Red Hat and hopefully we are just at the beginning of it, but with this collaboration, we can do the same thing for AI. I think with this integration between Watsonx and Hugging Face, you kind of like get the best of both worlds in the sense that you get the cutting edge and the community and the numbers of models, datasets, apps of the Hugging Face ecosystem, and you get the security and supports of IBM, right? For example, you mentioned, we mentioned all the models. The IBM consultants can help you to pick the right models for you at the time that is going to make sense for your company. So, you really get, kind of, like, the perfect mix to get to what we were saying, meaning every one of you being able to build your own internal ChatGPT. >> DARIO GIL: So, tell us, this is just great. I am just delighted about those opportunities. So tell us a little bit about what's next for Hugging Face when you look over the next year or so, what excites you the most? >> CLEM DELANGUE: Many, many exciting things for us. We have seen a lot of adoption, a lot of companies using us for text, for ODO, for image. And now we are starting to see that expand to other domains, for example, we are seeing a lot of video right now. We are seeing a lot of recommender systems; we are seeing a lot of time series. We are starting a lot of bioG chemistry. We are excited about it, we think ultimately AI is the new default to build all features, all workflows, all products. It's kind of like new default to build all tech. So we are excited for this expansion to other domains. Also, we are seeing a lot of work around chaining different models and, in fact, at Hugging Face we released today a transformer agents which is a way to chain different models to build more complex systems that are achieving kind of like better capabilities. These are some of the things that we are the most excited about. >> DARIO GIL: So a lot there so thank you, Clem. Thank you so much. Congratulations. >> CLEM DELANGUE: Thank you so much. (Applause) >> DARIO GIL: Thank you. So, while you saw how the platform works to enable the foundation model creation workflow end to end. And we talked about data, we talked about model architectures, the computing infrastructure, the models themselves, the importance of the open community. So, now let me show you how to use and how you would experience Watsonx. And we are going to go inside the studio, inside Watsonx.ai. And from the landing page you can choose two prompt models to fine-tune models or deploy and manage your deployed models. So here's an example of how you can use the prompt lab to do a summarization task. You give the model the text as a prompt, and the model summarizes it for you. In the case of a customer care interaction, it gives you the customer problem and the resolution according to the transcript of the interaction. In the tuning studio, as we saw before, you can set the parameters for the type of tuning that you want to do and the base model and you can add your data. The studio gives you detailed stats of the tuning process and allows you to deploy the tune model in your application. It's that simple. We took the complexity of the process away so you only need to worry about creating value for your business. And here are some of our current AI value creators. SAP will use IBM Watson capabilities to power its digital assistant in the recipe solutions. You have been hearing about Red Hat, how it's embedding IBM Watson Code Assistant into the Ansible Automation Platform, BBVA is bringing their enterprise data to use with their own foundation model for natural language. Moderna is applying IBM's foundation models to help predict potential MRNA medicines. NASA is using our language models together with US spatial models we have created together to improve our scientific understanding and response to earth and climate related issues. And WiX is using foundation models to gain novel insights for customer care as they meet the needs of their customers. So, what I encourage you is to join them and embrace the age of value creation with AI. A year ago, I stood on a stage just like this, closing THINK. And I shared with all of the attendees that what was next in AI was foundation models. And maybe at the time it seemed a little bit abstract and, you know sort of, like, this intellectual disposition about where things were going. But, boy, what a year it has been. And it has been a big year for AI at IBM. So, as we close our event this year, let me remind of you of all of the things we have created and announced. We have announced Watsonx, a comprehensive platform that allows you to create and governor AI in real time so that you can move with urgency and capture this moment. We announced a set, a family of foundation models, including IBM models, open community models, and how you can even create your own models. We announced our data model factory, using petabytes of data across multiple domains to create trillions of tokens to create our family of foundation models and show how the factory continuously updates them when conditions change and brings a regular cadence of models to ensure proper governance. We told you about products where we have infused our foundation models over 15 of them, including digital labor, Red Hat products like Ansible Automation Platform, our partner products like ACP solutions. We announced important collaborations to advance AI and bring it to the enterprise, Hugging Face and long-standing collaborations and initiatives like PyTorch and Ray. We showed you some of the organizations that have become AI value creators with us. We are bringing IBM Vela or cloud native AI super computer to train foundation models with bare metal performance while giving us the flexibility of the cloud. And we announced that we are making it available as a service. Last year we launched the Telum NC16. It's an engineering marvel and IBM's first processor to have on chip accelerator for AI inferencing. It can process 300 billion inference per day with one millisecond latencies. This means now you can infuse AI into every transaction in Z16 for applications like fraud detection and others in real time. Using the same core architecture as Telum, we built the IBM Research AIU, which is optimized to give superior performance for foundation models and enable with Rat Hat software stack. And at IBM Research we are incubating powerful AIU systems designed and optimized for enterprise, AI inference, and tuning. So a truly fantastic year and this is just the start of all the amazing things that we are building and developing for you and that we will be sharing with you in the coming years. So, today more than ever before, it's important to have a business strategy in AI. And in closing, as you think about how to harness foundation models for your business, let me offer you some tips to consider. First, act with urgency. This is a transformative moment in technology, be bold and capture the moment. Second, be a value creator, build foundation models on your data and under your control. They will become your most valuable asset. Don't outsource that and don't reduce your AI strategy to an API call. Third, bet on community. Bet on the energy and the ingenuity of the open AI community. One model, I guarantee you, will not rule them all. Run everywhere efficiently, optimize for performance, latency, and cost by building with open hybrid technologies. And finally, be responsible. I can't stress this enough. Everything I have mentioned is useless unless you build responsibly, transparently, and put governance into the heart of your AI lifecycle. Continuously governor the data you use and the AI you deploy. And co-create with trusted partners, trust is your ultimate license to operate. If you map your AI business strategy against these recommendations, you will be in a prime position to do amazing things with foundation models and generative AI. We have built Watsonx so that you can do just that. And I hope you join us, because we cannot wait to get started on this journey with you. Thank you. (Applause)