Transcript for:
OpenAI Model Applications and Insights

so with open ai's new model release many people have been wondering what you could actually use the model one of the things and one of the most common questions I've been getting is how do I actually use this model effectively the reason that this question is quite prevalent right now is because this model is essentially a reasoning meaning that it's not quite like your traditional chatti in the sense that this model is actually trained to be really smart meaning that it's going to reason for long steps about many different problems basically this model is so smart that I think the average person might not realize the use case for this model and looking at the evaluations and a lot of other different metrics you can see that this thing is incredibly smart but that doesn't mean that this doesn't have any applications and that's what I'm going to be getting into today how you can take advantage of a super smart model even if you might not need a model that ranks 89th percentile on competitive programming or is on the level of human PhD so let's get into the actual real world use cases that you can use on a day-to-day basis that can actually help your life now further on in this video there's going to be five main categories that I'm going to go through so just stick around to the end of the video to see if at least one of these categories does help you in some aspect or area of your life because you can be surprised with what this model of doing so one of the first things that you could actually do with this model is this model is a very competent coder we already recently spoke about how this model manages to solve difficult problems in certain challenges but I think most people are missing the application in terms of how good this model is at coding for example here I'm going to walk you guys through a couple of examples where individuals with no coding ability at all were able to build funable programs but then I'm going to show you why this is incredible in terms of using it on a day-to-day basis you can see this person here said 01 preview made a 3d FPS game in fully HTML I have zero coding skills so it took a few tries but eventually it worked and you can see right here that this is a game that was built just few a few prompt basically what you can do with this model is you can code things that you previously weren't able to code with other models like cloth or how the jump if you weren't familiar with gp4 what that jump was able to do with just a small Improvement in terms of the overall coding ability when the code is coherent and it works people have been able to build many different things so this is something that basically shows us that once we have another jump from son 5 all the way up to 01 preview this is going to be something that allows you to code extraordinarily well now what this is doing is this is basically a step-by-step explainer for individuals that don't have any coding experience now I know some of you will have some coding experience and the majority probably won't the majority aren't software developers but I think most people aren't utilizing this ability to at least understand fundamentally how certain programs going to work and how you can build your very own programs yourself now some people might be thinking oh I can use this program to code a 3D HTML game what on Earth is the point of this this is not like a AAA game that I can play or sell this is just some fun kind of experiment you'd be correct in stating that yes this isn't a gamechanging thing in terms of this right here and when I say this right here I mean this particular example but that's not what you should be focusing what you should be focusing on is the fact that you can actually use a model to completely understand step by step and build anything within reason with a model that understands major Concepts around coding and I'm going to show you guys why those implications are more profound than you might so for example what we can see here is am maresi stating that I just combined open AI 01 and cursor composer which is an AI application that allows you to code really efficiently with AI to create an IOS app in under 10 minutes A1 Mini kicks off the project 01 was taking too long to think and then I switched to 01 to finish off the detail and then boom we have a full weather app or iOS with animations in under 10 minutes so this is what I mean when I talk about about the fact that we have a major paradigm shift on our hands when we take a look at the fact that we have a system that can code step by step and Achieve far greater accuracy than prior models this is going to bring us to some nice outcomes for the average user I mean previously if you wanted to build your own iOS application it would essentially take thousands and thousands of dollars and perhaps one or two competent software developers in order to get now what we can do is we can prompt an AI system in order to just figure out exactly what we want and we can learn how these systems kind of work even if we don't understand anything we can literally just screenshot it and ask GPT for o and continually get feedback on what it is that we're building and how to understand exactly what it is that we are doing I don't think you understand how incredible this is and the worst thing about it is that this model I wouldn't say that this is more so a coding model I would say that this model is one that is more like a reasoning model in the sense that this model is one that thinks for quite some time and is trying to solve really difficult problems that require a lot of different steps of course coding definitely falls into that category but I can't imagine what we're going to be able to build with GPT 5 other future AI systems which is why I do believe that at least messing around with building certain applications with 01 preview right now now is going to be decent even if you have four because you're going to get a fundamental understanding of how to use AI with coding so that in the future when it gets even better you might have an app idea you might have some kind of software that you want to build for your own company your own management and you'll be surprised at how much money you can make from that in the future and there was also this example of someone using the 01 model to build a fully functional chess game that allows them to compete against an AI opponent the implications for this are staggering but I will include two more examples from open ai's official documentation where they actually talk about how they've been using coding for certain area I want to show an example of a coding prompt that 01 preview is able to do but previous models might struggle with and the coding prompt is to write the code for a very simple video game called squirrel finder and the reason o1 preview is better at doing prompts like this is when it wants to write a piece of code it thinks before giving the final answer so it can use the thinking process to plan out the structure of the code make sure it fits the constraints so let's try pasting this in and to give a brief overview of the prompt um the game scroll finder basically has a koala that you can move using the arrow keys um strawberri spawn every second and they bounce around and you want to avoid the strawberries after 3 seconds a squirrel icon comes up and you want to find the squirrel to win and there are a few other instructions like um putting open AI in the game screen and display instructions before the game starts Etc so first you can see that the model thought for 21 seconds before giving the final answer and you could see that during its thinking process it is gathering details on the game's layout mapping out the instructions setting up the screen Etc and so here's the code that gave and I will paste it into a uh to a window and we'll see if it works so you seen there's instructions um and let's try to play the game oh the squirrel came very quickly but oops this time I was hit by a strawberry let's try again you can see that the strawberries are appearing uh and let's see if I can win by finding the squirrel looks like I won all right so the example I'm going to show is a writing a code for visualization so I sometimes teach a class on Transformers which is a technology behind models like chapp and when you give a sentence to Chach it has to understand the relationship between the words and so on so it's a sequence of words and you just have to model that and Transformers utilize What's called the self attention to model that so I always thought okay if I can visualize this self attention mechanism and with some interactive components to it it will be really great I just don't have the skills to do that so let's ask our new model o1 preview to help me out on that so I just typed in uh this command uh and see how the model does so unlike the previous models like GPT 40 it will think before outputting an end so it starts started thinking as it's thinking let me show you what are some of these requirements I'm giving a bunch of requirements to think through so first one is like use an example sentence the quick brown fox and second one is like when hovering over a token visualize the edges whose thicknesses are proportional to the attention score and that means just if the two words are more relevant then have a thicker edges and so on so the one common failure modes of the existing models is that when you give a lot of the instructions to follow it can miss one of them just like humans can miss one of them if you give too many of them at once so because this reasoning model can think very slowly and carefully it can go through each requirement uh in depth and that reduces the chance of missing um the instruction so this output code let me copy paste this into a terminal so I'm going to use the the editor of 2024 for so Vim HTML so I'm just going to paste this thing into that and just save it out uh and on the browser I'll just try to open this up and you can see that uh when I Hoover over this thing it shows the arrows um and then quick and brown and so on and when I Hoover out of it it goes away so that's a correctly rendered um version of it now when I click on it it shows the attention scores as just just as I asked for and maybe there's a little bit of rendering like it's overlapping but other than that is actually much better than what I could have done yeah so this model did uh really nicely I think this can be a really useful tool for me to come up with a bunch of different visualization tools for uh my new teaching session so the next area that is really profound and this is one that I actually spoke about in my community where I help people to get the most out of AI basically what you can actually do with this and I think this is one of the you know most underrated uses of this model is actual business/ management advice so the reason that this model is really good is because with business and management you have to consider many different factors weigh into any decision that will impact the kind of choices you're going to be making on a day-to-day basis and I think that this is something that is quite underutilized just because many people aren't business I know that most people are employees but if you ever wanted to start a side hustle this is going to be something that could work as a really good advisor in terms of business when open AI 01 was released I was actually watching a few videos and you can see one of them is here this video is from Samar Hadad and basically he spoke recently about how he was wrong when he was doing in-depth testing on complex business pro so initially what he actually did was he critiqued the gpt1 model for its performance on complex problems but after feedback he decided to test the model again with a more detailed prompt so basically what he did was he decided to submit a business problem involving supply chain crisis for a smartphone manufacturer that heavily relies on semiconductor chips from A taian supplier facing geopolitical and environmental challenges this prompt actually included details like Financial impacts production sites market share information supply chain details and anticipated losses due to chip s and basically what the task was the task was to create a immediate crisis management and after he submitted this prompt it actually managed to deliver a really comprehensive plan covering multiple Strat now I won't get into all of the details such as negotiating prioritary Supply whatever increasing inventory through stockpiling but the model provides detailed estimate for each strategy's budget and including how it derived each of these numbers and just many different insights that he didn't now of course the reason that this has gone under the radar is because most people aren't thinking I have a business that has these kind of issues but what I do know is that what you can do is you can reason with these models with your own personal data and when I say personal data I mean for example you could say I'm 37 I work a job in it I've got maybe two kids and I'm trying to start a side hustle in this Niche or this industry what would be the best steps to get started or can you validate this idea based on my personal circumstance and because the model is going to reason through many different steps you're likely going to gain some really nice insights in areas that you most likely would have missed the model is going to draw on different comparisons and different insights that other models just wouldn't see and I think the reason that people are finding out that these models are a lot smarter than they thought is because of course these models are quite rap it's quite hard to figure out exactly what a model can do if you're only allowed to use it a certain amount of time so I'd still experiment with this model and different prompts because this is something use even if you don't have a business if you want to start a side hustle validate a side hustle or even just help out a friend's business in terms of advice this is something that can be remarkably helpful now the next one which is I guess you could say this is a gray area but this one is in the healthare area so in health science we can see that there are comparisons between gbt 40 and the open 01 preview and essentially what we do have here is the fact that open ai1 actually performs really well when making different diagnosis once again it's able to reason over many different factors and then come to a conclusion that is the most like you can see that on the left hand side we can see that GPT 40 is trying to make a diagnosis based on the following report you can see the phenotypes and the excluded phenotype and then we can see that it comes to a diagnosis which unfortunately is wrong but then of course we can see the open ai1 preview having the same exact prompt of course right here you can see that there is a really really extensive Chain of Thought that actually does diagnose the person with the right syndrome and this is something that I think does have decent implications because if this model is able to do this level of reasoning based on phenotypes I would say that it has been remarkably effective at clinical diagnosis 2 so essentially this model can actually Aid SL assist with personalized health plans for those that could be experiencing perhaps maybe chronic issues or issues that they just feel aren't that visible on the I know that everyone is an individual everyone is a unique in person and in doing that people have lots of different pieces of data that might not fall into the traditional buckets when they are searching for certain issues of course you should always go to the doctor to get diagnosed for anything as they are likely to have your health record and they are likely to know a lot more than a large language model but I do think that it can be useful in suggesting certain perhaps lifestyle changes or potentially suggesting certain diagnoses for things that you may be dealing with that you haven't potentially thought about I'm not stating that this is just like a doctor I'm just stating that this is a really useful tool that when combined with the specifics of your personal situation could provide a more personalized report based on all the different pieces of data that you might not feel comfortable sharing with your doctor now if you did watch my initial video on the open ai1 I spoke about how this medical scientist and immunologist actually spoke about how the open ai1 preview actually completely outperforms GPT 4 and GPT 4 on agent Clinic Med QA and basically he says that this model greatly outperforms GPT 40 the ability to process complex medical information deliver accurate diagnosis and provide medical advice and recommended treatments will only accelerate and we can see here that this performance is quite outstanding so for those of you who perhaps were having troubles with GPT 4 maybe you could try those same prompts or saying maybe you're having an issue with your pet and you don't have access to a vet you could always ask 01 preview and perhaps you might just get a response that could lead to a solution and interestingly enough he follows it up with this statistic which I haven't fact checked this but I do know that a lot of Americans do actually end up in worse situations because dangerous diseases are often misdiagnosed noed I know that this happens in a variety of different countries you can be in a very first world country and this situation can still happen due to human error humans often make mistakes sometimes these Physicians and doctors are often overworked leading to sometimes subpar performance which is of course not their fault they're just trying to do the best that they can but sometimes these misdiagnoses Do Slip through the cracks and then issues that were quite benign then become issues that are life change so a third or fourth opinion from a PhD in your pocket I would say wouldn't hurt I'm always going to preface that with the fact that you should always make sure before you make any lifestyle changes to ask your doctor but with these models I found that the more detail you give them the much better they are at providing recommendations based on the current information another area that most people might not need at the moment but this is an area that I've seen work pretty well is for legal work so essentially I have actually used open a for just drafting some pretty standard agreements pretty simple ones of course nothing that's too incredible and of course always need these checked over by a lawyer to make sure that they are airtight but essentially this article right here talks about the use for open AI 01 for legal work it says spellbooks First Impressions from implementing one preview into legal workflow so opening eyes is system to thinking is something that allows the model to come to much better conclusions than prior model so it says the number one thing that we're most excited about is 0's performance in document revision t a lot of generative AI experiences spit out entirely new documents but lawyers are rarely draft scratch they typically have a precedent they want to modify contracts like share purchase agreements can be 100 pages long and make it significant revisions to them requires a lot of jumping around consistency checking and making sure numbers add up system one thinking does not work well here and it's a deep challenge to get these tasks performing well with models like in this example below we used 01 with Spellbook associate to update a commercial lease based with 01 we are seeing dramatic improvements for revision tasks across the board one of our top predictions is that there will be a lot more work for BL on Nuance document revision launched over the coming year now here's where they talk about 01 for legal math they say that another consistent weakness with g upt 40 has been its ability to really understand the numerical content it's working with in its agreement and whether it all adds up discrepancies between cap table spreadsheets and deal documents have cost shareholders many millions of dollars while tools like Spellbook have been great for detecting legal issues and text they've been blind to whether things like share prices and ownership percentages really add up so with these calculations they were basically able to figure out if things actually managed to work and they're seeing an increased level of reliability from these models so if you're someone in the legal space perhaps testing these models on a few internal benchmarks could provide an insightful insight into how useful those models could be then of course we do have research so if you're someone that's researching something for your PhD even if you're just doing business research I think this is going to be something that is very very effective for those of you that are trying to explore New Frontiers in many different scenario you can see that this user says the feeling when chat gpt1 accomplishes in 1 hour what took you about a year in your so this video in question is one that I wouldn't say broke the internet but provided a clear demonstration of what this model is able to do so essentially the user expressed his amazement as he watched gpt1 successfully run and generate code that mirrors their phd's project functionality the code that chat GPT produced seems to replicate the essence of what the user's original code does despite being generated with synthetic data and its own function now the code generated by chat GPT wasn't a perfect copy it uses synthetic data and has some caveat for example it created its own inputs and emitted some manual steps that would normally require additional software and effort such as fitting curves and managing Edge effects in convolutions and the user does note the some fine tuning and verification are still needed but the overall point here is that this is something that by his own words says that effectively accomplish what I struggled for about 10 months in the first year of my PhD and I'm excited to apply 01 for other use case so hopefully this video did manage to help you for using A1 I know that this model is rather smart and it can be quite daunting to think of a prompt to effectively utilize the model's capabilities but hopefully this video managed to help you get to grips with how to actually use and engage with such a powerful model if you have any ideas about how individuals can take advantage of these models on a day-to-day basis don't forget to leave your own comments below because I'd love to hear your ideas