Transcript for:
Rapid AI Chat App Development

in case you haven't seen yet I just put out a new app called T3 chat and I'm really proud of it it's the fastest AI chat app I've ever used and as far as I know currently exists if you don't believe me go try it or watch my other videos about it it flies been getting a lot of questions about how I built it how it's so fast and most importantly how the hell did I do this in 5 days these are all great questions and not all of these questions have great answers but I want to do my best to try and clue you guys in on what it took to build something like this as quickly as Mark and I were capable of think of this more like a devlog type video in retrospect where I'm going to go through each day what I did and how the process led to building an app that we're actually proud of and we're able to hit a crazy deadline on before we can do that we need to hear a quick word from today's sponsor if you're anything like me you're probably pretty tired of these AI tools that claim they can replace your job they're never any good the ones that are good are the ones that complement your job they take the tedious things they make them less tedious and give you information that you might not have had otherwise things like code review and that's why I'm super hyped about today's sponsor code rabbit they make code review way easier by doing a first pass on your PRS and leaving a bunch of useful feedback summarizing drawing diagrams and so much more this is a real poll request where we're no longer allowing people to upload exe files without paying long story go check out my pirate software video if you want to know more about that but here's what code rabbit did summarize the poll request giving a bunch of useful info says that it's introducing significant enhancements to file upload validation and error handling across multiple files in the injust infrastructure here it's summarizing all the individual files and what they do but where it gets real fun is once it starts reviewing the code directly so here's a comment where it called out that we were returning a partial error and we should give a full error here is somewhere where it caught something that would not be a great experience for users where we would be telling them in bites how big the file should be and how big it actually was nobody knows how to read bytes we should be giving this in megabytes and gigabytes and since they called it out we were able to change it before anyone on the team even had to touch the pr super handy and when it has changes that are simple enough to propose a change for it appears in line and you can oneclick add it to your poll request it's free to get started it's fully free for open source and if you want a full month of the Pro Plan for free use my code Theo 1M free check them out today at soy. l/ COD rabbit before we can get into T3 chat we should start with where I started which was deep seek deep seek had just put out a new open source model called Deep seek V3 and I was blown away with what you could do with it it was really fast really cheap and comparable quality to what you'd expect from something like Claude I played with it and was really impressed but the chat app was awful it was so annoying to navigate my experience using it was garbage but I wanted to really take advantage of this model and it also been thinking a lot about it for a while because I've been frustrated with chat GPT and cla's web applications for as long as I've been using them and over the last six or so months I've been using them more and I've getting more and more frustrated so I wanted to play with this model and have a better UI to play with it in so I went and tried a couple of the open-source starter kits for doing an AI chat and quickly realized they were all garbage no offenseive people who made them it's really hard to do these things and you built everything correctly with the technical assumptions that have existed in the creation of most of these tools but I want to do something fundamentally different I've been dodging local first for a while because for most of what we build it doesn't make sense an app like upload thing gets nothing out of being local first an app like a chat AI app actually benefits a lot from it so it's disappointing not see anyone take advantage of it so I decided to start scaffolding I started with vzer and I bet we can even find the point that vzer got me to yeah as you can see it's pretty far from where we ended up we've redone all of this since but it gave us a rough starting point using the versel aisk I went over all my limits on vzero had something that was a UI that kind of worked and I was able to get that running and next on my machine got all the parts plugged in together and had it streaming I immediately had some things I wanted that next wasn't going to help much with though specifically I wanted whole navigation to be on client as such I ended up spending most of the day on the routing layer and you'll see something interesting here this is the only page in the app day one because I moved all of the routing out of next over to react router with a catch-all route that would handle all the different URLs you went to because I didn't want the server to be involved in navigation as you moved around the app this combined with my sync layer that I built entirely through react context worked it meant you lost everything as soon as you refreshed and my attempts to build this back in and sync and a KV were not going great but it kind of worked with a rough sync layer and all the pieces were coming together navigating it felt good but it was far from where we wanted it to be I can probably run it locally I go to SL chat we get launch chat which creates a new chat with an ID and I can say solve Advent of code 2022 Day 2 in typescript and it yeah it it took a second because I didn't have all my opiz and I was so unsure cuz I'm so used to being fast that I just assumed it was broken but it worked it doesn't have auto scroll because I was fighting scroll constantly throughout but I at least had a decent UI I had hacked in syntax highlighting in a way that was okay but I had something here that worked the sync engine was not one of the parts that worked but at the least I had all of this and I was proud with where we were at and I had also wasted a ton of time on random Explorations I I tried multiple different ways of storing the data I have a neon instance here that I had a schema for ended up going with a KV through up stash redus that worked fine to do use super Json or something I've been using a lot of super Json for this project but uh yeah it kind of worked it proved that this could happen but it was nowhere near where it needed to be but it was also 5 in the morning so I went to bed but first I made a quick update I think I have it in here do I have my read me yeah I wrote the things that I need needed to do and then passed out after waking up the next day I felt like I was Far long enough to bring my CTO Mark in I always feel bad bringing them in on these projects when they're so early but I knew I couldn't be doing this one alone and I would need a lot of help so I caved and brought them in did my best to notate all the things that needed to be done I was also battling my hot water and spent most of the day with plumbers funny enough but we made a ton of progress first and foremost we overhauled the UI we now had tabs that you could go between as well as a chat box that wasn't anywhere near cringe still had bugs and I was still insistent on command enter which was wrong enter is how you should submit we made a lot of progress here though parts are starting to come together I had thrown away all of the sync because that context I was using before was garbage and at this point I had started moving over to dexi which is a funny enough like an ancient Library if you don't believe me just look at their website you can tell this is from the 2010s it's awesome they have support for all these new cool things and the team works really hard and builds great stuff but this Library started in like 2011 and it has Internet Explorer 10 support this is not a project that I've seen anyone talk about and I understand it's kind of old but I don't care it was awesome it made so many things that I was struggling with way way easier and I had a lot of fun with it so we started architecting things with projects threads messages building a database layer where we could store all of this locally in index DB on your machine if you're not familiar with index DB it's a browser standard for storing a shitload of data in the browser pretty pretty cool chaos pretty cool ended up with a couple functions here for creating new messages and threads and then the code for the actual chat use the default hook they provided use live query which would sync by getting updates through signals whenever dexi had something occur this method of getting messages was really nice especially after the hell I had dealt with trying to do all of this with the versel aisk I don't want to on them too hard because the SDK is great and the back inside is still what we're using for our streaming in from the llms but the client side was very limited it worked great for a quick demo but as soon as I wanted things like local sync or IDs which oh God I I was so frustrated with the message types and the way IDs worked in here we'll have a whole tangent about that in a bit don't worry but I ended up spending a lot of time hacking the data layer here and dealing with weird client behaviors trying to get the state to behave and I couldn't get it to behave so I caved and moved everything on client over to the dexi layer I was increasingly invested in which meant that I could just hit a live query which would update when the message updated and just stream the message straight to my local DB worked great it meant a lot of things we rendered when they shouldn't even with react compiler but overall it worked pretty well we were a lot happier overall we had submit decent like ux and flows here we gotten some actual Tail in being written to make things kind of pretty and I had at a point where I was happy enough to show it to people and get some feedback and also worth noting this is the point at which I stopped using Claude and chat GPT to ask questions throughout Dev and I was just using T3 chat for all of my Dev work we also picked the name T3 chat like that night and if you look at the commit logs here it was at 5:00 a.m. where I decided T3 chat was the name and put it in the corner the reason I picked the name is I snagged the domain so I used it here and I was really happy with it but the T3 chat name was day two as well as all this overhauling and here we are in day three you might notice things don't look that different and there's good reasons for it I spent the first half of this day at vel's office detailing my frustrations with the SDK they were very happy to take me in thankfully and wrote down I think they wrote six pages of notes and are making meaningful changes to the asdk as a result fine and dandy Awesome by the time this video comes out chances are building something like this will be much easier because the changes forell is making but I had to do it all myself so I spent most of the day gutting the remaining pieces of the asdk floating around and moving everything over to my dexi layer sadly when I got home and opened my laptop to get back to work I got a notification from an upload thing user that malware bites was blocking their customers from accessing files on their service they had just released and if you follow me on Twitter you probably already saw this it went pretty viral for me and also been dealing with things like this my video about hiio and pirate softwares complaints about that touch on a lot of this but since upload thing allows any developer to let their users upload files we will inherently end up with people uploading malicious things it's going to happen despite the fact that we've been aggressive at removing those files and banning the users who do it a couple companies in the threat security space and Antivirus space would block our domains weeks after the files had been deleted because they weren't robust enough to check and they never bothered to notify us so I had to spend a lot of time fighting Mal byes in their stupid goddamn Forum because it's the only place to report false positives after fighting this for a while we ended up getting it done spent a decent bit of time on Tech figuring out how we can prevent it in the future we have some cool subdomain stuff coming up later but ended up spending probably like four to five hours dealing with all of the stuff around this sadly which meant I didn't get to spend as much time coding as I would have liked that said I was able to finish the dexi layer for the most part not the sync part just the local part as well as get some startup credits from anthropic open AI still hasn't got back to me it is what it is there was one other thing I forgot and I probably shouldn't have oh actually we had a home page now too did not fit well on the screen we ended up fixing that literally an hour or two ago but I did finally get off kind of it's probably going to break really bad now because of uh how much we've changed the off layer since and I wasn't running it here the way I am there but we had a mostly working off layer with cookies and local store I spent a lot of time thinking about off for this app because I wanted everything local I didn't want you to have to hit a server and get a thumbs up from me every time you did something and as much as I love clerk it very much leans you in that direction of doing everything through middleware on your next app and I do not want to fight anything that would come from there so instead I picked a worse battle which was rolling my own off and it made me miss clerk so much I genuinely genuinely wish I had spent the time to try and figure out how to make clerk work here I know they've been a sponsor for a while but they've been a sponsor for a while for a reason I like the company and I like the product and I lost so much of this day and the next day to off that you can see how bad this looks so between my time at the versel office my time with malware bites and my time fighting the AI companies and my time trying to get all setup the only actual UI we got done was the delete message button yeah not great so I was excited for the next day the problem being the next day was stream day in stream day if you guys know if you've been around for long enough stream days are long so stream days I tend to not get to Cod a whole lot during and at the end of stream day I actually had to go to the versel office again to hang out and do a little meet up actually the day before there was one other thing I forgot about I also spent a decent bit of time hanging out with the laravel team at the Excel office which was very fun got to hang out with them a bunch give them feedback on cloud early hung out with Josh he filmed the clip here great time more lost time though so sadly nowhere near as much code as I would have liked on day three day four stream day finished up off also had to go to the versel office for the Meetup that I had agreed to go to there and I had some friends I hav seen in a while so I was at the vercel office 2 days which is funny I'm almost never there just worked out that way and I had also spent a bunch of time this day moving off of next changing my mind and moving back so yeah I have a prototype version of all of this working with v in react plus Hano on cloud flare and all the hacks I had to do to make the streaming work on cloud flare were enough for me to say it and go back to next for now in the future we'll move this to doing it the right way but not yet so day four was mostly polishing Off streaming and also setting up linear so we could actually track our issues oh I think I also turned on react compiler that day if I hadn't earlier yeah pretty much no change in the UI everything still behaves basically exactly how it did just off was the big thing I bet if I go to SL off it'll work now yeah it does cool Google off look at that all through open off open off is a really good library that is not easy to set up day five and I know you're probably seeing this day six there he's like wait 5 days can we honestly say that these two days were both full days considering how much of my time I lost to entirely unrelated things yeah also day one started at like midnight so I'm pretty sure it's 5 days in terms of the dates of the month overall but like yeah be flexible the 5 days it was closer to 5 and a half day five I spent a lot more time on that syn layer because I had the local DB working great with dexi but I had not cracked the cloud side I tried a few things wasn't happy and decided Ed to go back to exploring other options I had also on stream said I was going to talk about local first and didn't get to it cuz I had to end stream early to go to the other versell event but I had a lot of DMS for people I trusted talking about local first stuff because as much as I don't think local first is something we should all be reaching for for everything there's a lot of developers I really respect and look up to that care a lot about it and had a lot of things they wanted me to consider and look into we had already explored zero funny enough I forgot to mention this earlier I had Mark exploring zero for most of day 2 and we concluded that is cool as it is it's not quite ready if you're not familiar zero is by the people who made replic Cas it's a way to set up a postgress layer with a cache like JavaScript server actually I think it might be in go but there's a server between your database and the client I know boring typical but you define all of the behavior for the app in the API as a typescript file that now is a websocket connection between that cache and your client so everything is done on the client and then synced up to the server rather than the other way really cool pattern really crazy potential overall it was just a combination of hard to set up not super flexible bad source of Truth thing where you had to write the same code in like five places and hope it all came together properly and mandatory downtime when you upgraded all of these things were enough that I was unsure and I had also gotten so deep into the dexi layer that I wanted to lean in further so we ended up doing a dexi sync layer on day five that I built myself but not after trying Jazz tools Jazz seems super cool I spent a bunch of time talking with the the team we tried really hard to get things set up but there were a couple like fundamental design decisions that ran very against the way I was trying to build the way I would shorten my issues were that it's very focused on collaboration and collaborative values as well as every user being fully authenticated before anything happens live the pr where I tried moving over to Jazz here you have to wrap everything with a provider as you would expect but if the provider doesn't have a signed in user it will not render its children so doing this actually broke the app entirely I couldn't get it to render and it was really unclear why turns out you have to be off before the Jazz provider will even return the children yeah is what it is got it kind of working but every time I thought things were working five new ones would break some of it said I just hadn't wrapped my head around the data model but a lot of it that the data model was weird everything has to be structured through a me object so here's a schema I tried making with jazz the schema had a weird hierarchy you have to globally register your account in their Jazz react package in order to have the types work at all then you define account account is a class that extends their account this is my app account they recommend you don't assign values directly in it but you need to be able to access them from it so instead you assign it a root value which you type out so I made it my app Root so my app Root is a child of my app account these are properties on classes and if you know me in my functional programming brain you know how much I was starting to get angry from this I then had to make a thread list which is a extension of coist of co- of thread and I have my thread which has a title last message at and threads which is a corre of a message list message list is a co- list of a Corf of a message which is this what this all means is I can't select messages by a thread ID I have to do everything through me so I if I want to render from a list I have to go to me. root do thread do select with the right ID and then get those messages and render them and I did not want this type of hierarchy in my app I have my create message function this function takes the thread ID in the message the user wants to send and it does all of the things it creates their message in the right thread it gets all the messages from the thread it creates a correct version of those messages just tidied up to send to the server to start streaming the new message from the AI and we start streaming it in I said at the time jokingly like very much a joke thinking there was no way in the world this was true haha if I have to pass the me object to create message my head's going to explode to which they replied uh about that to their credit they were hyped about how many issues I ran into they were super responsive and are taking the opportunity to fundamentally rethink the loading and data patterns around Jazz if I was to move to a sync solution jazz is very high up in my list of things I would consider but what my realization has been throughout this is actually a confirmation of a theory I had in the past which is the needs of different local first apps vary so much that if you are trying to build a generic solution for all the local first apps you're not building something anyone actually can use or want so these attempts to build generic Solutions all kind of sucked for me and I could not find one of these that was even close to what we were trying to do so I gave up finally after spending probably 3 to 4 hours back and forth on Jazz and rolled my own instead ended up going way better than I expected considering how much time I'd lost to everything else going on and I also spent some time experimenting with other models this is when I started playing with chat gbt playing with CLA a bit more too and the reason for that actually kind of silly I started paying more attention to the different performance characteristics of a handful of models this site by the way super killer this is artificial analysis. they everyday Benchmark every model to get you performance information so if we throw like four latest 40 mini got the latest CLA in here I love the sight or scroll breaks when you do that claw latest and deep seek V3 this was really useful for me to start getting info you'll see deep seeks quality is absurd but there's a catch and the catch wasn't something I felt the first few days the catch is when I started using it the output speed was great 90 tokens per second which means 90 words effectively coming in every second and it felt great as we got closer to launch the speeds were going down significantly they' gotten to almost half of what they were prior and I was losing confidence quickly I also noticed that all the alternatives for DC because it is an open source model which was exciting I was going to throw it on one of the providers and saw all of them were even slower so I started obsessing over the performance of the model probably a little too much spent a lot of time testing all the different models after playing a bit and screwing with chat gbt and gbt 40 I ended up getting 40 mini set up on Azure in a way that was really really fast and that's what we're using right now we're going to introduce the ability for you to select different models in the near future but for now the goal was fast without killing our bank accounts and I'm happy with where we landed there deep seek still hilariously cheap so if you're looking for the like cheapest that is high quality check them out but 40 min is also really cheap and really fast so there's a lot of good options nowadays which is really cool to see oh finally we had a real homepage by the way for a long time everything was on SL chat which meant if you just went to the site you got a blank page this fixed it yeah ended up not changing much UI wise oh I think I added the collapse for the sidebar which was cool but was not the focus the next day was grind day this was yesterday the day before launch and Mark and I just spent the entire day from when I woke up to when I went to bed hacking overhauling the UI making a ton of other changes most of which to be fair Mark was making but we hadn't merged just yet we changed the input box to look more like clads we changed the the sidebar to have a better new chat not at being there Reserve that area for your auth information and most importantly stripe and payments I still hate setting up stripe there are a hundred ways to do it and none of them feel right we have a solution I'm okay with but we also had a couple reports of people paying and not having it correctly flag their account as paid which makes me want to go mad so we'll be spending a lot of time tonight making sure it is a stable as possible so by the time you see this video checking out is fine but like chat's already saying it stripe as hell I'm afraid of stripe I have checked out Gro I had a tab open for it earlier the speed you can get things out of that is nuts but yeah the you are trying it and seeing how fast it is it's really nuts we spent a lot of time on stripe I also did an onboarding flow that I was really proud of where when you first open the app it would create three messages that describe what it is and what it does I did that instead of a traditional homepage and I think it's really really cool I also spent a bunch of time with Aiden you know the million dodev guy who made react scan react scan's a library that lets you see when things reender I have a video all about it and react render patterns coming out soon it might be out before this hard to know my schedule's chaotic but he is a expert like industry-leading expert in all things react performance he's also the CEO of million which was originally an alternative react runtime that would make your react apps way faster now it's also more focused on the linting side where they will catch performance regressions in your app he is so locked in on performance it's nuts and we ended up making a bunch of really cool changes the biggest one was the markdown chunking we would start to identify chunks I think there's a reix in here or oh it's the marked lexer which will split the chunks of markdown by the block that each of them are in so that we can memorize the blocks so when we get new text we don't have to re-render the entire message we only render the block that the new text is going to and this was a huge win in particular for messages that have multiple code blocks in them this made the performance go from to pretty good still not where I want it to be I'm going to spend a lot of time fighting prism or moving to something else for the syntax highlighting but we got it running way way better very happy with the result I also added some fun functions to make it easier to test in Dev with a lot of threads to get this all working well and I was really happy with the result I still actually have let me safely open up my environment variables here I added a react scan environment variable locally so that I can just go to the site and now have react scan running on it and you can see when I make a new message here um solve Advent of code day 8 2021 in vanilla JS oh that's really funny um I'm gonna just comment out the rate limit for now okay second attempt and you can see the block you're in renders but none of the rest of the UI does anymore and the result is you can hit a locked 60 FPS even with decent CPU slowdown it can do 120 FPS which is what my MacBook usually runs at when I'm not streaming but it can dip down to the hundred sometimes which is why I want to go further I'm happy overall though it's way better you might have seen the chat itself was rendering but those are memoized reenders so they're not actually recalculating it's just checking and giving the thumbs up like hey this is okay we don't have to do it if you look closely I'll see if I can do another now do it in Rust you look closely you'll see there's a little um star on these the star means it's memoized so it's not actually rendering it's just being checked a whole bunch and yes the things in this given message are being checked a lot but they are being opted out of really early so it's not a big deal for performance here it is with the performance monitor on now do earling the error is just a react scan thing don't worry about it but you'll see during the code block CPU utilization spikes a bunch but as soon as you're out of the code block and doing other things after it drops to nothing it's only the code blocks that have this level of CPU utilization and now that I have the dev tools open and the CPU slowdown on and I'm streaming at a really fast speed with react scan in react Dev mode it's not going as fast I see how immediately faster it goes and how quickly that drops though it's just the code blocks so now you see why I want to optimize it further but we've been to hel and back to make this as fast as possible both by doing everything we possibly can locally on the machine avoiding renders to the best of our ability streaming things through a data layer that actually Mak sense and building a routing Paradigm that is a combination of the things that work well and next and the things that I actually like about react router the result is as far as I know the fastest AI chat app that's ever been built there are a couple other cool things I did I'm not super proud of the state they are in but they are getting to a state that I'm really excited about like um I have this use Query with local cach function this should be named use action query with local cache because I pass it a server action the server action does something like get the user subscription status but I also store whatever the result is in local storage so instead of showing a loading state I can show a default State and then from that point forward show whatever the server returned previously theoretically what this will enable is if you are on the free tier and you go to paid tier and you go back to the homepage it'll show free for just a millisecond before it pulls in the updated value so I never have to deal with loading States ever I never have animations anywhere I had a couple like the things I want in the readme like my um strong stances avoid animations as much as possible and indicate changes as aggressively early as possible things like on Mouse down stuff like that and the result is an app that with a lot of work and thought into every layer every render every piece of data touching everywhere it's something that flies and I'm really proud of it hopefully at the very least this can help you guys understand that react isn't slow it's just easy to use it in a slow way admittedly we had a couple times where one small snc resulted in things rendering in ways that cause performance issues but for the most part it was just fine and I'm genuinely really happy with the results have you had a chance to try T3 chat yet though I'm curious if you feel the wins that we put the time into here do you actually feel the difference between Claude and T3 chat I can't imagine you wouldn't but if you somehow don't please come tell us hit up the feedback channel for T3 chat in my Discord if you have any issues at all especially performance related ones because we take them all very seriously I hope you enjoyed this breakdown of how we managed to build the app in five days five asteris but you get the point the goal here was to build something that felt better than every other chat app and I'm proud to say Mark and I somehow managed to do it let me know what you think and until next time keep shouting