In 2025, every business is facing one universal challenge. Mountains of unstructured data that they cannot use or make sense of. Now AWS have gone out of their way to solve this exact problem by offering readytouse AI services through their platform, democratizing access to the best AI models through a few simple lines of code. Now today I'll explain exactly what the core AI services on AWS are and then I'll show you a real project that I've built that demonstrates how these AI services work together. I'm Sleman, founder of Soulcurity and over the past decade I've architected secure cloud and AI solutions for fors government agencies and startups processing millions of requests daily. Now in that time I've developed a framework which will help you see the entire AWS's AI ecosystem through a simple lens input transformation and output. And in simple terms this means that your data goes in AWS processes it and business value comes out. Now I'll show you exactly how AWS has organized their AI services around this exact principle. And this principle simply just applies to AI in general. Input transformation and output. And make sure you stay until the end because I've built a real project, an AI product intelligence platform which I'll walk you through step by step and explain how it leverages AWS's AI services. So to start with, we are going to be focusing on the transformation part as this is where the real AI magic happens. And AWS has organized these into three distinct categories. So we've got pre-trained AI services, Amazon Sage Maker, and Amazon Bedrock. Now, the genius with all of this is that AWS handles all of the infrastructure for you. So, all you have to do is choose whichever transformation fits your data type and specific business need. So, let's get started with pre-trained AI services. Now, AWS has organized their pre-trained AI services by data types. We have vision, we have language, we have speech, conversational, and document. And how you need to see these services is as instant AI. This means that we don't have to provision them or deploy them as infrastructure like we would with things like EC2s or S3 buckets. All we're doing here is writing a few lines of code into our project that we will call via an API request. Now, an easy way to understand API requests or an API call is like, hey, I've got this query or I've got this problem and I need something to be tagged or recognized. So, just like you would when you're using chatbt and you're prompting the AI to give you a solution, we're doing the exact same thing here, but just not on the GPT interface. What we're doing is we're building a project in the IDE and then bringing the pre-trained AI services into our project through a simple API call. So, let's dive into vision services with AWS recognition. Now, what recognition does is actually quite remarkable. It takes something that humans do naturally like looking at a photo and instantly knowing what it is. And it gives the same capabilities to our applications. Meaning that our software can now see and understand images just like a person would. So for example, as we can see in the recognition console, we have an image here of a guy on a skateboard. And we can see all of the labels or the bounding boxes that recognition has set up for us. So you can see it specified a car car. We've got the person. We've got the skateboard there. It can find the wheels. It's got the building as well. And then we can see on the right we've got this results tab. And it will give us an accuracy of what it thinks the image actually entails. So here is how this works from first principles. And I want you to see this through the three-step framework that I mentioned earlier. Input transformation and output. So the input here is that we give recognition an image or a video file. This could be a photo from our phone, a security camera feed or whatever visual content that we have stored digitally. Then the transformation happens. This is where AWS will run our image through something that we call deep learning models. Now, deep learning models are essentially computer programs that learn patterns by studying massive amounts of data. In this case, millions of photos with labels describing what's in them. Now, what these models become is pattern recognition engines that can identify objects, faces, text by just simply recognizing visual patterns that they've learned from all of the training data. Now, this is amazing because we won't have to build something like this ourselves and find lots of different data. Instead, AWS has done all the hard work for us. So, finally, then we get the output and as a result, you get back exactly what we call the structured data. So, we literally just send the image to AWS and within milliseconds, we get back all of this structured data about what's in it, just like we can see right here. So, let me just show you guys an example. If I upload an image, we can see it's instantly recognized that I'm a man. It's figured out these emojis and that this is a baby. So, it's probably a bit wrong, but it probably thought this is a human. So, I obviously take up big part of the image and this taking up a little bit less of the image. So, maybe it's thinking that's the baby. You can see it's got the labels instantly. Now, what we can also do is filter on facial analysis and recognition will use its deep learning models for us. So, we try that again. It will recognize that it looks like a face, knows that I'm a male, gives my age range, which is obviously correct, knows that I'm not smiling, and you get all of this data, which is amazing. So, the great thing here is that there is no complex setup. There's no model training. Now, all we need here is just a simple API call and we get all of these capabilities directly into our applications. Now, don't forget I've actually created an AI application that uses recognition directly using API call, which I'll demo right at the end. Now, when we're building projects with recognition, there are three things that we need to consider. So, the first thing is performance optimization. Now, recognition processes images fairly quickly. But what matters is making sure that S3 bucket, which is AWS's file storage service, where we'll keep all of our images, and our API calls are all within the same AWS region. A region is basically a geographic location where AWS has data centers. And this might seem a little bit obvious, right? Making sure that we use recognition and S3 in the same area or the same data center. But I've seen projects where images are stored in one region and then processed in another region adding unnecessary latency which means there's a delay that will make our application feel a lot slower to our users. Now the next thing that we need to be aware of is size limitations. For image analysis of Amazon recognition, images stored in S3 must be 15 megabytes or smaller, while direct API blows are limited to 5 megabytes. Now, the default API rate is five requests per second, but we can actually request an increase if needed. Now, S3 itself will allow much larger files, but those are not relevant to recognition's image analysis. So, recognition can process images that are 15 megabytes in size or less. Now, the final thing that we need to consider is security. recognition can accidentally expose personal identifiable information. What we want to do here is enable content moderation features which are builtin filters that can detect any sort of inappropriate content. Then we also really want to use VPC endpoints which are private network connections that keep our data from traveling over the public internet especially if we're dealing with sensitive data. So we want to keep everything within the AWS network. So now let's talk about language services with Amazon comprehend and Amazon translate. Now, Amazon Comprehend uses something called natural language processing, which is essentially teaching computers to understand human language the same way that we do. Not just as individual words, but as complete thoughts with context and meaning. Think about how you can read a sentence and instantly know if someone's happy, angry, or being sarcastic. That's exactly what we're giving computers the ability to do here. So, here is what comprehend gives you. Firstly, it gives us sentiment analysis. This analyzes text and tells you if the overall tone is positive, negative, or neutral. This is great for reading customer reviews and automatically knowing if they're happy or complaining about the product or the service. Then we have entity recognition. This automatically finds and pulls out specific things like names of people, places or companies or even dates from our text without us having to search for them manually. Now, it also gives us key phrase extraction. This identifies the most important topics or concepts in our text and essentially highlighting what the text is really about. So we can see the key phrases here. Company name, credit card information, payment, date, due date, all these key things from this batch of text. And finally, we also have topic modeling. This takes large volumes of text and groups them into categories or themes like automatically sorting thousands of support tickets by what issues they're actually about. Now, a lot of implementations can go wrong because comprehend was trained on general internet text, news articles, and reviews. Meaning that it's learned from everyday language. So, when you give it specific content with specialized terminology, then the accuracy will drop. For example, if you're in healthcare, legal, or any other highly technical field, you either will need to build custom models or you want to add validation layers on top, which means having humans to doublech checkck the work that AI is producing. Right? Then we have Amazon translate. Now translate uses computer algorithms which are basically a set of rules and patterns that have been trained on millions of text examples to automatically convert text from one language to another and have the biggest cloud in AI. So think of translate like having a fluent translator instantly available through a simple API call. So you can see it's done that really really well. And there's loads of languages that we can choose. Now with translate, we need to understand the varying quality levels. The first one is European languages. Languages like French, German, and Spanish. They have excellent quality because these languages have a lot of training data available. Meaning there are millions of examples for the AI to learn from. So we can often use these translations directly in our production applications. Which means that if our applications are live, we don't have to do any sort of human review. very likely to be accurate. Number two is Asian languages like Mandarin, Japanese or Korean. These are good quality for general content, but we will have to have some sort of human to review the output for anything customerf facing or business critical since there's a lot of cultural context and formal language structures that don't always translate cleanly. meaning the literal translation might be correct but translate AWS's service might miss some cultural nuances or sound a little bit awkward to native speakers and number three is more less common languages so here we should expect way lower accuracy because there's obviously way less training data available and always on less common languages we want to validate with native speakers meaning that we need someone who speaks the language fluently to check the translation from Amazon translate now also translate charges per character and not per request. Meaning that you pay based on how much text you translate and not how many times that you call the service. So we can get around this by batching our requests which means we group together multiple translations together in a single API call instead of making lots of individual calls. And then we can also consider caching common translations which means we're storing frequently used translations so we don't have to pay to translate the same piece of text every single time. For now for speech services we have Amazon transcribe and Amazon Poly and these give our applications the ability to hear and speak a little bit like adding ears and voice to our software. What transcribe does really well, it converts audio files like recordings, meetings, phone calls, or even voice messages into written text that you can read, search through, and analyze. Now, transcribe works with high accuracy when the conditions are right. For example, in this room, but the problem is if there's any sort of background noise, it can absolutely kill the accuracy of transcribe. So, you have to ensure that you're using clean audio with minimal background noise. So this is because transcribe uses machine learning models which are computer programs trained to recognize speech patterns and these models work best when the audio quality matches what they were trained on which is obviously typically clean and clear speech. Amazon Poly does the exact opposite. So it generates speech from text, meaning that you can give it written words and it creates audio files. It sounds like a human speaking those words. Hi, I'm Ruth. I can read any text for you. Test it out. And Poly has four different engines and we can see we can filter through those and what they do. They have different languages and also different voices. If we want the neutral one, for example, we just have way more options for the voice and basically all the languages. And as we start going up the engine gets more sophisticated, but that means this engine doesn't have all of the different languages just yet. So, for example, if we go English Australian and run this. Hi, I'm Olivia. I can read any text for you. Test it out. And then we switch it to English Indian. We can hear the exact same thing. Hi, I'm Kajil. I can read any text for you. Test it out. You can see how well this works. It's truly incredible that we have access to these powerful services just through API calls. So this is what we call texttospech technology and it's what powers things like voice assistance and accessibility features for people who have difficulty reading. Now here is a feature that's incredibly useful when most people don't really know about and that is SML tags. Now SML stands for speech synthesis markup language and it's basically a way to give Poly instructions on how to pronounce things. These tags will let you control pronunciation, meaning that you can tell Poly exactly how to say specific words, which is perfect for brand names like AWS, which you want to be pronounced as individual letters. A, W, and S, and not as a word, right? So, let's test that out. Welcome to AWS. Welcome to AWS. Welcome to AWS. You can see it's reading it and pronouncing it in the correct way. So that's what we can use tags for. Okay. Then we have Amazon Lex, which is AWS's conversational AI, and it builds chat bots. Now, what makes Lex super powerful is that it uses the same underlying technology as Alexa. So we're essentially getting the same enterprisegrade conversational AI for our own applications. Now, recently, Lex has also been integrated with Amazon Bedrock to make our conversational bots way more intelligent. And we can also use pre-made bot templates as well from AWS. For example, if we run airline services, insurance, finances, retail orders. We can just plug and play these templates that are already made for us, which makes our prototyping and just validation of these features way faster. Now, here is how Lex really works, right? So, it's built around understanding what users are trying to accomplish. Now in chatbot terminology we call this as intent which is basically what the user wants to do like check their order status or book a meeting. Then we have utterances. This is just different ways people might want to express the same intent. And this brings us to the most critical limitation that trips up most developers. Amazon lex needs multiple examples of utterances per intent to work really well. Now AWS recommends in the documentation to have at least 10 per intent. So instead of just saying where is my order, users might also ask things like track my package or order status please or even things like when will my order arrive. You can see that there are dozens of different variations of saying the same thing and making the exact same request. And most developers really underestimate this. They only provide one or two examples per intent and then wonder why their conversational bot or Lex seems a little bit dumb and can't really understand what users are asking for. Okay. And finally, moving on to document AI, the final category. And here we've got AWS Textract. Textract extracts text from scanned papers, PDFs, images or forms, and even photos of handwritten notes and converts all of that into digital text. Now, what makes Textract so special is that it doesn't just read basic text. It can understand document structure. So, it can basically extract information from tables and forms. Now, this is absolutely game-changing for paperwork automation because we can take thousands of invoices, contracts, or application forms and automatically extract all of the important information without having humans manually type up everything. But there are some things that we need to consider about pricing. Now, text track charges per page and per feature. So, basic text extraction costs one amount. But if you also want to understand tables, then that will be an additional charge. It's the same with forms, that's another separate charge on top of the base text extraction. So, we've gone over the AI services in each category. And you're definitely thinking like, wow, that's a lot of information. And to be honest, yes, it is. So, what we're going to do is apply the 80/20 rule, which means that we're going to use 20% of the services to apply for 80% of the business use cases. Now the 20% of the services can be recognition, comprehend and text track and they can be applied to 80% of business use cases and they will get you around 85 to 90% of the accuracy with minimal effort compared to building custom models yourself. That's really where you get the most bang for your buck. But obviously there's going to be a ceiling pushing past that accuracy threshold going from like 90% to 98% accuracy for critical applications where mistakes are typically really costly. That's where we're going to need a service like Sage Maker which we'll cover in just a moment. So yeah, there is a time when you should skip pre-trained services completely. I would say don't use pre-trained services when you're dealing with highly specialized data like medical imaging or legal contracts where domain knowledge is really important because these areas have specialized terminology and requirements that generic models just simply cannot understand and also when trained on to understand it. Now there is a cost consideration that you might have to think about is that once you're processing you know millions of documents monthly or images for example then actually training your own custom model will typically become a little bit cheaper rather than paying per API call. So at scale we really want to be thinking about our own models but for fast development iteration we could use some pre-trained services which we're going to discuss right now the smart evolution strategy. So typically I always recommend for our clients to follow this strategic approach when we're building our own successful AI products. We want to plan our architectural evolution from day one. So what I mean by that is to design your system from the beginning to change and grow over time. So you want to start with pre-trained services to validate your idea of your product and get it to market fast. But you want to architect your system to easily swap in custom models when you hit scale and accuracy limits. Now, most successful AI products follow this exact same path. Pre-trained services for MVP than custom models for production scale. When you have more users, more data, clear requirements, and maybe even real revenue, right? And remember, there is never a perfect solution in technology. There are always tradeoffs and you're constantly balancing factors like cost, speed, accuracy, and complexity. So now that we've covered the pre-trained services which are great but as we've said they do have a ceiling and that's really where Amazon Sage Maker comes in. So let's talk about Sage Maker. Now Sage Maker is AWS's custom machine learning platform and you want to think of this like a machine learning infrastructure as a service. Just like EC2 will give us on demand servers and Lambdas provides serverless functions. Sage Maker gives us managed machine learning. So you focus on your data and your specific business problem and then AWS basically handles all of the technical infrastructure underneath. Now there are two parts to Sage Maker. We have Sage Maker NextG which is also known as Sage Maker's Unified Studio which is a comprehensive platform serving as like a unified center for data analytics and AI workflows. It's more of a broader solution that integrates various AWS services into one single environment. And then we also have the Sage Maker AI. This is more a focused service specifically for building, training, and deploying machine learning models at scale. Now, Sage Maker NextG is more of a comprehensive platform that also includes Sage Maker AI, right, which we've just covered as one of its single components and workflows along with the additional tools and services to complete the data and AI workflow. Now from first principles, Sage Maker addresses the fundamental challenge of custom machine learning development because building your own machine learning system is incredibly complex, resource intensive and requires specialized skills that most companies and startups just simply do not have inhouse. So what AWS have done here is they've abstracted away this complexity by providing managed services for each stage of the machine learning life cycle. So how does Sage Maker really work? What does the workflow look like? Let's go through that right now. The Sage Maker workflow is split into four different phases. The first phase is data preparation. Here's where we need to prepare our data for machine learning. Right? And this is where Sage Maker will help us label our images, meaning that it can add tags that can tell AI what's in each picture, categorize our text, and organize everything properly. Now this phase typically takes most of the time sometimes actually 70% of the entire project because it builds a foundation to actually then train and deploy our model on. Now Sage Maker provides the tools to speed this up including automatic labeling suggestions and workforce management for human labelers. Then we have phase two which is the model training. Here is where we feed our prepared data to Sage Maker and that's where it learns patterns from that data. For example, if you want to predict which customers might cancel their subscription or detect defects in manufacturing or forecast sales for the next quarter, Sage Maker has pre-built templates for common business problems or you can build something completely custom for your specific needs. Now, the platform automatically experiments with different approaches and we call this hyperparameter tuning to find what works best for your particular data. So instead of just manually trying hundreds of different configurations, Sage Maker will do all of this testing automatically. Then we have phase three model deployment. So this is where our trained model is now ready to make predictions on new data and we can deploy it with a single click and Sage Maker will handle everything. So it will scale up when traffic increases. It will monitor performance to make sure it's working correctly and it will also let us test new versions of a model without disrupting live service. This is called AB testing where you can try a new model version on smaller percentage of traffic. So for example, if we have a new model, we could allow 10% of our new users to try the new model. We could monitor that and then slowly increase it rather than just deploying a model to 100% of our customers, which is way more risky and potentially could have impact live service and customer experience. And finally, phase four, which is monitoring. So once our model is live, Sage Maker keeps watch to make sure it's all working really well. And over time our data will change and our model might get slightly less accurate. So Sage Maker alerts you when the performance of our model drops and helps us fix our issues to update it when we have fresh new data. Okay. So when do we actually use Sage Maker? If pre-trained services aren't achieving the accuracy that you need or if you need complete control over your model architecture, meaning that you want to customize exactly how the AI is making the decisions, then Sage Maker is a good solution to have. It's also perfect when you lack machine learning infrastructure expertise inside of your company. So for small startups and small teams, they will benefit from rapid deployment without having to build machine learning infrastructure from scratch. Here is where the time tomarket advantage, meaning getting your product to customers faster, simply outweighs the cost premium that you have to pay. And for enterprise teams, get integrated ML ops. Basically, the tools to manage AI systems in production and the governance features without having to manage complex infrastructure. Here, the additional cost is justified by the reduced operational overhead, meaning there's way less work for your IT and cloud teams. Now with all of that said, as always, there is never a perfect solution, but always we have to make trade-offs that we need to understand. The first one is the cost premium. Now, Sage Maker typically costs about 20 to 25% more than running the equivalent infrastructure directly on EC2. For example, a GPU instance that costs $383 a month on EC2 might cost us around $475 per month on Sage Maker. So we're looking at around 25% premium, but this reflects on the added convenience, automation, and managed services that Sage Maker provides for us. So basically, we're paying extra for AWS to handle all the complex setup and maintenance. The second thing is scaling speed. So Sage Maker has significantly improved its autoscaling capabilities, meaning its ability to automatically add more computing power when we need it has got better. Now this is still generally slower than launching a container for example using Docker which can start in seconds. So for example we could use something as ECS which we'll get on to very shortly. But if you need to handle sudden and unpredictable traffic you might need a plan for the slight delay. And finally third trade-off that we're making is the platform lockin. Sage Maker offers lots of integrations and workflows that are specifically set up to keep you within the AWS Sage Maker ecosystem. So once you start using it, it's quite difficult to get away from it. Now you can export your trade models in standard formats that work elsewhere. So moving your entire machine learning pipeline to another platform, but it will likely require some refactoring and adaptation. Now I want to give you a pattern that works really well with our clients and it's that start with Sage Maker to validate your idea and get to market fast because you have to prove that your AI solution actually works and that customers really want it before you start to invest heavily in optimizing for cost and your architecture. So as you scale and understand your usage patterns better, meaning you know how many predictions you're making per day, when your peak traffic happens or what your typical workload looks like, then you can actually optimize by moving to high volume inference to containers running on ECS, which is AWS's elastic container service. Now don't forget inference is the actual prediction making part. So after that you've trained your model. Inference is when you need to feed it new data and get predictions back. Now, ECS is cheaper for predictable workloads because you're not paying for the extra managed services that you don't need at scale. And here is the key part is that keep training experimentation on SageMaker because these activities are way less frequent but more complex. So, the managed services are worth the premium in my opinion. Now, there's also a hybrid approach that balances development speed with cost efficiency. So you get the convenience where you need it and optimize for costs where you're doing high volume repetitive work. So in summary with Sage Maker you don't need to manage the underlying technical infrastructure but you still need to understand your data your business problem and the basics of how AI algorithms work. And for most companies that tradeoff is absolutely worth the premium that you have to pay with using for Sage Maker. So now we've gone through the pre-trained AI services. We've also covered Sage Maker, but there's an entire new category of AI that's transforming what's possible. Generative AI with Amazon Bedrock. So, while SageMaker is about building and training your own models, Bedrock gives you instant access to pre-trained foundational models through a simple API call. Now, from a first principles perspective, Bedrock solves a fundamental problem. Foundational models are massive. They're expensive to train and they are complex to deploy. So AWS has abstracted away this complexity but letting you focus on building your applications instead of infrastructure. Now with Bedrock you get access to anthropics claude including 4.0 Sonic and there's other models like stability AI for generating images, Amazon Titan which is AWS's own AI model and Bedrock marketplace giving you access to over 100 specialized models for specific industries like healthcare and legal. So here we've got the AWS Bedrock console. all of the different models in our model catalog and then we got the marketplace deployment here. We can actually go and find the market models. So for example, if we select anthropics claude, we will have to request it. So you go into model access that you want to request. So for example, we can request it. Click next. Submit. Now this is tied to the region that you're in. So if you change regions, then you also have to reerequest the model. Let me just switch regions. And you can see I had access to Claude in my other region. So what can we do or leverage bedrock with? Well, firstly is text generation. This is the bread and butter and we can create articles, emails, code or even summaries. Now the real power here is context understanding. So if you feed it a 50page contract and ask for key risks, it will just do that with such ease. We can just use this as the same way as we do with chat GBT. And just like you use claw directly, you can use the same right here in the AWS account. Now there's obviously some pros and cons to this. So, I'm going to get to that in a second, but we could also use it for image generation. I do have to request access to it. So, if we go and request for Amazon Titan and click next and submit. And it'll take a couple of minutes, but it should give us access to the image generation model. Just like that. And so, here we can just make visuals from text descriptions. So, for example, product mockups, marketing assets, all from just simple prompts. make me a headphone for my and we can run it and it should be able to create some sort of image for us from our very generic and really bad prompt. Give it a couple minutes cuz image generation still takes a lot of time. So you can see the quality is pretty good. We got these very nice headphones that have been made on like a Shopify kind of landing page and image. Obviously I can tweak the prompts. I can change the configurations, make it better, but you get the idea. But that's not all. So with Bedrock you also have agents, you got flows, you got knowledge bases. There's a lot of things that you can do and let me know in the comments if you want a deep dive to Amazon Bedrock. But that's really not all because there is one thing that you can do to make the most of Bedrock's capabilities and it's all about building an intelligent AI layer. So let's say you are running a business and you've got a lot of customer data. Now, if you're running on AWS, then your data is probably already stored in AWS. And what Bedrock can do is let you understand and reason over all of this data and your customer base to build real intelligence into your products and your services. So, for example, when a customer asks like, "What product should I buy?" You don't want a foundational models general opinion. You want an answer based on the customer's purchase history, their browsing patterns, and what other similar customers loved. So we are tailoring the experience to the customer to ensure that we increase the chances of them spending their money and buying our products. So there are some examples where you can bring this into your industry such as e-commerce personalization. You can connect bedrock to your product catalog, user reviews, and purchase histories. Next time a customer asks you a question like what's good for winter hiking, they will get recommendations based on their size, their previous purchases and what other similar customers loved and actually bought. Now you can also use Bedrock for enterprise knowledge. So you can link your internal docs communication history and your project data. So when an employee asks something like how do we handle GDPR requests, they will get answers from your actual policies and past resolutions rather than just generic internet knowledge. Now another way that you could use it is for customer support evolution. So what we can do is we can connect support tickets, product documentation and resolution history and our AI chatbot will know everything about our customer specific setup, their past issues and what also has fixed similar problems. that we're reducing the amount of time that it takes to solve customer tickets, proving customer experience. So, and that's the thing, right? Since your data already lives in AWS, the integration of AI becomes more seamless. Your S3 documents become searchable knowledge. Dynamob customer data enables personalization. You can use RDS analytics to inform recommendations. Cloudatch logs reveal behavioral patterns. So you're not really moving data around or managing any complex pipelines because Bedrock can securely access your existing AWS resources and suddenly your AI understands your entire business context and from there the value is created by finding insights across your data. So identifying why customers churn by analyzing support interaction with users patterns. You can discover which features drive engagement by connecting behavior logs to user feedback. and you can predict what content converts. For example, correlating market data with sales outcomes. So, every customer interaction will make your AI smarter about your business. Over time, this creates a compound advantage. The more that you use it, the better that it understands your customers, your products, and your operations, and then ultimately to personalize recommendations and experience to drive further revenue. Now, that's not all because when you use Bedrock for foundational models, your data isn't shared with Anthropic to train their own AI models. And I'll explain why that's huge in just a moment. Because if you use OpenAI, you are sharing your data with them. If you're using Claw directly, you're also sharing your data, but you can opt out of it. But when you use foundational models with Bedrock, AWS has a specific agreement in place to never share your data to Anthropic. In fact, when you use foundational models on AWS, Bedrock makes a copy of that foundational model and makes it available to you and your account only, which you can then fine-tune with your own data. And like I said, none of your data is used to train the foundational model. Everything happens within your own secure environment, which is why you don't want to be using chat GBT and claw directly for any sort of customer information and internal business documents because your data can get exploited. What you also get is complete audit trail so you can see exactly what data was accessed and when with using AWS you can have IM based access control so you get fine grade permissions for who can use what and you have private model endpoints meaning that your AI requests don't go through any sort of shared infrastructure so when you consider all of these different factors combined it makes using Bedrock virtually a no-brainer instead of going directly to the foundational model providers like chat GBT. Now, with that said, what about the costs? Because a lot of the time, that's what it comes down to. That's part of the reason why Deep Seek took the world by storm. It's not only that they were able to dramatically decrease model training costs, but their tokens were so much cheaper compared to OpenAI. With Clawude, Opus 4 cost $15 per million input tokens and $75 per million output tokens. Sonet 4 cost $3 per million input tokens and $15 per million output tokens. Now, a million tokens is equivalent to about 750,000 words. And when we say input versus output, input tokens are what you send to AI. So, your questions, your documents, and your prompts. And then output tokens are what the AI sends back, its responses, and the content that it generates for you. Notice that outputs cost significantly more, $75 versus $15 for Opus 4. That's because generating new content requires much more computational power than just reading and understanding your input. So yeah, and just before we get to the demo of me showing you our AI product intelligence platform, it's not really the case of pinning bedrock against Sage Maker because they have different use cases. And I think this is really where people get confused. They both have their place with various trade-offs that we've covered. But if you're needing foundational models, the question is whether to use direct APIs through chatg or anthropic because you're actually paying the exact same price per token with either one. But by going with AWS, it gives you security, it gives you compliance, it fits in really well with your existing AWS infrastructure and the data is already there. The applications are also very likely to be running in AWS 2. So now let's see how easy it is to combine these AI services into a working application. So I've built an AI powered product intelligence platform that uses recognition to identify what's in an image. Comprehend to analyze customer feedback and then uses bedrock to give us business insights by using claude as our foundational model. Now all of this with just a few lines of API calls. Now let me show you how this is working. So I've built this front end using Nex.js JS14 with TypeScript. Now the input here handles everything our users interact with. So we have a drag and drop or even a choose file upload component with built-in validation and it checks file types, enforces size limits and provides immediate feedback. Now the text input that we also have that I'll show you in just a moment. So what it does is we can upload a product image. When we put an image in there, it will check the file type. It will enforce size limits and provides immediate feedback. So when we upload the image, it actually gets stored in Amazon S3 using pre-signed URLs. And then basically we can just put something in here that will then get analyzed with AWS comprehend for sentiment analysis. So we can say I love my product. It works very well and I love sharing it and showing it to friends and family. So then we can press analyze with AI. And by the way, the product that I've uploaded is a Nike sneaker. So we can see it's recognized that it's footwear, shoe, sneaker. You can tell with the sentiment from comprehend that it's very positive, my friends, my family. So it searches for the key phrases that we discussed earlier. And then it gives us the bedrock powered business insight. So the summary it gives us that it's fashionable. It's a functional sneaker product that customers enjoy sharing with friends and family. That's gets fed in from comprehend. Then we've got our customer insights, but also it then gives us a recommendation on how we could use this product to improve our service. So for product detection, we're using recognition. For sentiment analysis, we're using comprehend. And then we're getting the AI generated business insights with Bedrock all in a very clean and scannable format. Now actually let's have a look of how this works under the hood and how the back end orchestrates these AWS services. So the back end is where we orchestrate all of the AWS AI services. So I use the Nex.js API routes because they're simple and they're scalable and they can deploy really seamlessly. But if I was to build this into a real production grade product, I would opt to use API gateway with lambdas instead of the Nex.js roots. But for the speed of this demo, I opted for Nex.js API routt instead. So for our APIs, we have two endpoints. We have upload URL and we have analyze. Now the upload endpoint generates a secure pre-signed URLs for our S3 bucket which we have right here. This is important. So instead of routting our image files through our server, users can upload directly to AWS S3. This is way faster, it's more secure and it costs less. And we validate everything server side. So the file name, the types, the sizes, and we enforce security policies. Now the analyze endpoint orchestrates calls to our free AWS services. Recognition for image analysis, comprehend for text processing and bedrock for intelligence generation. We can see that step by step here. Now we could simply just split these up into separate API calls but in this case we've just kept it very very simple. And then we also have our infrastructure code using the CDK. So we created a bucket. So an S3 stack directly and an AM stack as well for permissions and policies. So I chose these services specifically for their managed capabilities and enterprisegrade features. So S3 secures image storage with encryption at rest. Recognition provides us with computer vision without training any models. It can detect objects, extracts text from images and also includes content moneration. Then we have comprehend for sentiment analysis and bedrock with claude as our intelligence layer. Bedrock will take the structured data from the other services and then it will generate human readable business insights, competitive analysis and actionable recommendations which is what it's done right here. Now what makes this architecture really powerful is that each service is fully managed. It scales automatically and it integrates seamlessly. No machine learning expertise is required, just API calls and business logic. Now the only infrastructure that I had to provision here was like I said AWS S3. The rest of our stack is purely API calls. So let's see how this completely works together in our data flow. Now this diagram you can see reveals the elegant orchestration of our AWS AI services transforming raw data into strategic business intelligence. So you notice the numbered flow. So step one shows our secure image upload to S3 using pre-signed URLs. Step two demonstrates the true parallel processing while recognition extracts objects, text and moderation data from S3. Comprehend will simultaneously analyze the sentiment, the entities and the key phrases from the input text that I put in into my product UI. So there's no real waiting and there's no real bottlenecks because recognition and comprehend can process the data in parallel. And finally, the real magic really happens at step three because both AI streams converge at bedrock where Claude doesn't just aggregate the data, it synthesizes the visual and the textual intelligence into structured business insights. Now, this output box right at the bottom or the results show us exactly what product teams and executives need. They need product summaries, they need customer insights, they need recommended actions, competitive analysis, and risk factors. Now, this retakes a couple of seconds of processing time, which shows us the power of managed services. What traditionally required a data science team, complex infrastructure, and months of development. We've achieved this all through intelligent API orchestration. There's no service to manage. There's no models to train. There's just pure business value. Now, this is really cloudnative AI at its finest. Turning the three fundamental principles of input, processing, and output into a competitive advantage through AWS's pre-trained enterprisegrade AI services. So, with just a few lines of code and simple API calls, you've built an application that could help e-commerce companies analyze product feedback at scale. The same pattern works across all AWS AI services. You're just connecting the inputs and the outputs to solve real business problems. So, if you're interested in working with my consultancy directly, then click the links below in the description. Or if you want to learn more from me, then check out my cloud engineer academy where I teach you how to build these platforms and services and become a real cloud engineer in 2025. Alternatively, check out this video right here to become a full cloud engineer with my in-depth road map for complete beginners. As always, thanks for watching and I'll see you on the next