Transcript for:
NA10 Scaling Lecture

hello hello welcome to our live stream how are all of you doing today i have with me omar again and thank you so much omar for joining us we'll be talking about a very interesting topic that we have seen a lot of questions about and i think you mentioned it at the end of last live stream as well that we'd be chatting about scaling at a10 so yeah really excited to hear and see how we can scale your nh10 instance step by step definitely thank you for inviting me again it's an honor to be here talking about na10 once again it's amazing to share some thoughts to our audience and to our viewers and yes we'll be talking about scaling it at the end which is very exciting it's something we have been working on a lot lately and we have some very interesting releases and we're explaining it's very interesting because we'll be talking about um some of the internals about how nadm works how we're working on it how it's evolving so it's going to be a very interesting talk both to technical and non-technical people as always we're trying to keep it uh as simple as possible for everyone to understand uh but inevitably we'll be talking a bit more technical and it's gonna be interesting for everyone to learn a little bit so i hope it's very insightful for everybody awesome so basically today we'll be taking a peek under the hood to see how the engine functions right exactly so we're not actually looking at the code necessarily but we'll be running stuff and understanding some concepts behind how in him nh10 is designed um it's internals uh what are the the machine parts so how they interact and how they behave so there's going to be a lot of uh interesting learnings today awesome and before we get started so uh omar let me share your linkedin on on the chat so that people who want to connect with you can send your connection request as well and uh yeah omar and i are joining from berlin today and i'm really uh curious to know where you're joining us from today so please share in the chat say hi we'd love to know uh what are you looking uh for today and uh where you joining us from so i already see a message from herschel who says hey hey hey airship um he also says excited to learn we are very happy that you could join us today and while we are uh looking at all this uh omar can you tell us a fun fact about yourself oh you got me in this one a fun fact uh well there's one thing actually as a developer we use we usually drink a lot of coffee i'm not the coffee type because my stomach hurts sometimes so i almost quit coffee but this thing about the coffee is the caffeine and even if i drink a lot of caffeine or any sort of energetic drink still i can sleep in like 30 seconds usually you know and that's something that drives my wife completely crazy because she takes a long while to sleep you know so we're usually talking in bed maybe a thoughtful conversation sometimes about politics and things like that you know and uh okay it's time to sleep for me i just take off the glasses uh turn to the side and sometimes in six or seven seconds i'm snoring and she gets completely nuts because it usually takes us like two hours almost to sleep you know so yeah that's interesting and that's one interesting thing because it's something that i learned from the time uh that it's all about thinking about my breathing so it's almost like meditating i usually i think about my breathing i relax my body and voila i'm sleeping that's awesome that's a that's a superpower yeah yeah being able to sleep at any time and not narcoleptic yeah yeah and i see a message from teflon who says hey everyone joining in from canada thanks for joining us and hershel is also joining us from berlin i see a message from no code who says uh they're joining us from india and then we have a message from isabel who says greetings from berlin we have quite some buttoners joining us today um omar no coffee is love well i love coffee don't take me wrong you know it's just about my stomach so i i usually like drinking coffee but just i cannot have it too much but i love coffee like almost every day in the morning i'll have a cup of coffee but i i mix it with milk just so that my stomach doesn't bit me doesn't hit me so hard nice awesome cool so we're going to talk about scaling and it ends so how do we get started uh what's i'm gonna share your slides on the screen and i guess like one of my first questions is uh what should we already know to get started with this sure awesome let's do this okay so uh about scaling an a10 and let's go to the next slide where oops sorry okay next slide um what is scaling actually then so let's set the ground for this uh scaling is a property of the system to handle a growing amount of work by adding resources to this system so in essence what it means is you need more stuff to be done so you need some more power how can be this be translated um either you get a more powerful computer so you i don't know you buy a a a newer laptop with a strong a newer cpu much more ram or something like this this is what we call vertical scaling when you have a more powerful computer or more powerful servers and there's the horizontal scaling when you say okay i don't need an a more powerful computer if i can split the work among multiple computers you know so this is what we call horizontal scaling when you're adding more service so this is a very basic concept that isn't interesting to know because uh so far we were only able to scale an a10 vertically by getting a stronger computer you know more cpu more ram uh but now we're actually able to horizontally scale na10 by adding uh more servers to our pool and actually making an a10 able to process a larger amount of data so this is similar to distributed computing um yes it's similar to district because actually what we are doing right now is distribu is building a distributed system because the concept of distributed is that uh work is is processed on multiple systems on multiple parts you know so this is actually what we are doing so yeah cool uh now let's talk a lot a little bit about na10 structured and how it's designed you know so if you have already interacted with anytime you want the first thing that you see is the interface so once you open your browser and you see the the nh10 ui so this is what we call the editor interface this is where you interact actually with anything this is the place you'll be drawing your workflows saving them checking for executions um adding credentials this is everything that happens in the editor you know and for every action that you take in the editor like for example when you save a workflow once you activate it uh once you save a credential it all has to be saved in a database system and this is what we call the this is done by the internal api so ginturfis interacts with the api by sending data that is saved to the database you know and there's also another service here that we call the webhook registration service uh i'll talk about just briefly about it but let's say that you created a workflow uh let's say trello and you want to take an action whenever a card is moved from a column to another so uh this means that na10 needs to tell trello about uh that it needs to be notified whenever an action on trello happens so trello sends a message to anytime and anytime can take action on it so this is the service that does this registration part this is something that had to be changed when we started scaling anytime because the way nh was doing it is that once the workflow is activated it would register the web hook and once the na10 instance would be stopped it would be registered the webhook but now we're in a distributed system there are other service servers talking so disney needed to be changed so this is more like under the hood change but it's just a service that uh this is like one of the architectural designs that we have in turn you know okay so this is the left column we have like the basic area where you interact with an a10 then in the center we have what we call the initiators uh these guys are responsible for understanding if something should be executed so if a workflow should be executed so let's go back to our trial example where we have an action like i drag a card from column a to column b trello notifies nhn uh this is done via a web hook which is an essence an http call so this is the layer that the web book this is the web hook taking part of it it receives an http request looks at it and says ah okay so the card was moved all right i have to do something hey i need to execute this workflow and the same thing happens i don't know for asana when you complete a task or something like that this is all done by those initiators they are the ones responsible for receiving information and checking if we have the necessary conditions to execute a workflow to take some action based on that okay uh the new shaders can be three types i already talked about the web hooks which receive an http request from any service sauna trello gmail google drive dropbox whatever uh draw the pollers which are almost inverse of a web hook so let's say on trello when you when you take an action travel notifies anything actively but this is something that was implemented by trello if trello doesn't have support for this what we can do is the opposite way where nhm asks trello hey do you have any updates was there any movement of cards something happened was there any change and when that happens we also trigger a workflow or an execution you know and uh and it and some of the trigger nodes are actually like have web hook under the hood like trailer as an example but i think some of them and i can't remember which ones exactly but air table trigger for example is doing pulling under the hood right exactly yeah so that's very interesting we try to make it transparent to you what's happening actually under the hood uh but internally this is what we're doing you know we need a way to see that there were updates that there is there is a condition for us to execute a workflow and this is done by the initiators and the trigger the the ones in the top of the middle column are the ones that are everything that is not something that we can pull for information and something that is not actively notified to us by an http request what for example uh trigger uh interval node or cron so these nodes for example they are time based so there is no external event that happens and needs to execute something so this is the example of a trigger type of node you know so this is the middle column where we call the initiators so these are the guys that actually check for the necessary conditions through starting workflow and lastly we have the guys doing the heavy lifting which are the workers um so we can have right now we we are allowing to have multiple worker instances so you can have as many as you want so you can spread the load so if we have a look at a workflow for example you have like the start nodes and all the trigger nodes possible and you have all the others so the column in the middle is the one that deals with the trigger nodes and all the other nodes which are not triggers will be processed by the workers so this is how we separated this you know i hope it was clear if it's not clear please feel free to share your your questions in the chat cool so let's uh take maybe um two minutes because i think like this is a crucial part of understanding of this whole scaling unit and instance so in case you have any questions please share them in the chat and we'll answer them right away um and while we wait for our audience to sort of come in with questions i had a few questions for you so does that mean that uh so what essentially are workers like are these um node.js threads are these processes what are these okay so they are actually processes they can actually be completely separate running in a completely separate server so let's say you're using your cloud service amazon for instance and you have one server running what we call the main na10 process and you can have multiple servers let's say ec2 instances or even docker containers if you're running in kubernetes that are worker instances and they are actually doing the job so uh we'll be showcasing this in a minute and i hope it makes it all a lot clearer to you uh but what happens is that we can have uh one single main instance generating jobs for a multitude of workers so they're actually no js processes yes and they can be running anywhere as long as they can share some information and we'll be talking about that in a second too cool so i see a question come in from no code who asks so triggers never involve an external service not necessarily let's say there is one trigger in na10 for example which is the ampq q system so these for example are triggered on an external servers which is a queue system but this is not handled by an http call in any way so it's not something that we call for using an http call and this is not some also a web hook it's more of a socket thing or a specific driver i i i don't know exactly how it works um but in the end this is an example of a node that is a trigger node uh it happens based on an external event as well but they're not necessarily the case so anything that is actually not related to an http request be it pauling or web hook is actually a trigger got it and i remember like because i worked a bit on the kafka note so i worked on the kafka node and one of our colleagues worked on the kafka trigger node so for that uh we had to sort of include uh i can't remember the exact name maybe it was kafka connect so node.js library and then we were sort of talking with it so it was following the protocol that kafka follows so not an external service but yeah then i see another question coming in from teflon dude who asks how many workers does the standard installation of n10 have okay so for a regular installation you're actually having everything bundle this one so you have both the main server and all the initiators and all workers in a single process so you have everything bundled together so let's say that you run uh via docker docker run and again dash io slash na10 you have the dna 10 instance running uh it runs just everything button built together and the same thing when you start anything in your computer either via npx or uh if you have downloaded the repository installed the packages and uh started it it all comes with everything bundled together in this so you have actually only one worker there good great i see you go forward yeah so yeah like if there's any more questions please send them in the chat so then let's move forward yeah let's do it okay so uh basic candidate and running let's see how it works i have a copy of nhan here the the repository if i simply i love this header by the way oh thank you i made sure that all windows are the same size so that we all feel comfortable about it and we'll be using all of these windows in a moment uh okay so this is what we are what we usually see when we start anytime uh so this is the basic screen for n10 um you have the editor interface here under the hood we have the api so i'll be opening the console here just to showcase that uh if i click to open the workflows it's talking actually to the the dna 10 back end and there's a lot of information here so we can open our um our workflows we can see them edited them and uh this is the basic kinetic installation what i did here is that i have this the repository installed on my machine i have already built the project previously and i simply ran npm run start which is actually the same as dot slash packages cli in an a10 start it's the same command in essence so it's pretty much the same thing there are no changes actually uh actually npm start is simply a shortcut for this one so uh this is what we see when we run an a10 so no big deals here if you if you have ever tried n10 you have probably interacted with this already okay so let's start with the cool things you know i i won't be spending too much so showing nathan right now because you have probably already know this so for scaling and a10 we have a few things that we have to keep in mind so now that we're working in a distributed system we need information to communicate so uh in standard na10 you have everything bundled together in a single soft piece of software in a single process so all the information is in there and it's just flowing internally but now we're actually dividing this so we're making it a distributed system with this in mind information needs to flow from one side to another from one peer to another so they can share information about what's happening um na10 by default we'll be using sqlite as the database if you don't know what sqlite is it's actually a database system that runs sql queries so it's a standard language for querying data inserting updating querying and things like that so it's a place where you can start data and it's based on a file on your local computer you know so this is what na10 runs as the default when you simply start it like i did previously but now we are running a distributed system we have many peers talking to it you cannot have all of you actually can but it's not advisable to have all of them interact with the same file because that will have lead to concurrency problems and fires are not supposed to be dealt by distributed systems like this it's not the best practice you have problems with logs um you have performance issues so we'll be using a real database for this um not saying that sqlite is not but a more powerful one actually yeah i'm sorry uh so we'll be using postgres for this uh and also redis why redis uh we're using actually an underlying library on na10 which is called bowl and it's already built on top of redis for being the message broker so when an initiator says it needs to run something it posts a message to redis and the workers are all listening to radis get this message execute the workflow and notify back okay so let's start setting it up and let's get cracking because this is the fun part uh another thing that i'll be doing this is just for this demo you don't have to do this at home uh but feel free to do it if you want to uh i will be running a lot of different containers in my computer because that's the best way to simulate that we're running running like on multiple different servers so uh just to make sure that they're all communicating together properly i'm adding all of them to the same network so let's say let's pretend that i have multiple computers the what i'm doing actually with this command is that i'm making sure that we have cables connecting all of them you know so i had one question omar so um na10 ships with sqlite by default so folks who follow this along would have to sort of migrate database to postgres um so would we be doing that as a part of the demo today or would we uh point them to somewhere in the documentation for that yeah we'll be pointing them this is not something that we'll be covering today but there is already um if you go to the documentations the reference section in nhn docs you have the cli section and in the cli section you have the import and export uh commands so in essence what you would do is you export information while running with sqlite then you set up your instance to work with any other database the postgres mysql mariadb your choice uh you start anything and then you import this information back so you can import and export workflows and credentials okay and i have added a link to that in the comments section as well awesome thanks okay so let's create the network this is very simple uh docker network create and i'm calling the network and again so this is a very simple command if you want to follow this at home feel free uh all of these comments should work i i'm just doing them live with you because i've already tested before but they should all be working so feel free to follow then we need as i mentioned we need redis so i'll be using the redis container for this and this is a command that spins apparatus adds it to the network that we created and exposes the port 6379 to my host exposing this part is not really necessary it's just something that i always do because if i want to check for information on radiss i can simply um connect to the redis instance that it's running on docker but it's not really necessary as we have all as we have added them all to the same network you know so let's start the the redis i'm using the dash d flag so it is it runs in detached mode which means that the terminal is free for me to use again uh and then same for postgres though just like redis i'll be running postgres uh one important thing is that i use the postgres password environment variable and i set it to mypasswd so this is exactly the password that we'll be using in this case so i'll just copy and paste this command and if i run docker es i can see that both postgres and redis images are running and they are named na10 dash postgres and anytime redis so we have both containers running in the computer and they are fine for us to use and regarding the database that uh the nadine instance that you just showed is it talking already to postgres that you just been uh spun up uh not really actually i killed it uh before we started doing this so i stopped that na10 instance because we'll be starting another one now we're connected to those services gotcha sounds good uh also one note that i left here um if you're using an enviro if you're deploying for high scalability for a lot of executions a very busy environment we strongly suggest you use postgres 13 because from 12 it received a lot of upgrades a lot of performance upgrades especially when it comes to counting so if you have lots of execution uh it works a little bit better you have some uh performance gains in this okay and then let's go to the cool part which is running anytime itself so i'll copy this command and i'll explain this one in a second so oh actually i have to go back to my home folder okay so what happens now is that we are using the docker image to run an a10 we are loading an environment file with some environment variables and i will show this in a moment what what these variables are i am adding it to the same network that we had before so they are all connected in the same network i am exposing port 5678 to my host computer and this is the docker image android nios and a10 so what happens now is that we can now open and we have an n810 instance just like the one before but if you recall when i clicked uh oh i use a shortcut just let me just showcase uh if you go to this menu and click open you see that there are no workflows here and in that previous instance i had a lot of workflows running because that's another instance that was connected to my sqlite file and this one is connected to postgres which is a brand new database it's completely empty so it's just just born right now so this one is completely empty and let's go back to our presentation uh where i will tell you what is in that file uh so if you check this command here like we're using an environment file i added it to dot n80 folder which is um inside my home folder so it's a hidden folder where an attempt keeps its information i just created a file there and added some information uh let me just open it here and folks could um equally like create this maybe and this doesn't ship with anything does it uh what do you mean like docker i mean uh the q dot env file oh no it doesn't like this one i created yeah so let's say if i want to run this setup uh i could equally create this file on my desktop and run from there as well okay exactly like you have to point to this file so this is why i switch to my home folder now because this is exactly where the file is so i'm slash home chromebook and there's a folder called dot n8n and this is where i have the uh the environment variables file and if i check what is inside this file we'll see some information here i'll go back to the presentation because uh the font's a little bit bigger so database type i'm telling nada the database that we'll be using is a postgres db uh i'm also telling nathan that the host is edit and postgres and this is important because this is the command that we used here uh this is the name actually so i named the container as nation-postgres so this is the same information this is relevant because they are all in the same network and because they're on the same networks the name actually act as the links for these and the password says mypasswd the same one that we use on that command the username uh the database name is postgres this is the default database that comes when you start postquest i could create a database for example called n8n and switch to that one i just let the default settings and there are two very interesting points here first is the execution mode i'm telling anything hey you're not running in standalone mode anymore and that bundled mode where everything is together you are now relying on q and to use q we need redis and i'm telling him that the reddest host the reddest machine can be found at n810 dash redis so it is now able to connect to redis to uh exchange information so so far what we did we set up is let's do a quick recap i started redis i started postgres and i started the na10 instance that we can see here working just fine awesome so i see a couple of interesting things happen in the slide um so i think like my takeaways here were basically we have to create this file and if we are using sort of a docker network like you did um and um this na10 instance also runs inside that network for host i could use just the service name right and it can have and postgres but let's say my uh postgres database or redis was somewhere else in different servers i would be pointing to the host which would look more like urls right exactly so yeah like let's say you're using um amazon web services rts system relational database service uh and the host here you choose something like i don't know rts 3023.amazon aws.com i don't know something like this you know you change the url to point to your dns where it's saved exactly awesome so i see a couple of questions coming in uh which is great so let's start off with teflon dude who asks what do you consider to be lots of executions okay lots of executions um i'd say over a million a month would be already a lot that's that's a good amount when you should like i think that up to a million executions a month as long as they're kind of equally divided uh anytime should be able to handle that uh naturally it depends on the number of nodes that you have in the workflows um but like for basic usage i think one billion executions per month should be something easily doable um but naturally like it comes to there are a lot of settings that you can do and you can tune in a 10. um for instance any 10 by default like it it's something that already existed before it can run the executions in a separate process or inside the main process itself so instead of spawning another process just to do the job and then reporting back and it then can run all of this built in but this is still without the q system you know this is a setting in any 10 so anything is actually able to change the behavior and actually be more or less performant but it has drawbacks for example in reliability so it's something that have users should be actually trying to tackle if they have actually a lot of usages but these are things that you can tune so when you're running in a separate process you have more stability in your system because being a separate process is something that you can control easier more easily so you have power over that child process but it takes a lot more memory you know and it's a bit slower because spawning a new process takes some time so it's part of the operating system job you know uh while running in the main process is something that requires less memory because um you don't have to spawn a new process um but you cut you cannot really control it so well in javascript because of the way uh the javascript works you know um you can't really tell a workflow to stop for example if it's already execution so when it's a separate process you can simply kill the child process it's a little bit of maybe i'm going too deep into this but you can actually tune in a 10 to improve performance depending on your needs so i'd say like um it's all it goes about testing and seeing uh how anything behaves on your deployment in your situation you know because instant size matters the amount of ram that you have so there are many factors to play in to take into consideration actually cool i see another question from no code who asks that postgres docker command will take care of the version automatically so i i wonder if he's asking about the 13 plus that you mentioned yeah probably um actually yes because uh the latest version for postgres uh it's already stable is the version 13. so with this command i did not explicitly mention any version and it will pick up the latest one and the latest one is already postfish 13. so yes it covers awesome um i see another question from amutan who asks what are all the options that i can define using dot env as well what are all the available options interesting uh yeah so in the docs we have a lot of options actually and you can also check them in our repository there's a file called config.ts um maybe we can give that to you in a minute uh better they are also in the documentation so you can find them in our website uh there are plenty of options for example you can say that your uh database handshake goes through ssl so you can have a certificate for that you can tell on a10 which is the host it's running on so it exposes the correct url if you have let's say okay it was a bit vague when you're running in a system where you have a dns setup so um let's say our cloud instances for example you have a dns name you can inform that 2010 what is the correct url to be used when connecting to external services so there are plenty of information that can be used to set up and in regards to performance it's the uh it's the flag called executions process it's one called mode own or main so you can switch between both and see the differences i hope that was clear so ah there are things a lot of things to talk about so i don't want to spend too much time but i i don't want to have a vague answer to you could ever let us know if something is still done here in the comments and i have shared a link for the scaling uh documentation as well so that has a list of uh the different options that you can set in the env file and then i see one last question from teflon dude who asks since the env file is passing environment variables through to the docker instance could these values be set using export commands in the user profile uh definitely but then you'd have to proxy them to your docker instance so uh let's have a look at the terminal just for a second if i use export name equals omer here and i use m for example i'll see m is a command that displays all my environment variables we can say that name is called over here and when i use docker run for any command if i simply use i if i'm not mistaken this dash dash env and i use name without setting a value so if i use this name equals tonight the docker container will have an environment variable called name with the value to not but if i simply omit the value it's going to use the name which is in my computer so you can use that from your own profile and simply proxy it to the docker container so that's also an option awesome um and i see amazon wrote back thank you so i guess that answers this question as well and uh one last question from teflon dude is uh is the value of q pulled redis host dependent on dns name could it be an ip oh it can be an ip sure no problems i'm just using names because uh inside docker actually the containers will have internal ips uh but using those names allow makes the docker networking and surface deal with those names and handle them to id's internally and resolve this so i don't have to worry about this but i could like maybe search for the ap internally and do that but that's a lot more work but yes and also if you have like separate computers in the same network so i'm talking to you from my laptop but i am uh showcasing the presentation from my desktop computer so there to come separate computers uh i could use the internal ip network for example so yes that should work it accepts both dns names host names and ips awesome sounds great and that's all we have for questions right now if you have any other questions please drop them in the comments so i guess we could continue then sure awesome so we can already see our na10 instance and if we browse the localhost five six seven eight this is the one we have here uh i'll be let me close this other tab and we can see that the instance is running and i have already prepared this workflow for us and if you want to check this workflow it's also available in our website in the workflow section i think tonight we'll be adding that to you in a moment and you can also see so it's a very basic workflow all it does it it contains two different types of triggers one of them is interval based so it's a time-based interval uh trigger every one second it will start an execution and the other one is a web hook whenever it receives an http request uh it will start an execution as well uh the interesting thing is that uh for both of these they will be executing and they will be leading to the same node which is a function node and the output is pretty simple we are just returning a simple object with name omar age 34 city berlin and there is one tricky and interesting thing here console log execution is working why did i add this um usually if you do something like this let's say in my computer here in the console i do console.log hey i have this information being printed to my console but actually this code will be executed in na10's backhand so the backhand will do this and what is the console for the backhand it's actually the terminal so that's one interesting thing that you'll see in a second and we'll you get to it will help us illustrate what's going to happen okay so uh let's i will pause the intro though just a second just so you can see it uh more properly i'll get the url for this uh web hook and i'll activate it oh i need to save it painter workflow oops there's a typo okay i'm sorry activate this one okay so now we have the workflow active and it should be able to receive http requests so if i try to open this ah nothing happens oh hell why would that happen well it was actually expected because now we told n810 that it should be running on queue systems so what it did is that it received an http request and if you go back to the presentation and reject this something happened here on the webhook it receives a request it looked at the request and said oh okay this is valid i have to run an execution and it pauses a message to radis and it's waiting for a worker but we never started awarding so let's do this so what will happen is that we'll be stuck waiting for the job to execute until we start one of the worker processes okay so let's do this now that you were worried here we are okay now is the time to start the worker process uh it's a command very similar to the one we used before uh we are running the same docker image and a 10 io slash n10 i'm just naming it differently it's now called na10-worker one it uses the same environment file why because it needs to connect to the same redis instance and to the same database instance this is a shared uh resource from all of our processes our rfo containers and it's connected to the same network one last slight difference is that i'm overriding the command that is being used in this container so instead of using the regular ma10 command which starts this that we know uh it's starting to work your process so it's doing a slightly different job how similar is the worker process to if it was just running anything so is it something like any 10 running without the editor you are in a headless mode or does the worker look a bit more different it's a bit more different but in essence uh that's pretty much it like we took one part from the n10 start command and took it apart and created the worker so uh it has a lot of similarities between them but yes it doesn't contain the interface ui it doesn't contain all of that initiators layers it doesn't have the web hooks registration for example so it's only the working part actually that's running it there gotcha okay so let's do this let's run this worker as you can see uh in this window we have started with id one so this is the first message that was ever posted to this redis instance and uh it started with id one and the first worker started what it did is it found something in the queue which is job id 1. uh this is naturally a coincidence because everything is first id so workflow 81 and execution id 1. and then we have a cool message execution is working if you recall what is this let's have a look it's this information so let's say that says hi and let's save it and let's refresh this screen it says chat says hi so i i did this so you can realize that when i was interacting with na10 instance i am actually talking to this container which is the one running anytime core but when the execution itself happens it's happening in this container which is another one you know so uh what happened under the hood is that this guy sent a message to redis this guy received the message says okay i have i know the workflow id and i know the execution id let me get this information from the database oh okay i have some information about how to execute this now i know how to do my job it does what it has to do saves this information tells the broker which is redis hey finish my job and this guy gets notified about it and then i finally get this information back in my browser so on port eight which is anything's name process now the cool thing uh is we can have another one so let's start the worker two why not so uh there's another worker now this one is also ready and if i just come here and i refresh it came which is this one if i refresh again it's now here and the cool thing is that remember that i added an interval note here so let me just click unpause this one and save and you'll see that this guy is generating a lot of jobs so this guy is actually posting messages to redis and these guys are actually doing the jobs and if you see the numbers uh this one is getting all the odd numbers and this one is getting all the even numbers because it's doing a round robin between these and if you run a third one no problem there you go you have another one in the poll so you can have as many workers as you want to split the service so this is as hard as it is you know simply a docker iran command as long as it's able to connect to redis connect your postgres it's able to run the jobs there's one interesting thing that i forgot to mention let me just deactivate this real quick is that uh if you see this message here user settings got generated and saved to config file if we have a look at this config file there is one interesting thing uh whenever you create a credential here in n10 for any service let's say airtable so whenever you save your api key uh the api key is encrypted in latency database which is now postgres okay so uh this encryption key needs to be shared by all of these guys so they can all decrypt this information back with this deployment that i did and this is something we will not be covering right now uh i'm not sharing this key so for each container that starts it generates a new key so in essence in this case if i had any credentials it simply wouldn't work because this one will save with a password and not each one has a different password so they wouldn't be able to actually share this information okay so this is something that you can find in the docs how to do this uh actually you can't at this moment uh but we'll be adding this information to you so you can see how to share this encryption key because it's very important that it's the same one for all of the the different processes uh was that clear what happened here hello send me questions if you have cool so yeah let's take a two minute break and um wait for our lovely audience to send in some questions and meanwhile um i see already a comment from teflon dude who said awesome tip using the console command yeah i use it a lot for debugging there's no there's there are some interesting things that i do sometimes let's say i have a very complex workflow here where i have a lot of information happening for example let's say that i that i have set up more information and uh there's i don't know um let's say i post it somewhere but one cool thing that i like doing sometimes is that i use this one console.log items and uh i and i continue to workflow so what did what this guy did actually this note did is it printed the information back to the console and simply returned the same information so it did not make any change to what it received but it actually allows me to during execution see what was happening under the hood uh you can naturally after the execution finishes you can open the executions and see what happened in each of the nodes but this is another trick that i do sometimes when i need to see what's happening internally in between the nodes so this is very interesting but a very interesting use case for the console log messages this is naturally while developing you know while changing stuff inside an exam core i had one question so how many workers can you create oh well actually as long as as much as your budget allows you to if your credit card's got a very good limit you can spin up the whole aws um but actually no i was just kidding there are some limitations because right now what we are doing is that we are using the main process communicate to those workers uh so we have a bus that's some that's used in computing you know uh but we have a a broker in the middle which is your lettuce and we also have the database instance and these ones need to be able to handle the load from uh about increasing this because in theory with this that we did uh we can already scale the amount of jobs that nih can handle infinitely but remember that we still have a single main process and if we have a look at the design we still have triggers and pollers and web books like if we have a if you see like all of this is still happening uh actually none of this let me select them properly okay anyway we have all of this still happening in the main process you know so this guy is still not scalable it's a single instance and it's doing some job it's the lighter part of the job but if this guy's alone and actually the uh the triggers and the polars uh are still running inside the main process and this is not something that we can scale at this moment so this is something that we're working on uh we'll showcase in a moment how to scale this part to the web hooks um but increasing the amount of workload that you have will increase the amount of information being transferred through redis and for your database so you need to make sure that these are also scaling accordingly so that you can handle all the loads cool so also also i see a couple of questions coming in so i see one question from no code who asks the server that you first ran for anything start that doesn't have a worker inbuilt um yes it does actually um you mean this let me just stop them all for a moment like this one [Music] actually if you do this one like this it's the very same thing that we did with the docker command which is actually running the worker command i'm not sure if that was the question but yes if this is the case so yes you can do this uh because once when we override this it's actually uh if i don't do anything under the hood docker would be executing exactly this and this na10 translates actually to this file so this guy these two are actually the same so i simply override saying hey please execute the worker command instead please let me know if that's the question i'm not exactly sure if i answered what you want for us um when we do anything start um and it spawns off the editor ui um would it not spawn a worker itself as well or is it required to explicitly add workers no it's required explicitly to add workers because uh when you run npm run start it starts all of the services uh so it starts a worker actually but it's already built in uh when i use and let's go back to the configuration file here when i use this command specifically executions mode queue it tells the nfm main process let's call it like this uh that it is no longer running the executions itself it's actually sending them to a broker because someone else is going to do the heavy job you know so this is the one that that actually switches behaviors between having the workflows being executed in the main process itself or being sent somewhere else got it that makes a lot of sense um and i see a message from him as well thank you perfect no problem then i see there's a question from teflon dude who asks theoretically if you had an instance of redis running on a cloud host and the main ntn instance is also on a cloud you could put workers behind a firewall and access data not publicly available but launch it publicly via the main anytime instance is that accurate okay let me just reread it okay they're both an account they can put your works behind a firewall yes and access data not publicly available uh but launch it publicly via the entertainment is that okay um i don't i'm not exactly sure what you mean by access data not publicly available um because if i think if you're workers please go on uh yeah i think it means maybe let's say um i have i have a factory running here um which has let's say sensitive sensor data so i don't want to expose that to the internet but maybe i'm fine with aggregating that information that having out of my firewall is that what you mean teflon okay um so if if that is the case uh what do you think yeah definitely then in this case yes you can have your servers behind the firewall uh you can have only the maintenance being uh publicly available if that's what you want to but also like now that you're you have a firewall if you want you can also restrict access to that one and definitely have access to your private resources that are inside your network for example safe behind your firewall so that yes your workers can actually access this private data as long as they're able to connect to redis and postgres and they don't need to have to be inside your private network they can be for example in another network that is not public um but yes that can happen cool oh yes workers behind farwell can access data behind the firewall yes exactly as long as they're uh network reachable in the network that should be working fine awesome and then i see a question from abu dhan who asks is there an upper limit to the number of workers that can be used uh how do i scale the editor ui along these lines would i just be able to use the redis and postgres cluster for very large deployments okay an upper limit to the number of works that can be used um to be honest i tested only up to seven or eight workers and it was all working fine but i tested this on my computer uh i i'm still panning to running some of these tests in a cloud infrastructure actually with plenty of servers but so far i can i did not have any issues running with seven eight workers and the cpu usage was really low because all of these are so lightweight uh about how to scaling the editor ui uh this is actually not possible at the moment but at the same time uh this is like one of the reasons why we took this approach because when you think about scaling ni-10 actually the heavy job is being executed here in the workers and they're here in the web looks like because this is the most busy part because let's say these guys are actually doing the job of connecting to third-party services checking for conditions waiting for something to happen you know updating it an airtable structure um posting something to trello uploading a file to dropbox all of these actions are happening here so this is where you need a lot of uh a lot of power and also the web hooks that are coming from the network because this is where actually you have uh the workload generation because you have notifications coming you have to look at them decide if there is something to do and execute something based on that event that happened uh the pollers and triggers are also and relevant but they have a much lesser impact and also um if i'm not mistaken like triggers i think we have like nine triggers only uh and all in eighth n and paulers i think there are seven you know so uh by the number of nodes they're they're just a very small number actually and uh this is where actually the heavy work is happening uh and this two right most columns in the left most column actually what's happening is that the internal api is simply handling what needs to happen to the editor interface and usually the editor interface is being held by i don't know one two or three people we still have uh some improvements in our roadmap for example for user management multi-tenancy and things like that but right now uh when where anytime is bundled together this is not the case so this doesn't really need a lot of scaling the editor interface especially is really comprised like a single html file with a few css and javascript files that can be served via cdn so let's say you have a lot of people accessing your anatomy instance you can use a cdn in front of it to reduce the load from those files so all you have to actually handle is the internal api calls like hey give me the list of workflows show me the credentials um open an execution or something like that and this probably wouldn't need a lot of scaling you know so that's why we decided to start with the workers and that's also answering you that it's currently not possible but that's something that we're looking for as well and along these lines would i just be able to choose a redis and a pulse cluster for very large deployments um yeah most definitely because we are taking the heavy load part and making it scalable and the part that doesn't do a lot of things you know that doesn't do some heavy lifting is still not scalable at this moment so we're pretty much safe in uh handling large amounts of data cool and i see a comment from chris who says uh i'll have to try the console logout uh let us know if you go into that's a nice cheating good cheating good cheating awesome so before we get to the next question shall we uh take a look at the next segment sure let's do it okay so let's just restart these uh so let me just okay uh now that i have already started the container um i just stopped it i'll just talk to ps a let me find this one five graph everything just to filter the ones that you want so exited eight minutes ago i just wanted to remember the name where is it worker 3 worker 2 worker 1 worker main okay docker start worker and it hand main same here docker start m810 worker one and okay and uh doctor start and again and i'm not starting the third worker because i'll be using this window for something else so we are back and uh well now that i've um the first time i ran it i did not run the command with the dash d because i didn't want it uh to be detached i want it to be actually attached to my terminal but now when i start it it simply goes off so i can use docker um docker logs dash f name and i can see the logs back so let's do the same here docker box dash f another interesting trick that i like using sometimes is that i want this name which is the last segment from the previous command so you can use dollar sign underscore it's copy it copies the the the previous command so docker uh logs f dollar sign under nine okay so let's just go back to the workflow and activate it again just to make sure that it's all working and we can see information gold okay we can deactivate this one now next segment uh okay we're almost out of time so i will speed up a little bit because that's the last segment worker processes uh we have already done that so what we did so far is that we scaled this right most columns the one that does all the heavy jobs and now the second most busy part in a10 is the web hooks and this is the one that we also worked on and you can we can use pretty much the same file it's almost identical to the one that starts the workers instead of using worker here at the end i'm just using webhook so it's almost identical there's one important and interesting thing this is a webhook process so it's actually listening to http requests from the external world when you're listening to requests you need to to listen to a specific port on your on that computer so one thing that i did here is that uh the the web hook processes will spawn by default uh listening to port five six seven eight which is nh main part okay so no changes here the only thing i did is i i mapped it to a different part on my host machine so this command-b 5677 is that i'm saying okay inside the container it thinks it's running on port 5678 and it will run like this but on my computer i want it to take port five six seven seven because five six seven eight is already taken by our editor interface so this is the the only like relevant change here port and the command at the end once we do this we have another process running now and it says just like i explained inside the container it thinks it's running on part five six seven eight but externally it was translated to another one and if i check here and try running four five six seven seven uh oh i forgot to activate the workflow i'm sorry let me pause this one activate the workflow and tada there we go what happened now is that we get the error on on the first execution that i did because the workflow was not active yet once i activated it it started the execution 61 which started in this webhook process and the actual job happened here so now we don't have this request going through the main process actually the main process doesn't even know that this request happened so if i refresh it you'll see 62 and 62 and you don't have 62 in the main process it's completely unaware of this execution because we have received an http call in another place and another process in another container that is completely separate from the main process it did what it had to do communicate to reddit to the database split the job uh the workers did what they had to do and uh we did not bother the main process with this information so this is another layer of scaling na10 and just like i did here invoking this url it could be for example um a web hook from github from the trello and this that first example when we said about changing a card from a column this webhook could be fired and arrived to and be delivered to the webhook instance and so one interesting thing is that this is a process that runs only web hooks so if i try to open the interface here it doesn't have it because it intentionally is a lighter process it doesn't have the editor interface it doesn't contain the internal api it doesn't contain a webhook registration it also doesn't execute jobs all it does is handle http requests take a decision upon it and um start the workflows so this is very a very interesting thing and if i just i just uh refreshed both tabs and you see that we have like 63 here 64 here 63 here 64 here you know so there they were almost paired you know and this is very interesting because now we can also scale the job that anytime is able to handle from any http call and we can scale that so that's also very powerful that's awesome i see another reason to get myself a raspberry pi cluster [Laughter] interesting yeah that's gonna be fun and okay let me just go back to the slides a little bit and so what we have achieved now like with this we have already scaled the worker instances as many as we want and we have already also scaled the webhook processes also as many as we wanted so um just to wrap things up so a little bit and we can go to the questions in a moment some important considerations about it uh we need to share the same encryption key among all the servers among all containers that's what i mentioned about the credentials they're all stored in the database encrypted so the key to decrypt this information needs to be the same on all of them we did not do this in this demo the database and the redis need to be accessible by any 10 core by the workers and by the webhook processes all of them need to access this information um sharing environment variables file makes it easy or as stefan uh teflon dude mentioned uh adding them to your own user that's another option as well i use the file here so i could simply point every time i wanted to start a process to the same file so i don't have to replicate this and uh with this informations actually i think you have like almost all the necessary information to apply to the kubernetes cluster i'm not an expert in kubernetes myself um but definitely with all we did so far you should be able to deploy a kubernetes cluster with some experience about that that's something hopefully we'll create some uh interesting materials about awesome cool and without the encryption key it still worked out because we didn't use any credentials right we didn't use any services exactly so if i connected to a trello account for example for instance that wouldn't work because only one of them would have the credentials and it would work properly for the others awesome good um i see a couple of questions coming in so yeah like if you do have any more questions please uh send them along in the chat and while you're thinking what questions you'd like to ask i have another question for you what would you like for us to cover in the next live stream we'd love to know and you know if you have any special requests around a guest you want me to bring in uh let me know that as well and i'll try my best to make it happen okay why do you think about that uh let's take a look at the question so i see uh amit had replied to his previous question got it thank you uh could you have that and i see a question from teflon dude which says redis contains jobs and postgres contains workflows is that correct exactly so uh the workflow information is all started in postgres so whenever i uh i unpause this and i save it it goes to postgres and this is where the workflow information is stored and if i unpause it if i pause it back again this is all saved in postgres we also uh when an execution is about to begin so let me keep with the workers for a second so i can show this uh let me unpause this and it's active so let me just save it for a second you see the execution list will show uh uh it's gonna oh did i okay yeah it's okay uh-huh executions um oh i still have oh yes i did not really kill them i just stopped watching the locks docker ps docker okay now the workers are stopped and there we go okay now we should see them okay there you go they are not getting stuck and this is exactly what i wanted to see so uh what we do is that the the workflow information is saved in the database and the the initial execution information is also saved to the database what i mean by initial information is that uh let's say you have you've received an http request the query stream relevant headers the request body all of this is saved to the database uh let's say you received a a web hook from github about a new pull request or a notification from trello about someone tagging a card all of this information is relevant to the execution that will happen so we start all of this information uh as part of the execution in the database and the workflow information so with this in hand we post a message to the message broker saying hey please execute workflow app please execute the execution id let's say 104s here uh in my screen and um and poses meshes to the queue so that one worker can actually get this information get the workflow data about it glue it together and execute the workflow itself awesome okay i see another question from teflon which asks if a worker running a workflow spawns another workflow does that workflow gets spawned on the original worker or does this get pushed to the job queue it's executed in the same worker so it's not a new message posted to the queue it's executed in the same worker so this workflow inception works fine there's a small caveat in this situation because what will happen is that under the hood and this is very specific um this is a new execution that is bound to execute and it's happening the same process and it's not something that the queue knows about and neither the main process so it's actually something that is only bound to this specific order so let's say i have a workflow execution that is running in this worker this worker started a sub workflow via the execute workflow node uh so this new execution from this other workflow is only bound to this little guy so anytime car cannot really know where is this coming from uh so you might have situations when you run this where you will see the status here as unknown this is a new status that shows sometimes and have this kind of situations when na10 cannot really tell if it's still executed or if it crashed because it's unaware about actually the style the status so this happens but it works fine uh anytime we just play it as unknown while it's executing but once it finishes it will display correctly as success or error cool um i see another question from ahmedan which asks what does concurrency 10 that is displayed when a worker started me cool uh that is a configuration from our queue system that tells uh how many works will it get at a time so if it's already handling 10 workflow executions it will not get a new one from the queue because uh let's go a bit deeper into some technical details and it uh jeff node.js which is a technology uh nhn is built in is javascript and javascript runs in a single process and a single thread so actually when you say concurrency it's not actually concurrent because uh let's say two executions they're never running at the same time what what actually javascript does is that it pretends that they're running at the same time but it starts running one it's one of them and once this one needs a pause let's say to read something from a file or get some information uh the other one runs a little bit and the other one and then it's actually intercalating multiple executions so this is what it means when you say concurrency 10. it handles up to 10 jobs at a time each one of them working once so never you have two of them working together but if you have an environment where you know that there is a lot of io io means input and output like reading from fios making an http request these are actions that take time to complete and during this time your computer is actually idle just waiting for the answer you can increase this number for example or if you're in an environment where you're actually doing heavy computations and calculating a lot of stuff then maybe decreasing this number can improve performance i hope i clarified this question awesome um i see a comment from teflon dude who says yep fighting up my rpi cast this weekend for this uh oh that's gonna be cool uh let us know about it yeah it would be really cool to hear uh what you come up with yeah right then we have a question from no code who asks how do we horizontal uh horizontally scaled workers and web hooks across servers uh like two instances interesting yeah so um if you use docker containers you can simply use like a procedure very similar to what we use today um where you can have auto scaling groups that are set to initiated containers on startup so you can have uh one autoscaling group for the main process which uh guarantees that one and only one main na10 instance is running which is in our case that this first window you can then have multiple workers as many as you want that's also another auto scaling group running this docker image um like in ec2 specifically you would need some setup about um once the insert once the instance uh starts it needs to run the docker command but you can also use for example um ecs elastic container service from amazon it's uh almost a kubernetes like very simplified i'd say uh in surface but you can in the end what it does it it spins ec2 instances to you on your behalf and it manages all of them so this is one way of actually making this uh work you can have multiple clusters multiple ec2 clusters and multiple services running um work processes and web hook processes i hope i clarified your question but uh my suggestion would be using bare ec2 instances would be would require a lot of work from you and maybe uh using ecs would make your life a lot easier awesome then we have one last question from teflon dude who asks is there a way to specify which worker executes a job uh currently no because uh the idea is to have a distributed system so what a message is posted to the queue and any of the workers available will pick it up it usually uses a round robin system and one interesting thing is that it's self-healing so if one of the workers crashes for example or is unable to fulfill this request this message remains in the queue and it's processed by another worker so this is something that is very interesting as long as you have workers running at least one actually you always have uh your jobs being processed but you cannot really route them at this time um this is if i'm not mistaken uh we did a lot of research with multiple queue systems uh i believe it might be possible but not with an atm like our q system provides this uh but we are not using this at this moment but you could actually have for example uh separate polls and says okay like let's say i have image processing cluster so i have messages being sent to the image processing cluster another cluster for http request another cluster for something else you know so you could have multiple clusters and pools of servers for specific workloads but this is currently not the case gotcha awesome i think that's the end of our questions and uh you know if you have any other questions about this topic or if you um go offline or uh try out these demos that omar mentioned and if you have any questions please feel free to ask them in our community forum we have a very active community and you know we'd be very happy to answer your questions over there all right uh omar any any parting words uh it was amazing uh spending this time with you and sharing this knowledge i really loved answering your questions uh it's great to see so much engagement uh happy to be here with you again tonight for inviting me and we'll expect even 20 minutes past our time so i i hope it was very good for everyone awesome yeah thank you so much roma for um presenting this in such an easy to consume very loved uh demo setup as well so i'm really excited to also stream and try this all out hopefully on a raspberry pi customer soon as well and you know thank you all uh for attending and loved all the questions all the engagement that we saw in the live stream as well and excited to talk to you all again next week and please drop by the community forum if you have any questions right have a nice rest of your day thank you everyone bye bye people thank you