so I'm Greg we're gonna talk about our ends today I am a senior architect at state you now you can find me on Twitter or github all this code is up on github slides are in you can link to them from there so with that let's get going how many people have used an ORM how many people have ever built one or looked at the code for one or tried to awesome so for people who are familiar with an ORM there's three words so we're gonna clean all of them object is just what you think of in Python it's a in memory representation of some data relational refers to a relational database like my sequel approach res or sequel Lite or sequel server or cold whatever and then a mapper is something that translates between the two this is historically a somewhat challenging problem if enemies have a read there's a blog post Oh RMS or the vietnam of computer science some would explain some of the difficulty in building one this is from the Jeff Raines developer survey at the end of 2018 so not hot off the presses how many people have used sequel for me how many people have used the ginger around how many people have used anything else on this list awesome so as you can see there's a variety of em so they're all trying to solve the same problem they all do things slightly different ways this is intentionally small this is some of the things an ORM might be responsible for so we have creating a database defining tables creating tables inserting data foreign keys indexes stored procedures obviously getting data updating data deleting data as you get more advanced you have more like production grade system you'll need to do caching transactions support different database backends performance is important air handling logging all of that's important we only have 45 minutes and probably only about 40 minutes so we're gonna just handle a few things creating a database defining some tables and some fields in the those tables foreign keys which add a wrinkle to everything then we're gonna create some tables insert some and maybe even recent data just for sequel light so let's get going live coding is probably one of most dangerous things you can do in a talk so why not to hope with that though I wrote I developed something that I'm gonna call test-driven live coding so I have a bunch of test cases and they're gonna fail and we're gonna try to fix them but before we get into that here's roughly from the readme what we expect the RRM to do so we'll define well import a bunch of classes create a database point into a sequel Lite file define a couple tables create those tables in our database creat some objects of these types and then save them to the database we're able to query for all of the records of a particular type fetch a specific one create an object that references an existing object so here rather than if you're not remember the data so a foreign key allows you to reference one table from a reference one table from another easy to do in objects because you just have a memory reference but in a relational database you have strict table structures so you can't like embed all the data in there you can but it's about of you can't you there so creating an object that represent all still storing that in a relational database table is a challenge so we're gonna get to that we're going to save this object with the foreign key and then also show that you can fetch an object and then dereference it without having to manually say pull the ID out get the object with the ID so with that let's look at our first test case we're gonna create a database so here's the test case basically we're going to delete clean up anything from the last test run that failed create a database and make sure that we have a connection to it so let's run that so first error we have is cannot import name database so start with class database and like any good test-driven development that's all I need to get that part pass all right now we need to actually be some work so we're gonna start with this we need to take an argument that is the path of the file and we need to import sequel 8/3 which is built into the Python standard library I'm going to avoid pep 8 and not have a lot of extra lines so you can see all the code never you see this okay is the text big enough in the back perfect alright so we were able to create a database verify that we have a connection now we don't have any attribute tables so in Python there's a thing called property if you're not familiar with this it allows you to run code when you access an attribute so we're going to call this tables and just to get our test passing we're going to return an empty list first test passed one down tend to go we'll see if we get how many we get through let's look at the second test now want to define our tables so just like in the readme here's the def we're going to import some things define our two tables one of the challenges in an RM is you have different data types so we want the name of an author to be a string we want to record the author's lucky number because I need an integer down for the post we're gonna have a boolean for whether it's published or not and also a foreign key so let's start running the second test okay so we don't have these three things to find so that's pretty easy to fix all right so we got past the import statement now we're trying to create a column object and it's saying that the column object takes no parameters so we need to create a constructor for this after that everything worked up through the foreign key so for foreign key we need to record what it is a foreign key to so there should be another table if you ever worked with an ORM sometimes you have to lazily instantiate but using a string to deal with circular Gabriel in ships all right Aurum is not going to handle that so we had to define the author table first and then we will define another table that is a reference to that self-taught table or this table okay so like I was saying one of the problems in an ORM is you have to map the in-memory representation to the relational representation and in Python you have strings and integers and floats and boolean's and a lot of a primitive datatypes a database has many of the same ones but they're slightly different not all the same not always called the same thing so we want to make sure that we know for any particular column with a Python type what the corresponding sequel type is and so for this rather than typing I'm just going to cheat and grab a bunch of stuff and put it in here and so like I was saying a Python int we want to map to the integer sequel or the integer type in sequel Lite for floats there's a real data type for string we use text for bytes will use a blob and for a boolean we'll use an integer because there is no boolean type in sequel wait so now I need luck this test will pass column has no attributes equal type oh because we define that mapping but we did not actually set it as a property and because we're short on time and short on screen space I'm noting the error handling I'm just going to assume that the thing you passed in is one of the types we know about to test passing alright hey making progress okay checking my time here so for the third test we actually want to create these tables in the database if you ever like if you whether you have or haven't user I'm how many people have written sequel before how and obviously I know there's trade-offs between an ORM and writing raw sequel but in order to write know where I'm have to write sequel so we're gonna create these two tables friends test3 there is no attribute create so we can go into our or into our database class and define a function in order to do this I'm just gonna create a quick helper function execute so this is going to be our main entry point for running any sequel commands against our database connection so we can just do self cond execute sequel and for things like creating a table we don't care about the return value but I promise in the future we're gonna want to return things so we return the result of this so then when we create a table we can just run so I just made this up this is a function that is a table has to know how to create the sequel to create itself so again I'm gonna grab statement here I have here so this is a regular standard C will statement create table the name of the table and then the list of the field definition or the fields with their types and so we need to be able to translate the version station we have of a Python class with attributes into an analyst defense so with that perfect we're trying to call this get create sequel function on the table one of the tricky things in designing an ORM is that you want to have a table class but then you want to have subclasses that are the individual tables but sometimes you need to call for call functions on the class itself not on instances of the class so Python has a way of doing this it's called a class method and this is gonna be called get create sequel that is not what I meant to do and so a normal Python function or Python method will take self as the first argument for class methods the convention is to call that argument CLS and then we're just going to return [Music] create table sequel format because if you noticed it's a string with these curly braces in it which in Python lets you substitute values in and right now we're not substituting any values in so when we run this it's gonna raise an error there's no name that we've defined so to do that we're gonna real quickly make another class method called get name and there's lots of thought on how to translate a Python class name into a table I'm just doing something simple which is lower case ingots so in Python any class has an attribute called double under name and then like I said version of lowercase it there's no reason we have to do this you could have tables in a database that have capital letters and in most cases it's case insensitive but convention is generally at least I've seen is for lowercase table names so name is going to be class get name and then the other thing that we're gonna need is fields because this is I'm gonna steal some steal some code here so one thing I didn't mention yet is that every table generally has a primary key and that's to say you can uniquely identify a record in the database some rooms require you to explicitly declare the primary key we're not we're just going to always assume that for any table there's an integer primary key that automatically increases for every when you put in the database so by default we're gonna have a ID field that's an integer primary key how many people have seen inspect in Python so inspect is a really cool module we're gonna import and it lets you loop through or otherwise inspect various Python objects and so in this case we're saying give me all of the members which is all of the functions attributes anything of anything that is a member of this class and we're gonna look for any of the fields or foreign keys that are defined on that class and so if it's a column we want to what am I trying say we want to have specified the name of the field and its sequel type so that we can put it into that sequel statement for a foreign key like I said we can't actually put the entire object into the table so we're going to create a field which is the name of the field so in this case it's the author of a post followed by underscore ID and we're always going to use an integer for the foreign key because of all of our IDs or integers again this is a simplification we're making databases let you do things are a lot more complex but we don't have time for that and so now that we have this if you look at the test you notice that the name of the field and its type just have a space between them and then there's a comma between each each of the fields and so in Python there's a super convenient way to do this called join so if you have any list and in our case we have two nested we have a tuple inside a list so this is we're going to take a space and put it between the name and the datatype then we're do this for each of the fields that we have and then for our actual sequel we're going to join them together with a comma and with any luck this test is gonna pass or at least got a lot further except I have a syntax error I need to close that all right so we were able to create these two tables in the database the sequel that or I'm generated is what it should because I know at this test ahead of time now we're looking again at this DB tables attribute and so we want to be able to query the database for all of the tables that are defined in it and so the way we're going to do that is there's a so this is the syntax for sequel light like I'm saying we're only supporting sequel Lite for my sequel or Oracle it'd be a little bit different but there's another sequel command you can run basically this is like a metadata table in sequel Lite so you're looking for all of the tables that are defined so instead of just returning this empty list we're going to execute select table sequel and this is gonna be a little bit farther but it's still going to fail because we are returning a cursor and what we want to do is oh I know we need to actually fetch the results for my name is query and so if you notice this we're looking we're expecting the list to have both of these things in it the way that the see the sequel at API returns results is a list of rows and so pull out the names we just need to do a little bit of a list comprehension here there we go there are tests third pass at passing tests only about halfway through the die okay so the fourth test is going to be creating fourth test is going to be creating instances of these classes one of the tricky things in our ORM is that we don't have any idea of what the attributes are going to be called so we can't we can't hard code these names so in order to do that I'm going to steal another snippet because I want to make sure we get as far as we can so the way that so so far all of the functions in the table class have been class methods because they're things we call it on the table itself the tables name how to create the table here we're finally wanted to create an instance of this table or an instance of an object that we're going to put into this table so this yeah whatever reason copying I'm oh it's always losing the last character I don't know sure why but thank you for pointing that out so like I said every object we create is going to have an integer primary key called ID but before we actually insert it we don't know what that ID is gonna be so we're gonna store an internal private attribute called data to store all the values of the fields on this table we can't do self named self that lucky number because we don't know what those are gonna be in the context of the ORM code but we can do is create a dictionary and then put all of the keyword arguments into values in the dictionary again we're not doing the error checking so you could put whatever a data you want in here it's not gonna get saved but it's not it doesn't do as much error checking so we do this and so we were able to create the object it took the keyword arguments and put it in a dictionary but when you are looking at an object and you access this name property it actually that name is already defined on the class and it's the actual column object it's not the value of that column in that object so when do another tricky Python thing I really should have titled this talk tricky Python things because cool little features in Python let you do this in a somewhat elegant way there's a magic method in Python called get attribute which is used when you look up an attribute on an object property is one way that it hooks into that but there's a lot lower level and this is one example so this is basically a function that gets called on an instance of an object or an instead of a class and it gets passed the name of the attribute looking for as a key I'm gonna steal this because it'll be easier to do that than to type out and I can explain it the problem is that if you try to do this for every attribute you're gonna break all of the Python or sorry all of the regular Python machinery so if you try to do something like this data equals self dot data and then just get the key so if we try to short-circuit what I had already written and try to return this didn't know what's gonna happen something called recursion so when you try to get this data up or when you try to get this dictionary that we defined it's calling this function itself and it never actually gets it or it's never able to actually resolve it to an actual instance of an object so we call it on the base object class and we can call get attribute and pull the data out we check if it's a thing that we know about if it's one of the columns that we've defined in our fields table and if not we again fall back to the default Python implementation so this will let us actually get the attributes off of the objects that we've created so like I was saying just to reinforce this the table has an active called name it's the column the instance of the object we actually wanna return the value for that object I was gonna say this in front actually forgot the main takeaway from this talk should be hopefully writing Nora rhyme is complex but it's not magic it is just Python code using built-in features the Python language to do some pretty cool things that make an easy user-friendly abstraction on top of something that's pretty complex with that we have four tests passing and we'll move on to test five so we want to save now that we've created an object we want to save it because what uses a database if you can't actually write anything's into it so we're gonna run if we run test five it's gonna fail because we don't have a safe thing define our save method defined in our database we're going to so we've defined this function so that we'll pass that's interesting so it was looking up the catastrophe sequel but it's actually using our internal get attribute code to do this why is it doing this I'm actually sure but what we need to do is grab this insert sequel command see it did it again it's not grabbing the last room so this is our inter sequel and we're going to okay so here we have the author object has no attribute get insert sequel so in this case we actually have an instance of the object so we're not gonna use a class method like we did for the other the other sequel functions we're actually going to define a get insert sequel function on an instance of this class and we're going to return we're gonna return their sequel cool so we have the sequel statement if we go and look at it we need in the name of the table again we need a list of all the fields that we're gonna set on that table and also values that we're gonna put in to the table so we here we can just say name is in this case because this is an instance method we can't use class Doggett name but you can use double-under class to get the class of any particular pythor object in Python and I'm gonna do the same thing yeah name we could also use the dub under name then lower here but since already have that defined we'll use this and then for now I'm just gonna put fields is an empty list placeholders is a nebulous just cuz I know the formatting is gonna fail if I don't do that so here we see where these things are substituted in to our query and I'm gonna go steal some more code because again I want to make sure we get so much to the can so again we're gonna use inspect to we're gonna use inspect to loop over all of the columns and form keys that we defined in this class and for any of the columns that are defined in this class we're going to add a field so we need to say say what fields we're putting in the database we need to have a list of the values we want to put in but who's ever heard of sequel injection so one of the ways of if you have an or a sequel injection it's basically a way of a malicious user can set data and because of the nuances of the sequel language it will treat data as a command and so we don't want users to be able to run commands on our database so a way of doing this is parameterizing queries and you put a placeholder and you clearly say to the sequel light this is the command and here's the data and it won't it's written so that it won't ever end or put it in the wrong way sorry it's not a great explanation but so we have a list of fields we have a list of placeholders and list of values so placeholders we're just going to put a bunch of question marks and that's kind of the that's the symbol that the sequel ID engine uses for where it's gonna substitute values in so we just need a list of question marks for each field that we're putting in and then the fields is again and with this this should work oh ok yeah so we're returning the sequel statement which is like I said the first argument we passed the execute we also need to return the values and so up in our database class I'm actually gonna steal this out you can actually pass a sequel command and a set of data to the sequel Lite execute function and in order to handle that because we have some commands that just take sequel and some that take sequel and arguments I'm just gonna put this params in here it's a deep it's gonna have an evil value of none and if we define it we're gonna call execute with sequel and the parameters and otherwise we'll just return execute the raw sequel as we have been doing before it's that okay so we were able to ensure that our ORM is creating the right sequel and passing the right arguments to the database engine I said before we want to make sure that every row has an ID and that ID should be unique and we're gonna let the database calculate it for us so we can't do that in Python we need to rely on the database telling us what what ID was given to that I that row when it was inserted in the database and the way we do that is when you when you execute an insert statement you get back is a cursor and then the cursor has an attribute last row ID and so we can use that to set the ID of the instance that we're saving to the database there we go another chest the pass halfway there test 6 already passes if we go look at test 6 it's really just adding in some more data apologies to any Blue Jackets fans but I just moved from Nashville and so these are all predators players so we're gonna save some more authors into our database now we need to query the define this all function that will query the database and pull out all of the objects of a certain class again we're gonna skip a bunch of steps because I want to get as far as we can we have another sequel command that does select all you'll notice that there's a pattern to this so for every database operation there's a template sequel string and then ways that we determine based on the definition of the table and the definition of the particular instance the fields on the particular instance how we format that interesting that we pass to the database engine so freely select all sequel we just want to list of the fields and we want the name of the table so again I'm just gonna cheat a bunch and put some of this code in here so in this case if you noticed in the call we actually passed the name of the table so in this case they get select all sequel as a function we're gonna call on the table class itself just grab that too so when a so we know that we always want to get the ID like I said we're making an assumption that every table has an ID and then for the rest of the fields in the database we want to append that this is another fear that we might get in the database thank you so like I was saying this is a thing we call on the table itself so it needs to be class method thank you and we also need to return the field because once we at the I mentioned before when you execute a select statement you get a list of rows without knowing the names of the columns there are ways around that but for host of this we're gonna assume that we need the fields and so we return the sequel and the list of fields that we want when we execute we're only going to pass the sequel because it's select these fields from this table but in order to if you remember when we define the constructor for our table instances it's a bunch of keyword arguments and so this is another bit of like weird Python syntax it's actually powerful fields will be a list of the field names and then as we loop through each row row is a list of values for those fields we've kept them in order so that we can just zip them together which creates a list of pairs name and value the dict function in Python can take a list of those key value pairs and create dictionary out of them and then you can use this star star operator to unpack that into keyword arguments that get passed to our constructor it's a bunch of magic if you've never used it before but it's really powerful and let you regulate come back to code so in this case table is a class so we can instantiate a class using these keyword arguments and append this to a result set that we return and with that our test seven should come pretty close to passing and in fact it passes awesome I've got a few minutes left so I'm going to and keep going test eight is going to be getting a object with a specific ID so this is the first case of a where clause so we're able get all the objects now we want to be able to get a specific object where clauses are more complex that let you do comparisons greater than less than camera for more than one we're just gonna focus right now on getting a single object by ID in order to do that again we're gonna grab a select where a hard-coded sequel string and select where is going to be very similar okay so we're when we run this we need to define a get operator for Farrar I'm on the day base class it's gonna take the name of the class you want and the ID so we define that pretty quickly it's a lot quicker to copy and paste and actually write code and again so this is gonna take the table object itself and the ID going to again build a sequel string and in this case we need both the fields that we're getting because we need to use that to populate the entire object and we need to combine that with the placeholders approach that we used before to avoid sequel injection so which kind of building up layers here but it's a set of common patterns like I said the list of fields and the parameters to avoid sequel injection instead of using fetch all we're gonna use fetch one which will get a single object in this case that's okay because we know that the ID should be unique if you're doing a more complex where you would want to iterate over all the rows and return multiple ones kind of like we did with select all same pattern we're gonna zip up the list of fields and the row into a dictionary and then return a table that was created from those parameters I don't think we so need to do is this get select where sequel and this is gonna follow a very similar pattern that we've seen that's method and so here it takes a variable number of arguments because we eventually want to be able to support multiple values multiple multiple things you're filtering by but in this case we're just going to hard-code if you noticed up here we're just hard to actually close some of these we're just hard coding ID equals ID but it could potentially take different arguments that you pass in because there might be more than one filled so for each filter I can look at the test and we want this Clause to look like where the name of the variable equals a parameter and again we're gonna pass in the parameter to avoid sequel injection if there's multiple of these we join them together with and so ID is this and something and something like robusto arm allows you to do or queries and this boolean logic and so way more complex than we have time for but I'll see how close we're passing this test all right it passes running on time so I'm just to wrap it up right there all of the code including the three more tests are in the code on github and if you have any questions feel free to reach out or talk to me in the halls or whatever real quick I just wanna show you what the tests do Oh so the fun part is actually converting an object to get its ID and then putting it in into the ID field and then later actually when you pull out a row that has an ID column knowing to go in and look up that object by itself so you can dereference author from the post things that we didn't get to joins lots I mean all kinds of fun stuff around that but I'm hoping this was a useful kind of like demonstration of the fact that it's not magic it's using some common patterns and you can do it don't ever use an arm that you write people put a lot of time into I'm making it more bus and seek walk me and Django and all those are the two I have experience with I'm sure the other ones are also very good but go forth and write databases use Davis thank you [Applause]