Transcript for:
Lecture on Deleting Code

okay guys let's get started what I'm going to be talking about today is a bit different than most of the talks at conferences how many of you have been to a talk about writing code how many been to talk about refactoring code ours we're going to learn most people don't refactor they refactor I want to talk to you guys about deleting code how many have been to a talk before about deleting code oh actually apparently there is one I've never seen one before how many have put through a pull request that removed more code than it put in and didn't you feel good I spent my entire morning today actually building up an entire slide deck one of the beautiful things about deleting code is it allows you to change your mind and I actually changed my mind today about using my entire slide deck I literally built 40 slides and decide I'm not going to use them so you're only going to get this one and this one does anybody know where this comes from and I really don't believe many of you are doing octal patches these days this is actually from a waterfall paper and the waterfall paper is one of the greatest ironies in our industry how many have heard of waterfall before how many have actually read the paper one hand two hands oh sorry didn't see you cassia the waterfall paper is one of the biggest ironies in our industry so if you go and read the waterfall paper and I really recommend people to do it it basically describes what you know as agile throughout the paper the very first page describes what you currently call waterfall and this is what happens when you turn from the first page to the second what it basically States is that in his heart he truly believes that waterfall is the optimal way of building software the only problem is it doesn't actually work a paper that's like this and it really gets into what we're going to be talking about today heard of the big ball of mud yeah it's a nasty thing you don't want it how many have actually read the big ball of mud paper oh again we have like three or four what's interesting is a big ball of mud paper basically argues a big ball of mud is inevitable in fact it's optimal you will be stuck with a big ball of mud due to the economics of software if you are not ending up with a big ball of mud perhaps you work in the Craftsman industry um you're basically gold plating outhouses the big ball of mud what it basically starts talking about is you need to go through and you need to start making small Pockets inside of your ball of mud you cannot deal with one giant ball of mud you have to use many many little balls of mud and this leads you towards the concept of writing code for the purpose of deleting it the idea is I can walk into any one of these areas of code and I can burn it to the ground when I don't like it anymore how many have had to put a feature into software before where your current model didn't quite work well with it however that current model is Tangled throughout everything and you end up spending two weeks for something that would have been really easy if you could have changed the model but you can't because you have no idea what this may affect what if you were to optimize from the very beginning to be able to delete code this is not a New Concept I really learned this lesson when I got into erling when you build a system in erling you conceive the system as a series of very small programs and you'll find if you start working in erlang you almost never go back and change code if I have a feature that I need to put into one of these small programs I normally rewrite the program this is a wonderful thing to have how many of you are afraid to delete code in your system maybe you have some unit tests around it but unit tests they only show that the tests pass they don't show a lack of bugs how many have had a system that was out there that you actually had people relying on your bugs Microsoft is wonderful for this they literally have bugs that are currently backwards compatible from Windows 3. but they can't get rid of the bug because people depend on it what's interesting for me is when we start getting into erlang code it's completely different than the code you guys are used to working with and erlang is becoming the hot new thing and no I don't think people will actually be coding in erlang what people are doing is they're taking the lessons of erlang and they're applying it in other places how many have heard of the newfangled thing called microservices so there's another word for them they're also called objects and if you're doing proper object oriented programming when I say proper I don't mean C plus plus I'm looking at going all the way back to small talk if you look at how people actually coded in small talk they were basically doing microservices when we start talking about microservices we are envisioning our system as a series of very small programs if I have a proper microservice I've got no problem going in and deleting it it's very rare that you refactor how many of you refactor today how many use refactoring tools so what's interesting for me is that most people don't actually understand what refactoring is when I refactor something the very definition of a refactor is I either change my test or my code and I only change one out of the two how many of you have refactored where you're changing tests in code at the same time okay that's called a refactor you are screwing up so the whole benefit of refactoring in tdd comes from the idea that one side stays stable and the other side pivots yeah that's a really different view of tdd isn't it if I let's say refactor my test I move something into the setup my code stays stable so if the test was green before and it's green after I have a set of measurements that I did before and after I made my changes that's good if I refactor my code my tests stay stable my test ran before they run after it's a measurement I'm predicting what will happen most people aren't doing this though and the refactoring tools they even push you to not do this most refactoring tools try to get you to change your code in your tests at the same time I think I'm going to call this Greg's law I can actually look at your software and I can know what tools you used because your tools actually affect your code how many of you have worked in a system where F5 step debugging was the only way of fixing problems it was probably in vb.net if you go back and watch when that software was being written guess how the original developers did it and now there is no other way of dealing with the software because that's the tool they were using and it comes all the way out to the end software this is common people that are using big Ides I'm looking at all of you I know of exactly three people in the room right now that are using Vim if you use a big IDE that affects the code that you're building all of this stuff affects the code that you're building if I were to go for instance and use a big IDE and I'm sure most of you have used an IV at some point have you ever tried using that code outside the ID and it doesn't work very well it's the same type of thing with the F5 debugging let's come back to erlang again so when people are using a tool such as erlang they no longer consider their problem to be one big program they look at it as being many little programs how many of you have used Linux before do you really think that we could have a 12 to 18 month project to rewrite LS these are not new ideas the Unix way of doing things is also the microservices way which is also the erlang way how many would be afraid to delete all the code for grep and rewrite it from scratch okay to be fair grep is a lot of options it's kind of lost its way these are small programs and we composed them when we start wanting to talk about building software and optimizing for its delete ability what we focus on is making small programs you would be amazed at how liberalizing it is to know at any point in time that you can walk into a piece of code and you can delete it rewrite it from scratch and it's a one or two day problem it's not a 12-month rewrite the ability to rewrite code is extraordinarily valuable because what's going to happen is overtimes you're going to get features being brought in and sometimes a new feature doesn't fit well with what you were doing before I actually just ran into this um probably about two three weeks ago working on event store I've been working for like the last two or three months and I feel really bad for James Nugent right now because it's a 10 000 line pull request that he has to review he'll enjoy that I'm sure but what I've been adding is competing consumers um so basically how rapidmq works and I wanted to add in some new features to competing consumers and I found it did not work well in my model like it just didn't fit now you can imagine I delivered the first version to our customers probably about two weeks before and I decided I was going to rewrite the entire back end that sounds risky doesn't it over the course of two days I rewrote the entire back into the system and you know what when I went to go put my features in it worked perfectly and it was beautiful my guess is it would have taken me two weeks to go back through and to get the code that was there having the new features inside of it this is common if you don't have the right model and you try to add functionality you run into issues you can oftentimes end up spending more effort trying to get your bad model to have the new features than just write a new model from scratch so the big question is how do we optimize for deletability and the way we optimize for deletability is we start moving away from monoliths we try to optimize for decoupling how many of you run coupling analysis on your software there's a lot of great tools for doing the sonar and depend and they give you a lot of valuable information for me the right size that I want and I know you guys all know microservices how many have heard of definition of microservices by the way my big question is can you separate object microservice and actor I've yet to get a real answer to that question but how do we find the Goldilocks zone when we start talking about a microservice you don't want them too big you don't want them too small I've heard rules of thumb you should never have more than 200 lines of code in a microservice this reminds me in my first job we actually had a rule that you could not have more than 24 lines in a function do you know why we were actually coding on VT terminals so you could not have a function that was bigger than the screen that's an absolutely insane rule because I can give you really quickly a great system that will break it what if I were doing an options pricing inside of my microservices I've got one method called price option except I want to price it on a video card in Cuda so I'm God I don't know 30 000 lines of code behind the one method that actually runs in the video card and does the options pricing for you is that no longer a microservice finding the Goldilocks zone is really the hard part when we start talking about this and my rule of thumb is you should not end up with one of these microservices or we can call them services or we can call them actors or we could just go back and call them objects you should not end up with one of these that's more than a week's worth of work to rewrite what that's saying is that my personal risk at any point in time that I want to do a full rewrite of this thing is one week if later I have a better understanding of my problem it's a one-week rewrite how many of you have worked with a bad model how many have gone and then said it's a three to six month project to fix it how do you explain that to the business so I'm I need six months worth of work that you're going to see no outside benefit from we will not add any new features we will not do anything but we will lower our technical debt because business people certainly understand technical debt they understand this it's a really clear concept for them I can't explain that so my goal is I want to optimize for my deletability I want to optimize to rewrite things without a risk greater than one week and keep in mind this is a rule of thumb this is not a hard concrete rule that you should go off and say but Greg said we should never have functions more than 24 lines there are times where you'll end up with a single service that does actually have more than one week's of work behind it but in general you should not find yourself doing this if you find that you've got more than one week's worth of work this should be a flag to you this is a high risk area of code this ability to burn what's there to the ground and start from scratch is massively valuable when I start finding that my models are wrong I delete the code it is not a huge risk from my perspective it also allows me to do a lot of other interesting things how many of you have heard of technical debt before technical debt is bad right I've always loved people that said technical debts bad how many of them mortgage do you think there might be a reason why they called it technical debt is debt bad debt allowed you to buy your house even though you didn't actually have all the cash and you're going to pay it back over time otherwise you'd still be living in an apartment debt is not inherently a bad idea of course every once in a while you might find the 22 year old that makes a hundred thousand SEC per year and drives a Ferrari yeah that's probably not good debt the same is true with technical debt so if I go through I can get something for the debt that I'm taking out how many of you have done a crap job on something in order to put it to production fast happens right I need this feature tomorrow why because if I don't have this feature tomorrow we're going to lose money to a competitor and it happens I may put technical debt into the system by doing a completely crap job on it but I get time to Market as a return as an example do I care about technical debt as much if I can delete the code if I've made things to the point where we have microservices and I really hate that term you guys already know everything about microservices if you if you want to learn more about how to get them right go read some Alan K and Carl Hewitt small microservice let's say that it's at maximum one week's worth of work how worried about technical debt are you really going to be at this point if technical debt starts accumulating delete it I cannot stress enough how important it is to optimize your code for deleting when people talk with me they always talk about their 18-month project that they want to get in on no try a one week project and if you can't rewrite this thing in a week you've failed so my next question would be how can you make it so you can rewrite the thing in a week this ability to delete code it removes your fear how many of you have spent three months trying to optimize or trying to estimate an 18-month project just to figure out whether it's 18 months or 24. now I want just keep your hands up if you've ever done that how many of you actually were correct in your estimate within even 50 percent it's waste any time that we're doing something that's a 24 month project an 18-month project a six-month project it's waste try working on one week projects try working in code that you've optimized you can delete any of the code at any point in time and it's a one week project and you'd be amazed at how good we are at estimating a one-week project compared to how bad we are at an 18-month project optimizing your code for deletability will also improve your velocity when you get into a piece of code a one-week rewrite is a piece of code that's basically manageable in your head how many have come into a new system before someone else's code or even worse it was a Junior's code the junior being you a week ago and your first five days you spend reading and trying to understand what's connected with what and how some projects it might be a month I've worked on projects that it was a year how do you understand a large connected code base if I had something that could take you one week to rewrite from scratch how long would it take you to understand that program at most one week by definition if you can rewrite it in a week you can understand it in a week what we're trying to do is we're trying to get things to the point that they are small manageable and understandable the moment that you start optimizing for the deleting of your code this is where you end up and we're not talking about microservices or SOA or by the way does anyone here know what SOA means in Dutch I used to have a slide that I would put up it was so a test.nl it's a website I'll give you a hint they don't test your services uh so I mean sexually transmitted disease to be fair once the middleware vendors got a hold of so it's basically become one about with soah it's the same the same same thing with objects it's the same thing with actors okay actors had a concurrency model but what we're really coming back to is the 1970s write small manageable programs that coordinate to get a job done as opposed to writing one big lump of crap this is what people meant inside of the big ball of mud paper try to get small projects inside of a big project try to keep things that are manageable again my rule of thumb is about one week if you go into an area of code and you say this is going to take me six months to rewrite what are the component pieces that would take you one week to rewrite each and I'm not saying that everyone should go delete all their code all at once when we start optimizing this way we end up in a very different place and it's a much nicer place by the way how many of you have really heard of people optimizing it's an architectural quality for deletability it's an interesting perspective on the problem almost never will you be going through and doing refactoring refactoring is also known as delete all the code and start from scratch when we talk about things like technical debt don't worry too much about it because you can delete the code and you can rewrite it in a week when do you know the most about your project is it at the beginning when you're doing your planning and trying to figure out how to do things or is it at the end after you've done everything at any point in time I'm willing to delete my code and rewrite it from scratch this is the Unix philosophy this is the erlang philosophy this is microservices this is SOA this is actors the problem is most people are picking up these ideas and they're never understanding the fact that the goal is to delete your code they pick up microservices or SOA is a concept and you'll walk in and you'll find that behind a single service they've got 17 000 lines of code how many of you can rewrite seventeen thousand lines of code in a week okay to be fair if it's Java code we can probably rewrite it in a thousand lines of closure I cannot understand seventeen thousand lines of code in a week and it's important to remember because people will be reading your code believe it or not some poor sucker is going to have to go through your code in the future even if that poor sucker is yourself optimize to keep things small and manageable in going through and doing this this will also help us in a lot of other ways don't focus on things and Technologies don't focus on things like microservices this idea is not new and if you understood the Unix philosophy 30 years ago you understand microservices today try to find yourself in the Goldilocks zone and focus on your rewrite ability the ability to just burn the entire section of the ground and start from scratch I cannot even explain to you without you guys having actually worked with it before how liberalizing this is your fear goes away can you imagine working as a developer and not being afraid not having all these tools to try to tell you what is hooked to what and if I rename this database column what we'll break in it it's a completely different experience to be in these kinds of systems where I can bring in a junior or a new developer and they can be productive their first day without having gone through and watched 19 hours of videos explaining how our system actually works but again there's absolutely nothing new here it's just a different perspective on the same ideas go through and focus on keeping everything as small independent programs and I mentioned before that tools will actually focus on your code you depending on the tool that you use your output will change go learn erling I can't stress it enough learn erling if you really want to learn object orientation go learn erling I know it sounds weird because it's not an object-oriented language but you will understand much better how objects work when you start looking at them from the perspective of being processes in early at the same time it will actually improve your object orientation and when we talk about these things nothing here is new by the way how many of you have heard of an aggregate before from domain driven design does that sound familiar to let's say a small process where there's one process per document in your system you have consistency inside of it you have no consistency outside of it in order to talk to it you send it messages have any of you ever read Alan Kaye how many of you work in an object-oriented language now and you've never read Alan K ay he had a lot of interesting things to say when he defined the word object orientation in fact if you go all the way back to his definition of object orientation an object is a little computer that you send messages to to tell it to do stuff the beauty of that is it's a recursive model so if if an object is a little computer I send messages and tell it to do stuff what what's a big computer well it's a bigger computer I send messages to tell it to do stuff and it routes them to little computers inside the big computer nothing that we're talking about here is new whether we talk about objects or actors or services or microservices or components it's all the same idea we're trying to make little programs inside of big programs there's a lot of good academic research you can actually go back and read to be fair Alan Kaye actually said the biggest mistake he made with object orientation was naming it object orientation he should have called it message orientation because people think about it now as the objects not about the messages that go between them when you take this fundamental view this is the exact same thing we talk about inside of erlang correct except we can replace the word object with process and okay erlang has a concurrency model because it's actor based single thread inside of each one of them but what we're getting back to is the same idea that each object or each process in erlang is a little process and it's a process a program that I send messages to and I tell it to do stuff if you want to hit the Goldilocks zone of these programs they should be roughly one week to rewrite and the way we conceptualize our system is a slew of little tiny programs how many have written a shell script in Linux before isn't it beautiful compared to Windows okay powershell's coming along we compose small programs in order to get Behavior out of them and our focus should be keeping the program small enough that we're willing to delete them at any point in time and I've mentioned it before but I cannot stress enough how liberalizing it actually is to delete code you are taking the handcuffs off of yourself you're allowing yourself to do things you wouldn't be able to do otherwise and you can do this in object-oriented code you can do this in any type of code that you want okay functional is a little bit different but at the end of the day what we're working with is small programs and the trick the one big secret from a Michigan mom that 80 of software developers don't know it's the secret to Great Consulting is to never build big programs you can literally make a career as a consultant telling people nothing but that there's a lot of Consultants that actually do that understand nothing but that simple idea and you can find it historically it's it's happened over and over and over and over again why because people don't understand it the difference between great code and sucky code is the size of the programs nothing more when I have a hundred thousand lines of code I am Shackled to it I will never understand a hundred thousand lines of code at the same time it's impossible you cannot keep that much in your head there will always be subtle details that are happening and the 3000 unit tests around it really won't help you that much because you won't be able to keep in mind all of the things that the code actually does I need to get things to be smaller there are some other benefits about having very small programs how many of you have deployed things before the Big Bang release is always scary isn't it if you release a hundred thousand lines of code all at once are you sure that's going to work in production foreign so I refused to release more than a few thousand lines of code at a time now I I will never do it what scares the crap out of me about releasing this size of a program is not that I'm going to go release it and I'm going to run into problems and you know it fails because what I'm gonna do is I'm Gonna Roll it back at that point that's no worries how many have released something and it worked great when you released it it's and it died four days later how do you roll back four days you guys all right SQL migration scripts how many of you write a SQL migration script from your new data back to the old schema so the running joke and I've done this with a number of teams is at this point you either get to wear the cowboy hat or the fireman hat oh come on you've all worked on production issues before and isn't it wonderful your your knee-deep in a production issue and someone comes in they want to talk with you about the Christmas party the idea is if you're wearing the cowboy hat or the fireman hat someone will walk over to you and just go oh okay I know what you're doing and walk away um I actually recommend it for teams if I'm releasing 3 000 lines of code to production can I be reasonably certain that that 3000 lines of code is going to work and this will come back to Paul's talk that he was just having if I'm only releasing 2 000 lines of code at a time why don't I do that 50 times a day It's relatively low risk and if I conceptualize my system as being a series of little programs could I run two programs side by side could I have two little computers that I send messages to and tell them to do stuff and I put half the load on this one and half oh wait this is called a blue green deploy isn't it we can take all the things we've actually learned on these big systems and apply it at very very small scales I may want to take 10 of my load and put it on the new program that I put out this is normally how you actually change these kinds of programs what you do is you go in you delete all of the code inside of it you write it from scratch and you push it to production sitting next to the old program once you're comfortable that your changes have not really broken anything and maybe you have some tests around it as well then you delete the old software today there's literally nothing new in anything that we're doing understanding these very basic ideas can turn you into a high paid consultant no literally what most Consultants do is they just explain this one idea over and over and over again because teens don't get it teams inherently want to build programs that are too big and too complex as I've always liked to joke developers have this wonderful habit of solving problems that nobody has I could do it in a simple way but if I did the simple way it's not up to my level of intellect so I need to make it more complicated to make it a problem worthy of me don't fall down these traps always focus on dealing with your code that you will never have a project that's longer than one week you will be able to fail your projects or succeed your projects within one week every time you are optimizing for your ability to rewrite your system as opposed to planning for change that will happen in your system how many of you can predict what use cases your system will have to do a year from now okay how about next week next week I'm normally pretty good with I'm not a hundred percent um I've actually challenged a lot of teams to do this every time that you add an abstraction I want you to create an options pricing model on that abstraction and I want you to forecast the probability that someone actually needs the abstraction in the future and I want you to measure yourself and then I want to compare you against a dartboard in other words a null hypothesis my guess is you will be slightly better Maybe but you will not be much better than dartboard or we could get the German octopus don't try to plan for future changes focus on the ability to completely rewrite everything from scratch when that change actually occurs and I understand that you'll be writing a lot of really simple stuff that's not a bad thing don't forecast out what changes may happen and try to take your model today to accept those changes in the future focus on making little tiny programs and lots of them that are very easily rewritable make the decision at the last responsible moment as they say in agile parlance but again these are not new ideas you can find this going all the way back to the 70s and apparently developers are just too dumb to actually realize this thing happening over and over and over again and we have different ways of explaining the same thing but almost nobody gets it how many of you have a program that would take you more than one month to rewrite today almost every system I go into has this even if you go into good code bases they're they're usually too large again the identifying trait of good code is small isolated programs that can be deleted on the Fly almost every single aspect of your system will improve if you think about it in this way now with that I will invite Martin up who's going to do the closing but does anyone have any questions just before we bring him up I don't want to hold you guys back from beer I know about getting between a Swede and their beer so at any given point in time I should only have a small amount of State for any given program and yes you you will need to have migrations but your migrations will be much much smaller also on a per program basis I can start looking at varying methods of storage um one that I like to talk with people a lot is event sourcing because if I'm actually event sourcing and I say my state is a first level derivative off my log then I have a lot less migration problems when I'm dealing with it there's no way that you will be able to store State let's say in a read model that you're coming off of we can just say it's even a file um I I can't just say I'm going to change my file without writing some form of migration for it but ideally I'm going to keep things small and isolated so I'm not saying I migrate my database this one thing might be talking to one table if that makes sense conceptually yes but they may come back to the same physical storage I may have 13 little programs that are all talking back to the same SQL database because I really don't want to have to manage 13 SQL databases in production but conceptually they're 13 different stores and they can vary independently of each other if you go down that road be very very careful about integrating the processes through the data ideally each process should should have its own data and if I want access to the data I talk to the process I don't go directly to the data there are times where you're going to have to actually integrate through the data for performance reasons but be very wary about doing that because you're building in a fairly nasty coupling between things how many of you have integrated through a database before and people will tell you that it's awful and you should never ever do it but you know what sometimes it actually works really well and it's a form of integration understand that it's got its own trade-offs associated with it you're you're putting in Fairly nasty coupling um has anyone ever worked in a company that had a rule that you were not allowed to rename a database column because no one had any idea what it would affect but it is one form of doing the integration ideally you will do the integration through messaging but there are times where you actually need to do it through the data itself for performance reasons uh schema list can help anything that's weak schema is a lot easier to version than things that are strong schemed um on other that I really like to use is what's known as a hybrid schema so let's imagine that I were to use an xsd that defined three or four things that were must understand but everything else that was there was dynamic um that's a very common way of dealing with things yes uh protobus uh absolutely it's something that can be shared um normally I look at things at protobufts and I would consider that to be a library and yes libraries get used from many places um many programs linked to g-lib C if we talk about it from a Unix perspective um and going back to your question with weak schema another way of doing a hybrid schema would be for instance if I were to use protobufs and I were to say required on three things and optional for everything else and that will again help you a lot in terms of your versioning of data over time any other questions okay well then I will give it to Martin who will close up the conference and thank you guys for having me out