Transcript for:
Insights on Software Engineering Evolution

The entire history of software engineering is one of rising levels of abstraction. So what we're seeing here is the rise of another level of abstractions, which gives us all these extraordinarily powerful frameworks from which I can build systems and which, as I alluded to, the architectural decisions that were front and center for us back then are now embodied in these. So now becomes a decision, what cloud service do I use? What messaging system do I use? What platform do I use?

That's the decision which has a lot of economic decisions and not just software kinds of decisions associated with it. So I think the role of the architect, in effect, has changed because now I'm dealing with systemic problems, not just software problems themselves. Grady Booch is a trailblazer in software engineering. He built his first computer 57 years ago at just the age of 12. and is known for his decades-long work in advancing the field of software engineering and software architecture.

He is the co-author of UML and has originated a term and practice of object-oriented analysis and design. He's an IBM fellow, an ACM fellow, and has been awarded several other prestigious awards for his work in software architecture. He's the author of six books and more than 100 technical papers on software engineering.

In this conversation, we cover the first two golden ages of software engineering, how UML was created, and why Grady disagrees with how it has evolved since version 1.0, how the practice of software architecture has changed over time, Grady's views on large language models, interesting stories like how Grady was offered to be Microsoft's chief architect but said no to Bill Gates, and a lot more. If you enjoy the show, please subscribe to the podcast on any podcast platform and on YouTube. It's safe to say I'm talking with a living legend in the field of software engineering, so welcome to the podcast.

Emphasis on living, I'm not done yet. Yes. Absolutely. So to kick off, you're a chief scientist at IBM. That's a pretty fancy title.

What does it mean and what do you do? I'm curious to know. Well, there was a time that my business card said I was a free radical, but upper management didn't like that.

So I had to find something a little bit more tame. Actually, the more important title slash position is that of fellow. There are, I think, 68 of us still active. No, 89 of us still active.

And this has been out of 350 or thereabouts fellows throughout the history of IBM. So we're a fairly rare breed. I was made a fellow upon the acquisition of our company, Rational Software, back in 2003. And the great thing about being a fellow is it's rather like having tenure, meaning we trust you, done good things. We want you to continue doing things.

Let's give you the degrees of freedom to do that. And so. as a fellow and with my focus upon first software engineering and then later upon AI, I am given a lot of degrees of freedom to pursue what I think makes a lot of sense to try to, as Alan Kay would say, invent the future.

So in my journey from starting at 2003, I first stayed with the rational division, but then quickly moved over to research because IBM's bureaucracy realized I was a person who worried about the next five to 10 years, not the next quarter. And indeed, my very early work in research was looking at finding ways to automate the discovery of patterns within legacy software systems. This is something we were doing pre-neural network days.

And it was interesting trying to see if we could discern the design patterns from the gang of four and elsewhere. It never went anywhere because it was a hard problem. And so...

I then began to, in the architecture sense of things, I worked with a lot of customers and actually had been doing that for decades where I'd be parachuted in. The customer would say, come help me, Mr. Wizard, with this particular architectural problem. And the exciting thing about it for me is that for many decades, I was engaged when projects across every conceivable domain.

And I'll pause there to say that roughly around the turn of 2010-ish, is when I began to be drawn back into the space of AI. So when you say you worked on legacy systems, what does a legacy system mean to you? Well, the moment you write a line of code, it becomes a legacy system until you throw it away. So all code, to some degree, is a legacy system.

Facebook is a legacy domain. Google is legacy. Heavens, even OpenAI has a legacy problem.

Because reality is that... As I often say, old code never dies. You have to kill it.

Once you have built something that's useful, then it's going to live on. And so unless you have fully disposable code, there is some body of code that you have there that represents something immutable, something that has a cost, something that has some degree of technical debt to it. It may be very, very small.

If you look at many of the classic organizations in the big financial space. These are groups who have been working with codebases since literally the 60s. I had one engagement with the Internal Revenue Service in the United States because they've been trying to modernize their systems since the 1960s.

Now let's go back in time. What was happening in the 60s? Well, you had the rise of the population, the increase of social security and the like, and more and more organizations who were creating lots of paperwork. And so the banks and the government realized by the late to mid 60s There was simply so much going on that you couldn't do it by hand. Why did banks used to close at 3 p.m.?

Because they needed the time for the humans to reconcile accounts. The IRS was very much like that. And so there was, during the 60s, this period of automation of human processes. And so most of that was written in IBM 360 assembly language. Zoom back to today.

There is still code in the IRS system. written in IBM 360 assembly language, running on emulators upon emulators upon emulators. But that creates really difficult problems because some of that code embodies business rules that are within the assembly language itself.

So how do you change that? And the answer is you're faced with a real human problem of how do I transmogrify old code in COBOL assembly language so that it can work on modern technology. There's only so much you can do via emulation. And then when you consider that the government makes new business rules every year, how do you keep up with that? That's a real and present legacy problem.

Facebook has the same thing, although their code doesn't date back that far. Google has the same problem as well. OpenAI will soon have that problem. at some point your customers will start asking for enterprise features like SAML authentication, SCIM provisioning, and fine-grained authorization. That's where WorkOS comes in, making it fast and painless to add enterprise features to your app.

Their APIs are easy to understand and you can ship quickly and get back to building other features. WorkOS also provides a free user management solution called AuthKit for up to 1 million monthly active users. It's a drop-in replacement for Auth0 and comes standard with useful features like domain verification, role-based access control, bot protection, and MFA.

It's powered by Radix components, which means zero compromises in design. You get limitless customizations as well as modular templates designed for quick integrations. Today, hundreds of fast-growing startups are powered by WorkOS, including ones you probably know, like Cursor, Vercel, and Perplexity.

Check it out at workos.com to learn more. That is workos.com. This episode is brought to you by Savala.

It's a true Heroku alternative where you can deploy applications manage databases and host static sites for free. Savala is a platform designed for teams. With its preview and pipeline features, developers can collaborate on any stack while being assured of the security of their workloads from staging to production. Savala holds all the major security certifications companies are typically looking for. Their application hosting offers automatic Git integration, Docker image deployments, hibernation for optimal cost savings, vertical and horizontal auto scaling, TCP proxy support, and optional private network connections for your databases.

Their free static site hosting is perfect for landing pages, documentation sites, and more. It also includes preview deployments for easier iterations and seamless teamwork. Savala features an easy-to-use interface, unlimited seats, no hidden tricks, and transparent user-based pricing with enterprise-level cloud-fair DDoS protection for workloads of any size. Sign up and deploy today. Go to Savala.com.

That is Savala with a double L dot com. When you say you were parachuted to help a bunch of different types of companies, can you give a sense of what types of companies you worked over the years, the decades, to help with their architecture, their legacy code, their tech depth? Every conceivable domain truly is. I've had the opportunity to work with, obviously, the financials.

I've done a lot of work in the defense sector. In fact, to go way back in time, complex systems were really not started in the commercial realm. But they really began in the world of defense.

The phrase I also use here is that all of modern computing was woven on a loom of sorrow. What we see in modern computing was born from World War II and the Cold War, particularly a system called SAGE, the semi-automatic ground environment, that came about during the 50s and indeed was operational until the 1980s. This was a system built in response of the Soviet threat of them taking bombers over the Arctic and coming into the United States before we had satellites and pervasive radar. And that was a system that precipitated the creation of what we call the software crisis. It was what triggered the creation of the NATO conference later in that decade, in which a group of folks came together from around the world saying, you know, how do we attend to this problem?

And it was really at the peak of what I'd call the first golden age of software engineering. So we have defense systems. I worked with a lot of real time systems, everything from. pacemakers to subway systems to, gosh, what else, CT scans and the like. Truly, you name a domain and I've probably spent some time in it.

The James Webb Space Telescope currently uses the UML in its design. That's pretty freaking cool if you think about it. Jumping way forward, you've now been...

Working with a lot of companies, been involved in a lot of projects, and also influenced a lot of software architecture, the broader field. How would you describe the field evolving over the decades? You were clearly part of some key techniques invented and becoming commonplace.

What was this like? So I alluded to a phase I called the first golden age of software engineering. This is the realm of the time of functional or not functional, but algorithmic languages such as Fortran. COBOL, APL, LISP, and the like, although LISP was sort of a multimodal kind of language.

But the dominant way that we decompose systems was through algorithms. And so you saw the rise of structured analysis and design techniques, which made a whole lot of sense at that time. This is where you had the Yordans and DeMarcos and Constantines and the like, because the presenting problem for software systems was they were generally not distributed. They were largely monoliths. And how could we build larger and larger systems that were sustainable and economically interesting over time?

Well, the golden age of that first golden age of software engineering began to change as we started to see the rise of distributed systems. And that rise, again, happened not in the commercial world, but it happened in the defense world. The ARPANET was, you know, funded by the government, funded by DARPA.

I was, as I mentioned to you earlier, I got my first email address in 1979 when there weren't that many email addresses around. In fact, one small story there. When we had the ARPANET in the air, I was teaching at the Air Force Academy at the time.

We had a little mimeographed document that listed the email address of everybody in the world. I think at that time there were a few thousand people. So you could we knew whoever but who everyone was at the time was. Pretty cool to go back.

So the first distributed systems were happening in that domain. Indeed, back to Vandenberg, I worked on a system called the Telemetry Integrated Processing System, which was a close network of some 32 minicomputers. Minicomputers were beginning to be a thing. And so the problems of how do I deal with taking a larger system and breaking it up into multiple distributed parts. was beginning to emerge as a problem, hadn't reached the commercial sector yet.

So we saw, we, the industry, began to see that there were limitations to what one could do with algorithmic decomposition. And so there were these pressures all around to try to attend to the next kinds of software, software that was distributed, software that was real-time, software that was multilingual, software that worked on a variety of computers. And I had to deal with all the normal aspects of distributed systems, which is they're going to fail at various times. And I've got communication issues and the like. So it led to a realization that we needed to think about software in very different ways.

The other thing that was happening is in research, you saw the rise of languages such as Simula and Smalltalk, which were looking at the world in fundamentally different lenses. So here we are again in the late 70s. And again, I'm what, a 20-something. I was asked by one of my former teachers at the Air Force Academy, say, Grady, would you go help the Department of Defense figure out how to use this new programming language called ADA so we can apply it to modern software engineering techniques? Now, why was the government worried about this?

By that time, software was a real problem for the Department of Defense, actually for all of the federal government, because there were several thousand languages in use that exploded because Fortran and COBOL were useful for some things, but not for all things. And so there was a decision made to build one language to rule them all. And that was the Ada programming language. Ada was far ahead of its time. It was a language that was influenced by Simulant, Smalltalk and others, but used the ideas of abstract data types from from Liskov and Galgan and others.

It used the ideas of information hiding from David Parnas, all ideas that were very new at the time, but frankly, are part of the atmosphere in which we breathe right now. And so as an industry, we really didn't understand the methodologies to make that work. Thus was born the Mooch method.

I was. here I was in 79 till about 81, I was going back and forth across the United States, helping the federal government and helping contractors try to apply this new language in new ways. And this was the beginnings of the second golden age of software engineering, in which it was not so much the complexity of the algorithms, but it became a systems engineering problem.

Systems that were dealing with distributed systems that were very new at the time. And That was the essence of the Booch method. It was the things I learned about helping organizations architect systems with these new kinds of domains and new kinds of languages.

Could you explain what the Booch method is? I understand it has to do with object-oriented programming, but coming from you as the person who invented it, it would be nice to explain what it is and why it was important. Well, let's go back to Plato.

talking about going way back. I wasn't around then, but I've read about him a little bit. There's this wonderful treatise he wrote, The Dialogue, in which there's a debate about how one should best look at the world.

Should I look at it as atoms, or should I look at it as processes? Well, the first golden age of software engineering was more focused upon the processes, the algorithms. But there's a parallel way of looking at the world.

And that's looking at it through the atoms, if you will, the classes and objects within them. So, yes, I was influenced by abstract data type theory, by Plato, by a lot of other interesting philosophical things that were coming together at the time of looking at the world in fundamentally different ways. So the Booch method was really trying to codify that. How could we decompose systems not based upon algorithms, but how do we? decompose it based upon classes and objects.

And again, that's where I was influenced by by Liskov and Parnas and Dykstra and Hoare and the like, names that are probably unfamiliar to students these days, but they were representing the theoretical underpinnings of the first and second generation. The Booch method was basically saying, hey, here's a new way of thinking about the world. And so it said, look not at algorithms, but look at... combining data and processes, algorithms together in one thing, thereby classes.

Now, we did some things right. We did some things wrong. I think the things we did right was classes make a lot of sense in terms of abstraction.

What we did wrong is we overemphasize the notion of inheritance. Inheritance was all about, let's save code because we can build generalizations and the like. That proved to not make a lot of sense because we ended up doing lots of dispirits. disparate kinds of abstractions.

That's okay. Fast forward to today and people say, well, what difference does it make? And the answer is, it's part of the very atmosphere in which you breathe.

And so you don't even think about it. You look at, you know, Redis and you may build things upon it, but you know, if you look at it, you're really dealing with a set of abstractions that Redis offers you. And those abstractions are class-based. So they're baked into the way of thinking of those kinds of systems.

So in short, the Booch method was, Let's look at the world not through algorithms, but instead through objects and classes. The last thing I'll mention is one of the things the Booch method hinted at that really did not catch full form into the UML was looking at systems through multiple points of view. Now, we'll come back to that a bit when I talk about Philippe Proust.

So just so I understand, because myself and most of us listening will have started our careers. long after the Booch method was invented for us. Classes, variables, inheritance, that's pretty common, pretty everyday things. But as I understand, it wasn't like that back then, right? So could you talk us through what the environment, the technology was like?

What made the Booch method so new, interesting, or innovative? Well, let me go even further back to the 1950s and show you a parallel story. There was a time in the growth of algorithmic programming languages. where the idea of a subroutine was considered controversial.

Why? Because doing a function call added at least two or three more instructions, which was computationally expensive. So even function calls and decomposing something into subroutines was viewed as an architectural aberration.

And people opposed it because it was inefficient. Well, obviously, we think, well, that was stupid because we need it for our management of complexity. The same thing, I think, was true back in the days of early object orientation. I mean, people were doing object orientation in algorithmic languages because you'd have these things in COBOL called, you know, common data areas. People would devise, here's all this common data.

And as a matter of practice, but not language. You would say this data is used in this way by these algorithms and vice versa. In fact, going back to that project I mentioned to you at Vandenberg Air Force Base with those 32 computers, on the side of every one of those computers, every day we'd see a printout of here is the common data pool.

And so it was the abstractions right in your face because they changed. And so people were trying to do algorithmic or object-oriented decompositions. but the languages didn't support it. There was a need to do that kind of thing. And until the languages came into play, there was no way to bring those ideas together efficiently.

And so, yes, the Booch method was very much a reaction to the forces upon building software intensive systems to look at classes, trying to apply it with modern languages and building a methodology around it. Today, we take it for granted because our languages make it easy for us to do this. It's just fascinating to think back how revolutionary it was and compared to just how commonplace it is.

And also to think about how things that we invent today and that are revolutionary in 20 years, people will be like, oh yeah, that's commonplace. Right, exactly. One thing that you're known for and you also mentioned you're associated with it is UML. But can you share how this was created? What was the goal of it back then?

Who were involved? And what was the need that it was solving at the time? Right.

So in 1982, two of my classmates from the Air Force Academy, Paul Levy and Mike Devlin. Paul had been a roommate of mine at the Air Force Academy. He was an economics major. Mike Devlin was a computer science major. Mike and I had a few classes together.

The first time I ever met Mike was in an unarmed combat course, by the way. So not your typical thing you get at colleges. But, hey, you know. I was trained to be a warrior. That's where I first met Mike.

And I think he beat the stuffing out of me, if I'm not mistaken, in the Pugil stick competition. But Mike and here I was at Vandenberg Air Force Base. Mike and Paul were up in the Bay Area.

They were working at the satellite control facility. And I engaged with them because they had one of the first largest ADA projects that was going on. So I went up and helped consult with that project. The two of them also went to Stanford.

And I think. I swear there's something in the water at Stanford because they then connected with Art Rock and Hamburg and Quist, the two premier venture capitalists at the time. Art Rock and Hamburg and Quist were the key founders to Apple, and they also contributed to the funding of what became Rational Software. So in 82, Mike and Paul got together with me and said, let's start a company. And we did.

It was a company called Rational Machines Incorporated. whose intent was to build a software development environment for this new coming AIDA programming language. We saw that to be an opportunity in which we could make piles of money. And we built hardware at first because this was the time when, you know, many computers were becoming affordable.

You had Sun coming into play here, but none of them were powerful enough to do the kinds of things we were doing. So Mike designed a system. I helped build the methodology around using the system. It was a system called the R1000.

And that was the dominant ADA system used around the world for ADA at the time. Well, around, I don't know, would have been the early jump ahead a decade now. Here we are, the mid-90s. And I was getting a little tired of doing that kind of stuff and branching out in other places.

And I found that there was traction. that I was getting from the Booch method into the commercial sector. I was giving a bunch of lectures at the time. And at one time, there was a gentleman in the audience who asked a really insightful question.

And afterwards, he and I met up, a guy by the name of Bjarne Stroustrup. And it turns out Bjarne was working on a thing called C with Classes, which was the predecessor to C++. The two of us got together, we hit it off, we found that we were doing very similar things together. And in fact, it led to the two of us doing a lecture series around the United States, where I got to know him quite well.

And this was around the time he wrote his first book on C++. If you look at the first edition, you'll see he references a lot of my ideas. And that's when my book on object-oriented design came out.

And I referenced his work a lot, too. So, oh, the Booch method and C++ kind of grew up together. Well, I thought this was interesting.

And I remember a particularly important meeting I had with Mike and Paul around the time. It was at the Red Carpet Club of United Airlines in Denver. And they met with me and said, hey, Grady, we're thinking of moving the company in the direction of embedded systems. And I said to them, well, good for you.

I think that's a stupid idea because you're missing the commercial sector. Go off and have fun. I'm going to do different things. That gave them pause.

And I think they realized, wait a minute, maybe there is something here in the commercial space. We were finding that we were having challenges continuing to grow the business in just the defense sector, which is what led them to that. And so they then made the decision, hey, let's take the Booch method.

and make it real. And thus was the beginnings of a system called ROSE, Rational Object-Rolling and Software Engineering, our first tool. The first prototype, by the way, I wrote in Smalltalk.

It was a wonderful system. I wish I'd kept the source code for that around, but I remember making changes to it just literal minutes before we did the first demo. And so that's where we sort of broke out from the defense sector into the commercial sector. And it was a big hit.

because that was the time when I think lots of others were recognizing object orientation was a good way of looking at the world and C++ actually supported that. Well, that led to two things, not just commercial success in our part. We began to make lots of money and we began acquiring other companies and we started filling out the software engineering lifecycle.

Can you just tell us what Rational Software did to this commercial thing that was a hit? Yep. So Rational Software, the ROSE, Rational Object Oriented Software Engineering, was a personal productivity tool, if you will, that ran on an IBM PC. It also ran on, I think we eventually moved it to a number of other devices.

It ran under Windows in the first one. It basically allowed you to draw UML diagrams, not UML, but Booch diagrams, so that you could then... reason about and think about your design.

We did a little bit of co-generation, but really it was just a design tool to help organizations think about their designs. And people use it quite well to document and specify and build their systems. Now, we started making lots of money off of it. And so we started acquiring companies. We bought a requirements company.

Ed Yorden came to me and said, go look at these folks. We did. We bought a small company out of... Cambridge called Pure Atria, which was led by a gentleman by the name of Reed Hastings. Reed came to us.

We bought his company. And we're talking about the founder of Netflix, right? Yes. So we bought his company.

Yeah. Reed realized he's a lousy CEO. And so he took his money, hang around for a few years, figuring out what he was going to do. And actually, that was a lot of the seed money that helped form Netflix. It is a small world in that regard.

So at that time, at the peak of what we were doing by the late 1990s, IBM Rational was sort of dominating the space of software engineering because we had tools across every part of the development lifecycle. And this is where the ideas of incremental and iterative software development came into play. Long before today, what we call continuous integration and continuous deployment, we were already doing that with a rational machine and our tools.

We had pioneered those ideas because we had built incremental. compilation tools and the like. So here we were in the 90s and we had a whole set of tool sets around this. But because this was clearly gaining traction in the marketplace, we were at the only ones and we were seeing the rise of hundreds, if not a few thousand companies that were beginning to try to do object oriented kinds of things.

And this was the beginnings of the second golden age of software engineering, where you had peak code and. And Constantine back again and Jordan again and Martin. We had Ivar Jakobsson and Jim Rombaugh. And so it was a very vibrant time. where organizations were trying to say, gosh, we've got these great tools.

We've got the ARPANET. We've got personal computers. How do we build software for it?

So the presenting problem was in software, what is the best way to design systems using this very, very robust, very powerful set of tools we have at our hands? And so rational being in a very interesting space, we said, you know, we're kind of dominating the market. let's keep going here. So we hired Jim Rumbaugh and the task Jim and I had was to combine his methodology, OMT, object management technique with mine.

We were sort of the two leading ones at the time. And then we bought Ivar Gokwison's company because both Jim and I were using what was the idea of use cases. Again, talk about something that's part of the atmosphere. Use cases are just something you think about, but they were new at the time in the 90s.

They were an idea invented by Ivar in his work. primarily at Ericsson. And so we were working together with EVAR on building software for base stations for the burgeoning cellular telephone networks, which were a thing back at the time as well, too. So here we have the three of us who were brought together by Rational. And our task was, let's unify our methods.

Now, you could never have found three very different people. I'm pretty amazed that... We didn't end up with one of us in the hospital and one of us in jail.

We were so, so very different. And I won't go into further detail on that, except to say that I'm very proud of what we created. And from that was born the UML.

So we decided, you know, this is not just ours. We need to make it something that the whole world could use. We made the decision to release it into the object management group. And that was born UML 1.0. I sort of drove most of that.

working with Jim in the OR. I wrote the primary document for it. Obviously, it was the three of us working together.

I don't want to, you know, I want to give them complete credit, believe me. But after UML 1.0, I was emotionally exhausted and want to go off and do new things. So I kind of walked away from it at the time.

I want to pause and mention one other person who was important here, actually two other people. The first was Philippe Krushtan. So we realized that our work was so big, we couldn't just do the methodology and the notation.

So Jim Ivar and I worked on the notation. That was the UML. And Philippe worked primarily with some of Ivar's people on the methodology. And thus was born the rational unified process. Notice the emphasis upon unified.

Philippe brought to the table the very important idea that we've begun to see hints at with the Booch method. And that is looking at the world through multiple points of view. Philippe has this idea of the four plus one view model. which he grew from his work with building the Canadian Air Traffic Control System. Again, a very complex distributed system.

Those ideas were eventually made manifest in IEEE IEC IOC standard 420020 on architectural description, which basically says if you're looking at an architecture, you have to look at it from multiple points of view. Use cases, logical view, process view, implementation view, and deployment view. And that's a very important and profound piece. The other thing that came into play, another person that came into play was Walker Royce. Now, Walker's an interesting guy.

You talk about a small world. His father, Wynne Royce, who I had the pleasure of working with when he was at Lockheed at the time. Wynne Royce was the gentleman who wrote the paper on waterfall life cycles. And his son was basically working on spiral models in the Booch method.

Wynne was misunderstood because he was not. endorsing waterfall methods. He said, that's a stupid idea. In fact, look at Parnas's paper, a rational design process, why and how to fake it. It's from that the rational software name came out of, by the way, which says at one level, it looks like waterfall inside.

No, it's what we today we would call agile. So here we are. Here we are.

What the late 90s, early 2000 UML to 1.0 was in the bag. And that was the life of Grady at the time. Hey developers, we've all been there.

It's 3am and your phone blares, jolting you awake. Another alert. You scramble to troubleshoot, but the complexity of your microservices environment makes it nearly impossible to pinpoint the problem quickly. That's why Chronosphere is on a mission to help you take back control with differential diagnosis, a newly distributed tracing feature that takes the guesswork out of troubleshooting. With just one click, DDX automatically analyzes all...

banzai dimensions related to a service, pinpointing the most likely cause of the issue. Don't let troubleshooting drag you into the early hours of the morning. Just DDX it and resolve issues faster. See why Chronosphere was named a leader in the 2024 Gartner Magic Quadrant for observability platforms at chronosphere.io slash pragmatic.

That is chronosphere.io slash pragmatic. And do I understand correctly that the goal of UML was to describe a system? So when I think back to college with UML, we have the different...

boxes of different classes. We have the arrows between them. They describe relationships depending on the type of the arrow is. And when you look at this whole diagram, you get a sense of the structure of the software you're building.

Yes. So if you look at the very first line of the UML 1.0 standard, I believe it says something to the effect the UML is a visual language intended to reason about. visualize, specify, and document the artifacts of a software-intensive system.

It says nothing about it being a programming language. In fact, I voraciously pushed back against that. It was a language meant to think about and reason about a system, to think about the world in object-oriented ways, particularly in ways that you looked at it through multiple points of view. DevOps today, by the way, is simply an amalgamation of deployment and implementation views.

But we didn't call it back that way then. We looked at it from those kinds of views. And that's really where UML 1.0 was. It was meant to be, how do I think about these things?

How do I reason about them? And I always intended for, you know, you'd write a UML diagram and you'd throw most of them away. Now, unfortunately, many people didn't do that. And in the move from UML 1.0 to 2.0, there was a faction of individuals and companies who said no. We want to make the UML very precise.

We want to turn it into a programming language. And that was, I think, a profound mistake. I never intended the UML to be a programming language. But the net result of that was to make the UML much more complex, much larger.

And the emphasis then was upon not using it to reason, but to generate code and to reverse engineer. Now, the reverse engineering I can get, that makes a lot of sense. But... Turning into a programming language was a mistake. And I think that began the decline of the UML because people were using it in the wrong ways.

At its peak, the UML probably had a 20 to 30 percent penetration in the marketplace, which is pretty cool if you think about it. And so I'm proud of looking at the systems in which it was used. But most of all, I'm proud of the fact that it helped people think of building software in different ways.

So when you say at a peak it had... 20 to 30 percent usage does this mean that about 20 to 30 percent of commercial developers were using it at the time and what what time was this around what time yeah here we're talking around 2000 plus or minus a few years remember that microsoft was big in the midst of this as well too that they actually worked with us to take you our rose product and make it a part of visual studio So we had a team that was working up in Seattle to make that happen. And it was a major selling point for Microsoft at the time because it actually helped their customers build more complex software. So yeah, we're talking around 2000 or so. But what else happened in that time frame?

The answer is the internet was, so the ARPANET had moved over to the internet. We were beginning to see companies in the late 1990s move on to the internet, which was great. But there was a challenge there.

There was first, how do I even build systems for distributed work in the web? And how do I make money off of it? What does those systems look like?

And so that's why Microsoft was interested, because we were helping their customers move from the PC to distributed systems. But on the other hand, there was also a lot of hype that, you know, the Internet's going to improve your sex life. It's going to do all these kinds of things.

It was just... totally overinvestment in that space. So a little after the millennium, we saw a backlash and there was this great downturn in the marketplace where people had built things and realized they weren't necessarily economically sustainable. So now we are here in 2003. IBM and Microsoft were still using our tools heavily because they were important for their customers. IBM and Microsoft both bid for us.

And I think IBM won some bid of $2.7 billion to buy Rational. And so we went over to IBM and became part of it. And that made sense because at that time, there were 3,500 of us in 14 countries.

We had reached almost a billion in revenues, which was pretty extraordinary for a company around that time. But it was time to be absorbed. Now, one other story I'll tell before I pause again.

Here I was, IBM acquired us. They made me a fellow immediately, which had never happened before. It usually takes years of being part of IBM. And a couple of months after, I got a phone call. It said, hey, Grady, it's Bill.

Come visit me. So I went up and flew up to Bill Gates, right? Bill Gates. Yeah, I'd done some things with Bill before.

And so Bill was at the time still CEO of Microsoft. He said, hey, Bill, took me into his office. We had a like a 30 minute meeting scheduled.

We ran for two hours, much to the annoyance of his staff. And he sat down and said, Grady, it's not public yet, but I'm going to be moving out of my role at Microsoft. because I would do other things. And Grady, you know, I've got two roles.

I'm CEO and I'm chief architect of Microsoft. I'd like to give you that job of chief architect for enterprise. And so I said, Bill, that's very interesting. And so I said, give me a little time.

And I went around and met all of his main reports. Most importantly, Balmer never met me. And that was a red flag.

And it was around the time, too, I realized that Microsoft was a particularly nasty company. You had the Office group and you had the Windows group that just couldn't stand one another. So I eventually came back to Bill through his hiring folks and said, Bill, I'm flattered. But, you know, you have a profoundly dysfunctional company and I'm not the one to fix it. So, Bill, thank you.

But no, thank you. I think I used something to the effect of it would only end in tears for both of us if I accepted. So let's move on. So I stayed at IBM and it was a good decision to make. It would have been a bad decision.

And there's this cartoon about Microsoft created by a software engineer cartoonist, Manu Cornet, about the organizations of Microsoft, the two organizations, and they're holding guns against one another. Yeah. Yeah. That's where it was. They needed somebody to knock at.

I'm a lover, not a fighter. And I was not the guy to break things up. Wow.

What a story. So I started UML in college. And to this date, it's part of several college curriculums. But interesting enough, in the industry, I've just not really seen it used for anything. At least the companies that I worked at and the startups and scale-ups and large companies I worked with.

I kind of see a resistance to using it when it's brought up, claiming that it's too formal, because we do use architecture, right? We use boxes and arrows and we diagram. I'm curious to know, you know, we talked about how UML was used by 20, 30 percent of the industry at some point. But what happened in your view?

So this leads us to, you know, contemporary architecture that I've got a shelf full of books on architecture, both in the. you know, older ones and new ones. And if you look at what a lot of people speak of as software architecture today, I think it's reasonable and sound and there's good stuff there.

But in many of the kinds of systems these architects talk about, the architectural decisions have largely been made for you. I'm going to build a system that requires message passing. Well, let's go find, you know, RabbitMQ or whatever, or Redis or whatever I need.

The architectural decisions have been made for you. So a lot of the activities of contemporary architects is simply taking very large frameworks and components and weaving them together, which is a very noble and wonderful thing to do. They also represent systems like Meta and DUT as particular. They have grown their architecture and system over a few decades now, and the stuff they're building on top of it is largely evolving and building upon those APIs, which does not require the deeper kinds of architectural thinking.

So I use this, let me give you an image here of a three-axis system. Along one axis, you have levels of ceremony. If I'm a startup, then it's just my other people's money and no one else's.

And heck, I can write disposable software. And if I fail, I'll just... go find another venture capitalist, of course, I'm not going to worry about any kinds of degree of ceremony because just build it, go hire some brilliant people and make it happen.

That's wonderful. On the other hand, if let's say I'm doing something like, I don't know, building the next generation intercontinental ballistic missile system, which uses about, I don't know, about half a billion, half a trillion dollars, you bet you're going to use more ceremony. because you have to have degrees of accountability. So that's one axis. The next axis is that of...

of, uh, of, uh, risk. So if I build a system and says, Oh, if I fail, you know, you know, so-and-so's not going to find their grinder match big deal. On the other hand, if I fail in somebody dies, that's a problem.

And you're going to use a more disciplined architecture. The third axis is that of complexity. If I build a system that people have done again and again, then heck, I don't need anything.

Heavens. This is where prompt engineering comes into play. I go build an app just by building prompts because we built these things. I don't need no stinking UML for that thing.

On the other hand, if I'm building something that I've never built before, let's say I'm building not an LLM, but I'm building a constellation of LLMs work together and I want to weave them together with non-neural systems, then you begin to think about architecture and that's where the UML comes into play. There's a sweet spot for a tremendous amount of software development going into place that doesn't need the UML and does not need any kind of thing like that. But on the other hand, you go a little further out in that three dimensions.

And yes, there are people all over the place using UML. I mentioned the James Webb Space Telescope. I still work with financial companies who are doing that, where the risk, the complexity, the ceremony is sufficiently high that it demands a bit more formalism. So Well, do I understand it correctly that you're saying that software architecture has changed from the 90s and 2000 when systems were new, architectures were new, software architecture was still a lot more novel.

And if we look at venture funded startups and big tech today, what we see is one, they're just not as risky if they fail big deal. And then two is we can use a lot more. software that's out there and been architected and you know we can for example use redis as a cache and it is there it works we don't need to think too much about it and then the third is that there's a lot of startups who just don't need ceremony basically they don't need audits they don't need formality they don't they can just go like all right let's just do it however we want to it we don't need that kind of audibility do i understand that these are the changes that are are in play with software architecture?

And is this why some of these formal methods are just not as popular with startups and scale-ups? Yes. In fact, I think it goes to the root of the economics of software development. Let's go back to the first age of software, first golden age.

The machines were far more expensive than the humans. And so it required one to do some thinking before I even got to the machine because machine time was very, very expensive for me. And so, yes, algorithmic decomposition, structured analysis, design techniques made a lot of sense because we needed that kind of optimization.

If you move to contemporary times, computational resources are like water to a fish. They're available to anybody. I've got on my desk behind me, I've got my own personal cloud of four NVIDIA single board computers.

which has more processing power that existed in the world in the 1970s. And that's pretty amazing. That's a few thousand dollars there.

And so the economics have changed vastly such that you don't need to think about it so much because, heck, it becomes disposable. AIS, I think, is also changing that because it allows me to build things where I don't even have to think about design. Heck, I don't even have to think about software. I just.

prompted for those things to build something and one-offs. And once I'm done, I throw it away. So in that sense, it comes back to the economics, but there will continue to remain a class of software that's new, unique, breaking new ground that still requires that kind of architectural thinking. And it's interesting because just recently I read how Amazon is using formal methods for AWS S3.

They're publishing a blog post detailing how they're doing it. And they're doing this to catch those really, really edge cases that only happen in one in a billion, one in a trillion. But at their scale, this is a regular reoccurring event. And I found it fascinating how some of these methods are making their way back. Yeah.

Well, let me. Let me set aside formal methods in a moment because it turns out that's a different topic for which I have some experience and opinions. But go back to Amazon.

You go to their websites and they have a whole language around architecture. Microsoft does as well, too, around Azure. It's the way I describe an architecture in Amazon. It's those, you know, their blocky diagrams.

It says, hey, I'm going to build this kind of thing. And here's my particular notation for it. And furthermore, here are some examples.

So even Amazon and Microsoft have recognized that architecture plays a role, but there are enough times people have done these kinds of things. It says, oh, you want to build this kind of system, then you want to use these services. In fact, here are some examples for it you can find on our website.

So without them really acknowledging it, Amazon and Microsoft view architecture is still important, but there are enough patterns that. one doesn't have to go through the process of rediscovering those because I can build those things. And this is a representation of the, I think the, the maturation of our business.

We move from algorithms, whichever one can use to design patterns to now architectural patterns, which Amazon and Microsoft have codified themselves. So let me switch over now to formal methods. Formal methods have always been a thing.

However, Formal methods, in my experience, have been a niche part of every software-intensive systems because formal methods only go so far in what domains they can cover. And so you'll see, as what your example described, Microsoft began using formal methods in their drivers to validate the correctness of their things. Their hardware driver?

With their hardware, exactly, yeah. And that was an important move forward. I've been with projects that use things like, you know, I've got this system in which people might die.

Let's run a formal analysis upon it. The thing is, though, that those formal methods don't deal with real world things because they don't deal with space and time. They deal with functionality. So I have always found formal methods to be of use, but only for parts of a system and never as drivers of the architecture itself. Speaking of software architecture.

These days, the role software architect is not really popular anymore, at least at the likes of startups and big tech. Like I said, we do have architects, but they're increasingly called staff engineer, principal engineer, distinguished engineer. The architecture is, and they do still do architecture, but there's a different focus on it.

Now, you were there when software architecture was created, when the first software architects were created as a role. Can you... tell us how you've seen this role be created and then evolved throughout the decades. So two things influenced my understanding of the space versus I didn't really call myself an architect, but I helped people design the systems they were building. I was heavily influenced by a dear friend of mine, Mary Shaw.

She's a professor at Carnegie Mellon. I think she won the National Medal of Honor under Obama, if I'm not mistaken. And she wrote this book.

really profound book called Software Architecture, in which she began the first exposition of architectural patterns. Mary's just a delightful human being. And that's when I began to understand the formalizations of what architecture could be. And the other thing that influenced me was architecture from other domains, particularly civil engineering and the like.

And there's, in the defense world, there's also, you know, shipbuilding architecture and airplane architecture and the like. So the term architecture is one that's very well respected outside the software engineering world. Ignore the title for a moment and let's go back to first principles.

What is it at all? And you've probably heard me say this. All architecture is design, but not all design is architecture.

Architecture represents the set of significant design decisions that shape the form and function of a system where significant is measured by cost of change. So. Software architecture, an architect has this horrible emotional baggage around it.

So think of it as it's all about making decisions. What are the decisions that shape my system? As an architect, that's what I'm doing. As the project manager or whatever, I'm also about making decisions. But one subtle difference is that it's no longer just the decisions about the shape of the software, but it's the shape of the system itself.

where it embodies itself in the physical world, where the other systems and humans themselves as well. That's what it is. It's all about decision process. And have you seen this role change? Did you see a golden age of companies where they were employing software architects that were empowering them to churn out design?

I'm seeing a bit less of this. I'm curious, are you seeing something similar? Is this just a bubble that we're seeing just because it seems you're embedded in a lot of different companies?

There's another soundbite I'll give you, which is that the entire history of software engineering is one of rising levels of abstraction. So what we're seeing here is the rise of another level of abstractions, which gives us all these extraordinarily powerful frameworks from which I can build systems and which, as I alluded to, the architectural decisions that were front and center for us back then are now embodied in these. So now becomes a decision, what cloud service do I use? What messaging system do I use? What platform do I use?

That's the decision which has a lot of economic decisions and not just Not just software kinds of decisions associated with it. So I think the role of the architect, in effect, has changed because now I'm dealing with systemic problems, not just software problems themselves. And have you heard about the role called solutions architect? I think it was created maybe 10 years ago.

Yeah. And it's about people who are doing cloud architecture. It's fascinating. It's a role specific to the cloud. Yeah.

And as you said, they make economic decisions. What services do I use? Do I use AWS, GCP? If I use AWS, do I use EC2 or do I use another service? Yeah.

And that's why it's a systemic issue. You generally, if you're a startup, you're going to hire somebody who's done that before, who knows where the skeletons are buried, who knows what the costs of these things are. And so you'll hire those kind of folks because they'll... accelerate you because they've made those decisions and they sort of know in the shape of what you're building now, what decisions next make sense.

And they are systemic decisions because they have economic and long-term associated consequences to them. One thing which comes up with software architecture is migrations. I do notice that most large companies are being hurt by long-running migrations.

And these migrations usually happen thanks to software architecture changing, for example, from going from a monolith to microservices or changing technologies, for example, going from node to go or upgrading major version frameworks, for example, one version of Angular to the other one. How do you think software architecture and migrations are connected? And why do you think software migrations are just so darn hard and they don't seem to be going away? They are.

Migrations will will plague us until the. Heat to death of the cosmos, I believe, because there's always, you're always building economically viable software, but then the technology is changing out from under you that compels you to consider migrating. You know, consider the migration from monoliths from the 60s, 70s, 80s to a lot of a sudden, economically, we had many computers and now distributed systems.

You're still not going to use... I don't know. an iPhone and put everything up in my mainframe. But those changes compel you to consider an architecture that better balances where the processing takes place. If all of a sudden, I can begin to do edge inferencing on my devices, that's going to be something that's going to change architectures as well.

So there are always these changes, both in the hardware as well as societal changes, if you will, that impact the... structure of my systems that are the forces that impel me, compel me to do this migration. But why is it hard? Well, another soundbite I'll give you is that the code is the truth, but the code is not the whole truth. There is so much that is outside of the code that represents design decisions, their rationale, why I chose this versus why I didn't choose that.

subtle things as to why did I name things this way because of its impact that is long understood, long misunderstood. And so while the code may be the truth, the problem with moving migrations is that you lose, there's a loss of information and it's difficult to try to recreate those design decisions just from the code. The people who wrote that initial code you're trying to move, they may have, they've they probably cashed out or they've died or some combination of and so you you don't you don't really know why those decisions were made and so you're working a little bit in the dark yeah you're you're mentioning people have died and i don't think we usually think about that but I guess when you're thinking about systems that are 40, 50 years old, that is the reality of some of them. What will the Lennox kernel look like when Linus eventually retires? So how will it drift?

Because he has provided a firm and much needed hand upon the conceptual integrity of that system. And that's what the chief decider makes. He or she.

is the one that provides that conceptual integrity. And when that person is gone, then you naturally see drift. And that's inevitable.

It's entropy. All software exhibits some degrees of entropy without adding that kind of force to it. One thing that is very relevant today in terms of technologies and architecture is AI. This technology is here.

It's revolutionary. It is disruptive. Now, you've been in the industry for closer to 50 years.

If you look back, how do LLMs and AI compare to the industry to past innovations and events? Because for many of us in the industry, for those of us who've been for 20, 30 years or so, LLMs do seem like the biggest change of software. But when you look back, have you seen something that was comparable in terms of impact?

the pace of change? That's a great question. Yeah. The first, I think, was just the realization that I could build distributed systems as opposed to putting all my processing on one. That was a change that rattled everything in the way we built systems.

So at the point in time that all of a sudden I could have a network of many computers and then eventually... that carried on to microcomputers and devices at the edge, like phones themselves. But that transition in the growth of many computers was seismic, that I don't think people understood the full implications of until much later, because it required us to rethink the way that we put systems together.

So we saw the rise of a great degree of uncertainty, because we didn't know how to build these systems. I have messaging across... Then do I use RPC?

Do I use something else? What do I use to communicate? Do I use shared memory? And so there was this period of exploration that we didn't know about until it finally we realized, oh, these are the common ways that work.

And can you remind me around what time this was? Was this around the 80s? Was it around the 90s? It was the late 70s because here you had, well, let's go back way in history.

There was the thing called. Gosh, what's the name of it? I'm drawing a bit of a blank here.

But we first, here we are in the late 60s, early 70s, in which mainframes were, you know, roaming the earth. But even then, it was a realization that these machines were sometimes being underutilized. And so there was born the formation of the first time-sharing systems. The next thing that happened around the same time were systems such as Whirlwind.

which came out of the Lincoln Laboratories, which was beginning to take machines and touch them to the real world. So all of a sudden now you had real-time computing. So you had this confluence of two very interesting things. You had large machines in which we were beginning to develop time-sharing kinds of operating systems, and then machines off to the side that were touching the real world. These came together in really the rise of many computers, particular digital equipment corporation and the like, where...

And miniaturization, which frankly came about because of the investment the Department of Defense was making in semiconductors, making it economically interesting to have a computer that could sit on your desktop that one or two people could use full time. So it took kind of the ideas of the embodiment of things like Whirlwind and then the distributed systems we were people building. Now, all of a sudden, I could take these things and I could start building them together.

And that's what was happening in the mid to late 70s, that we saw the rise of those distributed systems. There was one last thing. There was the rise of client server systems where you saw dumb terminals. That's what IBM was doing at the time, the green screens, talking to mainframes.

But the rise of those new machines meant I could move some of the processing off to the edge as opposed to leaving it to the monolith. So those are the forces that led us to rethinking about systems. So do I understand correctly that...

programmers at the time, they were used to working on these large computers and distributed computing came in absolutely and it just completely changed the dynamics so youngsters came in they started to embrace this distributed computing but existing programmers they kind of stick to what they knew what was efficient on the mainframe was was that what it was like it was and then when you had things like tcpip and html and all those things around it all of a sudden we now had a vocabulary we had now had mechanisms that we could bind these things together. So I would not say that LLMs are as pervasively as important as the rise of distributed systems, but there's a parallel to it. The second thing I'd say that is sort of similar to that is the rise of GPUs, which came from the gaming industry.

Because there they were solving a very different problem. How do I deal with more photorealistic things in my games? And... we realized it's all matrix multiplications. And so NVIDIA began to move into that marketplace, and we had the rise of GPUs, which dominated that space.

And it wasn't until Andrew Ning came and said, wait a minute, those GPUs used for gaming use the same kind of mathematics as do our deep learning kind of things. And poof, all of a sudden, we had this perfect storm of lots of data, powerful hardware, and the rise of interesting algorithms. backpropagation and the like. So it was a perfect storm.

So yes, this is an exciting time. There's no doubt that large language models are interesting, but we must be careful. And we'll go into that in a moment. So let's go into that. I'd love to hear your candid thoughts about LLMs, their applicability, the innovation, and the trade-offs that they come with.

Well, so to set the stage before people worry that I'm just pon- pontificating about things I know nothing about. I alluded to at the very beginning that, and here I was 12, 14, whatever, I was interested in what was burgeoning, becoming AI at the time. That has always kind of stuck with me. And in, what would have been, I've always pursued an interest in that space. And it wasn't until IBM drew me back to it.

around the time of Watson that I began to make it a full-time thing. So David Ferrucci, who led the development of Watson, had called me in and said, Hey, Grady, I'm going to give this lecture. I can't do it.

Would you do it for me? And I said, David, happy to do so, but only on the condition that I can choose the topic. And the topic I chose is what is the architecture of Watson Jeopardy?

And as it turns out, nobody had documented it. So. Being an expert in the architecture space, I sat down with David's team for several months and documented the as-built architecture of Watson Jeopardy. It's not a neural network system. It turns out to be a pipeline architecture that, through a pipeline, brought together a number of statistical systems.

AI at the time was all about predictive statistical methods, not neural networks and methods in knowledge engineering. And so I documented the architecture. That caught the eye of IBM management. And it's for the first time I really documented an AI architecture.

And for those of us who don't know, this was in Jeopardy. This was the IBM computer that played the game Jeopardy and won, if I remember. It won.

Yeah, we beat all the humans in that space. And at the time, this was in mainstream media, in the press, everywhere. And this was shown as an example of an AI system that can outperform a human on a very... human task which is this popular show jeopardy yeah right it was part of the the the trajectory ibm had been on in that space because before that we had deep blue which beat the the leading uh chess player at the time gary kasparov gary kasparov yeah yeah so so we did that yeah we did that through brute force methods brute force methods then watson jeopardy came along beating humans in natural language processing at the time. IBM had already been doing things in that space with a system, again, statistical systems.

We eventually sold it to Nuance and the like. But we were pretty dominant in the space of natural language processing. And IBM, Watson Jeopardy was the peak of that.

So IBM asked me, hey, Grady, this is cool stuff. We're going to commercialize this. Would you please help us do a study as to what IBM should do with this. So I led a study for about a year in cognitive systems.

What should IBM do? Now, this was interesting because I, I studied what Watson was doing. I looked at what was happening in the marketplace.

Here we were what, 2000. 10ish or thereabouts, a little bit later. It was that decade. And I made it very clear to IBM management, this is pretty cool, but be careful because there are things we know it can't do.

And I was very clear to our management that be careful about hyping it. Well, I remember in my meeting with Jenny when I was doing the briefing about the project, Jenny was the CEO at the time. She asked me a question. One of the VPs stepped in and sort of started talking over me and said, well, yada, yada. And I politely said to him, well, thank you, but I think you're wrong.

And I went on to explain to him. Now, being if I were not a fellow, I'm sure I would have been fired on the spot. But as a fellow, I get to say things like that.

Now, they didn't they did not invite me to join the Watson group. So I was happy with that. So I worked off into the side.

But, you know, it's sort of an I told you so that. But yeah, there were things that we can't do. And by God, we can't do them. So I was working in the underground for several years. And I'll get to the answer of your question in a moment.

But it explains why I can say something meaningful. So I was kind of an outsider. I was kind of doing my own thing inside IBM. And then we had a team down in Austin who had been approached by Hilton Hotels who said, we'd like to build a robotic concierge for our hotels.

And so we chose some robots from Alder Baron, a company out of south of France. They built these three foot tall robots called Pepper and smaller ones called Neo. And they built a robotic system, which was kind of cool. It was a question and answering system based on Watson technology. But there was something missing out of it.

And the team came to me and said, Grady, would you please help us out? So I did. I helped improve the architecture. We actually installed it in a few Hilton hotels.

But as it turns out, just down the road from that team was the Johnson Space Center. And so we began to be involved with them. And I thought, this is all great because I can get back to my space roots.

So they had a system called the Robonaut 2, which was a humanoid robot that at the time was on the International Space Station. A beautiful piece of engineering. And NASA was exploring the idea of human robotic interactions for... the space for the mission to Mars. So the group of us sat down and said, what would be an interesting design problem for us to explore that would propel us to look at hard problems?

Because we want to do hard things. And we realized it was the mission to Mars because they had two interesting use cases. The first is because of the speed of light issues, you couldn't rely upon mission control on the ground. You had to take it with you.

This was the HAL problem. We needed to build a HAL minus the kill all the humans, which use use case, which for some reason NASA didn't want that use case. I don't know if it's their choice.

The second is at the time NASA wanted to put robots on the surface of Mars to help astronauts, you know, build their build their scientific experiments and build their habitats and the like. So I remember one afternoon I'd sat down and said, I know how to architect this. This would have been around 2014. And I built a neurosymbolic architecture we called self, and it used the ideas of Marvin Minsky's Society of Mind, together with Rodney Brooks's notion of subsumption architectures, together with Hofstetter's ideas of strange loops.

And those three came together, forming this self architecture, which we then built. And so you We experimented with NASA for a while to build some software behind the Robonaut 2. Do things like, hey robot, here is a scientific process, go do this, or go clean these filters, which is a common station keeping activity. So we were actually building the software that allowed humans to interact with it and also serve as question answering things. That led us to Sol Machines, a company out of New Zealand.

They were an offshoot of the work that John Cameron had done in the movie. The shoot. What's the what's the one he's done?

He's done a trilogy of them now that he had this great technology where you put cameras on the arc on the artist and it would measure their muscle movements. They took that. They built a neural model of human musculature and they had the great hardware for it, but they had no software.

So we helped them in that regard. We then also worked with a company called. Woodside, an oil and gas company out of Australia, because they had a problem similar to NASA. They were building systems for oil rigs, which is a very dangerous environment.

And so they wanted to build cooperative robots, just like NASA was doing. So here we were with this system called SELF that was, frankly, a neurosymbolic system. It had neural pieces, but also symbolic pieces based upon those three architectural decisions. It was pretty cool. At its peak, I had 35 people working on this.

But then we recognized IBM was across six laboratories, I should mention, around the world. which made it possible for me to live and work out of Maui, Hawaii, which is where I still am. But then IBM recognized IBM's Watson was bleeding cash.

So they fired everybody. And again, being a fellow, they didn't fire me. So I continued in that space. But that was my exposure to the first set of new kinds of architectures.

For about six years now, I have been working with a set of neuroscientists to try to understand the architecture of the organic brain. And so rather than, you know, software kinds of things, I've been studying things like cortical columns and and the loops that exist within the hypothalamus. So I've been looking at architecture from the lens of a software architect into these kinds of organic systems.

And so that now leads us finally back to your question. What do I think about large language models? The answer is they're pretty freaking cool. However. They are, by the very nature of their architecture, unreliable narrators.

That's what I say politely. If I'm going to be impolite, I will say they allow us to build that global scale bullshit generators because they are clearly they are stochastic parrots. They do not reason.

They do not understand. But they do produce some very coherent results. Because they allow us to navigate a latent space that has been made very complex through training it through the corpus of the internet. So that, I think, is back to the perfect storm. We have large language models based upon data and the algorithms and the hardware that allow us to build these coherent kinds of things.

But again, we have to be careful. Gary Marcus and I have been voracious and consistent critics of Sam and others. who are saying we're just on the cusp of AGI.

Well, I think that diminishes the elegance of what AGI actually is within our humans. It diminishes what the beauty of what our human intelligence is. And frankly, we're not going to get there by scaling.

We've already seen we're hitting limitations upon it. Elon, whom I've gotten into as well on the internet a number of times on... his particular views of the world on AGI. He's been promising full self-driving for years now. We're not going to get there because those kinds of models are simply the wrong architecture.

If I've got to build a nuclear power plant to build my systems, I'm probably doing the wrong architecture. And so, yes, I think there are valid use cases for large language models, but we have to be careful. Because they are indeed dangerous.

Jan blocked me on Twitter because at the very beginning of their work, Galactica, I called him out on it and said, Jan, this is great stuff, but do you realize the implications? And he simply dismissed them. And I kept calling him out. He got tired of me doing so and eventually blocked me.

But this is another I told you so. We have seen the joys and the dangers of large language models. So just to double click on this, it sounds you're saying that large language models by themselves will not get us to AGI.

But do I understand that if we combine it with other tools like neural nets or neural symbiotic systems, we can actually get closer to these intelligence systems? And if we take this example that you just said, a robot on Mars that is able to clear filters and follow instructions, what do you think we would need to get to that type of? intelligence.

Right. And I even add to this, if you look at cultures such as Japan, where you have an aging population, a shrinking population, the U.S. is very much in this place, you have tremendous need for elder care. And so the robotic use case and not just the full humanoid robotic use case comes into play.

There are clear and present needs for these kinds of things as well, too. So, yes, there are lots of great use cases. So I've always.

believe this to be a systems engineering problem, which is why if you go back to the architecture of self, that's why I've been focused upon neurosymbolic systems. Now, let me get a little philosophical for the moment. I'm pretty sure you're sentient. I can't guarantee it, but I'm just guessing that you are, and you're not some amazing large language model.

And why do I think that- It's a fair guess. Why do I believe that? It's because I have a theory of mind about you that through multimodal ways and my interactions with you, I have a theory that says you're sentient.

You're someone who has its own agency, who has your own feelings and behaviors and needs and the like. And that's cool. You're not a robot. And so the human mind has evolved over millennium to develop that.

through an architecture that, yes, it's neural in nature, but on the one hand, artificial neurons are a shadow, not a shadow, they're an echo of a whisper of what organic neurons are. They are a small, small piece of it. Furthermore, if you look at the architecture of large language models, which as you observed are relatively simple, there's a simple layering to them which ignores the exquisite complexity of the human mind. We have in our cerebral cortex cortical columns. We have tens of millions of them.

They're in humans about seven layers deep. Reptiles have them as well, too. They're a bit smaller. Those are the ones that appear to be the place, our hominiculus work, where we build the predictive models of the world. But we also have places where emotions and other decision-making take place among these entangled architectures.

associated with the thalamus and the like. And furthermore, we have other things going on with, it appears to be, our hormonal system that appear to be other ways of message passing within our neural networks. So in that same regard, large language models are interesting, but they also are a whisper of a shadow of, I think, what reality is. And so where the next level is, and Gary and I have observed, reaching diminishing returns on what scale can do. The open AI folks in Elon believe scale will continue for a long, long time.

Gary and I are saying, no, we're reaching a limit there. You need to think about it in other ways. That brings us back to those architectures. The last thing I'll mention is at the very beginning, you talked about me being involved with embodied cognition. That brings me back to my roots again.

What is embodiment? It's building systems that are in and of the world that... respond to the world and act in the world. Large language models are largely unimodal. They work through text, maybe static images woven together through video and the like, but they're very sensory sparse.

Our minds and our intelligence have grown because they've been embodied in the world. In that regard, I think there are many kinds of intelligence that exist, but our human intelligence grew because of our embodiment. And it's going to require some fairly complex architecture to get that.

That's why I've been studying human architectures for six years, because I don't know enough about them. Let's see if that influences the way I build software architectures. And as closing, you previously tweeted something really interesting around this, and I'll quote this. You tweeted, we need a standard way of visualizing the architecture as well as the activity of LLM and to generalize any artificial neural network, sort of a UML for AI. Now, this was a year back.

Have you seen anything happen here? And why do you think LLMR architecture is so important? In fact, I have seen both.

I have both thoughts and I have seen movement. So internally, I've been helping out with providing a little bit of architectural adult supervision in our large language model work. And I've been trying to figure out what's the way to visualize the kinds of things we're doing. Turns out it's still very UML-like that these are now...

boxes that are systems unto themselves and their message passing systems along the way. So I'm trying to figure out the best ways to do that. As you may know, just this week, AlphaFold released its source code and weights publicly. So I'm jumping on that and I'm going to use that as a basis of can I describe the architecture of AlphaFold 3 using some UML deliberative.

So stay tuned. I think there's work to be done in that regard. And what advice would you have for software engineers who are just starting out the software industry? There's plenty of recent graduates who find themselves in the chilly job market.

There's a threat of what feels like AI tools changing how we do software engineering. You've gone through a lot of cycles of innovation in software engineering. What advice do you have for those starting out to set them up for a successful software engineering career?

Indeed, when Copilot and ChatGPD came out, I received a flood of messages from folks saying, my gosh, have I chosen the wrong career? Because you're not going to need developers at all. You will always need people who make informed decisions, no matter what the language is. Software engineering, again, is one of rising levels of abstraction. It's just that our tools have changed.

I learned to program in assembly language. People today are going to be learning in programming. with languages that are at a much higher level of abstraction. So the advice I'd give such folks is twofold.

Don't worry. Don't be afraid. There's always going to be some really cool work for you to do there. And in fact, I would say it's quite the opposite.

This is an exciting time because there is so much opportunity, so many cool tools, so much computational resources. that in many ways you are limited only by your imagination. And so what I encourage people to do is to first learn as much as you can. Second, don't get stuck in just one domain. You need to become an expert in some space.

But the world of computing is vast. And so find some space that nobody's in right now and go make a name for yourself there. Because there are lots of those kinds of places there for you. And the third thing I'd advise is...

Go have some fun. I mean, my gosh, the toys we have at our disposal, they're amazing and wonderful and cheap. There's so much a single person can do at so low expense to go change the world. I'd like to also close by saying I'm not done yet either. I'm having a tremendous amount of fun.

And in addition to the things I told you in studying the human brain and working with large language models, there are... two projects that I'm engaged in that I've been working with for a long time now. The first is I've been trying to write a book on software architecture.

I'm glad I did not write it because I know so much more now than I did before. And this is a different architecture book rather than saying, here's how you do it. I've been working with a number of companies to document their as-built architectures. So I'd like to put AlphaFold in it. Photoshop is in it.

What's the architecture of Photoshop? What's the architecture of a climate monitoring system? What's the architecture of Wikipedia?

So I'm trying to document the as-built architecture of systems that people use today that they may never have thought about. The idea being is that there are many different architectural styles. Let me expose you to them because you may have studied only one particular one. The world is a vast one.

The second is... And this is the software architectures guidebook, right? Yeah, the software architecture handbook, which my handbook, which my long suffering editor has been has been been patient with me for literally a decade now. And it's something I hope to finish before I die. The second book is I was on the board of trustees for the Computer History Museum for about 10 years.

And we'd hired a new CEO. He came to me and said he had been to PBS. We had a conversation and said, hey, John. why don't you do a documentary like Carl Sagan's Cosmos?

He paused and said, well, Grady, why don't you be our Carl? And I said, I'm no Sagan, but that's an intriguing idea. So I've been on a journey to do a documentary and I'm writing a book about computing and the human experience, which looks at the history of computing and what it means to be human.

So I'm looking at what is computational thinking look like? What is it? How has computing changed the individual?

society, nations, how has it changed science and the art and religion and such. And ultimately, it asks the question, in the presence of computing, what does it mean to be human? So that's the larger project I'm working on right now, which I hope to finish up also before I die.

And let's wrap up with some rapid questions. So I'm just going to ask and then just shoot. You don't need to think too much about it.

What was the first programming language that you used? Fortran. Fortran.

What project did you commit code most recently and what language did you use in it? Most recently, my own language, my own project self, Python. Python.

What do you do to recharge from doing software engineering related work? I live in Maui. I wake up.

That's enough. Yep. Enviable place. And also answer.

What are two books you would recommend for those who would like to understand more about software architecture? Mary Shaw's book, Software Architecture. I'd read that one. Other books kind of pale in comparison, but I'd start there.

Well, thank you very much, Grady, for being on the show. This was really interesting and fascinating. It was a pleasure.

Thank you for having me. Thanks to Grady for this great conversation. You can find ways to contact Grady in the show notes below.

If you enjoyed this podcast, please subscribe on your favorite podcasting platform and on YouTube. For some additional takeaways from our conversation, please see the show notes. Thank you and see you in the next one.