Transcript
Kevin Montalbo
Welcome to Episode 68 of the Coding Over Cocktails podcast. My name is Kevin Montalbo, and joining me is Toro Cloud CEO and Founder, David Brown. Good day, David!
David Brown
Good day today, Kevin!
Kevin Montalbo
All right. Our guest for today is a staff technologist at the office of the CDO in compliment. He has held positions in embedded software development and quality assurance. His expertise includes DevOps, technical leadership, software development, and data engineering. He's the author of the book building event driven microservices, leveraging organisational data at scale, which we'll talk about today. Ladies and gentlemen, Adam Bellemare. Hi Adam. Welcome to coding over cocktails.
Adam Bellemare
Hi, Kevin! Thank you for having me, and hello to you too, David!
David Brown
Hi Adam, thanks for joining us! We're going to jump into your book and talks, obviously. At Confluent, being an organisation orientated around Kafka, you've written a book leveraging some of your expertise there about event-driven architectures and specifically, event-driven architectures, how they relate to microservices. So we want to talk about that in some more detail, but maybe we can start off very, very simple by defining what an “event” is in this context. So, can you tell us what an “event” is and what it looks like?
Adam Bellemare
Right, so an event is, I mean, quite simply, it's really anything of consequence to the business, which is of course, a very broad definition. But it's because the events do encompass a lot of areas there. There's quite a few things that can encompass it. Now, the important part there is that it has to have some sort of underlying business meaning, and this is in large part because the way to do event-driven services by microservices or event-driven architecture as well is to express things that are going on in your business and provide the opportunity for other parts of your business, other systems, other domains, other teams, to react to it accordingly.
Now, if we want to drill a little bit deeper into the structure of an event, I mean, to get some of the basic housekeeping out of the way, typically, you have a key and a value and some metadata about the event. Now you don't have to have a key, but a key's very useful for a variety of reasons, such as if you're using a partitioned event-stream to co-locate data. And usually your key would be something like the primary key of a database row. And a lot of events can kind of fall down into two major categories, what I call state events. So again, if we use the relational database, as an example, if you log into a database and you say, you know, select “star” where ID equals Adam Bellemare's ID, you'll get my row and you'll see all the current information. So, a state would be the same. It would reflect all of that information, and that would be published to a stream.
Now, I think this is an important differentiation to make, because a lot of materials, books, blogs, speakers, examples I see will also use what's known as an action event or a verb. And this is usually a descriptor that says something has changed. So, if we use Adam, myself again, let's say I move house. So right now I live in Canada. Let's say I moved house to the United Kingdom. The event would show me moving from Canada to the UK, but it wouldn't have anything there about me. And so there's different kinds of events in how you structure it that way. And some of them are better for certain purposes and others are best suited for other purposes.
David Brown
Okay. All right. Now we're talking about event-driven architectures in a microservices setting. So how do the two relate?
Adam Bellemare
So, I'm trying to figure out how to frame this without getting too far down the rabbit hole. So, how does this relate? How do these two relate? So, I would say one of the big things about event-driven architectures today is that we have an opportunity to do things a bit differently than we used to do. And this has been influenced largely because of cloud computing; very cheap compute, very, very cheap disc, very cheap networking IO. And we have a new opportunity to actually communicate very large amounts of data very quickly between services. And this is something that we weren't really necessarily able to do before. So, for example, in an event-driven service that you may have built 20 years ago, let's say, you would probably use some sort of action event.
So, let's say Adam moved, I moved house. You would send me “Welcome to the United Kingdom. Here's some information you need,” sort of package, but that would trigger, very specifically, on that sort of edge. And so you would publish an event that's very purpose-built and almost somewhat intentional. And you would have a listener, a consumer on the other end that would receive it and do something with it. And then generally that stream would only be retained for maybe a few hours or a day, gets deleted and cleaned up. Now with modern event-driven brokers, event brokers, I should say. Event stream brokers. You can instead do it a bit differently. You could publish my facts and say, “This is Adam. This is all the information we have about him.” And then when I move, you can publish a whole updated one.
And you say anyone who cares about anything about users can listen to the stream and come up with their own logic. Disc is cheap, so you can maintain your own state. If you care about where Adam's moving, you could keep “country,” store that in your service. And then, whenever you see countries change, you can do whatever you want with it. But the big difference there is that the system that writes the data that produces it is producing what I would call a general purpose event, and then leaves it up to the consumers to do whatever they will with it. And that's a big change in terms of boundaries and in terms of responsibilities that weren't really available to us based on the technologies we had, say, 20 years ago.
David Brown
Okay, and there's something you talked about in your book with reference to Conway's Law, which we've talked about a few times in our podcast before, where Conway's Law reflects the architecture of an application and ends up reflecting organisational structure and process. Does this also tend to apply, do you find, when people are building event-driven microservices?
Adam Bellemare
Absolutely. Yes. So one of the things that I really like about the way Conway's Law is phrased, and I'm also sort of paraphrasing from memory here, I don't have it up in front of me, but it discusses how the communication structure of the company, of the organisation, of the team influences the design. So, it's not just the team structure and it's not just how the teams are divided up. And I like to sort of deconstruct this. If you think about a business – let’s just cut to the chase – you think about a business that has like a monolith, one big monolith. What they've done is they've taken what the business does well, the sort of means of communication they have between different teams and they've written it in code.
And that code also has a bunch of data associated with it. And so, that whole framework there is its own communication structure. And that framework itself has a certain amount of data gravity. So, it sort of makes it like, if you want to do something new, well, you're going to need to go where all your data is and all your data is pulled into here because all of your logic is related to it. And so you sort of have this technical structure. You could call it your legacy code. Sorry, I don't mean to say legacy. But basically you have like this preexisting, hardened structure that really influences all of the other things you can do.
And so one of the things with event-driven microservices is while we follow Conway's Law at the team level, and say, one service will be owned by one team. And, you know, the Two Pizza Rule is obviously a great one. You don't have ridiculous boundaries. You don't have way too many people on there, but we're also looking at changing the relationship of the users, building services to where they get their data from. And so we're renegotiating that communication structure in Conway's Law and figuring out how we make that work for us instead of sort of working against it.
David Brown
Are you saying that's a prerequisite to implementing event-driven microservices? It’s thinking about your organisational structure and breaking it down into these Two Pizza teams?
Adam Bellemare
I think it's about being aware of it. And it's mostly being aware of where you are and how that affects where you can go next, not where you can go in a long time from now. It's always hard enough to plan that far ahead. But like, for example, if you have a desire to break out of a more monolithic approach, I mean, I'm sure you've had speakers. I know you've had speakers on here before talk about this. So I'll spare you like the heavy detail or the details a bit too much, but basically you're trying to find what are some low hanging fruit? What are the things you can modularize? What are some of the things you can work on and learn from and win on?
But a lot of these learnings come with, what do you do with the data? Because if you have a monolith – and there's nothing wrong with monoliths, monoliths are great – but if you have one that’s getting too big and you need to split it up, you don't usually have a really neat seam, because you'll often have a thing where this module clearly writes this data, but then there's four or five other modules in there that need to access it. So, if you pull that out, the writer out, well, what about the readers? Where do they get their data from? And so you sort of end up with this almost like a loose thread on a shirt and you start pulling it. And then before you know it, the whole shirt's gone. You sort of have this problem with what you do with your data and how you make data easy to get access to? And that's really, I think, the crux of doing a vendor-driven microservice as well.
David Brown
Well, it's interesting, because I wanted to ask you about that a little bit later. It links system migration and how it relates to microservices. So, let's just dive into that whilst we are talking about it now. So, what approach are you saying that people should take? You mentioned this term “data liberation” in your book. So, what does it mean to liberate data in a legacy system migration and what approach should people be taking?
Adam Bellemare
Right. So, just for all of our listeners here, a lot of my experience is with Apache Kafka and Kafka Connect quite explicitly. So, I know there are other options out there, but I'm going to be speaking primarily about these just because this is what I've used for years and years. But I think one of the important things is that there's sort of like the high-minded ideals and it's like, you should have these well-formed streams and they should be clean and available and everything will be great and rosy. But then there's also reality where, you know, you have maybe like four weeks or six weeks to do something useful and you know, we're prototyping a microservice, do we even want to invest in this?
And so, data liberation is basically getting data out of an existing system, an existing operational system, quite usually, into event streams so that you can try to build a microservice. So, it's sort of like the event-driven equivalent, I guess [of], I don't want to say it's entirely like the Strangler Fig Pattern, but it's similar. What you're doing is you're tapping into the data, making that available and then you can build some services off of it now. Will it be exactly what you want? Maybe, maybe not. But what it'll do is it'll get you into it and it'll get you into it pretty quickly. And so I mentioned Kafka Connect because out of the box, you can set it up if you're using… Most people are using the cloud nowadays. So I mean requisition, some machines, I know that they have marketplaces sometimes and someone will have one available and you can download it and run it, et cetera.
But you can get started pretty quick on it. You can set up some connectors. So, let's say you're with a mySQL database, you can set up some connectors. You can get Debezium. I'm really quite fond of Debezium. I'm fond of it because you can tail the binary log such that when a change is made to your database, you get that event in your stream seconds later, or you can even tighten it up a bit more if you really want. So you start getting a taste of this near real time availability of data, and you can say, “You know what, let's see if these microservices – these event driven ones – actually help us.” And you haven't had to spend two years building a platform to do it first.
David Brown
Yeah. The holy grail, we gotta work towards it. So, let's talk about the architecture of implementing microservices, in particular, the distinction between asynchronous and synchronous microservices. Can you give us a rundown of the difference between the two and why you're advocating asynchronous event-driven microservices?
Adam Bellemare
So, I'm going to work backwards in that. So, why am I advocating asynchronous over synchronous? Couple of reasons. So, first off, I need to express that both of these have their pros and cons and there are tradeoffs.
David Brown
And I think you should also mention what is an asynchronous versus synchronous, just for those that aren't familiar with the two.
Adam Bellemare
Okay. Yeah, let’s disambiguate that right now. So, it's actually kind of hard when you put it pen to paper, how to describe it properly. So, synchronous is what I also call a request-response. So a service makes a request and then awaits a response to move on with its work. Now, there's a bit of a caveat there because there's also ways where you can do non-blocking requests and your server isn't technically waiting, but your client’s waiting for the other server to make a response to them and then pass it back. So, you know, there's also some asynchronous communication elements in there, and I'm sort of trying to gloss those over because the idea is really, I ask a server to do something on my behalf, and then eventually it comes back to me and says, “Here's your information.” That's a synchronous request-response.
In an asynchronous event-driven microservice, we're communicating through an event stream. So, again, you could still do a pattern where you send an event to a service, and then it sends you a reply back over streams. You can still do that. But you can also do unidirectional, where you're publishing important business facts and that service that's publishing it… I'm not going to say it doesn't care what its downstream people are doing, but it's trusting them that they'll use that data for some intelligent, fair purpose. And that's it. Like, the service doesn't actually care about what's going on there. So with those two sorts of services, you have different sorts of patterns. So synchronous ones, I think it's worth noting that there's a lot of companies that have used them very well. And service mesh is an architectural paradigm that works well, lots of companies have done it. Some of the downsides there can be the operational concerns, such as what if you have a very sudden load, you know? Can you scale up in response or do people start timing out? If you can have sort of fan out or calls that maybe chained too deeply. So, server A calls B that calls C that calls D that calls E and then E fails. And so, you can sort of get these complex distributed couplings synchronously through your services.
David Brown
And just to be clear, service mesh is an alternative implementation for microservices versus this event-driven architecture. That's why you mentioned the service mesh approach takes the synchronous approach, but you are advocating the asynchronous approach. So tell us why.
Adam Bellemare
Yes. So, the reason I'm advocating the asynchronous approach is that one of the things that the synchronous one does is the synchronous one effectively says, “We’ll give you business-centric functions for you to call and we’ll do work on your behalf.” The one I'm advocating is asynchronous in the sense that, “We’ll give you the read-only data you need to make your decisions for your business purposes.” So, instead of providing you with functions to call, it's providing you with streams of readily usable data to make your own decisions on. So they’re very different in that regard. But I think that would probably be like the most distinct difference.
David Brown
Is there a scenario where you would use both?
Adam Bellemare
Oh, absolutely. Yeah. I would actually be honestly surprised if someone's going to use only one, I'd be like, “I'm sorry.” I mean, one of the examples, like if I'm gonna do centralised authentication of a user, I'm probably just gonna use a synchronous service. Like it's bread and butter for that. There's some use cases that are just really phenomenally straightforward. So, I would say, yeah, you're going to use both.
David Brown
Okay. So, they both have their advantages and disadvantages and can work together. Let's talk about the contract and if a contract exists in this space. So in the world of APIs, we often talk about a contract that governs how the data is exchanged between the publisher and consumer. Now, you've talked about this decoupling of publishing messages to an event stream, and it doesn't really care who those consumers are and what they're going to do with the data. So, in an event-driven architecture, is there still the concept of a contract between the parties?
Adam Bellemare
So there is. There is still the concept. So basically the way it works is that, if in a synchronous service world, you would have an API spec and these are functions. You can call parameters to pass in expected values and some documentation. Now, in an event-driven world where your consumers are coupling on event streams, the event streams themselves are the API. And so what that means is it's an essential necessity. They have to have a strong, well-defined schema, and I'm fairly partial to Apache Arrow. But I've been using Protobuf a bunch lately. And honestly, they're both fantastic. The reason why they're important is that they enforce types. They enforce whether a field is mandatory or not. They enforce whether something could be nullable or not. And they also provide the ability to add documentation, add comments, and again, depending on which particular format you're using, there's some other nuances on what you can and can't do.
But what that gives is the consumer gets this ability to say, “Okay, I want to read from this event stream, I need to, what does the schema look like? Show me the schema,” and they can see the schema. And not only can you see it, but in a well-supported microservice world, ideally, you can push a button that'll generate code off that schema. So you can get, like, if you're using Java, you get your class file definitions. You can generate test code. You can basically automate sort of that coupling between what the event streams have and how that would integrate in with your microservice.
David Brown
So, in that regard, it's pretty much the same as a synchronous API, where you have a contract between the parties because the consumer has built services relying on that schema. And so if you break the API, or if you break the schema of an event-driven architecture, you are going to break those consumers. So, whilst you don't care about those consumers, really, you very much do in terms of you have a contract with them, in terms of the way they're going to consume data. Is there a mechanism to change that contract, to break the contract, to version your schema?
Adam Bellemare
Yeah, so that's true. So, the other part of your component is the social part of the contract, which is of course, the social contract, right? I'm going to add a bit more to this. One of the more recent technical paradigms is data mesh. And one of the things that's great about data mesh – and I won't go too deep into it – but like, it talks about how data needs to be well-supported, how there needs to be a means of communication between the users of the data and those who own it. There needs to be a well-defined schema, clear metadata, clear expectations, and also processes for if you're going to break something. So, that's about as much as I'll say about it, but what I really like about it is that it talks about and codifies and looks at this precise problem here.
And so, how would you do this in reality, if you need to break your schema? Well, you gotta find your consumers. And sort of a quick tip for that would be if you're doing a microservice world from day one, you need to have well-defined service identities. And if any service wants to couple on an event stream, it needs to have access control role. Granted kept in a list. So then you can go, oh, who's consuming from this stream. It's these eight services. Let's go find the owners. We're gonna email them, or, you know, whatever you do get 'em in a room and say, listen, we have to break the schema and you're gonna be impacted. So let's figure this out.
David Brown
So I'm glad you raised that, because I wanted to talk about some of the patterns and workflows you've talked about in your book for building these microservices. You talk about the choreography pattern, orchestration and distributed transactions. You know, these are all big subjects in their own right. But can you briefly run us through what these patterns are?
Adam Bellemare
Right. So or is what I would call the simplest it's the, the least restrictive. So we already sort of talked about that. When I, when I said, you know, there's a service that might publish important business facts, it holds up its end of the contract, but it generally leaves it up to the customers to figure out what they want to use it for. This is a good example of this could be just because everyone always uses e-commerce, I'll just keep using that. If you're publishing inventory, you're ever updating inventory this could feed a search search engine functionality, you know, show me items that are in stock. It could feed the service that actually monitors are inventory. You know, what do we have available? Maybe it makes predictions based on how quickly inventory's depleting, right? And you could have a third service that actually maps it to where it may be in the warehouse, but none of them need to tell the producer, you know, this is what I'm doing.
The producer is like, you know, good on you and that's choreography. So each service is sort of acting independently, but they do have dependencies coupled through, through the reliance on the data. Orchestration is well, if you're familiar, any form of distributed computing, an orchestrator usually indicates some sort of centralised component that needs to make sure other various components sort of stay in a consistent state. And that's exactly the same as it is here. So for example, if you have a let's say, want to do payments, so someone wants to order something, we reserve the inventory and then we take their payment. But in that order, so if we take their order, but we don't have the inventory, then we gotta roll back and re reject the order. Yes. And similarly, if you go down to payments, so the orchestrator is very purpose or a specific business process and its goal is to, or sorry, its role its responsibility is to issue commands to specific services that it's fairly tightly coupled to, and then a weight responses.
And so the orchestrator needs to keep state about where it is in that process. It needs to be durable. It needs to be able to handle, at least one's processing. If the whole thing crashes to the ground, you need to be able to bring it back and resume, you know, both rolling back certain ones and, and moving other ones forward. So that's orchestration you and sorry, what was the last one? We were distributed transactions is the other, okay. Yes. So yeah, distributed transactions. I think anytime anyone ever talks about them, the first thing they say is avoid them is sort of, you know, try to avoid them if possible. But yeah, orchestration, the orchestrator pattern is, is a fairly good way to do them if you're going to do them now. I think the caveat here is that it depends part of it depends, I guess, on how fast you need to go, because if you're going to be doing event driven for your commands and your, and your responses and perhaps one service is lagging is slow or whatnot it can have sort of a knock on effect into how these distributed transactions are working.
So it's one of those things where if you're to do it, you would probably need to figure out if it's in the, like a critical path or if you can tolerate some delay, if it is in a critical path and it's customer facing and it needs to be distributed you know, I would sort of want to know like, is there a way we can simplify it? Is there a way maybe that we can merge some services together, that we can bring some locality back towards it to sort of simplify and reduce it? Because honestly the more moving parts you have, the more things can go wrong. So simplify, simplify, simplify would be my advice.
David Brown
Okay. Good advice. We, we briefly touched on this earlier, when you mentioned that you, you, you do have a Kafka background and, and, and orientation confluent, but is all of this around Kafka, are there other solutions for building a event driven micro? For example, we recently had a podcast which was a <inaudible> versus JMS SmackDown, which in the end, there was a lot of agreement that actually it wasn't necessarily either or it they both had their own use cases. And both were an agreement is the same case here it is this only about Kafka, are there other bro message brokers, which also applied in this scenario? And if so, how do you know when to choose between the two?
Adam Bellemare
Right. Yeah. So yeah, I mean, there are, there are other, other broker options and this isn't something that's explicitly a Kafka thing. I know there's a number of a number of alternatives available that you can do. I think the, I think the big for selecting, I think it comes down to a couple of things. One, do you need durability? So what that means is like, if I write to this event stream how long can I keep that message or that event in there? Can I keep it indefinitely? Cause if I can keep it indefinitely, then I can use that event stream as the source of my data. And I can use say like a CAPA architecture to process historical data whenever I want. So that will limit some of your choices. For example AWS SIS at the moment I think has a one year maximum retention.
It used to be, I think two weeks now it's one year, but they don't care how much data you store. So the retention limit there isn't, I don't think it's a technical reason. I don't, you know, I don't want to guess too much in there. Yes. But that would prevent you from say if I, if I registered now and in two years from now, would me registering, still be in that event stream dancers? No, it would've aged out. Yeah. If you need queues, if you need Q based functionality or individual acknowledgement I mean with Kafka, we do have some options that you and do that, but other brokers might be better suited especially if you just want work queues, for example, like you may want to go with like a very Q a Q specific one. And if you're looking to do more event driven communication where you are, don't really wanna communicate sort of state, but you'd rather just communicate action events and say, you know, this changed.
And then I want like a simple lamb to function to react to this because that's the scale of where your business is right now. There's lots of simpler options where you are just gonna be running, you know, sort of like a minimalist message router that's entirely entirely ephemeral or volatile. And so you do have lots of these different kinds of options, but it, it, it kind of depends on what it is that you need, because what I'm advocating is much more about providing data as building blocks like this data as building blocks. So you can build any kinds of services you need through time.
David Brown
Well, the book is called Building a event and microservices leveraging organizing at Adam. How can our listeners learn more about you your book? Where can they follow you on social media?
Adam Bellemare
Oh, yes. So my book is through O'Reilly and I honestly really like their learning platform.
David Brown
It's very good. Yeah.
Adam Bellemare
I'm quite biased I guess. But, I go on it all the time. Honestly, that's a great place to start. I believe they have, you know, they're not paying me to say this, but I believe they actually have a one month free subscription for two weeks or something. So, if you haven't already done it, that'd be your easiest way to get there and see what else they have available because there's lots of really great material. Of course, you can buy the book in paper if you're more like me and you like the physical, tactile turning of the pages. I know it's on Amazon at the very least. Honestly, I'm not entirely really sure. I haven't purchased my own book, so I'm not…
David Brown
It’s all for sale after the show. Google yourself. Are you active on social media?
Adam Bellemare
I'm on Twitter, yeah. Just “@AdamBellemare” on Twitter. I think I'm the only one. I try to tweet. I'm a fairly recent account, I think I signed up in 2020, so I'm pretty late to the game. But you know, I found some good technical people that I like to follow, so yeah, it's kind of nice. You can curate your feed just to see the things you want and, you know, keep the blood pressure low.
David Brown
Great stuff, Adam. Thanks for joining us today!
Adam Bellemare
Thank you for having me! I really enjoyed it.