Why should I care about Event Sourcing?
The Event Sourcing Pattern is an extremly important design pattern for desiging microservice architecture based applications.
This post does not go into detail what the advantages of event sourcing over the old school approach for state persisting by mapping class instances to database table rows via object relational mapping (ORM) frameworks has. Either you beliefe me or you’ve to read Microservices Patterns, chapter 6 Developing business logic with event sourcing. To give a short preview:
- “Object-Relational impedance mismatch. (…) There’s a fundamental conceptual mismatch between the tabular relational schema and the graph structure of a rich domain model with its complex relationships.”
- “Lack of aggregate history. Another limitation of traditional persistence is that it only stores the current state ofan aggregate. Once an aggregate has been updated, its previous state is lost. If anapplication must preserve the history of an aggregate, perhaps for regulatory purposes, then developers must implement this mechanism themselves.”
- “Implementing audit logging is tedious and error prone. (…) Many applications must maintain an audit log that tracks which users have changed an aggregate. (…) The challenge of implementing auditing is that besides being a time consuming chore, the auditing logging code and the business logic can diverge, resulting in bugs.”
- “Event publishing is bolted on to the business logic. Another limitation of traditional persistence is that it usually doesn’t support publishing domain events. Domain events, (…) are events that are published by an aggregate when its state changes. They’re a useful mechanism for synchronizing data and sending notifications in microservice architecture. Some ORM frameworks, (…) can invoke application-provided callbacks when data objects change. But there’s no support for automatically publishing messages as part of the transaction that updates the data. Consequently, (…) developers must bolt on event-generation logic, which risks not being synchronized with the business logic.”
In addition creating self-healing and scalable applications depending on one or several databases persisting object relational mapped object instances is simply not possible at all. In consequence the majority of modern microservice architectures use an event-based and event-driven approach to persist state and communicate state changes.
Event sourcing in the microservice pattern context
In modern microservice architectures use either a single database per service to persist the service’s aggretates (Database per Service Pattern). Or several services share a database (Shared Database Pattern). However usually at least one service will have to use the Database per Service Pattern. In consequence the old school approach of relating object instances corresponding to separate ORM database table rows to each other via many-to-many, many-to-one and one-to-one relationships does not work anymore. Instead of IDs beeing references to other database table rows IDs in event sourcing are simple values strings.
Database transactions for manipulating state in a database using the old school ORM approach are atomic, consistent, isolated and durable (ACID). In short atomicity means that either all or no database operations occur. Consistency means that database operations must change affected data only in allowed ways. Isolation determines how transaction integrity is visible to other users and systems. Durability means that database operations that have committed will survive permanently. In microservice architecture persistent state changes involving database transactions are per definition not isolated, database transactions are not ACID, they are ACD. The Saga Pattern is used to handle the lack of isolation (Microservices Patterns, chapter 4.3 Handling the lack of isolation) to make distributed database transactions ACID.
- “Atomicity — The saga implementation ensures that all transactions are executedor all changes are undone.”
- “Consistency — Referential integrity within a service is handled by local databases. Referential integrity across services is handled by the services.”
- “Durability — Handled by local databases.”
This lack of isolation potentially causes what the database literature calls anomalies. Ananomaly is when a transaction reads or writes data in a way that it wouldn’t if transa-tions were executed one at time. When an anomaly occurs, the outcome of executing sagas concurrently is different than if they were executed serially.
The lack of isolation can cause the following three anomalies:
- “Lost updates — One saga overwrites without reading changes made by another saga.”
- “Dirty reads — A transaction or a saga reads the updates made by a saga that has not yet completed those updates.”
- “Fuzzy/nonrepeatable reads — Two different steps of a saga read the same data and get different results because another saga has made updates.”
A Saga uses the following countermeasures for handling the lack of isolation:
- “Semantic lock — An application-level lock.”
- “Commutative updates — Design update operations to be executable in any order.”
- “Pessimistic view — Reorder the steps of a saga to minimize business risk.”
- “Reread value — Prevent dirty writes by rereading data to verify that it’s unchangedbefore overwriting it.”
- “Version file — Record the updates to a record so that they can be reordered.”
- “By value — Use each request’s business risk to dynamically select the concur-rency mechanism.”
However this post is not about the Saga pattern. In summary the Saga Pattern has to be used in microservice architecture to make database transactions ACID. And the update of the state and the publishing of event messages must be atomic. That’s where event sourcing come into play.
What is event sourcing?
Event sourcing persists the state of a business entity such an Order or a Customer as a sequence of state-changing events.
Instead of persisting state in an application using the current state of aggregates one stores aggregate mutating events and restores the state of an aggregate at a given point in time using the history of events from the initial state of the event store or from snapshots of a given aggregate onwards. This has major impact on the overall architecture of an application.
The architecture of event sourcing frameworks
Building an event sourcing framework from scratch is like reinventing the wheel. Of course we don’t want to do that. In addition it has to be considered that designing and implementing a reliable implementation is no easy task. There have to be considered a lot to ensure everything work as it should work. For an in depth insight I highly recommend to read chapter Chapter 6. Developing business logic with event sourcing from the book Microservice Patterns: With examples in Java.
Something important to point out it that event sourcing frameworks are based on a distributed commit log architecture which differs significantly from traditional messaging technologies.
The previously referenced blog post compares some common distributed commit log technologies (Apache Kafka, Amazon Kinesis Microsoft, Azure Event Hubs, Google Pub/Sub). From these technologies obviously only Apache Kafka is suitable for Kubernative-native applications. In addition the post points out important characteristics of distributed commit log technologies:
- Messaging guarantee
- Ordering guarantee
- Configurable persistence period
- Persisted storage
- Consumer Groups
- Disaster recovery
- Max. size of each data blob
- Push model support
- Programming languages supported
I don’t go into detail what to consider exactly. What to consider and what potentially reasonable minimal or maximal bounds are depends on the use case at hand of course. However it’s important to know about these characteristics, potential differences between frameworks and their implications. Despite of beeing distributed commit log technologies they are not event sourcing frameworks. However event sourcing framework characteristics are similar. Consequently one can derive characteristics relevant for event sourcing frameworks from distributed commit log technologies.
Applications using the Event Sourcing pattern use the higher level concept of Domain-driven Design (DDD) (introduces the concept of Entitys, Value objects, Aggregates, Domain Events, Services, Repository, Factory) and depend on Command Query Responsibility Segregation (CQRS) pattern and Domain events pattern as well. In summary the CQRS pattern is required cause when using event sourcing it’s not possible to query the state of an aggregate easily cause it’s state consists of several events of state changes. There has to be implemented a “view database, which is a read-only replica that is designed to support that query. The application keeps the replica up to data by subscribing to Domain events (pattern) published by the service that own the data.” The important thing is to know that DDD, Event Sourcing, Domain events, CQRS are used in the context of applications which store data in a traceable manner as events instead of as rows in database tables.
Consequently every event sourcing framework consists of the following components
- a database to persist events/aggregate state changes (e.g. MongoDB),
- a connector to at least one and potentially several messaging (publish/subscribe) technologies (e.g. Kafka),
- a message technology (e.g. Kafka) from the supported message technologies,
- potentially several client libraries implemented in different programming languages (e.g. C#, Java).
The components of the event sourcing framework impacts the overall characteristics. E.g. a No-SQL database like MongoDB is different from a SQL database. Keep this in mind! Of course event sourcing frameworks implement the patterns listed above potentially in different ways. However this is no post about evaluation of event sourcing framework but about learning about their fundamental concepts when considered as black box. Let’s have a look into our alternatives next.
Event sourcing frameworks
It’s important to understand that your selection of an event sourcing framework is in many cases not limited by the programming language you use for your microservices. If you e.g. use a C# microservice framework (Microdot) you can use an event sourcing framework implemented in Java (Axon) as long as the event sourcing framework provides a client implemented in C#. Cause most event sourcing frameworks have historically been used in enterprise environments first the most dominant implementation languages are Java and C#.
- Akkatecture (implemented in C#)
- Axon Framework (implemented in Java)
- Event Flow (implemented in C#)
- Event Store (implemented in C#)
- Eventhus (implemented in Go)
- EventSourcing (implemented in C#)
- eventsourcing (implemented in Python)
- Eventuate Local (implemented in Java)
- EquinoxProject (implemented in C#)
- Kant (implemented in Python) → unmaintained
- Revo Framework (imlpemented in C#)
- Spring Cloud Stream (implemented in Java)
- Watermill (implemented in Go)
To be updated
I’m planning to update this post on a regular basis. Feel free to visit it again soon.
Happy designing :)