It's
very important to group microservices very well. Otherwise you will get a mess.
Several microservices should form a larger part of the functionality.
Example: Image Functionality in a Microservice System
Lets
approach the problem by example. Assume we want to build an image processing functionality
for our system. Each microservice should do exactly one thing. So we will end
up in the following microservices:
- image
scaling
- image
watermarking
- image
storing
- image
retrieving
All
these image services are very tightly coupled - the storing microservice and
the retrieving microservice uses the same database and the same files. Of
course it should be possible to use each of them in another context or by
itself. However in your own system you will consider it as an image-sub-system.
Therefore this sub-system should have its own abstraction. Its own API - this API
is an own microservice.
All non
image related microservices don't know anything about the individual image
microservices. They know only about the image API microservice. This API could group
together the typical use cases. For example to resize an image and watermark
it.
You
don't want to have this code in any other place than in your image
microservices. Otherwise your complete microservice system is very tightly
interconnected. This is very bad. Imagine that all images will be scaled & watermarked
and that you have performance issues because of high network traffic. One
simple optimization is to merge the scaling microservice and the watermarking microservice
together. Then you have to send the image over the network one time less. If
you utilise an image API microservice then the code must be changed only once
and only one microservice needs to be redeployed. On the other hand if all other
microservices used the internal image microservices then you have to change the
code several times and several microservices need to be redeployed.
Abstraction like in a monolith
This
abstraction and grouping of functionality is nothing else like modules in
monoliths. In a monolith all the image functionality would be in a package
"image". This package would have following sub packages:
"scaling", "watermarking", "storing" and "retrieving".
Likewise in the microservice architecture you don't want other parts of the
monolith use these sub packages directly. Otherwise you have interconnected modules
and it will be hard to change internal details of the image implementation.
Everything should be behind a high level interface and everything else should
be hidden and not be used anywhere else. Then it's possible to rewrite the
complete image package without the need to adapt code in other packages. The
only thing you need to take care of is to support the interface.
I think
it's very appealing that this concept of abstraction is assignable to a monolith
and to a microservice architectures. If the concepts were completely different then something is most likely wrong.
The module
abstraction is often violated in monoliths and the complete codebase is very interconnected.
It's bad, but it's manageable. Because you have the complete code in your IDE
with all the useful refactoring tools. But if you have separated code bases
it's very hard to refractor anything. So do it right in the first place and
don't move it to a later date.
Microservice grouped by Domains
One way
to group the system is to group it by domains. A domain is a (mostly) independent
part of the system. It's very important that you get your domains right.
Because if you don't then the problem is that for nearly every new feature
several domains must be changed. But this should be only an exceptional case.
For example the user part of a system consists of: login, registration, password
reset mail, user profile and so on. All these basic user functions are more or
less equal for each system. Therefore the user-sub-system should be also usable
for any other system. This only works if this domain is completely independent.
The domain is not allowed to have any dependency to any other domain. Otherwise
you have to remove those dependencies to make the user-sub-system usable for
another system.
Another
domain could be the product-domain. In this domain all the data of a product is
saved: price, stock count, description and so on. The user-domain and the
product-domain must be independent of each other.
How to deal with features that require two
domains
Lets approach
the problem again with an example: a shopping cart. A shopping cart is
individual to each user. Therefore it must be in the user=domain. The problem
is now that you also need the data from the product-domain. Otherwise the
shopping cart is quite useless if you can't see the details of the stored
products. But since the user-domain isn't allowed to have a dependency to the product-domain
how to solve this problem?
Actually
it's quite simple: with an aggregation microservice. In the user-domain all
stored data is related to the user like added-date, user-id and so on. There is
only one exception: the product-id is also stored.
The user-domain
provides an API to get all items from a specific user. These API will be consumed
by an aggregation microservice. The product-ids will be collected from the
aggregation microservice and then the details of the products will be fetched
from the product-domain. The data will be merged and then the shopping cart can
be displayed or the data can be passed to the next microservice. By that you
can implement the shopping cart feature so that both domains are still independent.
The
aggregation microservice/layer is nothing else than a Lego™ block which connects two
other Lego™ blocks together. If you use third party APIs then you do exactly the
same. You create some kind of Lego™ block which aggregates the data from several
third party APIs. It helps to think if you treat your own microservice APIs
like they are third party APIs. Because then you will get the independence of
the microservices right. You won't violate the independency.
Performance Issue with the Aggregation layer
Let's
assume that it's common that a shopping cart has thousands of items.
Additionally you need to support to sort the items in a shopping cart by price. Then the aggregation
microservice has to load thousands of items from the user-domain, load thousands of
product details from the product-domain, merge the data, sort them by price and
then throw away thousands of items except for 100 which will be showed to the
user (pagination). That's very inefficient.
The only
way to solve the problem is with data duplication. It's necessary to store the
price of the item in the shopping cart database as well. Then you can do the pagination
in the database and only get out the right 100 items.
That means
that you have to supply the additional data if an item is stored into the
shopping cart. Additionally the data in the shopping cart database must be
updated if the price is changed in the product database. To do that you need
events. If a product is updated an event must be thrown. An event listener will
take this event and update the price in the shopping cart database.
Eventual Consistency
This
means that you have only an eventual consistency. Because the price could be
changed but the corresponding event could be still in the queue. Then the item
will sorted in a wrong way. The price itself is correct, because the price which
will be displayed will be loaded from the product-domain with the other product
details.
There is
no way around this problem. Only if you use just one database, do distributed
transactions, or do the inefficient loading of thousand records. But that conflicts
with scalability and/or the microservice architecture mindset. Therefore it must
be alright for all data you duplicate that they are shortly out of sync. In
some cases you can't do it. For example payment data. But then don't duplicate
the data.
Conclusion
You need
guidelines which microservices are allowed to talk to which microservices.
Structure the communication workflow and use grouping and abstraction.
Otherwise you will not know which microservices rely on a specific
microservice - directly or indirectly. That's a very bad spot to be in. Another
thing you probably need is a good monitoring. So that you can see what
microservice calls were made for a user request and how long they take. Because
if you don't have this data then you have a very hard time to find performance
issues. Consider to use Zipkin.
Design the
system very careful and have good high level documentation for each
microservice.
If you get all of the things right then you should be fine and have much fun with your microservice system :-)