This blog is about Java (advanced Java topics like Reflection, Byte Code transformation, Code Generation), Maven, Web technologies, Raspberry Pi and IT in general.

Sonntag, 9. August 2015

Microservices with Spring Boot, Netflix OSS and Maven - Log Configuration

This is a follow up post to the microservice overview post. In this post I will explain the log configuration.

The basis of the configuration is the default Spring Boot configuration. I just adapted as I needed. The two important changes are:
  • Added [%mdc] to the CONSOLE_LOG_PATTERN and the FILE_LOG_PATTERN
  • The log file is only written if the LOG_FILE variable (system variable or environment variable) is set. With the system variable it's possible to have several microservices running on the same machine, which will write the output into the specified log file.
The Mapped Diagnostic Context (MDC) is filled in the RequestContextInterceptor#preHandle method with the uuid and the caller. It's possible to put even more data into the MDC. The MDC data will be printed as key1=value1, key2=value2, ... in the %mdc part. So it's great to fill the MDC at the beginning of the request and clear the data at the end of the request. Thereby you always have the meta data logged within your log output.

For example if you call http://localhost:9090/shopping-cart/3 following output will be generated distributed over 3 processes:

# frontend-service
2015-08-09 21:46:03.337  INFO 10868 --- [nio-9090-exec-2] a.r.c.m.f.c.ShoppingCartController       : [caller=/shopping-cart/3 <-, uuid=23badb29-6cb3-4f02-a786-2c15cbcaf302] getShoppingCart(3)

# shopping-cart-service
2015-08-09 21:46:03.340  INFO 17732 --- [o-auto-1-exec-1]    : [caller=frontend-service <- /shopping-cart/3 <-, uuid=23badb29-6cb3-4f02-a786-2c15cbcaf302] getShoppingCart(3)

# product-service
2015-08-09 21:46:03.360  INFO 20148 --- [o-auto-1-exec-4]      : [caller=frontend-service <- /shopping-cart/3 <-, uuid=23badb29-6cb3-4f02-a786-2c15cbcaf302] getproduct(3)

This output makes two things very easy:
  • Find all log output which belongs together. Just find all log entries with the same uuid.
  • See the flow of the initial request through the microservices.
To make this possible it's necessary to pass the data (uuid, caller) from one microservice to the next microservice. This is quite easy to accomplish with HTTP headers. The implementation of it is quite simple, too. It's handled in the RequestContextInterceptor and the ServiceNameInterceptor. Given that a ThreadLocal is used in the implementation and that Hystrix uses it's own threads it's necessary to configure Hystrix correctly. See this three files.

Last but not least it's nice that you don't have to copy the logback.xml file into each service. With Maven it's possible to pack the logback.xml file into an artifact and then unpack it into the microservices. If you change the log configuration you just need to change the file once. Depending on your setup it might be required to change the version of the artifact so that the new version is unpacked.

Sonntag, 28. Juni 2015

Microservices with Spring Boot, Netflix OSS and Maven - Monolithic Build

In the overview post I mentioned that it's possible to pack all microservice into one application. This application is fully functionally, without the need of any change, without Eureka and without remote calls. In the demo the module all-in-one showcases this monolithic application. In this short blog entry I will describe how it works.

Actually it's very simple. It's done with the dependency injection of Spring. Nevertheless is a quite cool setup.

Microservice Setup and Configuration
Let's look into the UserController of the frontend-service module and take a look what's normally happens if you start the FrontendApplication.
  • the UserController requires the UserHystrixClient 
  • the UserHystrixClient requires the UserClient 
  • the UserClient has the @FeignClient annotation 
    • thereby a Spring service is created, given that the FrontendApplication has this configuration: @EnableFeignClients({"at.rseiler.concept.microservice.client"}) 
    • this Spring service is then injected into the UserHystrixClient 
  • => that's it. It's this simple.
Monolithic Setup and Configuration
Now the explanation what happens if the MonolithApplication is started.
  • the UserController requires the UserHystrixClient (no difference)
  • the UserHystrixClient requires the UserClient (no difference)
  • the UserClient has the @FeignClient annotation, but no service is created. Because MonolithApplication doesn't has the @EnableFeignClients annotation.
  • the UserService of the user-service implements the UserClient and its annotated with @RestController. 
    • thereby a Spring service is created 
    • this Spring service is then injected into the UserHystrixClient 
  • => that's it. It's this simple.
In the end the "magic" is done with the Spring configuration. Because we don't need the Feign clients in the monolithic application we don't need to create them. However the REST endpoints are created, because of the @RestController annotation. So it's if two microservices defines the same endpoint then you get an exception at the startup.

Improved Setup
It's also possible to improve the setup so that the REST endpoints aren't created in the monolithic application. To do that you need to improve the configuration. Each @RestController annotated class needs additional a @Service annotation. In the monolithic application only the @Service are defined as includeFilters for the at.rseiler.concept.microservice.*.rest package and the @RestController annotation will be ignored. In a microservice configuration the configuration is reversed. Only the @RestController is defined as includeFilters.

By the expressive Spring dependency injection configuration it's possible to do very cool things. It's only necessary to think outside of the box and use the power of the configuration options. The only downside could be that it's probably confusing for others if an class has a @RestController annotation and a @Service annotation. So you either good documentation to explain this or keep it simple and don't do both. I chose the second approach for the microservice example.

Sonntag, 21. Juni 2015

Microservices with Spring Boot, Netflix OSS and Maven - Overview

In my last blog posts (Microservices: a few thoughts and The Importance of structuring Microservices) I wrote about microservices. Now it's time to go from theory to practice and write some code. Based on Spring Boot, Netflix OSS (Eureka, Feign, Hystrix and Ribbon) and Maven I created a setup for a microservice architecture.
I have put my example microservice project on GitHub

What the microservice example does
  • It's a microservice system consists out of 3 microservices and one frontend microservice.
    • product service: which just returns the data for a product (CPU, keyboard or mouse).
    • user service: which just returns the data for a user.
    • shopping cart service: which stores for the users the products in their shopping carts.
    • frontend service: communicates with all microservices, aggregates the data and generates out of the data a simple HTML page.
  • The microservices are grouped by domains.
  • Eureka is used to discover the microservices. Multiple instances can be started and new instances will automatically registered into the system and will be used from the existing microservices.
  • Ribbon is used transparently to do the load balancing on the client. So there is no need to have static IPs and a manually maintained load balancer.
  • The REST clients for the microservices only consists out of interfaces. No cooding needed.
  • Hystrix REST clients are used to ensure resilience. The Hystrix REST clients are very effortless to write.
  • The REST endpoints are done with Spring`s @RestController and implements the interface of the REST Client to ensure that the REST client and the REST endpoint matches each other.
  • Enhanced log output to make you life easier. It's done transparently with a Spring Interceptor and a Feign Interceptor.
    • The frontend microservice generates a UUID for each request. This UUID is put into the MDC (Mapped Diagnostic Context) and will be printed out for each log entry. The UUID is passed as a header to the every called microservice - even if the called microservice calls another microservice the UUID is passed along too. Therefor it's possible to know which user request caused which log entry in any microservice.
    • Additional a calling stack is generated and logged. The calling stack looks like this: ServiceB <- ServiceA <- /product/1 <-
      • /product/1 <- is generated by the frontend microservice and shows the IP address of the request and the called endpoint
      • ServiceA: the frontend microservice called the ServiceA
      • ServiceB: the ServiceA called the ServiceB
      • => Therefor you always know who called the microservice. Which isn't always a easy to answer question in a big microservice system. You could improve it furthermore if you would include zipkin.
  • It's possible to pack all the microservices together into one monolithic application and this application works without the need to change anything. Therefor you have great flexibility. For example in development you probably don't want to start the complete microservice system on your machine. It's handy if you can just start up everything you need, which could be only a part of the system, packed into one application

Overview: Main Frameworks
I give you an overview over the main frameworks/libraries and what they do.

Spring Boot: It saves a lot of time for the setup. The auto configuration and the spring-cloud-starter-* packages are awesome. In addition the Dependency Injection and the Spring Feign abstraction are great, too.

Eureka: It's a service registry. Each service instance registers itself to Eureka. Afterwards all instances of the services can be requested just with the name of the service. So there is no need for IPs or ports. If another service instance is started then it will be added into the corresponding service group or if a service instance is shutdown then the instance will be removed from Eureka.

Ribbon: It's a client side load balancer working together with Eureka. Spring Boot transparently integrates Ribbon. So you have to do nothing to use it. Ribbon uses the round robin scheduling algorithms.

Hystrix: It's a latency and fault tolerance library designed to isolate points of access to remote systems. I am using annotations to configure Hystrix. It's very easy to use Hystrix and does only require a very little amount of effort. By all means the programmatically side is very easy. The hard part is to react correctly to errors.

Feign: It's a REST client based on the Apache HttpClient which has a great level of abstraction. With Spring the only thing you need to do is to write an interface for you REST endpoint and annotate it with @FeignClient. You can use the Spring MVC annotations instead of the Feign annotations. This has the advantage that you reduce the complexity and the @RestController can implement the interface to ensure that everything is compatible.

The top level modules of the project
  • all-in-one: this module includes all microservices and it is fully functionally without any change. Eureka isn't needed anymore and all remote calls will be done with standard Java method calls.
  • common: this module contains some common code and following microservices: Eureka, Hystrix Dashboard and Turbine Dashboard.
  • frontend-service: this module generates the HTML and calls the other services to get the data.
  • product-domain: this module represents the product domain and contains all product microservices.
  • user-domain: this module represents the user domain and contains all user microservices.
One repository to rule them all
I deliberately put all microservices into a big project. The reason is that it makes many things easier:
  • The need to clone only one repository. Especially for this example no one would like to clone several repositories.
  • Refactoring, searching, code navigation and so on works very well if you have everything in one project. Especially if you run the all-in-one monolithic application then it's very handy for debugging and coding.
  • Changes can be done globally or locally. If you want to upgrade one library because of a security issue then it's very nice if only one pom.xml file needs to be changed. For example you could change the Spring version for all microservices. Otherwise you would need to clone and to open several projects and fix them all. Still Maven allows you to change the version of a dependency only for a specific module. You have the flexibility. Each module/microservice can be configured individual or mostly everything can be configured globally.
  • Only one build is needed. Depending on the size you could build always everything. Or individual domain groups. Or individual microservices. Just execute mvn install wherever you need it.
I would extract the common module into an own repository. This module shouldn't change that often or is interesting to debug. Therefore it shouldn't be painful if it's separated.
With the current structure there is one problem: you can't build the project directly after you have cloned it. It's necessary to build the common/common-configuration module first. Because it creates an artifact which is needed and sadly Maven can't resolve this dependency correctly.
If the project begins to grow then possible it makes sense to split the project further more. For example you could create an own repository for each domain-module. But it's up to you.

Microservice Structure
Each microservice is located in a domain-module with the exception of the frontend-service. Because the frontend-service aggregates all domains. In order to have a consistent design and usability. In the frontend-service should only be very little logic. It should only requests and aggregates data from the endpoints and prepares the data to be displayed. But it shouldn't have any business logic.

A microservice consists of two parts:
  • microservice-client: contains the Feign interface for the REST endpoint and the model/POJO classes. Additional there is a Hystrix-REST-client which just wraps the Feign-client. Actual everyone should use the Hystrix implementation instead of the Feign client.
  • microservice-service: is the microservice itself with the business logic and the REST endpoints. It has a dependency to the microservice-client. Since the model/POJO classes are needed and it implements the REST interface. Thereby fewer mistakes can happen.
I am using my parent-pom for this project. This POM file enables several useful features like static code analysis. 
 In common-configuration the logback.xml is defined. The file will be packed into an Maven artifact and then extracted into each service-module. Thereby it prevents to have several copies of that file. Probably you could also move the application.yml and the bootstrap.yml into this module. Otherwise there is nothing special to the Maven setup.

Scripts and Execution
To make it easier to build and run the project I provided some scripts. They are located in the scripts folder and are very easy. So I think I don't need to explain them.
The only thing what is important to know is that it can take a little while until the microservices are registered correctly and all microservices received the instances of the other microservices. In the case of an error just try to reload the page a little bit later.

I think it's a quite sophisticated setup and that it is very well suited to be the basis of production microservices. One amazing thing is that it's possible to pack all the microservices together - as long as the dependencies are compatible. This makes supposedly the development easier. Instead of the need to start many microservices you just need to start one application. In addition it can be a quick fix for performance issues. If there are two very talkative microservices just glue them together until you fixed the performance issue.
I hope you like my setup and my blog entry. I probably will write another blog post to explain some details of the setup. This article is already so long that I didn't want to include more details.

I am always happy about comments and suggestions for improvements! :-)

Donnerstag, 18. Juni 2015

The Importance of structuring Microservices

It's very important to group microservices very well. Otherwise you will get a mess. Several microservices should form a larger part of the functionality.

Example: Image Functionality in a Microservice System
Lets approach the problem by example. Assume we want to build an image processing functionality for our system. Each microservice should do exactly one thing. So we will end up in the following microservices:

  • image scaling
  • image watermarking  
  • image storing
  • image retrieving 

All these image services are very tightly coupled - the storing microservice and the retrieving microservice uses the same database and the same files. Of course it should be possible to use each of them in another context or by itself. However in your own system you will consider it as an image-sub-system. Therefore this sub-system should have its own abstraction. Its own API - this API is an own microservice.

All non image related microservices don't know anything about the individual image microservices. They know only about the image API microservice. This API could group together the typical use cases. For example to resize an image and watermark it.
You don't want to have this code in any other place than in your image microservices. Otherwise your complete microservice system is very tightly interconnected. This is very bad. Imagine that all images will be scaled & watermarked and that you have performance issues because of high network traffic. One simple optimization is to merge the scaling microservice and the watermarking microservice together. Then you have to send the image over the network one time less. If you utilise an image API microservice then the code must be changed only once and only one microservice needs to be redeployed. On the other hand if all other microservices used the internal image microservices then you have to change the code several times and several microservices need to be redeployed.

Abstraction like in a monolith
This abstraction and grouping of functionality is nothing else like modules in monoliths. In a monolith all the image functionality would be in a package "image". This package would have following sub packages: "scaling", "watermarking", "storing" and "retrieving". Likewise in the microservice architecture you don't want other parts of the monolith use these sub packages directly. Otherwise you have interconnected modules and it will be hard to change internal details of the image implementation. Everything should be behind a high level interface and everything else should be hidden and not be used anywhere else. Then it's possible to rewrite the complete image package without the need to adapt code in other packages. The only thing you need to take care of is to support the interface.

I think it's very appealing that this concept of abstraction is assignable to a monolith and to a microservice architectures. If the concepts were completely different then something is most likely wrong.

The module abstraction is often violated in monoliths and the complete codebase is very interconnected. It's bad, but it's manageable. Because you have the complete code in your IDE with all the useful refactoring tools. But if you have separated code bases it's very hard to refractor anything. So do it right in the first place and don't move it to a later date.

Microservice grouped by Domains
One way to group the system is to group it by domains. A domain is a (mostly) independent part of the system. It's very important that you get your domains right. Because if you don't then the problem is that for nearly every new feature several domains must be changed. But this should be only an exceptional case.
For example the user part of a system consists of: login, registration, password reset mail, user profile and so on. All these basic user functions are more or less equal for each system. Therefore the user-sub-system should be also usable for any other system. This only works if this domain is completely independent. The domain is not allowed to have any dependency to any other domain. Otherwise you have to remove those dependencies to make the user-sub-system usable for another system.
Another domain could be the product-domain. In this domain all the data of a product is saved: price, stock count, description and so on. The user-domain and the product-domain must be independent of each other.

How to deal with features that require two domains
Lets approach the problem again with an example: a shopping cart. A shopping cart is individual to each user. Therefore it must be in the user=domain. The problem is now that you also need the data from the product-domain. Otherwise the shopping cart is quite useless if you can't see the details of the stored products. But since the user-domain isn't allowed to have a dependency to the product-domain how to solve this problem?

Actually it's quite simple: with an aggregation microservice. In the user-domain all stored data is related to the user like added-date, user-id and so on. There is only one exception: the product-id is also stored.
The user-domain provides an API to get all items from a specific user. These API will be consumed by an aggregation microservice. The product-ids will be collected from the aggregation microservice and then the details of the products will be fetched from the product-domain. The data will be merged and then the shopping cart can be displayed or the data can be passed to the next microservice. By that you can implement the shopping cart feature so that both domains are still independent.

The aggregation microservice/layer is nothing else than a Lego™ block which connects two other Lego™ blocks together. If you use third party APIs then you do exactly the same. You create some kind of Lego™ block which aggregates the data from several third party APIs. It helps to think if you treat your own microservice APIs like they are third party APIs. Because then you will get the independence of the microservices right. You won't violate the independency.

Performance Issue with the Aggregation layer
Let's assume that it's common that a shopping cart has thousands of items. Additionally you need to support to sort the items in a shopping cart by price. Then the aggregation microservice has to load thousands of items from the user-domain, load thousands of product details from the product-domain, merge the data, sort them by price and then throw away thousands of items except for 100 which will be showed to the user (pagination). That's very inefficient.

The only way to solve the problem is with data duplication. It's necessary to store the price of the item in the shopping cart database as well. Then you can do the pagination in the database and only get out the right 100 items.
That means that you have to supply the additional data if an item is stored into the shopping cart. Additionally the data in the shopping cart database must be updated if the price is changed in the product database. To do that you need events. If a product is updated an event must be thrown. An event listener will take this event and update the price in the shopping cart database.

Eventual Consistency
This means that you have only an eventual consistency. Because the price could be changed but the corresponding event could be still in the queue. Then the item will sorted in a wrong way. The price itself is correct, because the price which will be displayed will be loaded from the product-domain with the other product details.
There is no way around this problem. Only if you use just one database, do distributed transactions, or do the inefficient loading of thousand records. But that conflicts with scalability and/or the microservice architecture mindset. Therefore it must be alright for all data you duplicate that they are shortly out of sync. In some cases you can't do it. For example payment data. But then don't duplicate the data.

You need guidelines which microservices are allowed to talk to which microservices. Structure the communication workflow and use grouping and abstraction. Otherwise you will not know which microservices rely on a specific microservice - directly or indirectly. That's a very bad spot to be in. Another thing you probably need is a good monitoring. So that you can see what microservice calls were made for a user request and how long they take. Because if you don't have this data then you have a very hard time to find performance issues. Consider to use Zipkin.
Design the system very careful and have good high level documentation for each microservice. 
If you get all of the things right then you should be fine and have much fun with your microservice system :-)

Montag, 1. Juni 2015

Microservices: a few thoughts

Microservices are a hot topic right now. At work we are discussing if we should move to microservices, too. In my opinion it would be the right choice. First of all our current software is very old and has many flaws. Therefore we need to rewrite the software anyway. Second the concept of microservices fits us very well and it's in general a great concept. 

Specialists versus Generalists
A problem we currently have is that everyone must know the complete project, which is quite big. We don't have modules on which the developers could specialize. This causes a lack of code ownership and deep knowledge and understanding of the code. Therefore the code gets worse and worse. With microservices you have strong "modules". So each team/person can focus on a group of microservices and know only about the interfaces of the other microservices.
The disadvantage is that you have a more difficult time with the project management. Because you have to align your features accordingly to the teams and their knowledge.
If you can handle that then the overall output and quality should be considerably better with specialized teams. More features mean mostly more money and that's a good thing. This is not really a microservice thing. It's just an argument for a good architecture and for specialized teams.

Impact of changes
Another nice benefit is, that you can delimit the impact of a change
better. Since a change in a monolith can always have strange side effects. This shouldn't happen with microservices.
It's very important that the microservices are fault tolerant. So if one microservice misbehaves then it must not poison the other microservices. The outage of one microservice must be contained. Meaning that a part of the functionality is missing or broken. But everything else must still work fine. With this concept you can take better care of your core microservices. The not so important microservices could have less quality. Thereby it's possible to optimize the development output - invest only as much as needed.

Homogeneous Stack versus Inhomogeneous Stack
Many people say that another benefit is that you could use different frameworks or even other programming languages. But in my opinion this doesn't make sense. At least if the team is not very big. Or you have very special requirements. Because a homogeneous stack makes everything much easier. Even the build tools, build configuration and the deploy pipelines should be the same. Otherwise you violate the DRY (don't repeat yourself), because you have some kind of "code/configuration/script duplication" and have to solve each problem for each programming/framework-stack.

You still have the benefit of upgrading the microservices independently. If a microservice runs stable and doesn't change then there is no need to upgrade it. With a monolith you don't have this option. If a library is upgraded then everything is affected. Even the code you didn't touch for years and worked perfectly until the upgrade. Because of that a library upgrade in a monolith is very dangerous and is done rarely. Microservices can be upgraded one after the other. So there is no big bang upgrade but many little upgrades. Furthermore you shouldn't have a jar hell like in a monolith. This makes upgrading easier, too.

In general you have much more flexibility with microservices. Because one huge monolith is very hard to change. What would it mean to change a core library in a monolith? For example to change a self written URL dispatcher to Spring MVC. Of course there is way too much logic in this layer. For the current project this change would mean a rewrite of a big part of the system. Which would be very error-prone. Like in the "Upgrades" section the problem is, that you have to change everything. It's not reasonable to change only a part of the monolith to Spring MVC. In some cases a partly change is just impossible. The general problem with partly changes are that you have to support still the old self written URL dispatcher and Spring MVC. The whole system gets more complex. And in a few years the next library will come that you will want to use.
But this isn't only true for core libraries. It's also true for smaller libraries. If there is a new major library upgrade, which breaks the old API, then you have to migrate a huge code base to the new version. It's not possible to write only one new feature with the new library version. For this single feature this doesn't pay out. So you don't do it and use the old library which forces you to write more code. Now you have a technical dept. Because all other features in the future are limited to the old library version too.

How does the reusability go? First you have a class which uses the same method for the same task. The next step is to have a package for a more complicate task. If it gets even more complicated you write standalone and independent libraries and add them as a dependency. Microservices are the next step of the reusability.
If you have several systems which need to send emails, then you probably have in each system an email library dependency. Furthermore code to use this library and this code is most likely an code duplication. Because it doesn't pay off to create a library which will be used in each system
only for a few lines of code. In a microservice world you create a microservice which handles the emails. Then in all other places you just have to do one remote call and you are done. If you create a new system, for example a new batch job, then you have to use the email-microservice and you are done. No need to integrate an email library into the batch. If a new requirement comes in that all mails should be resent after an hour if the initial delivery failed then you only have to take care of this requirement once. Not several times - for each system which sends emails.

This is a controversial topic, because not every software system needs to be extremely scalable. Many software systems work fine with just one big database (which is hopefully redundant). Therefore it's not always an argument for a microservice architecture. If you need scalability then microservices are great. There is one condition: the microservices need to be stateless. Then it's very easy to start up more instances of the microservices and scalethe system this way.

There is another kind of scalability besides of the technical scalability: the developer scalability. It most likely won't work to have hundreds of developers working on the same monolithic code base. But again not everyone has to face this challenge.

Microservices don't solve everything. In contrast this concept introduces new challenges and problems. Nevertheless I think that these challenges are manageable. The benefits you get from the use of the microservice archtecture are huge. If the system is complicated enough, you know that your team can handle the microservice-challenges and creates a good microservice architecture then you should definitely consider a microservice architecture. If one of the criterions aren't met then stick to your monolith.

A word of warning: a good microservice architecture and microservice infrastructure is very hard to create and you need many things: an excellent concept, a very good understanding of your domain - so that you can split up your monolith in the right way, excellent monitoring, continuous delivery, automatically deployments and fast deployments, handle failures gracefully - so that failures don't create a ripple effect, be able to handle your data if it is distributed over several databases, and so on.