In the modern web world, user experience and the needs of the applications developed accordingly make us think about what the next step will be. Let’s admit it, this is usual for all of us.
The needs of users are changing every day; we focus on new technologies and libraries.
The frontend applications that we develop feed from a data source and interact with the end user in many aspects. We mostly define this data source as API. However, since these data sources are not generally written suitably for different frontends such as web and mobile, we can consider them as General-Purpose API. When you write in accordance with the web, you have difficulty in responding to different data needs of your mobile applications or vice versa. When you try to support all of them, the size of the complexity increases.
It is almost impossible to find a common solution for this issue. The service written appropriately for a client is either rewritten for other clients or another endpoint that will be made suitable for the new client is created. If you are trying to provide data to multiple clients for a complex product, I think you understand what I mean.
The data need of a web application and a mobile application is different because for many years, mobile bandwidths show that we need to work with a smaller request/response model. This means that you cannot send the data model, which you send to a web application, to your mobile client the way it is. Likewise, many network operations to be performed are just one of the issues that can directly affect the battery run-time of your mobile device.
Offering a dynamic data model in accordance with your needs!
Different user interfaces and this different experience show that the data model should also differ based on the needs. In this case, APIs developed in accordance with the general purpose become partially insufficient for us. When we develop a new feature, we need to integrate it separately for web, mobile and other frontends with the API developed according to the general purpose and to perform a series of procedures in order to make the data model suitable for the relevant client. This situation usually results in extra effort and time cost for the developer teams.
This part of the article has aimed to clarify certain problems that we frequently encounter in the real world. From now on, I want to talk about some solutions and methods.
Backend for Frontend
This approach that we know as BFF is not very new. With the new definition of a number of existing approaches, its popularity has increased, and it promised to offer an ideal solution to real-world problems.
We can say that the BFF approach has emerged following a solution to a series of problems experienced in SoundCloud. The approach mentioned here suggests the creation of APIs developed specifically for the product according to the experiences that it has rather than the general-purpose API.
This approach actually shows us that we need to create another layer that presents the data model suitable for the need of the frontend. Moreover, the layer called BFF should be written by the people who know the frontend well and again owned by frontend teams. Eventually, you are the one who best knows what you need, right?
What is BFF?
The Backend for Frontend approach is defined as customized services for the needs of the frontend.
The aim of the approach is to present the most suitable data model for frontend and reduce complexity. This approach aims to facilitate the handling of different user experiences emerging in the diversified clients. Adaptation of APIs to interface becomes easier with this approach which aims at strict commitment to a specific user experience.
Based on the basic principle of the approach, we can say that a different BFF should be created for each interface.
In the microservice world, you can have numerous services where you manage many areas belonging to your product. In this case, it is inevitable to make many requests to collect the information of a product. With the services called Aggregator, it is possible to collect information of a product from foundation layers and to return with a single response.
Aggregate is a pattern in Domain-Driven Design. A DDD aggregate is a cluster of domain objects that can be treated as a single unit. An example may be an order and its line-items, these will be separate objects, but it’s useful to treat the order (together with its line items) as a single aggregate. — Martin Fowler
However, aggregation layer can also undertake different tasks. For example, on this layer, it can make the decisions that are required to complete an order and have this process performed by microservice, which has only the task of completing the order. It is a suitable layer for keeping such logical operations i.e. business cases.
Foundation refers to the layer with the microservices that return the data that aggregators need. We can qualify them as real microservices. We can call them as bottom-level services that communicate directly with DB and do what they are told to do.
What makes aggregation layer advantageous is that you can rewrite your microservices in the bottom-layer without touching the business logic or you can change the DB model behind it.
However, this pattern may not be necessary in every case. It is necessary to design these layers in accordance with the need.
Here, you can ask the following question: Where exactly should the BFF layer be positioned?
The BFF layer should be positioned just before the Aggregation layer.
The BFF layer should take over the data that Aggregators collect from Foundation layer and return in a single response and make them suitable for frontend. However, BFFs do not have to communicate with the aggregation layer all the time.
Sometimes, you can directly communicate with the bottom-layer microservices. At this point, we can say that it should be generally in front of aggregation but can be positioned anywhere according to the need.
Why should I create the third layer? You may ask this question here. Cannot Aggregation perform this task?
At first glance, the aggregation layer may seem to replace BFF, but the situation is different in the real world. BFF can be positioned as the third layer for some reasons such as maintenance cost.
Of course, each layer that the data visits while reaching the client means some latency. Some solutions definitely come with a price.
We can strengthen the BFF layer with GraphQL!
With BFF layer, we can offer a common data model suitable for our component structures to be used on frontend.
For example, you want to display the products, which may attract the attention of users, on frontend by taking them from a 3rd party recommendation service. Let’s say that you have a product component that you have developed with React.
The product component is shaped by the properties given to it. However, different services may return inconsistent product information. For example, our product component should show the price information as $1,156.09. Since the 3rd party service is prepared for general purpose, it sends this information as 1156.9.
In another example, our own product service sends the price as 1.156.9 for the product information belonging to a different area. Or, while one service returns the price information as String, another service can return as Int. It is possible to give more examples about this.
We want to use a single product component on frontend, and we want to have a standard data format. For this reason, we should convert it to a suitable format with minimum effort on frontend.
Trying to support all these differences on frontend is far from reality. There are many advantages of using dumb components in client. If you try to process every response on UI side, you will have a UI that is difficult to maintain and continuously generates bugs. This also means that the component’s testability becomes more difficult.
Program to an ‘interface’, not an ‘implementation’.
— Design Patterns: Elements of Reusable Object-Oriented Software (1994)
We have mentioned that the BFF layer is mostly an area that frontend developers should adopt. At this point, the technology and language to be used are also very important. Proceeding with Node.js will probably be more attractive when you think that the people who will write active code are frontend developers. Because what is exactly done is Input — Output. However, it is wrong to establish strict rules stating that BFFs should be written with one language or technology.
When I look at the problems that the BFF approach aims at solving, I think that using GraphQL for this layer will bring us closer to the ideal solution.
Thanks to GraphQL, we can gain much more flexibility on BFF layer. When we do not want the data that we do not need, all we need to do is not to request that data in the GraphQL query that we send. In this way, the idea of creating separate BFF layers for mobile and web is replaced by the independent BFF approach that is experience-specific for all clients with GraphQL.
Phil Calçado summarizes the subject regarding the relationship between BFF and GraphQL as follows;
Were GraphQL available back then and had we decided to use it, things would be quite different. When we use GraphQL, we don’t necessarily need a Presentation Model at all, and if we do use one, it can be implemented on the client application, as GraphQL makes it possible to get all data needed in a single request.
Sending requests to multiple BFF services to meet the data needs of a module or page on frontend side is a costly process. In addition, making separate implementations on each client for these services is another cost. Perhaps, you want to design a micro frontend or a web architecture that consists of independent parts instead of a monolithic web application. If all clients recognize only one endpoint, things can get easier.
At this point, I think what we need has become clear. Yes, we need a gateway solution.
What is Gateway?
Gateway is the method used by clients to access services via an API, where controls such as security and authorization are performed, instead of directly accessing them. Gateways collect, unite and return responses from distributed and discrete services as a result of a request.
Communicating with multiple services to create a single UI screen increases the number of requests and causes a delay on UI side. In the ideal world, multiple requests to be made by the client must be performed on the server through Gateway and collected efficiently.
Accordingly, we need to communicate with all BFFs that we have developed on GraphQL through a gateway. In fact, it is not very important for the frontend to which BFF is visited in the background. All we need is to transfer the product information in the format suitable to our component model in the simplest and fastest way.
Apollo emerged as a client solution for GraphQL. However, with its different approach, it has become an important player for the GraphQL ecosystem in a short time. Apollo’s server and client solutions work with a complete harmony. The compatibility of the parts that we use on a road where the modern frontend world is rapidly progressing is very important for us.
We can use Apollo Federation (@apollo/federation) in our BFF services to make GraphQL schemas combinable.
For now, Apollo-Server works only on Node.js and can work in an integrated way with frameworks such as Express, Hapi and Koa. Therefore, for structures like federation, you need to progress with this infrastructure in all your BFFs. If you want to use Go in one of your BFF services and Node.js on the other, you will need to spend extra effort on some issues such as schema stitching.
Gateway (@apollo/gateway) connects to the BFFs in the service list that we have defined, gets its schemas and composes them on a single graph. In this way, we transfer our queries to gateway. Gateway understands to which service we want to go by looking at the composed schema and directs our Query to the relevant BFF. It conveys the response coming from BFF directly to us.
We can send more than one Query to gateway simultaneously and direct these requests to the relevant BFF services. A single Endpoint, a single Query and a single Response. It sounds really good.
By using Federated Schema and Gateway solution on Apollo-Server together, you can create an ideal BFF layer.
The Points to Consider on Intermediate Layers!
My first thought for intermediate layers is that there are areas that need to be checked quite tightly. In case that there is a problem regarding the data reaching the client, we need to be able to identify from which layer it has originated.
If you are making a direction with Federated Schemas over Apollo Gateway, the first thing to consider is the deployment phase.
On startup, Apollo-Gateway fetches the schemes of the BFFs in the service list.
If you have made a change or addition on the schema in your BFF service that retrieves the product data, you can create an endpoint such as /schema-reload for it to be reflected on gateway and call the gateway’s startPollingServices method when a GET request comes here. In this way, the current schemas of the services will be conveyed to gateway, will be compressed and updated again. You can stop it when PoolingServices finishes its duty.
You can automate this process while performing Continuous Delivery. For example, if you are using a platform like Spinnaker, you can get the schemas updated by making a GET request to the gateway’s /schema-reload endpoint with a deployment hook.
In the BFF model that we apply, it is not considered very utopic to determine the test coverage of our services as 100%. It is very critical for us to write unit and integration tests to make sure that this layer is returning correct results and running properly. Generally, using the expression “100%” for test coverages may not be very realistic for the ideal world. However, if it is an intermediate layer where you perform data manipulation, you have to do this in a sense. At least to sleep with a peace of mind at nights.
Collecting metrics belonging to your BFF layer and preparing dashboards that will follow them will allow you to instantly monitor the behaviors of your service. It is a critical issue to be informed instantly when an unexpected situation occurs. However, when it comes to GraphQL, there is a limited number of resources regarding how metrics can be integrated with on-premise solutions.
When we look at Apollo solutions, apollo-graph-manager can collect these metrics very effectively and send them to the cloud. On the other hand, having a paid service that runs on cloud can create a risky situation considering the security policies of some companies.
You may want to benefit from on-premise solutions to gain more control over metrics. However, since the graph-manager is a tool specific to GraphQL, it contains very useful and powerful features.
Apollo offers us a tracing method for GraphQL metrics. If you are not going to send the metrics to Graph-Manager, you will probably want to use a metric database and go for solutions like Prometheus. In this case, you need to convert the data, which you will obtain with the lifecycle methods offered by Apollo, into a format suitable for Prometheus. By writing a small Apollo plugin that will perform this process, you can present the exported data to Prometheus over an endpoint like /metrics.
Due to the structure of GraphQL, we need to collect the correct metrics. Collecting metrics containing only request/response times will not be sufficient to determine exactly where this results from when a latency is experienced under heavy traffic.
For example, when you post a Query, it is useful to know how long it takes to parse this Query. Likewise, we need to constantly know the operations such as Query’s resolution time and how long it takes for resolver to complete its task. In this way, you have the opportunity to figure out on which area there is a problem in your Query and whether it should be optimized (you can create metrics such as parsing, validation, resolve, execution and request time).
We can visualize the metrics, which we export in a format compatible with Prometheus, with Grafana. If you are using a container orchestrator like Kubernetes, when you adjust the setting of the service monitors, you can collect the information of your BFFs running on container such as CPU, Ram and Heap and send them yo Prometheus and visualize these through Grafana.
We need to instantly track any errors that may occur in intermediate layers like BFF. At this point, Apollo offers us a lifecycle where we can catch errors. By writing a small Apollo-Sentry plugin, we can catch the emerging exceptions and log them with a powerful tracking tool like Sentry and activate the necessary warning mechanisms (slack, mail etc.). If you want to query the exceptions that you have caught more easily, you can also use Splunk.
The production of meaningful logs by our BFF services is also an important subject in order to make assumptions about whether they are running properly or not. We can send our logs to stdout in a suitable format. If you are working on an orchestrator like Kubernetes, you can collect these logs from stdout and send them to a tool like Splunk and set up alert mechanisms. You can also benefit from Apollo’s lifecycle methods to collect logs.
The aggregations that we use as a data source can sometimes not respond at all or give very late responses. BFFs should be aware of this situation and interrupt the service request.
We want to receive the data about the favorite products module on our page by sending a GraphQL query to our BFF service from our React application. We forward the related service call to aggregation over BFF. However, aggregation starts responding too late. There are probably some problems. In this case, BFF starts waiting for the response.
You stay in the loading state coming from BFF on frontend, and it is very likely that you are showing placeholder to your users with this data. Placeholder will be shown to the user until the service responds, and a bad user experience will be obtained. Placeholders are generally useful but create a perception that your site is slow when the time you show to the end user is extended.
You will need to apply the Circuit Breaker pattern to manage such situations. When the timeout values that you set are exceeded, our BFF service should return the fallback status to the clients very quickly. According to this information, client should not show the relevant area until the data arrives or it should take a different action according to the fallback scenario.
We have three basic points in Circuit Breaker pattern.
• In Open status, the requests that fail during the reset timeout that you set are interrupted.
• In Closed status, requests are released uninterruptedly because there are no failed requests.
- In Half-Open status, some requests are allowed to understand whether the problem is still continuing, if the problem is not continuing, Closed status is chosen and Open status is preserved if the problem is continuing.
If a request fails, you may need to retry. Retry mechanism can be summarized as a retry of a request in the context of certain determined rules.
For example, you clicked on the “show basket” button on a product page and no action occurred. Probably, there is no problem on the basket service, and there is a situation that causes network error while communicating with the service. The products in the basket will probably be shown when you retry. We define the errors in such a scenario as “Transient Failures”.
In order to overcome such situations, a suitable retry mechanism should be built.
The operation must be idempotent for Retry. If the result is the same when we call the relevant service more than once, we define it as idempotent. Retry is a risky procedure for non-idempotent operations, and it is important to be careful.
Don’t Repeat Yourself
If you have a large organization, the number of your BFFs will start to increase in the course of time. In the end, you will be alone with many BFFs with code repetition and the same function. While building your BFF architecture, it may be a good start to make it easier to write new services and to collect modules that perform similar tasks in a core library. In this way, you can offer new features and improvements from a single point while maintaining the integrity of all your BFFs.
When you do not build over a core library, inconsistency occurs between BFF services; code quality and performance are negatively affected.
For example, to collect your metrics, you should not make the same improvements again and again in each BFF to capture and log errors. You can assign operations such as Caching, HealthCheck, Metrics and Service Call to the core library.
Gathering such common operations in the core library can be a good way to reduce complexity and code duplication. You should have certain limits regarding how much your BFF services will grow.
Otherwise, you may have trouble in terms of managing these intermediate layers. You probably don’t want to deal with big monolithic services. Creating a core library means that you will proceed with a single language and make your BFF services dependent on the library. If you have such an intention, it is useful to set the limits of your core library and make sure that it does not grow very much. Since this layer will not contain any business logic, we can partially apply the DRY principle.
Finally, this approach is also important for clarifying the responsibility areas of teams.
It should be determined how many BFFs you will have. You can create feature-based or product/client-based BFFs. Over time, code duplications in services may increase. You should consider how to manage this situation.
• BFFs should be a layer that can only respond to client. You should leave other operations to the bottom-level microservices that are responsible of these.
• BFFs should never communicate with each other.
• How long will it take to create a new BFF? You should make sure that you have the environment where you can make fast and practical developments. Try to keep operational costs at the minimum level.
• Automate all your processes and avoid duplications.
• BFFs should be under the responsibility of frontend teams or those who know the behaviors of the frontend. Before any development process, you should definitely determine your needs on frontend.
- If you have a single client, you will probably never need to implement such a design. You should analyze your needs correctly.
While the subjects mentioned in this article may be suitable for certain domains, they may not be suitable for other domains. You should design your architectural design according to your own business model and produce solutions that are convenient for your needs.
Today, different approaches such as independent frontend applications and micro frontends as well as different user experiences of platforms reveal the necessity of API design that is in harmony with frontend.