The Reverse Proxy Pattern

In the cloud native world where a cloud software is a bunch of micro services, there are few patterns that are followed as best practices to achieve cloud qualities. One of them is the Reverse Proxy pattern. Today, we will discuss what Reverse Proxy pattern is, why it is needed and where it is applicable.

Contents

In which context is this pattern applicable?
What is the Reverse Proxy Pattern? Why do we call it a pattern?
What is Reverse Proxy Server?
How is it Different from Forward Proxy Server?
What exactly is the solution provided by this pattern?
Which Problem does the Reverse Proxy Pattern Solve?
Which consequences does this solution bring?

In which context is this pattern applicable?

The reverse proxy pattern is applicable in the below scenarios in cloud development.

Client-Server communications.
Mostly discussed in the context of web communication through HTTP(s) but also can be applied in other areas e.g. communication via FTP.

What is the Reverse Proxy Pattern? Why do we call it a pattern?

Reverse proxies act as intermediaries in the communications between the clients and servers. The Reverse Proxy provides a single point of entry to all web servers. Here we say web servers because in the cloud where we deploy a product, at the backend there are many web applications running as web servers. Each of them handles a specific workflow and accepts a specific part of the HTTP requests. There are different scenarios where we can apply this pattern. Many scenarios exist in web communication with web clients and web servers. Each scenario solves one or more problems. Many cases Reverse Proxy works as a load balancer or vice versa but these two are different and distinct features of a cloud product. You can read the difference between Reverse Proxy and Load balancer from NGINX to get a clear definition of both.

What is Reverse Proxy Server?

Reverse proxy server is a web server typically owned or managed by the server side of the Client-Server communication and accessed by clients from the public internet. The diagram below will help you understand the Reverse Proxy. Several clients are sitting on corporate or client networks. These clients are web browsers accessing the web portal of the cloud software and users are interacting with the cloud software via the user interface in the frontend part of the cloud application. The front end sends several web requests to the backend of the cloud software. The backend resides on the corporate network of the cloud software provider or on the cloud infrastructure providers network that the company hosting/selling the cloud software uses as Infrastructure as a Service or Platform as a Service. The backend has one or more Resource Server that serves a particular resource, meaning they accept a part or all of the web requests generated by the front end. The Reverse Proxy seats before the Resource Server and all requests that need to go to the Resource Server(s) come to the reverse Proxy first. The Reverse Proxy redirects the request to the appropriate Resource Server.

How is it Different from Forward Proxy Server?

Typically Forward Proxy is managed by a client side of the Client-Server communication. The clients are in a private/internal network. The clients can access the forward proxy only. The forward proxy retrieves resources from the public internet on behalf of the client.

What exactly is the solution provided by this pattern?

This pattern provides A server as proxy that forwards all incoming requests from Client(s) to actual server(s) or micro services where resources reside and then forwards replies from actual server(s) to the Client(s). It resides between client/Web UI and backend server(s).The Client runs anywhere in the Internet or in the outer world. The server(s) or micro services are inside a protected corporate network. The micro services are not publicly accessible from the outer world. Only the reverse proxy is reachable by all clients, it receives requests from clients and dispatches them to the micro services. The clients are unaware of the actual server from which the response is coming.

Which Problem does the Reverse Proxy Pattern Solve?

Protection & Flexibility

When a Reverse proxy is not used, Once a Resource Server is made public, the address(host name with domain) cannot be changed. If we put a Reverse proxy in front of the Resource Server only the address of the Reverse Proxy is made public, the reverse proxy forwards the requests the Resource or Web Servers, so any change in the address of the resource servers needs to be adopted in the Reverse Proxy and these changes are hidden from the users.
Identity and Location of the server(s) can’t be hidden. Sometimes we would like to hide the real address of the Resource Servers to avoid attacks in the Resource Servers. We can achieve that using the Reverse Proxy.
Web servers need to take care of firewall features and web security features besides business logic. Each Resource server is exposed to the public and the clients directly contact the Resource Server, each of the Resource Server needs to implement these firewall and web security features.

Integration (multiple servers)

Especially in Micro-Services architecture multiple servers each providing a specific resource. In this case we cant fulfill the Same-Origin-Policy (SOP) for browsers. When we have a Reverse Proxy, the browser sees only one web server for all requests to access different resources.
Large attack surface difficult to protect. Without Reverse Proxy as all Resource Servers are exposed the attacker gets a large attack surface each of the Resource Server can be attacked. When we have only Reverse Proxy for public access we reduce the attack surface. When the attack surface is reduced it becomes easier to protect the server.
Without Reverse Proxy all web servers must have application firewall features to protect against common web based attacks and all web servers must have web security features like TLS encryption.

Load Balancing

With no Reverse Proxy how can we Load Balance if the web server can not cope up with load ? We can not do horizontal scaling by increasing the number of instances of one web server because each of them will have a different host name. We also can not dedicate specific web servers that will handle specific requests. We need a router or dispatcher for each web server that will do the load balancing. With Reverse Proxy the load balancing can be done in one place.

Front Door / API Gateway

When we have multiple micro services serving a part of the whole application each server needs to provide the User Authentication and Verification. They need to provide a login mechanism. They also need to have User session management. Its lots of wasted efforts in term of development and maintenance to provide these features in each micro service. When we have a reverse proxy it can handle these features in one place, the micro services do not have to implement these features, they just need a way to validate whether the incoming request is already authenticated by the Reverse Proxy and whether a valid session exists.

When a client must authenticate against each micro service it's annoying and frustrating for the client. Clients do not care how the internal architecture of a cloud software is, they want the same monolith experience where the backend is a single server serving everything. Clients expect Single Sign On(SSO) for several web applications or micro services where they just have to authenticate only once and thereafter subsequent calls to one or many micro services are automatically authenticated. As the Reverse Proxy can work as a front door and a gateway for all APIs the micro services provide, we can achieve SSO easily. This is because clients only see the Reverse Proxy as an application server where all requests to the backend goes. Clients just need to authenticate against the Reverse Proxy.

When we have a central entry point we can apply some optimization like request compression,

request and response Enrichment, caching static content etc.

Which consequences does this solution bring?

The problem we have seen in the previous paragraph we see all of them can be solved by the Reverse Proxy.

PROs

Masks server address change and provides single host name
Protect Web Servers
Help Minimizing Attack Surface
Load balancing

Authentication verification
Request and response enrichment
Caching static content
Optimization like request compression
Fulfil Same-Origin-Policy
Single-Sign-On

The main Cons is that the Reverse Proxy becomes a single point of failure.

Examples are open source web servers like Apache and Ngnix. But you can develop your own reverse proxy.

What we have learned today are the following. In your deployment you can build a reverse proxy as a single entry point for application and all microservices sit behind this. It can work as Application Router and Load balancer and can support OAuth authentication laws and use JWT to propagate user context to all microservices. It can hide backend server/micro services from clients, session cookie is established with the main root host. When we want to communicate with different micro services in the backend we don't have to deal with CORS. It can act as a firewall between micro services. Using a Reverse proxy allows request/response manipulation. Customers can do seamless SSO. A Reverse Proxy can isolate the network between backend servers and the frontend. As an Api gateway it's a gatekeeper between client and server. The Reverse proxy can take care of authentication, batching requests to backend, logging, SSL offloading.