How does Load balancing work?

Do you know that you can use several servers and divide the work between them? Yes, you can use a simple device or software called Load balancer, which will redirect clients’ queries to different servers based on your preference. Do you want to know more?

What is Load balancing?

Load balancing is a method of workload distribution across multiple servers/devices. The purpose is to optimize the use of resources, maximize throughput, minimize response time, and, in general, increase reliability. It is mainly used to improve the performance and availability of applications, websites, and other services by distributing the workload across multiple servers.

It all sounds great, but how does Load balancing work?

How does Load balancing work?

As we mentioned before, the Load balancing method should be decided when you are setting up the system. There are different options of it, including:

Round-robin load balancing: It is the most simple one. The clients’ queries are distributed evenly across all of the servers that you have. In case you have 3 servers, client 1 will connect to server 1, client 2 to server 2, client 3 to server 3, and client 4 will start the circuit again and connect to server 1.

Weighted Round-robin load balancing. Almost the same as the traditional Round-robin, but you can set the weight to each server. Imagine that server 1 has 25% weight, server 2 has 25%, and server 3 has 50%. In that case, client 1 will connect to server 1, client 2 to server 2, but then clients 3 and 4 will both connect to server 3 before it starts all over again with server 1. People use Weighted Round-robin when the servers are not equal and prioritize more powerful devices. 

Least connections load balancing. In this case, the clients’ queries will be redirected to the server or device with the fewest active connections. For example, if server 1 has 100 clients connected, server 2 has 120, and server 3 has 90, the third will receive the following 10 clients’ queries. It just checks the number of active connections without paying attention to overall server performance or latency.

Least response time load balancing. Here the focus is on response time. The fastest a server can respond, the better. The Load balancing based on this method will send clients’ queries to the server that is reacting the quickest. If a client’s query shows that server 1 responds in 30ms, server 2 in 40ms, and server 3 in 35, this will indicate that the client will connect to the first server. That does not take into account the load of the servers. If many queries are coming from a close distance to one of the servers, that server will get more queries.

Conclusion

So Load balancing works by answering clients’ queries by using a specific decision method like round-robin, weighted round-robin, least connections, least response time, or another to provide better performance and availability. It sounds complicated, but it is a simple traffic manager that reduces the stress of a network and leads to better overall performance.

Published by Adrian

Leave a Reply