Popular websites often receive a considerable volume of traffic. While this is good for business, traffic and visitor requests can potentially collapse your applications and servers.
However, there’s a simple solution to this challenge – load balancing.
As the name suggests, load balancing is a technique for building a network that distributes incoming traffic across several backend servers. An immediate outcome is an increase in available server resources. As a result, your business apps can now handle a lot more requests at a much faster rate.
The good thing is that you can easily set up load balancing using several software components such as HAProxy. In practical terms, load balancing provides high availability at the network (TCP – layer 4) and application (HTTP – layer 7) layers. The end result is a noticeable improvement in the application speed and performance by distributing the workload across multiple servers. Because of this boost, you’ll find load balancing in almost all high-traffic websites, such as social media networks (Twitter and Instagram) and public email platforms (Gmail is a great example).
How Load balancing Works
Load balancing is pretty flexible in implementation, and you can mix and match components to fit your particular requirements. The following diagram shows a common implementation of the idea.
Now that you know the theory behind load-balancing solutions, it is time to discuss using High Availability Proxy (HAProxy). This popular load-balancing solution works great with all popular Linux distros, including CentOS 7.
Prerequisites
HAProxy is often used to set up load-balancing solutions because of its flexibility. You can set it up to cater to your project’s requirements and create a solution that ensures the high availability of your business applications.
Let’s start with the prerequisites. Note that the following are not absolute requirements. We have picked these requirements to demonstrate the flexibility of HAProxy and the idea of load balancing.
For this test scenario, we have three servers with CentOS 7. Out of these, one server is set up as the HAproxy Load balancer server, and the other two are the backend web servers.
These servers have the following hostname and IP addresses for better identification.
192.168.1.1 : web1.test.com
192.168.1.2 : web2.test.com
192.168.1.3 : Loadbalancer.test.com
Installation and Configuration of HAProxy on CentOS 7
Let’s start with installing server software and HAProxy on the CentOS 7 servers.
Step 1: Install HAProxy on the Load Balancing Server
The first step is the installation of HAProxy on the server acting as the load balancer in our test configuration. Use the following commands to install HAProxy on the load balancer server.
On RedHat/CentOS 7 based systems:
# yum install haproxy
On Ubuntu/Debian based systems:
# apt-get install haproxy
Once the installation finishes, start the HAProxy service with the following command:
#systemctl start haproxy
This will start HAProxy on the load balancer.
Step 2: Install Web Server on the Backend Servers
The other two servers would act as the backend servers. For this, we’ll set up Apache as the web server for servicing the requests that the HAProxy server would redirect.
We chose Apache because it’s perhaps the most widely used web server today. It is open source and has been a popular choice for setting up servers for web apps.
On the web servers 1 & 2, use the following commands to install Apache:
On RedHat/CentOS 7 based systems:
# yum install httpd
On Ubuntu/Debian based systems:
# apt-get install apache2
Once the installation finishes, you’ll notice that the Apache service doesn’t start automatically. You need to enter the following command to start the service.
# systemctl start httpd
You won’t get any indication that the service has been started successfully. To confirm this, you can get the status of the service with the following command:
# systemctl status httpd
You will see an active status if the service is running:
Step 3: Configure Firewall Settings on Apache Servers
Once the installation is completed, we need to configure the firewall settings to allow HTTP traffic.
Essentially, this means we need to allow HTTP service by allowing port 80 on both the web servers. For this, use the following commands:
# firewall-cmd –permanent –add-service=http
# firewall-cmd –permanent –add-service=https
Now to make sure that the firewall rules are in force, reload the firewall rules with the following command:
# firewall-cmd –reload
Finally, check the firewall status to see if everything is working as intended.
# systemctl status firewalld
After the firewall reloads, you are ready to start the service and check the web server.
Step 4: Add Backend Servers to HAProxy Configuration File
At this point, we have installed HAProxy and Apache server on the test servers.
Now to connect the components, we need to make changes to the haproxy.cfg (the HAProxy configuration file). You can see that we’ve commented out the existing content of the file and added our rules.
Start by entering the following command in the terminal of the HAProxy server.
# vi /etc/haproxy/haproxy.cfg
Comment out the current rules and add the following to the file.
# HAProxy Load Balancer for Apache Web Server
frontend http-balancer
bind 192.168.1.3:80
stats enable
stats auth admin:123
stats uri /stats
stats refresh 10s
stats admin if LOCALHOST
default_backend web-servers
backend web-servers
mode http
balance roundrobin
server webserver-01 192.168.1.1:80 check
server webserver-02 192.168.1.2:80 check
Save the file and restart the HAProxy service to ensure the new rules are loaded and in use.
You should note that you can also use other custom ports instead of the default ports 80 and 443. Another important fact to remember is that the default algorithm is Round Robin.
Also, note that the check option (at the end of the server directives) forces health checks before the backend server is used.
HAProxy Status Dashboard
HAProxy also offers a dashboard called HAProxy Status page. Here you can see the essential metrics that present server health, current request rates, response times, and more.
You need to enable a directive in the configuration file to add a metric. For instance, in the following screenshot, you can see that you need to provide credentials to access the Status page. This is because of the authentication directive we added to the HAProxy config file that requires users to enter credentials before accessing the page.
Once you provide the proper credentials, you can access the HAProxy Status page where you can access various important statistics about the load balancer.
Set Up the Logs
We’ll now configure the rsyslog to confirm the location of the HAProxy logs. For this, use the following command:
vi /etc/rsyslog.conf
While editing, uncomment the following lines:
# Provides UDP syslog reception
$ModLoad imudp
$UDPServerRun 514
Next, add the following lines to the /etc/rsyslog.d/haproxy.conf
# HAProxy Logging
local2.* /var/log/haproxy.log
Now, restart both rsyslog and HAProxy service to make sure the changes are in use.
Test the Load Balancer
Our test HAProxy load balancer is in place, and we can now test to see if the load balancer is working correctly. Start by adding the index.php pages on both test servers. When you load these pages in the browser, you’ll see something like this:
Now access the URL of load balancer for testing the request forwarding on web1 & web2. When you refresh the HAProxy load balancer URL, it will prompt different message every time like ‘This is web1 page’ or ‘This is web2 page’
You can also check the logs to see the responses of the servers.
Conclusion
This tutorial covered the process of using HAProxy as a load balancer. We showed how you can set up a simple configuration of an HAProxy-based load balancer and two servers that act as the backend for handling the requests.
This setup can be scaled or modified to cater to your requirements. Let us know how you use HAProxy for load balancing for your business apps.