Azure Internal Load Balancer (ILB) hairpin
1. Introduction
As per Azure documentation - https://docs.microsoft.com/en-us/azure/load-balancer/load-balancer-overview#limitations
– the Azure Internal Load Balancer default behaviour is as follows
..if an outbound flow from a VM in the backend pool attempts a flow to frontend of the internal Load Balancer in which pool it resides and is mapped back to itself, both legs of the flow don't match and the flow will fail.
So, what happens if your application design requires
backend pool members to make calls to the private frontend of the same load balancers
they are associated with?
ILB hairpin - single backend |
In the above example, if VM-WE-02-Web01 initiates a
connection to 10.2.1.100:80 (ILB VIP) there is a 100% chance this connection
will fail. If the backend pool happened to contain other VMs (E.g. backend pool
with 2 instances) then there is a chance (50/50) the frontend request would get
mapped, successfully, to another backend member. As shown below:
ILB hairpin - multiple backend |
1.1 Why is this not working?
From the same documentation link
When the flow maps back to itself the outbound flow appears to originate from the VM to the frontend and the corresponding inbound flow appears to originate from the VM to itself.Let's take a look at the default NAT behaviour of the ILB to understand the problem in more detail.
- The Azure ILB does not perform inbound Source-NAT (SNAT) and therefore the original source IP is preserved.
- When using the default LB rule setting of DSR (aka floating IP) disabled, we do perform Destination-NAT (DNAT)
ILB harpin - NAT xlate |
All of the above results in the following, again from the original documentation link:
From the guest OS's point of view, the inbound and outbound parts of the same flow don't match inside the virtual machine. The TCP stack will not recognize these halves of the same flow as being part of the same flow as the source and destination don't matchWe can confirm this behaviour using WireShark. Firstly, for a flow that does work, showing a successful 3-way TCP handshake. (FYI this flow is sourced from the on-premise location, see topology diagram in the next section)
Wireshark - working flow |
Now for a flow that does not work, showing a failure of the
TCP handshake. We do not get past the SYN stage. As the Azure ILB performs
DNAT (see frame number 7647 on the screenshot below for confirmation of this) on
the return traffic, the O/S is unable to reconcile the flow and we therefore fail
to observe a TCP SYN ACK.
Wireshark - non working flow |
2. Lab Topology
Now that we have detailed the behaviour, lets look at possible workarounds. To do this I will use the following lab environment.
- Azure spoke Virtual Network (VNet) containing Windows 2016 Server VM running IIS, hosting simple web page
- Azure spoke VNet containing Azure ILB
- Azure hub VNet containing ExpressRoute Gateway
- VNet peering between Hub and Spoke VNets
- InterCloud ExpressRoute circuit providing connectivity to On-Premises
- On-Premise DC with test client Virtual Machine
2.1 Baseline
From the client VM (192.168.2.1) we are able to successfully load the web page via the ILB front end.
However, from the backend VM (10.2.1.4) we are only able to load the web page using the local VM IP address. Access via the frontend ILB VIP fails, due to the condition described in section 1.
// show single NIC
c:\pstools>ipconfig
| findstr /i "ipv4"
IPv4 Address. . . . . . . . . . . : 10.2.1.4
// show working
connectivity using localhost address
c:\pstools>psping
-n 3 -i 0 -q 10.2.1.4:80
PsPing v2.10 -
PsPing - ping, latency, bandwidth measurement utility
Copyright (C)
2012-2016 Mark Russinovich
Sysinternals -
www.sysinternals.com
TCP connect to
10.2.1.4:80:
4 iterations
(warmup 1) ping test: 100%
TCP connect
statistics for 10.2.1.4:80:
Sent = 3, Received = 3, Lost = 0 (0% loss),
Minimum = 0.09ms, Maximum = 0.13ms, Average =
0.11ms
// show baseline
failure condition to front end of LB
c:\pstools>psping
-n 3 -i 0 -q 10.2.1.100:80
PsPing v2.10 -
PsPing - ping, latency, bandwidth measurement utility
Copyright (C)
2012-2016 Mark Russinovich
Sysinternals -
www.sysinternals.com
TCP connect to
10.2.1.100:80:
4 iterations
(warmup 1) ping test: 100%
TCP connect
statistics for 10.2.1.100:80:
Sent = 3, Received = 0, Lost = 3 (100% loss),
Minimum = 0.00ms, Maximum = 0.00ms, Average =
0.00ms
3. Workarounds
3.1. Workaround Option [1] - Second NIC
- Add a second NIC to the Virtual Machine (from within the Azure VM config) with different IP address (we use .5 in the diagram above)
- Configure local (O/S level) static route forcing traffic destined to the LB VIP out of the secondary NIC
This works as the packet from backend-to-frontend now has a different source (10.2.1.5) and destination IP address (10.2.1.100 > DNAT > 10.2.1.4). With verification as per below:
//command line
from web server
// show
multiple NIC
c:\pstools>ipconfig | findstr /i "ipv4"
c:\pstools>ipconfig | findstr /i "ipv4"
IPv4 Address. . . . . . . . . . . : 10.2.1.4
IPv4 Address. . . . . . . . . . . : 10.2.1.5
// show
baseline failure condition
c:\pstools>psping
-n 3 -i 0 -q 10.2.1.100:80
PsPing v2.10 -
PsPing - ping, latency, bandwidth measurement utility
Copyright (C)
2012-2016 Mark Russinovich
Sysinternals -
www.sysinternals.com
TCP connect to
10.2.1.100:80:
4 iterations
(warmup 1) ping test: 100%
TCP connect
statistics for 10.2.1.100:80:
Sent = 3, Received = 0, Lost = 3 (100% loss),
Minimum = 0.00ms, Maximum = 0.00ms, Average =
0.00ms
// static route
traffic destined to LB front end out of second NIC
c:\pstools>route
add 10.2.1.100 mask 255.255.255.255 10.2.1.1 if 18
OK!
// show working
connectivity to LB front end
c:\pstools>psping
-n 3 -i 0 -q 10.2.1.100:80
PsPing v2.10 -
PsPing - ping, latency, bandwidth measurement utility
Copyright (C)
2012-2016 Mark Russinovich
Sysinternals -
www.sysinternals.com
TCP connect to
10.2.1.100:80:
4 iterations
(warmup 1) ping test: 100%
TCP connect
statistics for 10.2.1.100:80:
Sent = 3, Received = 3, Lost = 0 (0% loss),
Minimum = 0.68ms, Maximum = 1.45ms, Average =
0.99ms
3.2 Workaround Option [2] - Loopback VIP (+ DSR)
- Re-create the Load Balancer rule with DSR enabled, Enabling DSR causes the packet to be delivered to the destination
VM with the original destination IP address intact. In our case this is the frontend
IP of the ILB (10.2.1.100). With DSR disabled (the default), the packet
delivered to the backend VM would have a destination IP address of the backend
IP itself (10.2.1.4 in our case). Further reading:
- Configure loopback interface on backend VM with same IP address as ILB VIP (10.2.1.100)
- Configure backend VM applicaton (IIS in our case) to listen on additional IP address. NB. If using Windows Server 2016, enable weakhostsend on both the NIC and Loopback interface. See RFC1122.
Connectivity is now working externally from the on-premise VM using
ILB VIP 10.2.1.100, with DSR enabled.
c:\pstools>ipconfig
| findstr /i "ipv4"
IPv4 Address. . . . . . . . . . . :
192.168.2.1
c:\pstools>psping
-n 3 -i 0 -q 10.2.1.100:80
PsPing v2.10 -
PsPing - ping, latency, bandwidth measurement utility
Copyright (C)
2012-2016 Mark Russinovich
Sysinternals -
www.sysinternals.com
TCP connect to
10.2.1.100:80:
4 iterations
(warmup 1) ping test: 100%
TCP connect
statistics for 10.2.1.100:80:
Sent = 3, Received = 3, Lost = 0 (0% loss),
Minimum = 19.39ms, Maximum = 20.17ms, Average
= 19.78ms
Connectivity from the web server itself is also working when
accessing the service on 10.2.1.100, as this exists locally on the server,
aka on-link.
c:\pstools>ipconfig
| findstr /i "ipv4"
IPv4 Address. . . . . . . . . . . : 10.2.1.4
IPv4 Address. . . . . . . . . . . :
10.2.1.100
c:\pstools>route
print 10.2.1.100
Active Routes:
Network
Destination Netmask Gateway Interface Metric
10.2.1.100 255.255.255.255 On-link 10.2.1.100 510
c:\pstools>psping
-n 3 -i 0 -q 10.2.1.100:80
PsPing v2.10 -
PsPing - ping, latency, bandwidth measurement utility
Copyright (C)
2012-2016 Mark Russinovich
Sysinternals -
www.sysinternals.com
TCP connect to
10.2.1.100:80:
4 iterations
(warmup 1) ping test: 100%
TCP connect
statistics for 10.2.1.100:80:
Sent = 3, Received = 3, Lost = 0 (0% loss),
Minimum = 0.11ms, Maximum = 0.20ms, Average =
0.14ms
- The backend call to the frontend VIP never leaves the backend VM. This may or may not suit your application requirements, the request can only be served locally.
- DSR is optional, but allows the backend VM to listen on a common IP (The ILB VIP) for all connecitons, locally originated and remote.
- You must continue to listen on the physical primary NIC IP address for application connections, otherwise LB health probes will fail
3.3 Workaround Option [3] - Application Gateway / NVA
A simple option for HTTP/S traffic is to utilise Azure
Application Gateway instead. Note, use of either APGW or an NVA has cost, performance and scale limitations as these are fundamentally different products. These solutions are based on additional compute resources that sit inline with the datapath, where as the Azure Load Balancer can be thought of more as a function of the Azure SDN.
Application Gateway |
Application Gateway only supports HTTP/S frontend listeners,
therefore, if a LB solution for other TCP/UDP ports is required an NVA
(Network Virtual Appliance) is required. NGINX is one of these 3rd
party NVA options.
NGINX |
See https://github.com/jgmitter/nginx-vmss
for fast start NGINX config on Azure including ARM template. Also see https://github.com/microsoft/PL-DNS-Proxy for a similar NGINX template with ability to deploy to custom VNet.
Two NGINX instances are used for high availability. Each
instance contains the same proxy/LB configuration. These instances are fronted
by an Azure internal load balancer themselves to provide a single front end IP address for
client access. This front end IP also works for backend members specified in
the NGINX config file, as shown on the diagram above.
The simple NGINX proxy configuration is shown below.
upstream
backend {
server 10.2.1.4;
}
server {
listen 80;
location / {
proxy_pass http://backend;
}
}
The number of backend servers in my example is a single VM,
in production there would be multiple nodes and additional lines within the
upstream module. E.g.
upstream
backend {
server 10.2.1.4;
server 10.2.1.5;
server 10.2.1.6;
}
server {
listen 80;
location / {
proxy_pass http://backend;
}
}
Adam, this was a fantastic explanation! thanks!!!
ReplyDeleteAmazing! Thanks for sharing it with us!
ReplyDeletehi
ReplyDeletevery good article
i 'm stuck with my implementation...is there any way you can help me out ?
i am running out of time
thanks
forgot to add what i have
ReplyDelete1 Azure ILB with 2 SQL databases in the backend pool.
the databases VM has 2 NIC each
One NIC is prymary and the other NIC has just an static IP (without default gateway of course)
my intention is to point the LB to the secondary NIC and use DSR, but if i do that the VM servers can not reply back to the probe with the same NIC, they are trying to reply using the primary NIC then the LB put the VMs out of the pool because it does not receive the reply from NIC 2 as expected.
if i add a route in the DB to reply to the probe coming from 168.63.129.16 using the NIC2 it will work but them other services from Azure will stop working as they contact NIC1 (the primary) and the replies are going through NIC 2
is there a way we can commnunicate to explain more in details ?
Thank you for the informative blog regarding azure cloud migration services. I found very useful data from this blog.
ReplyDeleteazure cloud migration services
I just want to thank you for sharing your information and your site or blog this is simple but nice Information I’ve ever seen i like it i learn something today. Migrating Application Workloads to Azure WS-050
ReplyDeleteazure training
ReplyDeleteazure online training
azure online training in hyderabad
Say, you got a nice article post.Really thank you! Really Great.
ReplyDeleteazure online training hyderabad
azure online training in hyderabad
This comment has been removed by the author.
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteDevops Proxy Support, Aws Interview Support, Azure Interview Support, Devops Interview Support, Devops Proxy Support, Aws Proxy Support, Devops Proxy Interview Support Provides DevOps Online Job Support From India and AWS DevOps, Azure DevOps Proxy Interview Support with experts. Contact Now! 7339826411
ReplyDeleteThanks for more information in Azure Conditional Access Solutions
ReplyDeleteWe are come up with the most senior consultants from India for this Devops on job. Devops Technical Support, Devops Proxy Support and Devops On Job Support. Aws Interview Support, Azure Interview Support, Azure Proxy Support, Devops Interview Support, Devops Proxy Support, Aws Proxy Support, Devops Proxy Interview Support, AWS DevOps, Azure DevOps Proxy Interview Support. Contact Now! 7339826411
ReplyDeleteWe are come up with the most senior consultants from India for this Devops on job. Devops Technical Support, Devops Proxy Support and Devops On Job Support. Aws Interview Support, Azure Proxy Support, Azure Interview Support , Devops Interview Support, Devops Proxy Support, Aws Proxy Support, Devops Proxy Interview Support, AWS DevOps, Azure DevOps Proxy Interview Support. Contact Now! 7339826411
ReplyDeleteThanks for this wonderful blog, Keep sharing your thoughts like this...
ReplyDeleteAzure Training in Chennai
Microsoft Azure Online Training
Great content. But it's quite strange, this solution works perfectly in Windows but not Ubuntu. Anyone knows why? Thanks in advance.
ReplyDelete*"Workaround Option [1] - Second NIC"
DeleteAt APTRON Solutions, we understand the importance of hands-on learning and practical experience. That's why our Microsoft Azure Fundamentals Training in Noida is structured to provide you with a blend of theoretical knowledge and real-world applications. Our industry-expert trainers will guide you through the core concepts of Microsoft Azure, including cloud concepts, Azure services, Azure pricing, and support.
ReplyDeleteAt APTRON Solutions, we understand the growing demand for cloud skills in today's tech-driven world. Industry experts craft our Microsoft Azure Fundamentals Course in Noida to ensure you grasp fundamental cloud concepts, Azure services, security, pricing, and support. Whether you're aiming to enhance your career prospects or seeking to build scalable cloud solutions, our hands-on training equips you with practical skills and industry-relevant knowledge.
ReplyDelete