Active-Active Data Centers – Network Architect’s Perspective
Disclaimer:- This note was written by me ( Mayank Nauni) in my personal capacity. The opinions expressed in this article are solely my own and do not reflect the view of my employer or my preference towards any of the OEMs.
The first thing I hear from a lot of network architects after we talk about active/active Data Centers is “Global Traffic Load Balancers” … and that is it! I believe it is one of the component of an active/active DC, quite important component though, but we seem to be not noticing the elephant in the room and setting the wrong expectations with the design at the same time.
Let us understand a bit about active/active DCs. There are two kind of active/active DCs:-
1. Active/Active DC Infrastructure
2. Active/Active application infrastructure
In order to have an active/active application setup, we must first have active/active data center infrastructure (network is just one key component of active/active DC). An active/active DC can be divided into two halves from purely network’s perspective:-
· Ingress Traffic
· Egress Traffic
Ingress Traffic:- Global Traffic Managers (Intelligent DNS) to your rescue, wouldn’t go into the details but most common configuration used here is priority based ( yet to see a round-robin deployment as calling DC active/active is a jargon rather and one DC still tends to be more “active” than other owing to infra sizing and hence attracting more ingress traffic).
Egress Traffic: – Here is the tricky part though IGPs in the internal network and BGP on the core helps to some extent but still not considered to be a viable solution. Quite often we experience “Traffic Tromboning Effect” for the traffic but modern day network infra has been addressing this from quite some time, different names by OEM but it is anycast IP gateway with LISP if I had simplify it, not to mention overlays like VxLAN have been simplifying it for us. I did consider “Firewall Clustering” between DCs once, but it seems to be a complex deployment and prone to run into a race condition if DCI goes down, firewall cluster was more to address the issue caused by VMs flying around the DCs due to Vmotion and outbound traffic getting dropped by the firewalls as this session is maintained by the other firewall (when not operating in cluster).
So active/active network ( GTM+ stretched L2/L3 with anycast gateway+BGP outbound load sharing) achieves high availability and a true active/active DC? The answer is “No”. Then from the business perspective why would I even invest a hefty sum on it? The answer is, “it is one of the indispensable and core component of active/active DC and serves as the first building block or rather as the foundation itself”.
What are the core components of Active/Active DC:-
- Stretched Network across the DCs, SDN comes handy ( VxLAN as overlay)
- Global Load Balancer ( Intelligent DNS)
- Local Traffic Manager ( Coupled tightly with the Global Load Balancer)
- Compute Virtualization Cluster across the DC
- Shared / Virtualized Storage across the DC
- Active/Active Application Architecture ( the most important part )
The trickiest part here is achieving active/active for the application’s database tier. More details to follow in forthcoming articles.