Regional Edge Resiliency Zones and Virtual Sites

Introduction:

This article is a follow-up article to my earlier article, F5 Distributed Cloud: Virtual Sites – Regional Edge (RE).  In the last article, I talked about how to build custom topologies using Virtual Sites on our SaaS data plane, aka Regional Edges.  In this article, we’re going to review an update to our Regional Edge architecture.  With this new update to Regional Edges, there are some best practices regarding Virtual Sites that I’d like to review. 

As F5 has seen continuous growth and utilization of F5’s Distributed Cloud platform, we’ve needed to expand our capacity.  We have added capacity through many different methods over the years.  One strategic approach to expanding capacity is building new POPs.  However, in some cases, even with new POPs, there are certain regions of the world that have a high density of connectivity. This will always cause higher utilization than in other regions.  A perfect example of that is Ashburn, Virginia in the United States.  Within the Ashburn POP that has high density of connectivity and utilization, we could simply “throw compute at it” within common software stacks.  This is not what we’ve decided to do; F5 has decided to provide additional benefits to capacity expansions by introducing what we’re calling “Resiliency Zones”.

 

Introduction to Resiliency Zones:

What is a Resiliency Zone?  A Resiliency Zone is simply another Regional Edge cluster within the same metropolitan (metro) area.  These Resiliency Zones may be within the same POP, or within a common campus of POPs.  The Resiliency Zones are made up of dedicated compute structures and have network hardware for different networks that make up our Regional Edge infrastructure. 

So why not follow in AWS’s footsteps and call these Availability Zones?  Well, while in some cases we may split Resiliency Zones across a campus of data centers and be within separate physical buildings, that may not always be the design.  It is possible that the Resiliency Zones are within the same facility and split between racks.  We didn’t feel this level of separation provided a full Availability Zone-like infrastructure as AWS has built out.  Remember, F5’s services are globally significant.  While most of the cloud providers services are locally significant to a region and set of Availability Zones (in AWS case).  While we strive to ensure our services are protected from catastrophic failures, F5 Distributed Cloud’s global availability of services affords us to be more condensed in our data center footprint within a single region or metro.

I spoke of “additional benefits” above; let’s look at those.  With Resiliency Zones, we’ve created the ability to scale our infrastructure both horizontally and vertically within our POPs.  We’ve also created isolated fault and operational domains.  I personally believe the operational domain is most critical.  Today, when we do maintenance on a Regional Edge, all traffic to that Regional Edge is rerouted to another POP for service.  With Resiliency Zones, while one Regional Edge “Zone” is under maintenance, the other Regional Edge Zone(s) can handle the traffic, keeping the traffic local to the same POP.  In some regions of the world, this is critical to maintaining traffic within the same region and country. 

 

What to Expect with Resiliency Zones

Resiliency Zone Visibility:

Now that we have a little background on what Resiliency Zones are, what should you expect and look out for?  You will begin to see Regional Edges within Console that have a letter associated to them.  Example, “dc12-ash” which is the original Regional Edge; you’ll see another Regional Edge “b-dc12-ash”.  We will not be appending an “a” to the original Regional Edge.  As I write this article, the Resiliency Zones have not been released for routing traffic.  They will be soon (June 2025).  You can however, see the first resiliency zone today if you use all regional edges by default.  If you navigate to a Performance Dashboard for a Load Balancer, and look at the Origin Servers tab, then sort/filter for dc12-ash, you’ll see both dc12-ash and b-dc12-ash. 

 

Customer Edge Tunnels:

Customer Edge (CE) sites will not terminate their tunnels onto a Resiliency Zone.  We’re working to make sure we have the right rules for tunnel terminations in different POPs. We can also give customers the option to choose if they want tunnels to be in the same POP across Resiliency Zones.  Once the logic and capabilities are in place, we’ll allow CE tunnels to terminate on Resiliency Zones Regional Edges.

 

Site Selection and Virtual Sites:

The Resiliency Zones should not be chosen as the only site or virtual site available for an origin.  We’ve built in some safeguards into the UI that’ll give you an error if you try to assign Resiliency Zone RE sites without the original RE site within the same association.  For example, you cannot apply b-dc12-ash without including dc12-ash to an origin configuration. 

If you’re unfamiliar with Virtual Sites on F5’s Regional Edge data planes, please refer to the link at the top of this article.  When setting up a Virtual Site, we use a site selector label.  In my article, I highlight these labels that are associated per site.  What we see used most often are: Country, Region, and SiteName.  If you chose to use SiteName, your Virtual Site will not automatically add the new Resiliency Zone.  Example, your site selector uses SiteName in dc12-ash.  When b-dc12-ash comes online, it will not be matched and automatically used for additional capacity.  Whereas if you used “country in USA” or “region in Ashburn”, then dc12-ash and b-dc12-ash would be available to your services right away. 

 

Best Practices for Virtual Sites:

What is the best practice when it comes to Virtual Sites?  I wouldn’t be in tech if I didn’t say “it depends”.  It is ultimately up to you on how much control you want versus operational overhead you’re willing to have.  Some people may say they don’t want to have to manage their virtual sites every time F5 changes the capacity. This could mean adding new Regional Edges in new POPs or adding Resiliency Zones into existing POPs. 

Whereas others may say they want to control when traffic starts routing through new capacity and infrastructure to their origins.  Often times this control is to ensure customer-controlled security (firewall rules, network security groups, geo-ip db, etc.) are approved and allowed.  As shown in the graph, the more control you want, the more operations you will maintain. 

 

What would I recommend?  I would go less granular in how I setup Regional Edge Virtual Sites.  As I would want as much compute capacity as close to them as possible to serve my clients of my applications for F5 Services.  I’d also want attackers, bots, bad guys, or the traffic that isn’t an actual client to have security applied as close as possible to the source.  Lastly, as we see L7 DDoS continue to rise, the more points of presence for L7 security I can provide and scale. This gives me the best chance of mitigating the attack. 

 

To achieve a less granular approach to virtual sites, it is critical to:

  1. Pay attention to our maintenance notices. If we’re adding IP prefixes to our allowed firewall/proxy list of IPs, we will send notice well in advance of these new prefixes becoming active. 
  2. Update your firewall’s security groups, and verify with your geo-ip database provider
  3. Understand your client-side/downstream/VIP strategy vs. server-side/upstream/origin strategy and what the different virtual site models might impact.
  4. When in doubt, ask. Ask for help from your F5 account team.  Open a support ticket.  We’re here to help.

 

Summary:

F5’s Distributed Cloud platform needed an additional scaling mechanism to the infrastructure, offering services to its customers.  To meet those needs, it was determined to add capacity through more Regional Edges within a common POP.  This strategy offers both F5 and Customer operations teams enhanced flexibility.  Remember, Resiliency Zones are just another Regional Edge.  I hope this article is helpful, and please let me know what you think in the comments below.  

Published Jun 13, 2025
Version 1.0
No CommentsBe the first to comment