Low Latency in the Cloud

Explores what's available in the main cloud service providers for good consistent latency and service response times.

For low-latency workloads, AWS, Azure, and GCP offer a mix of networking capabilities, specialized infrastructure, and services designed to reduce round-trip times and ensure fast data processing. Here’s a comparison of their key offerings:


Direct Connectivity

All three main cloud vendors offer direct network connectivity.

  • AWS Direct Connect – Private, dedicated network links between on-premises and AWS with sub-1 ms latency in most cases.
  • Azure ExpressRoute – Private, high-bandwidth, low-latency connectivity between Azure and on-prem.
  • GCP Cloud Interconnect – Dedicated private connectivity between on-prem and GCP, similar to Direct Connect and ExpressRoute.

Global Routing

All three main cloud vendors offer a facility to optimize routing globally:

  • AWS Global Accelerator – Uses AWS's global network for routing traffic to the closest endpoint, reducing jitter and improving consistency.
  • Azure Front Door – Global load balancer with routing across Microsoft's backbone for lower latency.
  • GCP Cloud CDN + Cloud Load Balancing – Routes traffic over Google's backbone to optimize speed.

Optimize Network Stack

For inter-node communication, there are premium offerings that address latency and packet loss. However, this is where offerings diverge

GCP Andromeda Network Stack

Andromeda is Google's custom network SDN (Software Defined Network). It provides low-latency, high-throughput, and scalable networking across the Google Cloud environment. Key features include:

  1. SR-IOV for Direct Packet Access

    • SR-IOV (Single Root I/O Virtualization) allows VM instances to bypass the hypervisor's network stack and communicate directly with the underlying NIC (Network Interface Card).
    • This significantly reduces CPU overhead and lowers latency for workloads that require real-time or high-throughput networking (e.g., trading systems, AI/ML, gaming).
  2. Zero-Packet Loss and Low-Jitter Transport

    • Andromeda employs advanced congestion control and packet pacing techniques to ensure consistent low-latency networking.
    • Google's internal backbone prioritizes real-time traffic, reducing the risk of jitter and dropped packets.
  3. Smart Load Balancing with Maglev

    • Google’s Maglev load balancer (part of Andromeda) spreads traffic evenly across multiple backend instances without adding extra latency.
    • This makes Andromeda-powered networking highly efficient for distributed systems.
  4. Host-Based Firewall & Security Policies

    • Unlike traditional networking stacks, Andromeda enforces firewall rules at the virtual NIC (vNIC) level, reducing reliance on external security appliances.
    • This eliminates the latency overhead of network appliances while still providing micro-segmentation and zero-trust security.
  5. Dynamic Path Selection

    • Andromeda can dynamically reroute traffic over Google's private fiber backbone to avoid congested paths.
    • This results in consistent, low-latency networking, even under high traffic loads.
  6. Hypervisor Bypass for Network Acceleration

    • Traditional virtualization stacks rely on a software switch (vSwitch), which adds latency to VM-to-VM traffic.
    • Andromeda minimizes this by offloading networking functions to hardware and enabling direct memory access (DMA) between VMs and the NIC.

The following are best use-cases for Andromeda

  • High-Frequency Trading (HFT)Microsecond-level networking for real-time financial applications.
  • Online Gaming & Interactive Apps – Low latency for real-time multiplayer experiences.
  • AI/ML Training & InferenceFast interconnect between GPUs/TPUs for distributed ML workloads.
  • HPC Workloads – Optimized for high-performance computing clusters requiring minimal network overhead.
  • Live Video Streaming – Supports real-time media processing without buffering delays.

How Andromeda Reduces Latency in Google Cloud?

FeatureHow It Lowers Latency
SR-IOV (Single Root I/O Virtualization)Direct access to NIC eliminates hypervisor processing delays.
Private Google Backbone RoutingAvoids the public internet for faster data transfer.
Dynamic Path SelectionReroutes traffic in real time to avoid congestion.
Zero Packet Loss TransportEnsures reliable, jitter-free data transmission.
Maglev Load BalancerEliminates the need for additional hops in networking.

Note - Andromeda does not require special API calls or changes to code.

AWS Elastic Fabric Adapter (EFA)

EFA provides ultra-low-latency networking for HPC and ML workloads using the SRD (Scalable Reliable Datagram) protocol.

FeatureAndromeda (GCP)SRD (AWS)
Networking ModelSDN-basedCustom transport protocol
Hypervisor BypassYes (SR-IOV)Yes (via EFA)
ProtocolStandard TCP/UDP with GCP optimizationsCustom SRD protocol
Multipath RoutingYes (Google backbone reroutes dynamically)Yes (multi-path, congestion-aware)
Reliability MechanismGoogle’s internal loss recovery & congestion controlSRD's built-in packet loss recovery
Latency OptimizationHypervisor bypass + congestion-aware routingUltra-low-latency, <15μs for intra-cluster comms
Best ForGeneral-purpose cloud networking, VMs, low-latency applicationsHPC, ML, tightly coupled compute clusters

Comparing with GCP, AWS and Azure

FeatureGCP AndromedaAWS (ENA & SRD)Azure Accelerated Networking
SR-IOV SupportYesYes (ENA)Yes
Private Backbone RoutingYesYes (AWS Global Accelerator)Yes (Azure Front Door)
Hypervisor BypassYesYes (SRD)Yes
Jitter-Free Packet ProcessingYesYesYes
Best ForHPC, AI/ML, low-latency appsHPC, AI/ML, gamingHPC, financial trading
Originally posted:
Filed Under:
cloud
low-latency