Cloud NetworkingAWSAzureGCPVPCCloud Architecture

Cloud Networking Explained: How AWS, Azure, and Google Cloud Handle Your Traffic

Sam RiveraMay 30, 2024

There's a famous saying in tech: "There is no cloud. It's just someone else's computer." It's a humorous oversimplification — but there's enough truth in it to be a useful starting point. Cloud computing is, at its core, about using computing resources (servers, storage, networks, databases) that you access over the internet rather than owning and operating yourself.

But the networking that makes cloud computing possible is genuinely sophisticated — and increasingly, understanding cloud networking is a fundamental skill for developers, sysadmins, and architects. This is an honest, accessible walkthrough of how networking works in the major cloud platforms.

The Basic Cloud Networking Model

When you sign up with a cloud provider like AWS (Amazon Web Services), Azure (Microsoft), or Google Cloud (GCP), you're getting access to massive data centers — enormous buildings full of physical servers, storage, and networking equipment. The cloud provider owns and operates this hardware. You rent access to virtual slices of it.

The key abstraction is virtualization. Physical servers are divided into virtual machines (VMs). Physical networks are divided into virtual networks. The whole cloud is a massive, software-defined layer running on top of physical infrastructure.

Virtual Private Cloud (VPC): Your Network in the Cloud

The foundational networking concept in every major cloud is the VPC (Virtual Private Cloud) — a logically isolated section of the cloud provider's network that you control.

When you create a VPC, you're defining:

An IP address range (CIDR block): You choose a range of private IP addresses for your VPC. Typically something like `10.0.0.0/16`, giving you 65,536 possible addresses. Every resource you create in the VPC gets an IP address from this range.

Subnets: You divide the VPC CIDR into smaller subnets, each associated with a specific availability zone (a physically separate data center location). You typically create both public subnets (accessible from the internet) and private subnets (no direct internet access, for databases and internal services).

Route Tables: Define where traffic goes. The public subnet's route table has a route to an internet gateway (the VPC's connection to the public internet). The private subnet's route table doesn't.

Security Groups: Virtual firewalls that control what traffic is allowed to and from each resource. Unlike traditional network firewalls that operate at the subnet level, security groups operate at the individual resource level. Each EC2 instance (AWS virtual machine) has its own security groups.

Network ACLs (Access Control Lists): Optional additional layer of subnet-level traffic control. Less commonly configured than security groups.

This structure — VPC with public and private subnets, route tables, and security groups — is the standard pattern for cloud networking. It mirrors traditional network architecture (public DMZ, private internal network) but implemented entirely in software.

How Traffic Flows In and Out

Internet Gateway (IGW): The VPC's door to the public internet. Resources in public subnets (that have public IP addresses assigned) can communicate with the internet through the IGW. The IGW performs NAT translation for resources that have both private and public IPs.

NAT Gateway: Resources in private subnets (like databases) often need to download software updates or communicate with external APIs, but they shouldn't be directly accessible from the internet. A NAT Gateway placed in a public subnet allows private subnet resources to initiate outbound internet connections while remaining unreachable from the internet directly.

Load Balancers: Cloud providers offer managed load balancers that distribute traffic across multiple backend instances. AWS has the Application Load Balancer (ALB) for HTTP/HTTPS and the Network Load Balancer (NLB) for TCP/UDP. These are all fully managed — you don't manage any servers; you just configure the load balancer's rules and target groups.

Elastic IPs / Static IPs: Normally, public IP addresses assigned to cloud resources can change if the resource is stopped and restarted. An Elastic IP (AWS) or Static IP (GCP) is a persistent public IP address that you allocate and can attach to resources — it remains stable even if the underlying resource is restarted.

Content Delivery Networks (CDN): AWS CloudFront, Azure CDN, and Google Cloud CDN are managed CDN services. They cache your content at edge locations around the world, serving users from the closest location for minimum latency. Integration with cloud storage (like S3) makes hosting and distributing static content globally very straightforward.

Connecting Cloud to On-Premises: Hybrid Networking

Most large organizations don't move everything to the cloud — they have hybrid architectures with resources both on-premises (in their own data centers) and in the cloud. Connecting these securely requires dedicated connectivity.

VPN (Site-to-Site VPN): The simplest option. An encrypted VPN tunnel between the cloud VPC and the on-premises network over the public internet. Easy to set up, modest cost, but performance depends on internet conditions. Maximum throughput is typically limited to a few Gbps.

Direct Connect (AWS) / ExpressRoute (Azure) / Cloud Interconnect (GCP): Dedicated private network connections from the cloud provider's facilities to your data center — bypassing the public internet entirely. Arranged through telecommunications providers who have physical connections in the cloud provider's data centers. Provides consistent, high-bandwidth, low-latency connectivity. More expensive, but often necessary for compliance, performance, or large data transfer requirements.

VPC Peering: Connecting two VPCs (possibly in different accounts or even different regions) directly, allowing resources in each to communicate as if they were on the same network. Traffic stays on the cloud provider's private network backbone — doesn't traverse the public internet.

Transit Gateway (AWS) / Virtual Network Peering Hub (Azure): When you have many VPCs that need to communicate, managing individual peering connections between each pair becomes complex. Transit Gateway is a central hub that all VPCs (and VPN connections and Direct Connect connections) connect to, simplifying the routing topology dramatically.

DNS in the Cloud

Cloud providers have their own managed DNS services:

**AWS Route 53:** Both a public DNS service (for registering domains and managing public DNS records) and a private DNS service (for resolving names within VPCs).

**Azure DNS / Google Cloud DNS:** Similar managed DNS services.

Cloud DNS services offer health-check based routing — Route 53 can route traffic to a backup region if the primary becomes unhealthy. They also offer geolocation routing (route traffic to the nearest region), latency-based routing, and weighted routing (split traffic percentage between targets) — all powerful features for building globally distributed, resilient services.

Within VPCs, private DNS allows resources to communicate by name rather than IP address. Your backend service can reach your database at `database.internal` instead of a specific IP address — making the architecture much more flexible.

Security Groups vs. Network ACLs: Understanding the Difference

This is a common source of confusion. Both are traffic filtering mechanisms in cloud networking, but they work differently:

Security Groups:

Operate at the resource (instance/service) level

Are **stateful**: If you allow an inbound connection, the response is automatically allowed outbound. You don't need to explicitly create a matching outbound rule.

Only allow rules (no explicit deny — anything not allowed is implicitly denied)

Applied when a network interface is attached to an instance

Network ACLs (NACLs):

Operate at the subnet level — apply to all resources in the subnet

Are **stateless**: You must explicitly configure both inbound and outbound rules

Support both allow and deny rules

Rules are evaluated in number order; first matching rule wins

In practice, security groups handle most traffic filtering. NACLs are used for additional subnet-level controls, typically for blocking specific IP addresses or as a backup defense-in-depth layer.

The Software-Defined Networking (SDN) Behind the Cloud

How do cloud providers implement all of this at massive scale? The answer is Software-Defined Networking (SDN).

Traditional networking has the control plane (routing decisions) and data plane (actual packet forwarding) implemented in hardware together in physical routers and switches. SDN separates these: the control plane is centralized software, and the data plane is simplified forwarding hardware that follows instructions from the central controller.

Cloud providers implement their VPC networking through SDN at massive scale. When you configure a VPC, a route table, or a security group, you're updating configuration in the SDN control plane. The control plane then automatically programs the physical networking hardware and software throughout the data center to implement your configuration.

This allows cloud providers to create and destroy thousands of virtual networks instantly, to move workloads between physical servers without changing network addresses, and to enforce security group rules at wire speed in hardware — all with no manual hardware reconfiguration.

Multi-Region Architecture: Building for Global Scale

Cloud providers divide the world into regions (geographic areas with multiple data centers, like `us-east-1` for Northern Virginia, `eu-west-1` for Ireland) and availability zones (AZs) within regions (physically separate data centers within the same metropolitan area, with independent power and cooling).

Distributing resources across AZs provides high availability: if one data center has a power failure or other issue, the other AZs keep running. Distributing across regions provides geographic redundancy and lower latency for users in different parts of the world.

Global load balancers (AWS Global Accelerator, Azure Front Door, Google Cloud's Global Load Balancing) can route users to the nearest or healthiest region automatically. Traffic is directed into the cloud provider's global backbone network as early as possible, reducing latency compared to traveling over the public internet.

Observability: Seeing What's Happening in Your Cloud Network

You can't manage what you can't see. Cloud providers offer extensive logging and monitoring for network traffic:

VPC Flow Logs (AWS) / NSG Flow Logs (Azure) / VPC Flow Logs (GCP): Capture metadata about all traffic flowing through your VPC — source IP, destination IP, port, protocol, bytes transferred, and whether traffic was allowed or rejected. Invaluable for security investigations, troubleshooting, and understanding your actual traffic patterns.

Network performance monitoring: CloudWatch (AWS), Azure Monitor, and Google Cloud Monitoring provide metrics on network throughput, packet loss, latency, and connection counts for load balancers and other network services.

The Cloud Is Physical

It's easy to think of cloud networking as entirely abstract, but it's built on very physical infrastructure. When you create a VPC in `us-east-1`, you're using network equipment in Amazon's physical data centers in Northern Virginia. When you enable Direct Connect, there's a physical fiber cable running from your data center to an Amazon facility.

The cloud abstracts away the complexity of that physical infrastructure — you don't have to cable servers or configure physical switches. But the physics of networking still apply: speed of light, bandwidth, latency, physical distance. Choosing a region close to your users isn't just about legal compliance — it's about the speed of light across the fiber optic cable between you and them.

Understanding cloud networking means understanding both the elegant software abstractions (VPCs, security groups, managed services) and the physical realities they sit on. Together, they make it possible to build globally distributed, highly available, secure applications without owning a single piece of networking hardware.