👋 Hi! I’m Bibin Wilson. In each edition, I share practical tips, guides, and the latest trends in DevOps and MLOps to make your day-to-day DevOps tasks more efficient. If someone forwarded this email to you, you can subscribe here to never miss out!

In this blog we will look at,

  1. Understanding VPC Requirements

  2. Understanding application requirements

  3. Choosing a CIDR for VPC

  4. Avoiding IP Address Conflicts (Best Practice)

  5. Subnet Design

  6. VPC & Subnet Documentation (Best Practice)

  7. AWS VPC Topology

  8. VPC Endpoints

This guide is only focussed on the AWS cloud environment.

I am not taking a hybrid environment into consideration. However, we will touch base on a few concepts related to hybrid cloud environments, but the key focus is on AWS VPC.

Understand VPC Requirements

As a DevOps engineer, you need to understand the VPC requirements by asking questions to the relevant teams.

When working in real projects, following are some of the important questions that will help you understand the VPC requirements better.

  1. Identifying Your Hosting Needs: What do you want to host?

  2. Meeting Compliance Standards: What are its compliance requirements?

  3. Handling Sensitive Information: Does it have applications dealing with PCI/PII data?

  4. Public vs. Private Accessibility: Are the applications internet-facing?

  5. Connecting to On-Premise Systems: Does the VPC require a Hybrid connectivity to an on-premise environment? If yes, is it DNS or IP-based connectivity?

  6. User Accessibility to VPC Services: How are users going to connect to the services hosted in VPC?

  7. VPC to VPC Connectivity: Does it need access to services hosted on other VPCs that are part of organizations network?

It is always best to document these requirements.

Deployment Architecture

Before designing a VPC, it is essential to understand the infrastructure requirements of the application.

This guide will walk you through designing a VPC network using an example application and its specific requirements.

The architecture consists of four categories of applications:

  • Web Application (Java-based)

  • Automation Tools (App/Infra CI/CD)

  • Platform Tools (e.g., Prometheus, Grafana, Consul)

  • Managed Services (RDS, Cloudwatch etc)

Below is the high level application's deployment architecture.

(Open the image in a new tab for a high-resolution view.)

VPC Network Design

Ideally, in most organizations, the VPC is created and managed by a dedicated network team. However, DevOps engineers working with the application team need to define the VPC requirements to ensure it can host all the required applications.

How to Choose a CIDR for VPC?

The CIDR block for a VPC depends on the number of servers planned for deployment. This includes both self-hosted and AWS-managed services.

We not only consider the immediate requirements but also future expansion. While we may start with 15 servers, the infrastructure should be scalable to accommodate 1,000+ servers in the future.

  • A /16 CIDR block provides 65,536 IPs, but it's often too large.

  • A /20 CIDR block (4,096 IPs) may be too small if we scale beyond 1,000 servers.

  • A /18 CIDR block (16,384 IPs) is a balanced choice, ensuring scalability.

Subnet Design

Based on our application architecture and components we would need the following public and private subnets.

  • 3 Public Subnets – To deploy internet facing Load balancers for the Java app autoscaling group

  • 3 Application Subnets (Private) – To deploy the Java app autoscaling group

  • 3 Database Subnets (Private) – To deploy the RDS MYSQL instance

  • 3 Management Subnets (Private) – Dedicated to CI/CD tools and automation services.

  • 3 Platform Subnets (Private) – For platform tools such as Prometheus, Grafana, and Consul used for monitoring and service discovery.

These subnets should be carved our from the 10.0.0.0/18 CIDR.

  • Starting IP Address: 10.0.0.0

  • Last IP Address: 10.0.63.255

Each subnet must have enough IPs to support scaling needs while maintaining AWS best practices.

For example,

Avoiding IP Address Conflicts

Let’s consider a scenario where 10.0.0.0/16 range is already allocated to a project in an on-prem environment.

Even if there is no hybrid cloud connectivity to on-prem, we should not re-use 10.0.0.0/16 for VPC. Because in the future, if hybrid connectivity is set up, it could lead to IP conflicts.

Network teams in organizations ensure there are no IP range conflicts by keeping track of private IP addresses reserved for projects. This way, there won’t be any IP conflicts.

Typically they use IP Address Management (IPAM) tools to track IP address allocation. These tools provide a centralized view of the IP address space used within the organization.

The following image shows an example dashboard of an open-source IPAM tool called Netbox.

Private Subnet Access

Since we have private subnets, DevOps engineers & developers need access to the servers on private subnets.

Most organizations set up a VPN connection to the AWS cloud to access the servers deployed in VPC.

Following are the native-options for connecting instances in the AWS VPC private subnets.

  1. EC2 Instance Connect: Helps you to connect to AWS instances in a private subnet securely without needing a Public IP. It is an identity-aware proxy that uses IAM permissions to connect to the instance. One instance can be used as a JUMP server to connect to other instances in the VPC (cheapest solution)

  2. AWS Client VPN (client-to-site VPN): Allows remote workers to access AWS resources securely; Ideal for a distributed team that needs to use AWS services. (Gets expensive with more users)

  3. Site-to-Site VPN: Connects the on-premises network to the AWS Virtual Private Cloud (VPC); This is the ideal solution for organizations that want a secure, private connection between their on-prem network and AWS. Requires an on-premises VPN device. Setup can be expensive.

  4. AWS Direct Connect: Creates a direct, private link between the on-prem and AWS network; It is ideal for businesses that need a fast, reliable connection to AWS without using the public internet. It comes with a higher upfront costs.

Internet Access For Subnets

Both private and public subnet servers require internet access.

  • Public Subnet: Adding an Internet Gateway (IGW) makes a subnet public, allowing instances to receive inbound traffic directly from the internet.

  • Private Subnet: By default, subnets without an internet gateway remain private. However, instances in private subnets still need outbound internet access for tasks like reaching third-party services or package repositories.

To enable outbound internet access for private subnets, a NAT Gateway must be attached. This ensures that private subnet instances can access external resources while remaining inaccessible from the internet.

Egress Traffic Filtering

Most organizations use a forward proxy to manage all outbound internet requests from both private and public subnets. This means that even with a NAT Gateway, outbound traffic passes through a firewall service for filtering and control.

AWS offers a service called AWS Network Firewall, which can be integrated with a NAT gateway for egress traffic filtering. You can restrict or filter HTTP and HTTPS traffic using domain names.

Some organizations deploy Squid Proxies for DNS filtering and traffic control.

Large enterprises often use advanced security solutions like Checkpoint for both ingress (incoming) and egress (outgoing) traffic filtering.

Here is how the Outbound Traffic Flows.

  1. All outgoing requests first go through the proxy server.

  2. The proxy applies security policies and filters traffic.

  3. Approved requests are then forwarded through the NAT Gateway to reach the internet.

VPC & Subnet Documentation

One of the key things in VPC design is documentation. All VPC configurations should be documented to ensure the VPC stays compliant over time.

You can choose a documentation method of your choice. It could be an Excel sheet, confluence documentation, or GitHub Markdown documentation.

Here is an example documentation of a Public Subnet.

Like this you need to document it for all subnets.

Route Table Design

Route tables are important for directing traffic (e.g., public subnets to IGW, private subnets to NAT Gateway).

For each subnet group, we will create a custom route table and assign rules required for the specific subnets.

For example, all three public subnets will share the same public-subnet route table.

AWS VPC Topology

The following diagram shows the high-level VPC topology for our design.

(Open image in new tab for HD)

Network ACLs

Network access control list (NACL) is the native VPC functionality to control the inbound and outbound traffic at the subnet level.

In our architecture, the connection to the DB subnet should be allowed only from the App subnet and management subnet. The public subnet should not have direct access to the DB subnet.

VPC Endpoints

VPC interface and gateway endpoints lets you connect to AWS managed services like s3 , Secrets manager, Cloudwatch etc. privately using AWS Privatelink.

As per our application architecture, we use s3, secrets manger and Cloudwatch services.

Here is an AWS official image for reference.

Automating VPC Management

Now that we have all the requirements for the VPC documented, we can use an IaC tool to provision and manage the VPC resources and configurations.

You can use Terraform/Cloudformation to automate and manage a VPC.

Follow Terraform AWS VPC blog to automate AWS VPC creation.

Reply

Avatar

or to participate

Keep Reading