-
Notifications
You must be signed in to change notification settings - Fork 775
refactor: one default common perimeter #1429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
refactor: one default common perimeter #1429
Conversation
…ice control versions.tf module
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the VPC Service Controls to use a single, common perimeter, which is a positive architectural simplification. The changes are extensive, touching documentation, Terraform configurations, and helper scripts to align with this new model. However, the implementation introduces some significant issues. There's a critical conflict in the Cloud Build private pool networking where both a managed Cloud NAT and a self-managed NAT VM are configured, which needs to be resolved. More importantly, the new perimeter configuration relies on placeholders in the Terraform code that must be manually updated using scripts in a later step. This approach is brittle, error-prone, and undermines the declarative nature of Infrastructure as Code. I've raised a critical issue to address this architectural flaw. Additionally, there are medium-severity concerns regarding hardcoded IP ranges and the use of a git branch reference instead of a specific tag or commit for a module source.
resource "google_compute_address" "cloud_build_nat" { | ||
project = var.project_id | ||
address_type = "EXTERNAL" | ||
name = "cloud-build-nat" | ||
network_tier = "PREMIUM" | ||
region = "us-central1" | ||
|
||
depends_on = [module.peered_network] | ||
} | ||
|
||
allow = [{ | ||
protocol = "all" | ||
ports = null | ||
}] | ||
resource "google_compute_instance" "vm-proxy" { | ||
project = var.project_id | ||
name = "cloud-build-nat-vm" | ||
machine_type = "e2-medium" | ||
zone = "us-central1-a" | ||
|
||
deny = [] | ||
tags = ["direct-gateway-access", "nat-gateway"] | ||
|
||
log_config = { | ||
metadata = "INCLUDE_ALL_METADATA" | ||
boot_disk { | ||
initialize_params { | ||
image = "debian-cloud/debian-12" | ||
} | ||
}] | ||
} | ||
|
||
network_interface { | ||
network = local.peered_network_name | ||
subnetwork = "sb-b-cbpools-${var.private_worker_pool.region}" | ||
subnetwork_project = var.project_id | ||
|
||
} | ||
|
||
can_ip_forward = true | ||
|
||
// This script configures the VM to do IP Forwarding | ||
metadata_startup_script = "sysctl -w net.ipv4.ip_forward=1 && iptables -t nat -A POSTROUTING -o $(ip addr show scope global | head -1 | awk -F: '{print $2}') -j MASQUERADE" | ||
|
||
service_account { | ||
scopes = ["cloud-platform"] | ||
} | ||
|
||
depends_on = [ | ||
resource.google_compute_router_nat.cb-nat, | ||
module.peered_network | ||
] | ||
} | ||
|
||
# This route will route packets to the NAT VM | ||
resource "google_compute_route" "through-nat" { | ||
name = "through-nat-range1" | ||
project = var.project_id | ||
dest_range = "0.0.0.0/1" | ||
network = local.peered_network_name | ||
next_hop_instance = google_compute_instance.vm-proxy.id | ||
priority = 10 | ||
} | ||
|
||
resource "google_compute_route" "through-nat2" { | ||
name = "through-nat-range2" | ||
project = var.project_id | ||
dest_range = "128.0.0.0/1" | ||
network = local.peered_network_name | ||
next_hop_instance = google_compute_instance.vm-proxy.id | ||
priority = 10 | ||
} | ||
|
||
# This route allow the NAT VM to reach the internet with it's external IP address | ||
|
||
resource "google_compute_route" "direct-to-gateway" { | ||
name = "direct-to-gateway-range1" | ||
project = var.project_id | ||
dest_range = "0.0.0.0/1" | ||
network = local.peered_network_name | ||
next_hop_gateway = "default-internet-gateway" | ||
tags = ["direct-gateway-access"] | ||
priority = 5 | ||
} | ||
|
||
resource "google_compute_route" "direct-to-gateway2" { | ||
name = "direct-to-gateway-range2" | ||
project = var.project_id | ||
dest_range = "128.0.0.0/1" | ||
network = local.peered_network_name | ||
next_hop_gateway = "default-internet-gateway" | ||
tags = ["direct-gateway-access"] | ||
priority = 5 | ||
} | ||
|
||
# Cloud Router | ||
resource "google_compute_router" "cb-router" { | ||
name = "cb-cloud-router" | ||
network = local.peered_network_name | ||
region = "us-central1" | ||
project = var.project_id | ||
} | ||
|
||
# Cloud NAT | ||
resource "google_compute_router_nat" "cb-nat" { | ||
project = var.project_id | ||
name = "cb-cloud-nat" | ||
router = google_compute_router.cb-router.name | ||
region = google_compute_router.cb-router.region | ||
nat_ip_allocate_option = "AUTO_ONLY" | ||
source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES" | ||
|
||
log_config { | ||
enable = true | ||
filter = "ALL" | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This configuration sets up both a managed Cloud NAT (google_compute_router_nat
) and a self-managed NAT instance (google_compute_instance.vm-proxy
) for the same VPC. This is redundant, increases complexity and cost, and could lead to unpredictable egress routing. The managed Cloud NAT is the recommended solution for providing NAT services in GCP. I strongly suggest removing the self-managed NAT VM resources (google_compute_address.cloud_build_nat
, google_compute_instance.vm-proxy
, and the associated google_compute_route
resources) and relying solely on the managed Cloud NAT.
1-org/envs/shared/service_control.tf
Outdated
title = "ER app infra -> cicd" | ||
from = { | ||
identities = [ | ||
"serviceAccount:PRJ_APP_INFRA_PIPELINE_NUMBER@cloudbuild.gserviceaccount.com", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file contains several hardcoded placeholder values (e.g., PRJ_APP_INFRA_PIPELINE_NUMBER
, PRJ_BU1_APP_INFRA_NUMBER
) that are intended to be replaced by sed
commands in a later step, as documented in 4-projects/README.md
. This approach is highly brittle, error-prone, and circumvents the benefits of Infrastructure as Code by introducing a manual, imperative process. A change in project naming or structure could easily break these scripts. This circular dependency should be resolved architecturally, for instance, by having a separate step that updates the perimeter using terraform_remote_state
to dynamically fetch project numbers after they are created.
nat_proxy_vm_ip_range = "10.1.1.0/24" | ||
|
||
single_project_network = { | ||
subnet_name = "eab-develop-us-central1" | ||
subnet_ip = "10.1.20.0/24" | ||
subnet_region = var.subnet_region | ||
subnet_private_access = true | ||
} | ||
single_project_secondary = { | ||
"eab-develop-us-central1" = [ | ||
{ | ||
range_name = "eab-develop-us-central1-secondary-01" | ||
ip_cidr_range = "192.168.0.0/18" | ||
}, | ||
{ | ||
range_name = "eab-develop-us-central1-secondary-02" | ||
ip_cidr_range = "192.168.64.0/18" | ||
}, | ||
] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The IP ranges for nat_proxy_vm_ip_range
, single_project_network
, and single_project_secondary
are hardcoded within the locals
block. This reduces the module's flexibility and reusability. It's recommended to define these as input variables, with the current values set as defaults, to allow for easier customization without modifying the module's core logic.
module "cloud_function" { | ||
source = "GoogleCloudPlatform/cloud-functions/google" | ||
version = "~> 0.6" | ||
source = "git::https://github.com/GoogleCloudPlatform/terraform-google-cloud-functions.git?ref=release-please--branches--main" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The module source is pinned to a branch (release-please--branches--main
). While the comment explains this is a temporary workaround, using a branch reference can lead to non-reproducible builds if the branch is updated with breaking changes. For better stability and long-term maintenance, it is recommended to pin to a specific release tag or a commit hash.
source = "git::https://github.com/GoogleCloudPlatform/terraform-google-cloud-functions.git?ref=v0.6.0"
1. Before move to the next step, configure directional policies for your perimeter. | ||
|
||
1. Navigate to `gcp-org`. | ||
1. Use `terraform output` to get the APP Infra Pipeline cloud build project id and project number. | ||
|
||
```bash | ||
cd ../gcp-org/ | ||
|
||
export cloudbuild_project_number=$(terraform -chdir="../gcp-projects/business_unit_1/shared/" output -raw cloudbuild_project_number) | ||
echo "cloud build project number = $cloudbuild_project_number" | ||
export cloudbuild_project_id=$(terraform -chdir="../gcp-projects/business_unit_1/shared/" output -raw cloudbuild_project_id) | ||
echo "cloud build project id = $cloudbuild_project_id" | ||
sed -i'' -e "s/PRJ_APP_INFRA_PIPELINE_NUMBER/${cloudbuild_project_number}/" /envs/shared/service_control.tf | ||
sed -i'' -e "s/PRJ_APP_INFRA_ID/${cloudbuild_project_id}/" /envs/shared/service_control.tf | ||
``` | ||
|
||
1. Use `terraform output` to get the Bucket used for storing terraform state for stage 4-projects foundations pipelines in seed project. | ||
1. Use `gsutil cat` to get the project numbers of the SVPC and Peering projects in each environment (production, nonproduction, and development) for configuring the directional app infra policies. | ||
|
||
```bash | ||
export projects_gcs_bucket_tfstate=$(terraform -chdir="../terraform-example-foundation/0-bootstrap/" output -raw projects_gcs_bucket_tfstate) | ||
echo "projects_gcs_bucket_tfstate = ${projects_gcs_bucket_tfstate}" | ||
|
||
export peering_project_number_dev=$(gsutil cat gs://${projects_gcs_bucket_tfstate}/terraform/projects/business_unit_1/development/default.tfstate \ | ||
| jq -r '.outputs.peering_project_number.value') | ||
echo "peering_project_number_dev = ${peering_project_number_dev}" | ||
sed -i'' -e "s/PRJS_DEV_SAMPLE_PEERING_NUMBER/${peering_project_number_dev}/" envs/shared/service_control.tf | ||
|
||
export peering_project_number_prod=$(gsutil cat gs://${projects_gcs_bucket_tfstate}/terraform/projects/business_unit_1/production/default.tfstate \ | ||
| jq -r '.outputs.peering_project_number.value') | ||
echo "peering_project_number_prod = ${peering_project_number_prod}" | ||
sed -i'' -e "s/PRJS_PROD_SAMPLE_PEERING_NUMBER/${peering_project_number_prod}/" envs/shared/service_control.tf | ||
|
||
export peering_project_number_nonprod=$(gsutil cat gs://${projects_gcs_bucket_tfstate}/terraform/projects/business_unit_1/nonproduction/default.tfstate \ | ||
| jq -r '.outputs.peering_project_number.value') | ||
echo "peering_project_number_nonprod = ${peering_project_number_nonprod}" | ||
sed -i'' -e "s/PRJS_NONPROD_SAMPLE_PEERING_NUMBER/${peering_project_number_nonprod}/" envs/shared/service_control.tf | ||
|
||
export shared_vpc_project_number_dev=$(gsutil cat gs://${projects_gcs_bucket_tfstate}/terraform/projects/business_unit_1/development/default.tfstate \ | ||
| jq -r '.outputs.shared_vpc_project_number.value') | ||
echo "shared_vpc_project_number_dev = ${shared_vpc_project_number_dev}" | ||
sed -i'' -e "s/PRJS_DEV_SAMPLE_SVPC_NUMBER/${shared_vpc_project_number_dev}/" /envs/shared/service_control.tf | ||
|
||
export shared_vpc_project_number_prod=$(gsutil cat gs://${projects_gcs_bucket_tfstate}/terraform/projects/business_unit_1/production/default.tfstate \ | ||
| jq -r '.outputs.shared_vpc_project_number.value') | ||
echo "shared_vpc_project_number_prod = ${shared_vpc_project_number_prod}" | ||
sed -i'' -e "s/PRJS_PROD_SAMPLE_SVPC_NUMBER/${shared_vpc_project_number_prod}/" /envs/shared/service_control.tf | ||
|
||
export shared_vpc_project_number_nonprod=$(gsutil cat gs://${projects_gcs_bucket_tfstate}/terraform/projects/business_unit_1/nonproduction/default.tfstate \ | ||
| jq -r '.outputs.shared_vpc_project_number.value') | ||
echo "shared_vpc_project_number_nonprod = ${shared_vpc_project_number_nonprod}" | ||
sed -i'' -e "s/PRJS_NONPROD_SAMPLE_SVPC_NUMBER/${shared_vpc_project_number_nonprod}/" /envs/shared/service_control.tf | ||
``` | ||
|
||
1. If you are deploying with VPC Service Controls in dry run mode, update the `required_egress_rule_app_infra_dry_run` and `required_ingress_rule_app_infra_dry_run` variables to true, if you are deploying with VPC Service Controls in enforced mode, update the `required_egress_rule_app_infra` and `required_ingress_rule_app_infra` variables to true in [service_control.tf](gcp-org/envs/shared/service_control.tf) file. | ||
|
||
```bash | ||
export enforce_vpcsc=$(terraform -chdir="envs/shared/" output -raw enforce_vpcsc); \ | ||
echo "enforce_vpcsc" = $enforce_vpcsc | ||
if [[ "$enforce_vpcsc" == "false" ]]; then \ | ||
sed -i -E '/^[[:space:]]*\/\/required_ingress_rules_app_infra_dry_run[[:space:]]*=/ s|^[[:space:]]*//||' envs/shared/terraform.tfvars; \ | ||
else \ | ||
sed -i -E '/^[[:space:]]*\/\/required_ingress_rules_app_infra[[:space:]]*=/ s|^[[:space:]]*//||' envs/shared/terraform.tfvars; \ | ||
fi | ||
if [[ "$enforce_vpcsc" == "false" ]]; then \ | ||
sed -i -E '/^[[:space:]]*\/\/required_egress_rules_app_infra_dry_run[[:space:]]*=/ s|^[[:space:]]*//||' envs/shared/terraform.tfvars; \ | ||
else \ | ||
sed -i -E '/^[[:space:]]*\/\/required_egress_rules_app_infra[[:space:]]*=/ s|^[[:space:]]*//||' envs/shared/terraform.tfvars; \ | ||
fi | ||
``` | ||
|
||
1. Commit and push the changes. | ||
|
||
```bash | ||
git add . | ||
git commit -m "Add infra pipeline directional policies" | ||
git push | ||
|
||
cd .. | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new set of instructions requires users to run a long and complex series of manual gsutil
and sed
commands to substitute placeholder values in the 1-org
stage. This process is fragile and difficult to debug. It introduces a strong coupling between stages that is managed outside of Terraform, which undermines the reliability and declarative nature of the infrastructure code. This entire manual process should be re-evaluated and replaced with a solution that uses Terraform's native capabilities, such as terraform_remote_state
, to pass outputs between configurations.
This PR contains the refactoring of VPC Service Controls, with the creation of a single common perimeter, with all projects inserted in it, with the exception of the cloud build project.