Skip to content

Conversation

mariammartins
Copy link
Contributor

This PR contains the refactoring of VPC Service Controls, with the creation of a single common perimeter, with all projects inserted in it, with the exception of the cloud build project.

@mariammartins
Copy link
Contributor Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the VPC Service Controls to use a single, common perimeter, which is a positive architectural simplification. The changes are extensive, touching documentation, Terraform configurations, and helper scripts to align with this new model. However, the implementation introduces some significant issues. There's a critical conflict in the Cloud Build private pool networking where both a managed Cloud NAT and a self-managed NAT VM are configured, which needs to be resolved. More importantly, the new perimeter configuration relies on placeholders in the Terraform code that must be manually updated using scripts in a later step. This approach is brittle, error-prone, and undermines the declarative nature of Infrastructure as Code. I've raised a critical issue to address this architectural flaw. Additionally, there are medium-severity concerns regarding hardcoded IP ranges and the use of a git branch reference instead of a specific tag or commit for a module source.

Comment on lines +215 to 323
resource "google_compute_address" "cloud_build_nat" {
project = var.project_id
address_type = "EXTERNAL"
name = "cloud-build-nat"
network_tier = "PREMIUM"
region = "us-central1"

depends_on = [module.peered_network]
}

allow = [{
protocol = "all"
ports = null
}]
resource "google_compute_instance" "vm-proxy" {
project = var.project_id
name = "cloud-build-nat-vm"
machine_type = "e2-medium"
zone = "us-central1-a"

deny = []
tags = ["direct-gateway-access", "nat-gateway"]

log_config = {
metadata = "INCLUDE_ALL_METADATA"
boot_disk {
initialize_params {
image = "debian-cloud/debian-12"
}
}]
}

network_interface {
network = local.peered_network_name
subnetwork = "sb-b-cbpools-${var.private_worker_pool.region}"
subnetwork_project = var.project_id

}

can_ip_forward = true

// This script configures the VM to do IP Forwarding
metadata_startup_script = "sysctl -w net.ipv4.ip_forward=1 && iptables -t nat -A POSTROUTING -o $(ip addr show scope global | head -1 | awk -F: '{print $2}') -j MASQUERADE"

service_account {
scopes = ["cloud-platform"]
}

depends_on = [
resource.google_compute_router_nat.cb-nat,
module.peered_network
]
}

# This route will route packets to the NAT VM
resource "google_compute_route" "through-nat" {
name = "through-nat-range1"
project = var.project_id
dest_range = "0.0.0.0/1"
network = local.peered_network_name
next_hop_instance = google_compute_instance.vm-proxy.id
priority = 10
}

resource "google_compute_route" "through-nat2" {
name = "through-nat-range2"
project = var.project_id
dest_range = "128.0.0.0/1"
network = local.peered_network_name
next_hop_instance = google_compute_instance.vm-proxy.id
priority = 10
}

# This route allow the NAT VM to reach the internet with it's external IP address

resource "google_compute_route" "direct-to-gateway" {
name = "direct-to-gateway-range1"
project = var.project_id
dest_range = "0.0.0.0/1"
network = local.peered_network_name
next_hop_gateway = "default-internet-gateway"
tags = ["direct-gateway-access"]
priority = 5
}

resource "google_compute_route" "direct-to-gateway2" {
name = "direct-to-gateway-range2"
project = var.project_id
dest_range = "128.0.0.0/1"
network = local.peered_network_name
next_hop_gateway = "default-internet-gateway"
tags = ["direct-gateway-access"]
priority = 5
}

# Cloud Router
resource "google_compute_router" "cb-router" {
name = "cb-cloud-router"
network = local.peered_network_name
region = "us-central1"
project = var.project_id
}

# Cloud NAT
resource "google_compute_router_nat" "cb-nat" {
project = var.project_id
name = "cb-cloud-nat"
router = google_compute_router.cb-router.name
region = google_compute_router.cb-router.region
nat_ip_allocate_option = "AUTO_ONLY"
source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES"

log_config {
enable = true
filter = "ALL"
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This configuration sets up both a managed Cloud NAT (google_compute_router_nat) and a self-managed NAT instance (google_compute_instance.vm-proxy) for the same VPC. This is redundant, increases complexity and cost, and could lead to unpredictable egress routing. The managed Cloud NAT is the recommended solution for providing NAT services in GCP. I strongly suggest removing the self-managed NAT VM resources (google_compute_address.cloud_build_nat, google_compute_instance.vm-proxy, and the associated google_compute_route resources) and relying solely on the managed Cloud NAT.

title = "ER app infra -> cicd"
from = {
identities = [
"serviceAccount:PRJ_APP_INFRA_PIPELINE_NUMBER@cloudbuild.gserviceaccount.com",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This file contains several hardcoded placeholder values (e.g., PRJ_APP_INFRA_PIPELINE_NUMBER, PRJ_BU1_APP_INFRA_NUMBER) that are intended to be replaced by sed commands in a later step, as documented in 4-projects/README.md. This approach is highly brittle, error-prone, and circumvents the benefits of Infrastructure as Code by introducing a manual, imperative process. A change in project naming or structure could easily break these scripts. This circular dependency should be resolved architecturally, for instance, by having a separate step that updates the perimeter using terraform_remote_state to dynamically fetch project numbers after they are created.

Comment on lines +20 to +38
nat_proxy_vm_ip_range = "10.1.1.0/24"

single_project_network = {
subnet_name = "eab-develop-us-central1"
subnet_ip = "10.1.20.0/24"
subnet_region = var.subnet_region
subnet_private_access = true
}
single_project_secondary = {
"eab-develop-us-central1" = [
{
range_name = "eab-develop-us-central1-secondary-01"
ip_cidr_range = "192.168.0.0/18"
},
{
range_name = "eab-develop-us-central1-secondary-02"
ip_cidr_range = "192.168.64.0/18"
},
] }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The IP ranges for nat_proxy_vm_ip_range, single_project_network, and single_project_secondary are hardcoded within the locals block. This reduces the module's flexibility and reusability. It's recommended to define these as input variables, with the current values set as defaults, to allow for easier customization without modifying the module's core logic.

module "cloud_function" {
source = "GoogleCloudPlatform/cloud-functions/google"
version = "~> 0.6"
source = "git::https://github.com/GoogleCloudPlatform/terraform-google-cloud-functions.git?ref=release-please--branches--main"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The module source is pinned to a branch (release-please--branches--main). While the comment explains this is a temporary workaround, using a branch reference can lead to non-reproducible builds if the branch is updated with breaking changes. For better stability and long-term maintenance, it is recommended to pin to a specific release tag or a commit hash.

  source = "git::https://github.com/GoogleCloudPlatform/terraform-google-cloud-functions.git?ref=v0.6.0"

Comment on lines +229 to +308
1. Before move to the next step, configure directional policies for your perimeter.

1. Navigate to `gcp-org`.
1. Use `terraform output` to get the APP Infra Pipeline cloud build project id and project number.

```bash
cd ../gcp-org/

export cloudbuild_project_number=$(terraform -chdir="../gcp-projects/business_unit_1/shared/" output -raw cloudbuild_project_number)
echo "cloud build project number = $cloudbuild_project_number"
export cloudbuild_project_id=$(terraform -chdir="../gcp-projects/business_unit_1/shared/" output -raw cloudbuild_project_id)
echo "cloud build project id = $cloudbuild_project_id"
sed -i'' -e "s/PRJ_APP_INFRA_PIPELINE_NUMBER/${cloudbuild_project_number}/" /envs/shared/service_control.tf
sed -i'' -e "s/PRJ_APP_INFRA_ID/${cloudbuild_project_id}/" /envs/shared/service_control.tf
```

1. Use `terraform output` to get the Bucket used for storing terraform state for stage 4-projects foundations pipelines in seed project.
1. Use `gsutil cat` to get the project numbers of the SVPC and Peering projects in each environment (production, nonproduction, and development) for configuring the directional app infra policies.

```bash
export projects_gcs_bucket_tfstate=$(terraform -chdir="../terraform-example-foundation/0-bootstrap/" output -raw projects_gcs_bucket_tfstate)
echo "projects_gcs_bucket_tfstate = ${projects_gcs_bucket_tfstate}"

export peering_project_number_dev=$(gsutil cat gs://${projects_gcs_bucket_tfstate}/terraform/projects/business_unit_1/development/default.tfstate \
| jq -r '.outputs.peering_project_number.value')
echo "peering_project_number_dev = ${peering_project_number_dev}"
sed -i'' -e "s/PRJS_DEV_SAMPLE_PEERING_NUMBER/${peering_project_number_dev}/" envs/shared/service_control.tf

export peering_project_number_prod=$(gsutil cat gs://${projects_gcs_bucket_tfstate}/terraform/projects/business_unit_1/production/default.tfstate \
| jq -r '.outputs.peering_project_number.value')
echo "peering_project_number_prod = ${peering_project_number_prod}"
sed -i'' -e "s/PRJS_PROD_SAMPLE_PEERING_NUMBER/${peering_project_number_prod}/" envs/shared/service_control.tf

export peering_project_number_nonprod=$(gsutil cat gs://${projects_gcs_bucket_tfstate}/terraform/projects/business_unit_1/nonproduction/default.tfstate \
| jq -r '.outputs.peering_project_number.value')
echo "peering_project_number_nonprod = ${peering_project_number_nonprod}"
sed -i'' -e "s/PRJS_NONPROD_SAMPLE_PEERING_NUMBER/${peering_project_number_nonprod}/" envs/shared/service_control.tf

export shared_vpc_project_number_dev=$(gsutil cat gs://${projects_gcs_bucket_tfstate}/terraform/projects/business_unit_1/development/default.tfstate \
| jq -r '.outputs.shared_vpc_project_number.value')
echo "shared_vpc_project_number_dev = ${shared_vpc_project_number_dev}"
sed -i'' -e "s/PRJS_DEV_SAMPLE_SVPC_NUMBER/${shared_vpc_project_number_dev}/" /envs/shared/service_control.tf

export shared_vpc_project_number_prod=$(gsutil cat gs://${projects_gcs_bucket_tfstate}/terraform/projects/business_unit_1/production/default.tfstate \
| jq -r '.outputs.shared_vpc_project_number.value')
echo "shared_vpc_project_number_prod = ${shared_vpc_project_number_prod}"
sed -i'' -e "s/PRJS_PROD_SAMPLE_SVPC_NUMBER/${shared_vpc_project_number_prod}/" /envs/shared/service_control.tf

export shared_vpc_project_number_nonprod=$(gsutil cat gs://${projects_gcs_bucket_tfstate}/terraform/projects/business_unit_1/nonproduction/default.tfstate \
| jq -r '.outputs.shared_vpc_project_number.value')
echo "shared_vpc_project_number_nonprod = ${shared_vpc_project_number_nonprod}"
sed -i'' -e "s/PRJS_NONPROD_SAMPLE_SVPC_NUMBER/${shared_vpc_project_number_nonprod}/" /envs/shared/service_control.tf
```

1. If you are deploying with VPC Service Controls in dry run mode, update the `required_egress_rule_app_infra_dry_run` and `required_ingress_rule_app_infra_dry_run` variables to true, if you are deploying with VPC Service Controls in enforced mode, update the `required_egress_rule_app_infra` and `required_ingress_rule_app_infra` variables to true in [service_control.tf](gcp-org/envs/shared/service_control.tf) file.

```bash
export enforce_vpcsc=$(terraform -chdir="envs/shared/" output -raw enforce_vpcsc); \
echo "enforce_vpcsc" = $enforce_vpcsc
if [[ "$enforce_vpcsc" == "false" ]]; then \
sed -i -E '/^[[:space:]]*\/\/required_ingress_rules_app_infra_dry_run[[:space:]]*=/ s|^[[:space:]]*//||' envs/shared/terraform.tfvars; \
else \
sed -i -E '/^[[:space:]]*\/\/required_ingress_rules_app_infra[[:space:]]*=/ s|^[[:space:]]*//||' envs/shared/terraform.tfvars; \
fi
if [[ "$enforce_vpcsc" == "false" ]]; then \
sed -i -E '/^[[:space:]]*\/\/required_egress_rules_app_infra_dry_run[[:space:]]*=/ s|^[[:space:]]*//||' envs/shared/terraform.tfvars; \
else \
sed -i -E '/^[[:space:]]*\/\/required_egress_rules_app_infra[[:space:]]*=/ s|^[[:space:]]*//||' envs/shared/terraform.tfvars; \
fi
```

1. Commit and push the changes.

```bash
git add .
git commit -m "Add infra pipeline directional policies"
git push

cd ..
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This new set of instructions requires users to run a long and complex series of manual gsutil and sed commands to substitute placeholder values in the 1-org stage. This process is fragile and difficult to debug. It introduces a strong coupling between stages that is managed outside of Terraform, which undermines the reliability and declarative nature of the infrastructure code. This entire manual process should be re-evaluated and replaced with a solution that uses Terraform's native capabilities, such as terraform_remote_state, to pass outputs between configurations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant