Skip to content

feat(gcp): GCP cloud connector terraform module + gcloud installer#40

Merged
tim-thacker-nullify merged 2 commits into
mainfrom
feat/gcp-integration
Apr 11, 2026
Merged

feat(gcp): GCP cloud connector terraform module + gcloud installer#40
tim-thacker-nullify merged 2 commits into
mainfrom
feat/gcp-integration

Conversation

@tim-thacker-nullify
Copy link
Copy Markdown
Member

@tim-thacker-nullify tim-thacker-nullify commented Apr 8, 2026

Claude

Summary

Adds a read-only GCP integration option matching the existing AWS pattern. Trust model is Workload Identity Federation only — no service account JSON keys, no long-lived secrets. Customer can revoke at any time via `terraform destroy` or `uninstall.sh`.

This is the customer-facing IaC for the larger "complete the GCP integration" effort happening in the nullify-7 monorepo (separate PR).

What's included

```
gcp-integration-setup/
terraform/
main.tf, variables.tf, outputs.tf, providers.tf, versions.tf
terraform.tfvars.example
README.md
modules/nullify-gcp-integration/
main.tf — WIF pool + AWS provider, SA, custom role, bindings
variables.tf, outputs.tf, versions.tf
examples/
organization/main.tf — org-wide install
single-project/main.tf — per-project install
scripts/
install.sh — idempotent gcloud one-shot installer
uninstall.sh — revocation counterpart
docs/
permissions.md — every role + custom permission documented with rationale
```

Permission model

Read-only, service-config and network-topology metadata only. Explicit non-grants:

Capability Granted?
`storage.objectViewer` (object data)
`secretmanager.secretAccessor` (secret payloads)
`bigquery.dataViewer` (table rows)
Any write/admin role

Predefined viewer roles granted: `cloudasset.viewer`, `iam.securityReviewer`, `viewer`, `compute.viewer`, `container.clusterViewer`, `cloudsql.viewer`, `spanner.viewer`, `cloudkms.viewer`, `logging.viewer`, `run.viewer`, `cloudfunctions.viewer`, `appengine.appViewer`, `dataproc.viewer`, `dataflow.viewer`, `pubsub.viewer`.

Custom role `nullifyCloudConnector` covers the long tail (Cloud Armor, VPC Service Controls, AlloyDB, Filestore, Memorystore, Cloud DNS, API Gateway, Artifact Registry metadata) with strict `.get`/`.list` allowlist.

Full justification per role/permission lives in `gcp-integration-setup/docs/permissions.md`.

WIF trust pinning

The Workload Identity provider's `attribute_condition` restricts trust to a single Nullify AWS IAM role. Even if the WIF pool is enumerated, only the exact Nullify principal can mint a token. This matches the `external_account` credential JSON the Nullify backend synthesises in `hyperdrive/pkg/cloudintegrations/gcp/auth.go`.

Two scoping modes

  • `scope = "organization"` (recommended) — bind roles at the org level
  • `scope = "projects"` — bind roles only on the `project_ids` list (POC mode)

Customer flow

  1. `cp terraform.tfvars.example terraform.tfvars && $EDITOR terraform.tfvars`
  2. `terraform init && terraform apply`
  3. Paste `service_account_email` and `workload_identity_provider` outputs into the Nullify console under Settings → Cloud Integrations → GCP
  4. Click Verify → green check per project
  5. Click Save

Test plan

  • `cd gcp-integration-setup/terraform && terraform init && terraform validate`
  • `cd gcp-integration-setup/terraform/examples/organization && terraform init && terraform validate`
  • `cd gcp-integration-setup/terraform/examples/single-project && terraform init && terraform validate`
  • `shellcheck gcp-integration-setup/scripts/install.sh`
  • `shellcheck gcp-integration-setup/scripts/uninstall.sh`
  • Apply against a sandbox GCP project, paste outputs into Nullify staging, confirm "Verify" returns green
  • `terraform destroy` cleanly removes all resources

🤖 Generated with Claude Code

Adds a read-only GCP integration option matching the AWS pattern. Trust
model is Workload Identity Federation only — no service account JSON keys,
no long-lived secrets. Customer can revoke at any time via terraform
destroy or the uninstall.sh script.

Layout (mirrors aws-integration-setup):
  gcp-integration-setup/
    terraform/
      main.tf, variables.tf, outputs.tf, providers.tf, versions.tf
      terraform.tfvars.example
      README.md
      modules/nullify-gcp-integration/
        main.tf       - WIF pool + AWS provider, SA, custom role, bindings
        variables.tf, outputs.tf, versions.tf
      examples/
        organization/main.tf  - org-wide install
        single-project/main.tf - per-project install
    scripts/
      install.sh    - idempotent gcloud one-shot installer
      uninstall.sh  - revocation counterpart
    docs/
      permissions.md - every role + custom permission documented with rationale

Permissions are intentionally read-only and limited to service-config and
network-topology metadata. Explicit non-grants:
  - storage.objectViewer (object data)
  - secretmanager.secretAccessor (secret payloads)
  - bigquery.dataViewer (table rows)
  - any write/admin role

Predefined viewer roles: cloudasset.viewer, iam.securityReviewer, viewer,
compute.viewer, container.clusterViewer, cloudsql.viewer, spanner.viewer,
cloudkms.viewer, logging.viewer, run.viewer, cloudfunctions.viewer,
appengine.appViewer, dataproc.viewer, dataflow.viewer, pubsub.viewer.

Custom role nullifyCloudConnector covers the long tail (Cloud Armor,
VPC Service Controls, AlloyDB, Filestore, Memorystore, Cloud DNS,
API Gateway, Artifact Registry metadata) with strict *.get/*.list allowlist.

The Workload Identity provider is configured with an attribute_condition
restricting trust to a single Nullify AWS IAM role. Even if the WIF pool is
exposed, only the exact Nullify principal can mint a token. This matches the
external_account credential JSON the Nullify backend synthesises in
hyperdrive/pkg/cloudintegrations/gcp/auth.go.

Module supports two scoping modes via the `scope` variable:
  - "organization" (recommended): bind roles at the org level
  - "projects": bind roles only on the project_ids list (POC mode)

After terraform apply the customer pastes service_account_email and
workload_identity_provider into the Nullify console -> Settings -> Cloud
Integrations -> GCP and clicks Verify.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tim-thacker-nullify tim-thacker-nullify added the minor Minor version updates (features) label Apr 8, 2026
Copy link
Copy Markdown
Member

Did a read-through of the module and the gcloud installer. A few things I think are worth addressing before this comes out of draft — happy to open follow-up PRs for any of them.

Scope coverage

The PR offers organization and projects scopes, but there's no folder scope. For customers whose GCP hierarchy is carved up by folder (e.g. a dedicated security/ or prod/ folder that the rollout should be pinned to), the current options force them to either grant org-wide or enumerate every project manually. Adding a folder mode backed by google_folder_iam_member (and a matching examples/folder/ example) would round this out. The variable validation at gcp-integration-setup/terraform/variables.tf:15-19 and modules/nullify-gcp-integration/variables.tf:19-22 would need to accept "folder" as a third value.

Custom role scope mismatch

modules/nullify-gcp-integration/main.tf:87 defines the nullifyCloudConnector role with google_project_iam_custom_role, which creates a project-scoped role under host_project_id. Two downstream bindings look problematic against GCP's constraint that a project-scoped custom role can only be granted on resources within (or below) the project where it was defined:

  1. main.tf:211-216google_organization_iam_member.custom grants this project-scoped role at the organization level. I believe this is rejected at apply time for org installs.
  2. main.tf:232-237google_project_iam_member.custom iterates over var.project_ids and binds the host project's custom role on each. For any project in the list that isn't host_project_id, the binding fails for the same reason. The examples/single-project example trips this today (host_project_id = "acme-security", project_ids = ["acme-prod"]).

One fix is to define the role with google_organization_iam_custom_role when scope = "organization" (or for any scope broader than a single project) so it's assignable across the hierarchy. Alternatively, collapse the custom permissions into predefined roles where possible and drop the custom role entirely.

install.sh ↔ Terraform parity

The shell installer and the Terraform module produce meaningfully different security postures today:

  1. No attribute_condition. scripts/install.sh:48-52 calls providers create-aws without --attribute-condition, so the WIF provider trusts any principal in Nullify's AWS account rather than the single role Terraform pins at main.tf:183. Worth mirroring the Terraform pinning logic here so both paths have the same trust boundary.
  2. Custom role missing. The script never creates or grants nullifyCloudConnector, so users who pick the gcloud path are silently missing Cloud Armor / VPC Service Controls / orgpolicy / AlloyDB / Filestore / Memorystore / Artifact Registry / Cloud DNS / API Gateway permissions. Either the script should mirror the Terraform role, or docs/permissions.md should flag this as a deliberate subset.
  3. Org-only. The installer hardcodes org scope; users who want project (or, if added, folder) scope have no shell path.
  4. NULLIFY_TENANT_EXTERNAL_ID is required but unused. install.sh:24 errors out if it isn't set, then never references the value (see next point).

tenant_external_id / labels are dead code

modules/nullify-gcp-integration/main.tf:21-29 builds local.common_labels by merging var.labels and var.tenant_external_id, but common_labels is never referenced anywhere else in the module (grep confirms no labels = attribute on any resource). The variable description at variables.tf:55-58 and the root README both say tenant_external_id is "Embedded as a label on the WIF pool," but google_iam_workload_identity_pool doesn't expose a labels argument and no pool/provider/SA labels block exists. Net effect: customers paste an external ID from the console, and nothing on the GCP side carries it.

Either wire it into a resource that supports labels (or into a description field on the pool/provider/custom role for audit correlation), or drop the variable + the merge + the README claim.

Smaller items

  • AWS role-path regex. main.tf:183 / :197 strip ^arn:aws:iam::[0-9]+:role/ from the principal ARN. AWS IAM role ARNs can include a path (e.g. .../role/some/path/RoleName) but assumed-role ARNs never do — they contain only the friendly name. If Nullify ever issues a role under a path, the condition will be built with some/path/RoleName and never match the assertion. Safer to extract the friendly name explicitly.
  • google-beta declared but unused. versions.tf and modules/.../versions.tf require hashicorp/google-beta, and providers.tf instantiates it, but no resource uses it. Can be dropped.
  • data "google_project" "host" in outputs.tf:26-28. Functionally fine, but it's surprising to find a data source at the bottom of an outputs file — moving it to main.tf or a dedicated data.tf helps reviewers.
  • next_steps output. The "green check next to every project" copy in outputs.tf:21-24 reads oddly at org scope. Minor wording nit.
  • uninstall.sh. If install.sh ever grows a custom role, the counterpart teardown will need to delete it too — worth pre-wiring even if the script is no-op today.
  • Top-level README.md. GCP only appears as a one-line row in the Deployment Options table and the repo structure diagram. A short quick-start section with a link to gcp-integration-setup/terraform/README.md would help customers landing from the console.
  • No CI validation for the new module. .github/workflows/ has auto-tag.yml and helm-release.yml but nothing running terraform fmt -check / terraform validate / tflint against gcp-integration-setup/. The test plan in the PR description covers this manually — worth automating so regressions (including the org-custom-role case above) get caught in CI.

None of the above changes the overall shape of the design (WIF-only, no long-lived secrets, read-only + explicit data-plane non-grants) which I think is the right call. Mostly it's the custom role binding, the missing folder scope, and the install.sh ↔ Terraform drift that I'd want tightened before merge.


Generated by Claude Code

Larger pass through vik-nullify's review of the GCP integration module.

## Custom-role binding bug (apply-time failure)

`google_project_iam_custom_role` only assigns within its defining project,
so binding it on the organisation (`google_organization_iam_member.custom`)
or on a sibling project (`google_project_iam_member.custom` when
project_ids != [host_project_id]) failed at apply time. The
`examples/single-project` example tripped this today.

Fix: introduce `google_organization_iam_custom_role.nullify_cloud_connector`
and select between the two variants via `local.use_org_custom_role` based
on whether organization_id is set. Bindings now reference
`local.custom_role_id`. The single-project example is rewired so
host_project_id and project_ids match (the only configuration that works
without an organization_id), and a new module-level
`terraform_data.input_validation` precondition rejects multi-project
installs that omit organization_id at plan time rather than blowing up
mid-apply.

## Folder scope

Added `scope = "folder"` with a `folder_id` variable, matching
`google_folder_iam_member` bindings (predefined + custom), and a new
`examples/folder/` example. Org_id is required for folder scope so the
custom role can be defined at the organisation and assigned on the folder.

## install.sh ↔ Terraform parity

The shell installer now mirrors the Terraform module's security posture:

1. **`--attribute-condition`** is now passed to
   `gcloud iam workload-identity-pools providers create-aws`. Previously
   the WIF provider trusted any principal in the Nullify AWS account
   rather than the single role Terraform pins.
2. **Custom role created and bound** at the organisation, mirroring
   `local.custom_role_permissions`. Previously the gcloud path silently
   missed Cloud Armor / VPC SC / orgpolicy / AlloyDB / Filestore /
   Memorystore / Artifact Registry / Cloud DNS / API Gateway permissions.
3. **`uninstall.sh`** now removes the custom-role binding and deletes
   the org-level custom role.
4. **`NULLIFY_TENANT_EXTERNAL_ID`** is no longer required by install.sh
   because the variable was dead code in the Terraform module too (see
   below).
5. The script header documents that install.sh is org-scope-only and
   points folder/project-scope users at Terraform.

## `tenant_external_id` / `labels` were dead code

The module declared `tenant_external_id` and `labels`, merged them into
`local.common_labels`, and never referenced `common_labels` anywhere else.
GCP IAM resources don't expose a `labels` argument and no resource block
in the module set one, so the variables had zero on-cluster effect. README
and module docstrings claimed otherwise. Removed the merge, the locals
block, and both variables. The examples and tfvars.example are updated.
The `customer_name` variable is kept (still useful for support correlation)
but its description now reflects that it isn't actually a resource label.

## AWS role-path regex

`replace(var.nullify_aws_principal_arn, "/^arn:aws:iam::[0-9]+:role\\//", "")`
left a `path/RoleName` if the principal were ever issued under a path,
which would never match the assumed-role assertion (assumed-role ARNs
contain only the friendly name). Replaced with a path-tolerant
`element(reverse(split("/", ...)), 0)` extraction in the new
`local.nullify_aws_role_name`, used in both the WIF provider's
`attribute_condition` and the service account's
`workloadIdentityUser` binding. install.sh already used `${var##*/}`
which is equivalent.

## Cleanup

- `google-beta` provider declaration and `providers.tf` block dropped
  from the root and the module — no resource was using it.
- `data "google_project" "host"` moved out of `outputs.tf` into a new
  `data.tf` so reviewers don't have to find a data source at the bottom
  of an outputs file.
- `next_steps` output wording updated to read sensibly at org / folder
  / project scope (was hardcoded "green check next to every project").
- Top-level README gains a GCP quick-start section pointing at the
  module's README.

## CI

New `.github/workflows/terraform-validate.yml` runs
`terraform fmt -check`, `terraform init -backend=false`, and
`terraform validate` against the root module and every example
(`gcp-integration-setup/terraform`, the three examples, and the
existing AWS modules), plus `shellcheck` on install.sh / uninstall.sh.
Targets the same set of paths the customer would actually apply, so a
regression of the org-binding bug above (or any future structural
mistake) gets caught at PR time rather than at customer apply time.

Verified locally:
- `terraform fmt -check -recursive gcp-integration-setup/` clean
- `terraform validate` passes for the root module + all 3 examples
- `shellcheck` clean for install.sh and uninstall.sh

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tim-thacker-nullify tim-thacker-nullify marked this pull request as ready for review April 9, 2026 13:01
@tim-thacker-nullify tim-thacker-nullify merged commit aad2fa0 into main Apr 11, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

minor Minor version updates (features)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants