IAM hardening field manual - NexcurAI Guides

The premise

IAM drifts. Every IAM posture at any non-trivial company is a partial record of the people who needed things in a hurry. The work of IAM hardening is not exotic; it is the work of saying no to the convenient past and writing down what should be true going forward.

This manual takes you through seven passes. Each pass is a concrete set of actions. Each ends with a verification step so you know the pass is done.

Pass 1: Inventory

You cannot fix what you have not listed.

AWS

Export the full IAM inventory per account: aws iam list-users, list-roles, list-groups, list-policies. Script it across all accounts in the org.
Enumerate service-linked roles separately; they follow AWS lifecycle, not yours.
For every role, enumerate its trust policy (who can assume) and its permission policy (what it can do).
For every access key, note the age and the last-used date. aws iam get-access-key-last-used.

GCP

gcloud projects get-iam-policy per project, per folder, per org.
Enumerate service accounts with gcloud iam service-accounts list, their keys with gcloud iam service-accounts keys list.
Note which service accounts can impersonate which other service accounts.

Verification: You have a flat file or spreadsheet with every identity, every role, every policy, every key, every last-used timestamp. If you cannot produce this, pass 1 is not done.

Pass 2: Kill standing admin

Standing admin access is the single largest source of compounding IAM risk.

Human admin: remove it. Grant via just-in-time tooling (AWS IAM Identity Center with permission sets that require manual elevation, GCP short-lived access-token workflow via gcloud auth application-default login + short-lived credentials, or a purpose-built tool like StrongDM, Teleport, or Aembit).
Service admin: the AWS administrator role assumed by a CI job or automation should be replaced with a tightly scoped role that does only what it actually does.
Root / organization owners: restrict to two named people, with hardware keys, recovery process documented, not used for any operational task.

Verification: aws iam list-attached-user-policies shows no human with AdministratorAccess attached. GCP: gcloud projects get-iam-policy shows no user members on roles roles/owner or roles/editor.

Pass 3: Least-privilege service accounts

Every non-human identity gets the minimum. The test is: if I remove this permission, does anything break? If nothing breaks, the permission was unnecessary.

AWS: IAM Access Analyzer

Turn it on per account. Use the “Unused access” finder. Scope findings down based on actual usage. Use CloudTrail to verify a permission has not been used in 90 days before removing.

# Enable IAM Access Analyzer org-wide
resource "aws_accessanalyzer_analyzer" "org" {
  analyzer_name = "nexcur-org-analyzer"
  type          = "ORGANIZATION"
  tags = {
    owner = "platform"
  }
}

GCP: Policy Analyzer

Use gcloud policy-troubleshoot iam and gcloud asset search-all-iam-policies to find overbroad grants.

The tightening pattern

For each service account that has a broad primitive role like roles/editor:

Identify the predefined role that is closest to what the service account actually does (roles/storage.objectAdmin instead of roles/editor).
If no predefined role fits, write a custom role listing the exact permissions used.
Deploy, monitor for 24 hours, remove the broad role.

Verification: No service account has roles/owner, roles/editor, or equivalent AWS primitives outside a documented exception list.

Pass 4: Secrets off env files

If any secret lives in a .env file, a repo config, or a shared document, move it.

AWS: Secrets Manager for dynamic secrets, SSM Parameter Store for config-shaped values.
GCP: Secret Manager.
Developer / laptop secrets: 1Password Teams, with a documented rotation policy.
CI/CD secrets: provider-specific vaults (GitHub Actions secrets, GitLab CI variables), ideally backed by the cloud secrets manager via OIDC.

Verification: grep -rE "(AWS|GCP|STRIPE|OPENAI|ANTHROPIC)_[A-Z_]*KEY=[A-Za-z0-9]" . in every repo returns only placeholder examples. git log scans via trufflehog or gitleaks are clean or have a triaged ignore list.

Pass 5: Access key retirement

Access keys are a liability. Replace them with federated access where possible.

AWS: OIDC federation

For CI/CD, use GitHub Actions OIDC (or equivalent for your provider) so every job assumes a short-lived role instead of holding a long-lived key.

# Trust policy for GitHub Actions OIDC role
data "aws_iam_policy_document" "gha_trust" {
  statement {
    effect  = "Allow"
    actions = ["sts:AssumeRoleWithWebIdentity"]
    principals {
      type        = "Federated"
      identifiers = [aws_iam_openid_connect_provider.github.arn]
    }
    condition {
      test     = "StringLike"
      variable = "token.actions.githubusercontent.com:sub"
      values   = ["repo:nexcurai/*:ref:refs/heads/main"]
    }
  }
}

For any remaining keys, enforce automatic rotation on a 90-day cadence. For human operator keys, prefer short-lived tokens via SSO rather than long-lived keys at all.

GCP: Workload Identity Federation

Use workload identity federation for any external CI/CD or third-party service reaching into GCP. Delete long-lived service account keys as soon as alternatives exist.

Verification: Count of IAM access keys with age over 90 days equals zero. Count of service account keys over 90 days equals zero.

Pass 6: Trust-path analysis

This is the pass that catches the clever, non-obvious privilege escalations.

A trust path is a chain: identity A can assume role B, role B has permission to update role C, role C has AdministratorAccess. Identity A effectively has admin. Nobody would have granted it that intentionally.

AWS: the PrivEsc review

For every role, enumerate which identities and roles can sts:AssumeRole into it.
For every role, enumerate which actions in its permission policy could lead to privilege escalation: iam:PassRole, iam:UpdateAssumeRolePolicy, iam:AttachRolePolicy, lambda:UpdateFunctionConfiguration, ec2:RunInstances paired with iam:PassRole, cloudformation:CreateStack paired with iam:PassRole, and the full list Rhino Security Labs documented.
Cross the two. Any identity that can traverse a path to admin without explicit grant is a finding.

Automation helpers: PMapper, Cloudsplaining, AWS IAM Access Analyzer external access findings.

GCP: the impersonation graph

Service accounts that can be impersonated by other identities form a directed graph. Walk it. If a low-trust identity can reach a high-trust identity through a chain of impersonations, document it or break the chain.

Verification: Trust-path map exists. Any paths to admin-equivalent permissions are explicitly documented with rationale, or broken.

Pass 7: Drift detection

IAM will drift again. The question is how fast you notice.

Configuration drift: AWS Config or GCP Asset Inventory with rules alerting on changes to IAM roles, policy attachments, trust policies, and service account creation.
Usage drift: CloudTrail (AWS) or Cloud Audit Logs (GCP) flowing to a searchable sink. Alert on unusual patterns: new identity created outside Terraform, role attached to a sensitive resource by a human, service account key created.
Terraform drift: daily terraform plan in CI against production; any change that is not in code shows up as a drift finding.

Verification: Drift detection alerts reach a human inside 24 hours. Drift incidents over the last 90 days have been triaged and either reverted or formalized in code.

What this looks like in production

Zero standing human admin.
Zero long-lived access keys over 90 days old.
Every non-human identity scoped to a custom role or an appropriate predefined role; no broad primitives outside a documented exception list.
Secrets in secrets manager, never in code or env files.
Trust-path map maintained; privilege escalations either broken or documented.
Drift detection live, alerting within 24 hours.

Common mistakes

Tightening without measuring usage first. You will break CI. Use CloudTrail / Audit Logs to baseline before removing permissions.
Adding iam:PassRole with Resource: *. This is a common PrivEsc vector. Always scope iam:PassRole to the specific roles the caller needs to pass.
Trust policies with no condition. A trust policy that allows any account to assume is a critical finding. Always scope with aws:SourceAccount, aws:PrincipalOrgID, external ID, or the OIDC sub claim.
Using primitive GCP roles (owner, editor, viewer) on service accounts. Always replace with predefined or custom roles.
Not automating rotation. Manual rotation does not happen. Automate or accept the compound risk.

Series A security readiness - the framework this manual lives inside.
Sample: IAM hardening checklist - the checklist in interactive form.
Cybersecurity service line

IAM hardening field manual.