PS Product SecurityKnowledge Base

๐Ÿ” Repository Secret Scanning

Secret Scanning Feedback Loop

Intro: Secret scanning is one of the fastest-return controls in a Product Security program because leaked credentials regularly become the easiest path to production compromise, cloud abuse, or supply-chain pivoting.

What this page includes

  • what repository secret scanning is and why it matters
  • a practical top-five tool shortlist
  • local, Docker, and CI/CD execution patterns
  • sample findings and how to interpret them
  • how to connect secret scanning to a release quality gate

Working assumptions

  • any committed credential should be treated as exposed until proven otherwise
  • revocation and rotation are more important than simply deleting the string from git history
  • secret scanning works best as a layered control: local hooks, merge request checks, and platform-side scanning

What repository secret scanning actually does

Repository secret scanning looks for tokens, passwords, private keys, certificates, cloud credentials, webhook signing keys, and other sensitive values in:

  • tracked files in the working tree;
  • staged changes before commit;
  • commit ranges in a branch or merge request;
  • full git history when a historic scan is needed;
  • artifacts such as Docker images, archives, and generated config.

The goal is not just โ€œfind suspicious strings.โ€ The real goal is to reduce the time between secret introduction, secret detection, secret revocation, and pipeline recovery.

Why this matters in practice

A leaked secret is dangerous because it often bypasses exploit complexity. Attackers do not need a memory corruption bug when they can reuse a valid API key, deploy code with a CI token, or laterally move with a cloud access key.

Typical impact paths include:

Secret type Example impact
Cloud credential Unauthorized API calls, data exfiltration, crypto-mining, privilege escalation
CI token Pipeline abuse, artifact tampering, malicious releases
Registry credential Pull or push of unauthorized images, poisoning of release lineage
SaaS API token Abuse of GitHub, GitLab, Slack, Jira, Stripe, or internal platforms
Private key Long-lived access, service impersonation, TLS interception risk

Top 5 tools to evaluate

These five cover most practical Product Security programs well:

Tool Best fit Strengths Trade-offs
TruffleHog high-signal repo and filesystem scans verifies many secret types against providers, strong git + filesystem modes verification can require outbound access; tuning may be needed
Gitleaks fast local and CI scanning simple binary, TOML rules, JSON/CSV/SARIF reports, strong pre-commit use mostly pattern + entropy based; no built-in provider verification model like TruffleHog
GitHub Secret Scanning teams standardized on GitHub native platform alerts, partner patterns, SARIF ecosystem nearby strongest inside GitHub-hosted workflow
GitLab Secret Detection teams standardized on GitLab CI/CD native secret_detection job, MR and vulnerability views, security policy hooks best experience requires alignment with GitLab scanning model
GitGuardian ggshield developer-first prevention and CI checks broad detector coverage, strong UX, pre-commit/CI/path/docker modes CLI depends on GitGuardian service/API model

Practical note: these are not mutually exclusive. Many teams use Gitleaks or TruffleHog in developer CI, then rely on GitHub/GitLab platform scanning as an independent safety net.

TruffleHog vs Gitleaks at a glance

Question TruffleHog Gitleaks
Detection style regex + detector logic + provider verification where supported regex + entropy + configurable rules
Best โ€œblock the mergeโ€ mode verified findings or high-confidence policy any finding or rule-tag based thresholds
Git history support yes yes
Filesystem scanning yes yes
Docker usage straightforward straightforward
Report formats JSON-oriented workflows JSON / CSV / SARIF
Best use reduce false positives with verified results fast, deterministic, portable secret gates

See also: ๐Ÿ”Ž TruffleHog and Gitleaks Deep Dive.

Local execution from an admin workstation

TruffleHog

# scan the current git repository from main to HEAD
trufflehog git main HEAD .

# scan the working tree as a filesystem
trufflehog filesystem . --json > trufflehog-report.json

# fail only on verified findings
trufflehog git main HEAD . --results=verified

Gitleaks

# scan a repo or working tree
gitleaks detect --source . --report-format json --report-path gitleaks-report.json

# scan a specific commit range
gitleaks detect --source . --log-opts="--all origin/main..HEAD"

# use a custom rules file
gitleaks detect --source . --config .gitleaks.toml

GitGuardian ggshield

# scan an entire repository
ggshield secret scan repo .

# scan only the content that triggered CI
ggshield secret scan ci --format json -o ggshield-report.json

Detect-secrets baseline workflow

detect-secrets scan > .secrets.baseline
detect-secrets audit .secrets.baseline
git diff --staged --name-only -z | xargs -0 detect-secrets-hook --baseline .secrets.baseline

Third-party service secret exposure patterns

A lot of real leaks are not cloud root keys. They are service credentials for tools that quietly sit in the delivery path.

Common examples:

  • GitHub and GitLab personal or project access tokens;
  • Slack webhooks and bot tokens;
  • Jira and Confluence API tokens;
  • npm, PyPI, Maven, NuGet, and package-registry credentials;
  • Artifactory / Harbor / registry robot credentials;
  • Stripe, Twilio, SendGrid, Mailgun, and similar SaaS API keys.

Why these matter

These credentials often unlock one of three things:

  1. delivery-plane access โ€” push, publish, release, or fetch
  2. metadata access โ€” issues, wiki, pull requests, internal URLs
  3. message or billing abuse โ€” spam, social engineering, quota burn, or cost amplification

Practical response order

When a secret is found:

  1. verify whether it is active;
  2. revoke or rotate it;
  3. check where it was used and by whom;
  4. decide whether git history cleanup is required;
  5. add or refine a detector if the pattern was missed previously.

Practical rule-tuning example

See ../snippets/secrets/gitleaks-third-party-rules.toml.

Run as Docker containers inside jobs

TruffleHog in Docker

docker run --rm -v "$PWD:/workdir" ghcr.io/trufflesecurity/trufflehog:latest   git --json --results=verified main HEAD /workdir

Gitleaks in Docker

docker run --rm -v "$PWD:/path" ghcr.io/zricethezav/gitleaks:latest   detect --source="/path" --report-format json --report-path /path/gitleaks-report.json

Sample finding report and how to read it

Example: TruffleHog-style finding

{
  "DetectorName": "AWS",
  "Verified": true,
  "Raw": "AKIA...REDACTED",
  "SourceMetadata": {
    "Data": {
      "Filesystem": {
        "file": "terraform/prod.tfvars"
      }
    }
  },
  "ExtraData": {
    "rotation_guide": "Rotate the access key immediately",
    "account": "prod-shared-services"
  }
}

Example: Gitleaks-style finding

[
  {
    "RuleID": "aws-access-token",
    "Description": "AWS Access Key",
    "File": "scripts/bootstrap.sh",
    "StartLine": 24,
    "EndLine": 24,
    "Secret": "REDACTED",
    "Entropy": 3.92,
    "Tags": ["key", "AWS", "credentials"]
  }
]

How to interpret issues

1. Real secret vs suspicious token

  • A verified TruffleHog hit should be treated as high priority.
  • A regex-only hit still matters, but it needs triage: test fixture, fake token, or real credential?

2. Current exposure vs historical exposure

  • If the secret is only in history but still active, the incident is still real.
  • Deleting the line from the current branch does not remove the need to revoke or rotate.

3. Privilege and blast radius

Ask:

  • what can this credential access?
  • is it read-only or write-capable?
  • does it reach production or control-plane APIs?
  • can it mint or retrieve additional credentials?

Common issue classes and short remediation guidance

Finding type Why dangerous Short remediation
AWS access key in tfvars direct cloud API access, often broad permissions revoke key, rotate workload to role-based auth, move secret to vault/CI variable
GitLab/GitHub PAT in script repo write access, pipeline tampering, token reuse revoke token, replace with scoped bot/service token or OIDC flow
Slack webhook URL alert spoofing and social engineering rotate webhook, move to secret manager
Database password in config data exposure and pivot to app backend rotate password, move to secret backend or runtime injection
Private key in repo impersonation and long-lived access revoke/replace certificate or key pair, investigate downstream trust chains

Adding secret scanning to CI/CD

A practical sequence:

  1. pre-commit or pre-push guard for developers;
  2. merge request / pull request secret scan;
  3. default branch historic scan on first enablement or periodically;
  4. platform-native scanning in GitHub or GitLab;
  5. quality gate aggregation so releases are blocked on verified or critical findings.

Add secret scanning to the quality gate

A strong starting model:

  • block on any verified secret;
  • block on any net-new production credential pattern;
  • allow non-blocking warning mode for low-confidence matches during rollout;
  • measure:
    • count of verified secrets;
    • count of untriaged secret findings;
    • median time to revocation;
    • percentage of repos with secret scanning enabled;
    • recurrence rate by team or service.

Example gate policy:

Condition Gate behavior
Verified secret found Fail pipeline immediately
New secret in MR branch Fail MR pipeline
Historic secrets found during onboarding Warning + remediation plan
False positive not triaged Warning until baseline or rule update exists
Repeat finding type in same repo Fail after grace window
  1. Start with one or two tools, not five.
  2. Prefer high-confidence blocking before broad low-confidence blocking.
  3. Build a clear remediation runbook: revoke, rotate, replace, confirm, close.
  4. Keep a small allowlist and require expiration or owner approval.
  5. Link secret scanning with:

Snippet pack

Footer

Mobile builds and third-party service keys

A lot of โ€œsecret scanningโ€ programs are tuned for cloud root credentials and CI tokens, but they underweight developer-facing service keys that show up in:

  • mobile apps;
  • frontend builds;
  • test fixtures;
  • sample config files;
  • integration examples.

Common examples:

  • Firebase and Google service keys;
  • Slack webhooks;
  • SendGrid, Twilio, Stripe, and Mailgun API keys;
  • registry robot accounts and package-repository credentials.

Practical tuning move

Keep one custom rules file for organization-specific patterns and one for broader third-party platforms.
See ../snippets/secrets/gitleaks-mobile-and-third-party-rules.toml.