Secrets in AI workflows: Preventing leaks in code review and LLM training

Why LLM training and AI code review expose secrets and what to do instead

Jan 07, 2026

Dillon Watts

Guest Contributor

Back to the blog

Secrets in AI workflows: Preventing leaks in code review and LLM training

DevRel

TLDR

When source code containing hardcoded credentials, API keys, or database connection strings is fed into an AI workflow, those secrets effectively leave your security perimeter. This article explores how secrets can leak during AI interactions and why the only true solution is architectural.

The adoption of Generative AI in software development has shifted from a novelty to a necessity. Whether it is an automated code review bot analyzing Pull Requests (PRs) or a developer pasting a traceback into a chatbot, the flow of proprietary code into "black box" systems has increased exponentially.

However, this efficiency introduces a critical vulnerability: context leakage.

The three vectors of exposure

Traditional secret leakage usually occurs when a developer accidentally pushes a ⁠.env file. AI workflows introduce three new, distinct vectors:

Inference data logging: When using public LLM APIs, the prompt data (which may contain pasted code with secrets) is often retained by the provider for abuse monitoring or model retraining.
Model memorization (Training): If internal repositories containing secrets are used to fine-tune a custom model (e.g., fine-tuning Llama 3 on internal docs), the model can memorize and regurgitate these secrets.
Automated review echoing: AI agents integrated into CI/CD pipelines may analyze a diff, identify a "bug" near a secret, and reproduce that secret in a PR comment or a build log.

The mathematics of LLM memorization

Research suggests that Large Language Models are capable of verbatim memorization, particularly with data that appears infrequently, such as high-entropy strings like private keys.

If we define the training dataset as (D) and a specific secret string as (s in D), the probability of the model (M) generating (s) given a specific prompt context (c) increases significantly if the model overfits on the data segment containing (s). We can conceptually model the extraction risk (R) as a function of the secret's frequency (f(s)) and the model capacity, where (N_{total}) is the total size of the training corpus. High-capacity models have enough parameters to "store" the exact representation of (s), essentially compressing the secret into the model weights.

Analyzing risk environments

Understanding where your AI processes data is the first step in mitigation. In Public Consumer AI environments, providers often retain data for future model training by default. This creates a high-risk scenario where the primary leak vector is developers inadvertently pasting configs or keys directly into a chat interface.

Conversely, Enterprise APIs generally offer contractual zero-retention policies. While this significantly lowers the risk profile, leaks can still occur through logging or monitoring side-channels. Finally, Self-Hosted (Local) models offer complete control with zero external exposure, effectively eliminating third-party risk. However, they remain vulnerable to internal access control failures if the model weights themselves are not secured.

The flaw of sanitization

The immediate reaction to this problem is usually "sanitization," or building regular expression (regex) scripts or middleware to redact secrets before they are sent to an LLM. While helpful, this approach is reactive and fragile.

False negatives: Regex patterns often miss non-standard keys or custom tokens.
Context loss: Aggressive redaction can confuse the AI, causing the code review helpfulness to drop significantly.
The "Whack-a-Mole" problem: Every new API service you add requires a new redaction rule.

The solution: Secrets management

The only way to guarantee an LLM does not leak a secret is to ensure the secret never exists in the source code in the first place.

Instead of pasting hardcoded strings or relying on local .env files (which are frequently accidentally pasted into chat windows), organizations should leverage a dedicated secrets management platform like Doppler.

How Doppler neutralizes AI risks

Doppler acts as a central source of truth for secrets and application configuration. Instead of scattering secrets across git repositories or local files, Doppler injects them into the application at runtime.

This architectural shift completely changes the AI interaction:

Code sanitation by default:

When a developer asks an AI to "fix this database connector," the code snippet they paste looks like this:

const db = connect(process.env.DB_CONNECTION_STRING)

Because the actual credential is stored in Doppler and only injected when the app runs, the code pasted into the AI contains zero sensitive information. The AI sees the variable name, not the value.

Safe fine-tuning:

If you train a custom model on your repositories, and those repositories use Doppler, your training dataset contains only references (⁠ENVVARNAME), not actual secrets. The model learns code structure, not your Stripe API keys.

Rotation as a defense:

If a developer does accidentally leak a secret in a chat log, Doppler's instant secret rotation allows for the exposed credential to be invalidated immediately without requiring a code commit or a new deployment pipeline.

Securing your AI workloads

AI accelerates development, but it also accelerates the velocity at which sensitive data travels. Trying to "filter" secrets out of AI prompts is a losing battle.

The robust solution is to remove the secrets from the developer's clipboard entirely. By adopting a platform like Doppler, you decouple credentials from code. This ensures that when your team interacts with the next generation of AI tools, they are sharing logic, not the keys to the kingdom.

If you are leveraging custom models and would like to ensure secrets are protected, start a Doppler demo by signing up here.

Back to the blog