Dec 10, 2025
10 min read

Advanced LLM security: Preventing secret leakage across agents and prompts

Advanced LLM security: Preventing secret leakage across agents and prompts

TL;DR

This article tackles those security challenges head-on. You’ll learn how secrets leak across LLM workflows, and, more importantly, how securing AI systems can prevent these leaks. By the end, you’ll be able to design secret-safe AI pipelines that treat prompts and agents as first-class security surfaces.

Initially, AI systems benefited from security practices such as key rotation and secret masking. However, as more advanced models became deeply embedded in production pipelines, those early safeguards began to appear limited.

These systems can now write code, make API calls, and run live workflows, often interacting with real credentials in the process. Somewhere in that flow, secrets can slip into training data, logs, or prompts and quietly resurface in model outputs.

A recent report from The Hacker News revealed that more than 12,000 live API keys and passwords were discovered in publicly available datasets used to train large language models (LLMs). AI technologies are built to process information, but they can also memorize and expose it. When weak security practices take root in these systems, the risk of data theft, secret sprawl, and full-scale breaches increases by the second.

How secrets leak across AI systems, training data, and the model lifecycle

Secret exposure in the AI lifecycle arises from challenges that don't exist in traditional software development. In a typical application, secrets are static strings stored in configuration files or vaults, so traditional security measures are generally effective in these environments.

In contrast, an LLM can treat secrets in its training data as knowledge. Models also operate in an agentic manner, integrating with tools, APIs, and data sources to fulfill user requests. These interactions often require live production credentials, which can be weaponized if exposed. It's therefore critical that you understand how secrets leak at each stage of the AI lifecycle.

StageLeakage vectorHow secrets leakLLM security implication

Data collection & training

Training data memorization

API keys, tokens, or PII can be accidentally scraped from public or internal sources like repositories, logs, or chat transcripts and end up in training datasets.

The model memorizes these secrets as patterns. With the right prompt, attackers can force it to regurgitate the data, bypassing access controls.

Model development & fine-tuning

AI code review failures and supply chain risk

Hardcoded credentials can slip through AI-assisted code reviews that focus on logic over security. Additionally, using unverified models, libraries, or datasets may introduce supply chain attacks or TrojanStego-style threats, where hidden payloads are embedded within model files.

Model poisoning occurs when a malicious payload is embedded in data or code, activating only under specific prompts to exfiltrate secrets or internal data.

Inference & logging

Inference-time exposure

Models and agents that use live credentials for external APIs can expose them in memory or logs if secrets aren't securely injected at runtime. This exposure allows attackers to gain unauthorized access to internal systems.

Prompt injection attacks (LLM01) can manipulate system prompts to reveal environment variables, secrets, or proprietary logic used in RAG pipelines.

AI workflows that experience leakage often lack least-privilege controls and scoped access. But in many cases, leaks start with something as preventable as a hardcoded credential:

If the same principles that secure DevOps workflows and developer infrastructure were applied to AI systems, the likelihood of secret exposure would drop dramatically. Also, special attention should be given to prompts and agents, which have become a hidden layer where sensitive data is most likely to leak.

How prompts and agents create hidden layers for secret exposure and prompt injection

LLM systems rely heavily on prompts and agents to complete tasks and generate outputs. Between these two layers, sensitive credentials are constantly being passed around, creating potential entry points for attacks and other security threats. Let’s examine how both can serve as channels for secret exposure.

Prompts

Consider a banking application that uses an LLM API to provide digital assistance. The issue often begins with how system instructions are defined and passed to the model.

This setup embeds API keys and tokens directly into the prompt, exposing them to prompt-injection or introspection attacks.

An attacker could trick the model with queries such as:

Any of these could cause the model to leak sensitive details.

A safer approach is to retrieve credentials at runtime using environment variables:

This way, no secret is hardcoded or visible in the model's context window.

Agents

Agents are even more active. They not only process text but also use other AI tools to execute tasks such as booking tickets, processing refunds, and escalating support cases. These actions require temporary access to your internal systems and credentials.

Here's an unsafe example:

This design exposes multiple risks:

  • The credentials are static and hardcoded.
  • Long-running agents can persist these secrets in memory, logs, or checkpoints even after they are removed from configuration.
  • If compromised, they could reveal full system access.

For the safety of your internal and sensitive data, prompts and agents require strong security practices to prevent exposure. The following diagram illustrates this:

Secure prompt and agent lifecycle. Secrets are pulled at runtime, never embedded, and cleaned up after model execution to prevent leaks.
Secure prompt and agent lifecycle. Secrets are pulled at runtime, never embedded, and cleaned up after model execution to prevent leaks.

Protecting your AI systems means treating prompts and agents with the same care as you do your production infrastructure. Building good practices ensures that your LLM workflows remain safe even as users interact with them globally.

Building data security into AI models and their workflows

Below are the key stages in an AI workflow, along with how you can apply secrets guardrails at each step to strengthen LLM security and protect sensitive data handled by machine learning algorithms.

1. Data ingestion

Data ingestion is one of the most critical stages in AI development pipelines because it’s the first point of contact for raw input data used to train a model. ETL workflows usually collect, transform, and load data from multiple sources, making this stage a common entry point for attackers. During this process, attackers can inject poisoned or corrupted content, and scraped datasets might include live secrets or personally identifiable information (PII).

Example of sensitive data inside a dataset:

To stop leaks before they reach the trainer, add a pre-ingestion gate that scans for secret-like patterns and PII.

You can also integrate data loss prevention (DLP) tools such as Microsoft Purview, Symantec DLP, or Google Cloud DLP to identify and block unauthorized transfers of sensitive data across networks and environments.

2. Model training

This stage involves developing and customizing LLMs using your cleansed data and defined workflows. The goal is to produce a model that performs a specific task for your product, such as a customer service agent, chatbot, or account updater.

The main security risk at this stage is credential exposure. Training pipelines often rely on temporary access to data sources, APIs, or storage locations. If these credentials are overly scoped or long-lived, the model or its training environment could access systems beyond its intended use case.

To mitigate this, apply role-based access control (RBAC) to limit who can start, stop, and monitor training jobs, as well as access model checkpoints and logs. The model itself should use least-privilege permissions and temporary credentials that expire once training completes.

In practice, a training job will often request short-lived session credentials from AWS Security Token Service (STS). These credentials rotate automatically and expire after a short duration, avoiding the risks of static IAM keys.

The role itself must be tightly scoped. Below is an example least-privilege IAM policy that grants read-only access to the training dataset and write-only access to logs.

3. Prompt templating and inference

As discussed earlier, prompts and agents can easily expose secrets when credentials are embedded directly in system instructions.In a secret-safe workflow, prompts are created with placeholders, and all sensitive values are retrieved securely at runtime rather than hardcoded or passed through the model context.

The following example demonstrates how to safely retrieve secrets at runtime and automatically redact them from logs during inference:

This snippet achieves the following:

  • Secrets are retrieved only at runtime using os.getenv().
  • Sensitive tokens are redacted from logs using a regex before printing or storing outputs.
  • No secret values appear in model prompts or checkpoints, ensuring both prompt safety and inference hygiene.

4. Memory cleanup and monitoring

Even after safe inference, traces of secrets can remain in memory or checkpoint files. Long-running agents may persist temporary data, and cached responses can accidentally retain sensitive information. This makes post-inference cleanup just as crucial as runtime protection.

Here’s a lightweight example of how to handle memory cleanup after each inference task:

This cleanup script can run automatically at the end of each inference session or be integrated into your CI/CD pipeline.

Also, monitoring should be added to track where and when secrets are requested.

These audit logs can be ingested into your observability platform to detect anomalies and security incidents such as unexpected secret access or overuse. When suspicious behaviors are identified, automated triggers can revoke or rotate the exposed credentials.

Together, these stages create stronger secret hygiene across your AI lifecycle.Integrating a secrets manager like Doppler then helps apply these same principles more efficiently and at scale.

How these security principles can be applied in Doppler to prevent data breaches

Storing or managing the multitude of secrets for your AI workflows in multiple locations increases the risk of breaches and secret sprawl. This ultimately cancels out the earlier efforts of maintaining good secret hygiene.

A more efficient approach is to use a centralized platform that enforces these same principles automatically. Doppler helps achieve this through features like:

  • Runtime secret injection for AI jobs and agents
    Instead of hardcoding keys or passing them through prompts, you can pull secrets securely at runtime using the Doppler CLI or SDK.

    In the example below, an AI agent fetches its API key from Doppler when it starts up, logs the request for auditing, and keeps the key out of model prompts, code, and logs.
  • Environment scoping for each model or agent
    Restricting models or agents to only the environments they need helps ensure that only authorized users and systems interact with critical credentials. In Doppler, you can define separate environments and configurations so staging, training, and production secrets never overlap.
  • Secret logging for visibility across training and inference
    Tracking how secrets are used throughout the lifecycle of your LLM is critical for understanding access patterns and spotting potential abuse. Doppler’s activity and audit logs give you centralized visibility across all of your operational surfaces.
  • Automatic revocation of identities used in LLM workflows
    As discussed earlier, long-term use of secrets across your LLM workflows should be consciously avoided to reduce the risk of stale or misused credentials. Either use Doppler’s rotated secrets to automatically rotate and revoke credentials, or its dynamic secrets to issue short-lived credentials from the start.

Wrapping up

It’s an error to think of LLM security as separate from your core engineering hygiene. Your agents and prompts benefit from the same security practices that keep your code and DevOps infrastructure safe. When these AI workflows follow least privilege principles, runtime safety, and auditability, your entire stack becomes more resilient and resistant to breaches by design.

Enjoying this content? Stay up to date and get our latest blogs, guides, and tutorials.

Related Content

Explore More