Skip to main content

Security Guardrails

Detect and act on sensitive data and adversarial inputs.

Overview

Security guardrails inspect requests and responses passing through the gateway. Each detection category can be independently enabled and assigned an action.

Data Risk Detection

Contextual detection identifies sensitive data categories and applies the configured action.

Supported Categories

  • PII - Personally Identifiable Information
  • PHI - Protected Health Information
  • PCI - Payment Card Industry data
  • Financial data - Financial records and account information

Exact Data Matching (EDM)

Pattern matching with custom EDM rules for specific data formats.

Adversarial Risk Detection

Catches adversarial attack patterns in requests:

  • Prompt injection - Attempts to override system instructions
  • Jailbreak - Attempts to bypass safety controls
  • Social engineering - Manipulation attempts targeting the AI model

Configurable Actions

Each detection category supports per-category actions:

ActionBehavior
BlockReject the request entirely
RedactRemove the sensitive data and allow the request
AnonymizeReplace sensitive data with anonymized placeholders
MonitorAllow the request and log the detection for review