NeMo Guardrails
NeMo Guardrails provides programmable safety controls for LLM applications. It runs as a separate service in front of the model and can enforce:
- Sensitive data detection (for example, PII in inputs and outputs).
- Content policies (for example, forbidden topics, competitor mentions).
- Custom validation flows written in Colang and Python.
The TrustyAI Operator exposes NeMo Guardrails through the NemoGuardrails custom resource (CR). This document focuses on a basic deployment that:
- Protects an existing model deployed on the serving platform.
- Uses NeMo Guardrails for input/output filtering and simple business rules.
TOC
PrerequisitesArchitectureNeMo configuration ConfigMaprails.co basicsactions.py basicsDeploy the NemoGuardrails custom resourceAuthentication (auth enabled)How to obtain a tokenAccessing the NeMo Guardrails APIBasic chat completion (allowed content)Message length guardrail exampleForbidden content exampleSensitive data detection exampleFurther readingPrerequisites
- TrustyAI Operator installed (see Install TrustyAI).
- A model already deployed on the serving platform (for example, vLLM) that exposes an OpenAI-compatible API.
Architecture
At a high level, the request path is:
Client → NeMo Guardrails service → model predictor (OpenAI-compatible API)
NeMo Guardrails:
- Receives OpenAI-style
chat/completionsrequests. - Executes configured rails (sensitive data detection, length checks, forbidden topics, and so on).
- For allowed requests, forwards them to the underlying model.
- For blocked requests, returns an appropriate assistant message without calling the model.
The TrustyAI Operator manages the NeMo Guardrails server Pod and Service through the NemoGuardrails CR. The Service can then be exposed externally using the chosen ingress or gateway solution in the cluster.
NeMo configuration ConfigMap
NeMo Guardrails expects a configuration directory that typically contains:
config.yaml: main NeMo Guardrails configuration file.rails.co: Colang flows that implement the input/output rails and any additional control logic.actions.py: Python actions that Colang flows can call to perform custom logic.
NeMo configuration ConfigMap example
-
config.yamlbasics
In the example above, config.yaml:
-
Declares a single backend model under the
modelssection and configures the OpenAI-compatible endpoint viaopenai_api_baseandmodel_name. -
Configures built-in PII detection under
rails.config.sensitive_data_detection:input.entities/output.entitieslist the entity types to protect (for example,EMAIL_ADDRESS,PERSON).- When the
detect sensitive data on input/detect sensitive data on outputrails run, NeMo automatically invokes its internal detectors using this configuration.
-
Defines which rails run, and in what order, via
rails.input.flowsandrails.output.flows:detect sensitive data on input/detect sensitive data on outputare built-in rails backed bysensitive_data_detection.check message lengthandcheck forbidden wordsare custom rails implemented inrails.coand backed by Python actions inactions.py.
-
Replace
<model-predictor-host>,<port>, and<model-name>with the actual predictor service URL and model name. -
Ensure the backend predictor implements an OpenAI-compatible
/v1/chat/completionsAPI. -
For more advanced configuration of
config.yaml(additional rail types, prompts, tracing, knowledge base, and integration with other safety providers), refer to the official NeMo Guardrails YAML configuration reference: Nvidia NeMo Guardrails Configuration.
rails.co basics
In this example, rails.co defines two custom input rails:
define flow check message length:- The flow name
check message lengthmust match an entry inrails.input.flowsinconfig.yaml. $length_result = execute check_message_lengthruns the Python actioncheck_message_lengthfromactions.py, passing the current conversation context.- The
ifstatements branch on the returned string and either:- call a
bot ...block to send a reply (for example,bot inform message too long), and stopto abort further processing and prevent the LLM from being called,- or do nothing and allow the pipeline to continue to the next rail when
"allowed"is returned.
- call a
- The flow name
define flow check forbidden words:- Uses the same pattern, but calls the
check_forbidden_wordsaction and only blocks when the return value is not"allowed".
- Uses the same pattern, but calls the
Additional points:
bot ...blocks (for example,bot inform message too long) define canned assistant messages that are sent directly to the client without contacting the backend LLM when a rail decides to stop the pipeline.- Rails defined in
rails.coare executed in the order listed inrails.input.flows/rails.output.flows. Built-in rails such asdetect sensitive data on inputrun before or after custom rails depending on their position in the list. - The Colang shown here is a minimal example. More complex flows (multiple steps, variables, additional actions) are supported; see the Colang reference in the NeMo Guardrails documentation for full syntax and capabilities. A good starting point is the Colang 2.0 getting started guide: Colang Getting Started.
actions.py basics
The actions.py file contains Python functions decorated with @action that Colang flows can call with execute <action_name>:
- Actions receive a
contextobject, which is a dict-like structure populated by NeMo (for example, containing the latest user message under"user_message"). - Actions return a value (typically a short string) that Colang flows interpret and branch on.
In this example:
check_message_length:- Inspect
context["user_message"], compute the word count, and return:"blocked_too_long"when the message should be rejected."warning_long"when a warning is needed but the pipeline may continue."allowed"when the message length is acceptable.
- Inspect
check_forbidden_words:- Lowercase the user message, search for forbidden words, and:
- Return
"allowed"when nothing is found. - Return a non-
"allowed"value (for example,"blocked_password") when a forbidden word is present.
- Return
- Lowercase the user message, search for forbidden words, and:
These patterns can be extended for more complex guardrails, such as structured checks, numeric thresholds, or calls to external services.
Deploy the NemoGuardrails custom resource
With the ConfigMap and token Secret in place, create a NemoGuardrails CR to deploy the NeMo Guardrails service:
Key fields:
nemoConfigs: References one or more configuration bundles; each bundle can map to one or more ConfigMaps containing NeMo Guardrails configuration files.env.OPENAI_API_KEY: Token used by NeMo Guardrails to authenticate against the backend model endpoint (for example, a vLLM service). For internal, unauthenticated inference services, this value can be set directly withvalue: "<placeholder>"and is not used by the backend. For HTTP-only inference services, TLS certificates are not required for the backend URL.security.opendatahub.io/enable-auth: When set to"true", the route to NeMo Guardrails is protected by cluster auth and requires a Bearer token.
Apply:
After the CR is created, the operator reconciles and creates:
- A Deployment for the NeMo Guardrails server.
- A Service that exposes the NeMo Guardrails HTTP endpoint inside the cluster.
Wait for the Deployment Pod to become Ready:
Authentication (auth enabled)
When HTTP authentication is enabled in front of NeMo Guardrails, the service expects a Bearer token on incoming requests.
How to obtain a token
Create a ServiceAccount, a Role (with get, create on services/proxy), and a RoleBinding in the same namespace as the NemoGuardrails resource; then create a token for the ServiceAccount:
Optionally set token duration, e.g. --duration=8760h for one year. The last command outputs the token; set it as the Authorization: Bearer <token> header value.
Accessing the NeMo Guardrails API
NeMo Guardrails exposes an OpenAI-style chat completions endpoint:
POST /v1/chat/completions
Expose the NeMo Guardrails Service using the preferred ingress or gateway mechanism (for example, an Ingress resource or API gateway) and note the public host and port:
- Without auth: the Service is typically exposed as HTTP on port 80.
- With auth enabled: the Service is typically exposed as HTTPS on port 443.
Set the base URL accordingly, for example:
Basic chat completion (allowed content)
Example request:
Typical response:
Message length guardrail example
The check_message_length flow and its corresponding Python action enforce a simple length-based guardrail. When the user message is too long, the rail replies directly without calling the backend LLM:
The response is generated by NeMo Guardrails without calling the backend model:
Forbidden content example
Forbidden topics are controlled by the check_forbidden_words action and its Colang flow. When the user message contains a forbidden word such as "hack" or "password", the rail blocks the request:
The response is generated by NeMo Guardrails without calling the backend model:
Sensitive data detection example
Sensitive data detection is configured in config.yaml under rails.config.sensitive_data_detection. In the example configuration, both input and output detection flag EMAIL_ADDRESS.
Example input that includes an email address:
Typical response:
In this case, the built-in sensitive data detection rail has detected the email address in the user message and NeMo Guardrails returns a safe fallback reply instead of letting the backend model respond with a potentially unsafe answer.
Further reading
For a broader overview of the NeMo Guardrails library (use cases, architecture, and ecosystem integrations), see the official documentation: Overview of NVIDIA NeMo Guardrails Library.