In the previous Model Context Protocol (MCP) Integration chapter, we gave our agent "hands" to interact with the outside world. It can now read files, search the web, and query databases.
But with great power comes great responsibility. What if your agent accidentally reads a file containing passwords and sends it to the user? What if a user tricks your agent into being rude?
This brings us to Middleware & Defense.
Imagine your agent is an international airport.
The Problem: If you check every passenger manually inside the plane (inside your agent logic), your code becomes messy. You would have if is_safe(message): scattered everywhere.
The Solution: Middleware. It acts like the Security Checkpoint.
To understand Defense Middleware, we need to understand three simple concepts:
****-****).Let's say we are building a Customer Support Agent. We want to ensure it never leaks customer email addresses in its output.
Instead of writing Python code to check for emails, we define a "Defense" rule in our YAML configuration.
# config.yaml
middleware:
email_protector:
_type: pii_defense # A specific defense type
action: redirection # Don't crash, just fix it
target_location: output # Check what the agent SAYS
target_field: "$.response.body" # Only check the body text
Explanation:
action: redirection: Tells the system to sanitize the data (replace emails with [REDACTED]) rather than throwing an error.target_location: output: We are worried about what leaves the system.target_field: We use a specific path so we don't accidentally redact internal system logs.In the NeMo Agent Toolkit, you don't need to manually attach this middleware to every function. The toolkit reads your config and wraps the relevant functions automatically.
If a user asks: "What is Alice's email?" And the LLM tries to reply: "It is alice@example.com"
The Middleware intercepts the result:
Original Output: "It is alice@example.com"
Middleware Scan: [Email Detected!] -> [Redacting...]
Final Output: "It is [EMAIL REDACTED]"
Your agent logic doesn't even know this happened. It just generated the text, and the middleware cleaned it up.
How does the toolkit wrap these functions? It uses a pattern often called the "Onion" or "Decorator" pattern.
When you call a function, you aren't calling the function directly. You are calling the Middleware Layer.
Let's look at the base class that makes this possible: DefenseMiddleware.
Every defense tool inherits from this standard structure. It holds the configuration and access to the Builder.
# packages/nvidia_nat_core/src/nat/middleware/defense/defense_middleware.py
class DefenseMiddleware(FunctionMiddleware):
def __init__(self, config: DefenseMiddlewareConfig, builder):
super().__init__(is_final=False)
self.config = config
self.builder = builder
# 'action' determines if we Block or Redact
print(f"Defense initialized with action: {config.action}")
Explanation:
This class prepares the middleware. It knows how to react (config.action) based on your YAML settings.
_should_apply_defense)
You might have 50 functions in your agent, but you only want to defend the generate_email function. The middleware checks if it should run for the current step.
def _should_apply_defense(self, context_name: str) -> bool:
target = self.config.target_function_or_group
# If no target is set in config, defend EVERYTHING
if target is None:
return True
# Otherwise, only run if names match
return context_name == target
Explanation: Before scanning data, the middleware asks: "Is this the function I am supposed to guard?" This saves performance by not scanning internal utility functions unnecessarily.
Sometimes, the output is a complex JSON object. We don't want to scan the whole thing, just specific fields. The toolkit uses JSONPath (like $.data.message) to find the exact text to check.
def _extract_field_from_value(self, value: Any):
# If the user specified a specific field (e.g., "$.response")
if self.config.target_field:
# Use JSONPath to find just that part of the data
jsonpath_expr = parse(self.config.target_field)
matches = jsonpath_expr.find(value)
# Return only the extracted text for scanning
return [match.value for match in matches]
return value # Otherwise, scan everything
Explanation: This method extracts the needle from the haystack. If your agent returns a 5MB JSON object, but only one field contains text, this ensures the defense system focuses only on that one field, making it fast and accurate.
In this chapter, we learned:
action (Block/Redact) and a target (Specific functions or fields).Now our agent is smart, connected, and safe. But how do we know how well it is performing? Is it slow? Is it getting confused?
We need to see inside the brain of the agent.
Next Chapter: Observability & Profiling
Generated by Code IQ