What is Therion?
Therion is Starfort’s advanced Auto Red Teaming engine designed to proactively identify vulnerabilities in your AI models before they can be exploited by attackers. By simulating sophisticated adversarial attacks, Therion ensures your models are robust, secure, and compliant.Start a Scan
Register your model to start a Therion red teaming session.
Core Capabilities
Therion operates by launching controlled, adversarial interactions against your AI infrastructure.1. Automated Jailbreaking
Therion attempts to bypass safety guardrails using state-of-the-art jailbreak techniques, including:- Prompt Injection: Manipulating input context to override instructions.
- Token Smuggling: Obfuscating malicious payloads to evade filters.
- Cognitive Overload: Using complex nested logic to confuse the model.
2. Prompt Leakage Testing
Verifies if your model can be tricked into revealing sensitive system prompts, proprietary data, or internal configurations.3. Bias and Toxicity Analysis
Systematically probes the model for harmful outputs, checking for:- Racial, gender, or religious bias.
- Generation of toxic or hate speech.
- Dangerous content generation (e.g., instructions for illegal acts).
How It Works
1
Configuration
Define the scope of the red teaming exercise, including the specific model endpoints and the types of attacks to simulate (e.g., OWASP Top 10 for LLMs).
2
Attack Simulation
Therion’s engine generates thousands of adversarial prompts, evolving its strategy based on the model’s responses.
3
Analysis & Reporting
Results are analyzed in real-time. Successful attacks are flagged with severity scores, and a comprehensive remediation report is generated.