SteeringAPI Documentation
Learn how to control AI model behavior with fine-grained, interpretable features.
๐ Quick Start
Get up and running with SteeringAPI in minutes. Learn the basics of feature steering and make your first API call.
๐ API Reference
Interactive OpenAPI documentation with all endpoints, request/response schemas, and try-it-out functionality.
๐ง How It Works
Deep dive into the mathematics and mechanics of SAE-based model steering from theory to implementation.
๐ท๏ธ SelfIE Labels
Learn about our automated feature labeling system that generates interpretable descriptions for SAE features.
What is SteeringAPI?
SteeringAPI provides fine-grained control over AI model behavior through interpretable features extracted using Sparse Autoencoders (SAEs). Instead of relying on prompt engineering alone, you can directly manipulate high-level semantic concepts like tone, style, safety, and domain-specific knowledge.
Key Features
- Interpretable Control: Steer on human-understandable concepts like "formal language" or "technical jargon"
- Precise Adjustments: Fine-tune model behavior with numerical precision
- Real-time Steering: Apply interventions during inference without retraining
- 100k+ Features: Access a vast library of interpretable features across multiple models
Common Use Cases
Content Moderation
Suppress harmful content by reducing activation of features related to violence, toxicity, or inappropriate topics.
Style Control
Adjust tone, formality, or personality traits to match your brand voice or use case requirements.
Domain Adaptation
Enhance or suppress domain-specific knowledge (medical, legal, technical) without fine-tuning.
Bias Mitigation
Identify and reduce unwanted biases by steering away from problematic feature activations.
Next Steps
- 1. Read the Quick Start: Get started in 5 minutes
- 2. Understand the concepts: Learn how steering works
- 3. Try it out: Sign up and start steering