Vellum uses OS-level sandboxing, a keychain-backed credential vault, and a scoped trust rule system to keep your data safe while giving the assistant the access it needs.
The sandbox uses native OS-level sandboxing: sandbox-exec with SBPL profiles on macOS, bwrap (bubblewrap) on Linux. No extra dependencies on macOS. Fail-closed — if the backend is unavailable, commands fail immediately rather than falling back to unsandboxed execution.
file_read, file_write, file_edit, bash) operate within ~/.vellum/workspace.host_bash, host_file_read, host_file_write, host_file_edit) execute directly on the host, subject to trust rules and permission prompts.| Symptom | Cause | Fix |
|---|---|---|
Docker CLI is not installed | Docker not installed | Install Docker Desktop |
Docker daemon is not running | Docker Desktop not started | Start Docker Desktop or sudo systemctl start docker |
Cannot bind-mount the sandbox root | File sharing not configured | Add ~/.vellum/workspace to Docker file sharing |
bwrap cannot create namespaces | bubblewrap not installed (Linux) | apt install bubblewrap |
Run vellum doctor for a full diagnostic check.
Secrets are stored in the macOS Keychain (encrypted file fallback on Linux). The LLM never sees raw tokens or keys.
SecureField panel collects credentials; the LLM never sees the value.allowedTools and allowedDomains, enforced by the CredentialBroker.Use either format in proxied shell commands:
credential_store listfal/api_keyUnknown references fail immediately with a clear error before the command executes.
Patterns like *.fal.run match subdomains (api.fal.run, queue.fal.run) and the bare domain (fal.run). Exact patterns take precedence over wildcards.
When multiple credentials match the same host, the request is blocked — the proxy refuses to guess. Per-credential selection picks the most specific template (exact > wildcard). Cross-credential resolution blocks when more than one credential matches.
credential_store listhostPattern matches the target hostheaderName and valuePrefixLOG_LEVEL=debug for decision tracesThe autoApproveUpTo threshold controls which risk levels auto-approve without prompting. Configured per execution context via the gateway.
| Threshold | UI Label | Behavior |
|---|---|---|
none | Strict | Prompt for every action. No auto-approve. |
low | Default | Auto-approve Low risk. Prompt for Medium and High. |
medium | Relaxed | Auto-approve Low and Medium risk. Prompt for High only. |
high | Full access | Auto-approve all risk levels. No prompts. |
Accepts a scalar string applied to all contexts, or an object with per-context overrides:
// Scalar — same threshold everywhere
autoApproveUpTo: "low"
// Per-context — different thresholds per execution context
autoApproveUpTo: {
conversation: "low", // interactive chat sessions
background: "medium", // scheduled tasks, heartbeats
headless: "none" // API/webhook-triggered (strictest)
}Trust rules are persisted in a SQLite database managed by the gateway process. Each rule stores a tool name, glob pattern, risk level, decision (allow/deny/ask), and optional directory scope.
When you approve a shell command, the prompt offers parser-derived allowlist options based on the command's structure. For example, cd /repo && gh pr view 5525 --json title generates:
cd /repo && gh pr view 5525 --json title — exact commandgh pr view * — any gh pr view commandgh pr * — any gh pr commandgh * — any gh commandCompound commands with multiple non-prefix actions only offer an exact-command option to prevent over-generalization.
Trust rules record the skill's version hash. If source files change, the hash changes and you're re-prompted — modified skills can't silently inherit previous approvals.
Writes to skill directories are escalated to high risk, preventing the agent from modifying its own capabilities without explicit consent.
Pick a name and share your world. Then watch the relationship grow.
HATCH YOURS