Voice provider sandbox iframes

How ElevenLabs and Vapi WebRTC SDKs run inside the extension without weakening its CSP.

The Teleperson extension keeps a strict Content Security Policy — script-src 'self', no inline scripts, no unsafe-eval. But the WebRTC SDKs from ElevenLabs and Vapi require unsafe-eval to deserialize their codecs. Both can't be true in the same context, so we use sandboxed iframes.

CSP boundary diagram: strict-CSP zone (side panel + background SW + auth) on the left, sandbox iframe with unsafe-eval running ElevenLabs WebRTC on the right, postMessage as the only crossing

What the manifest declares

Manifest V3's content_security_policy accepts two contexts:

{
  "content_security_policy": {
    "extension_pages":
      "script-src 'self'; object-src 'self'; connect-src 'self' https://api.anthropic.com <supabase-url>",
    "sandbox":
      "sandbox allow-scripts allow-forms; script-src 'self' 'unsafe-eval'; object-src 'self'; connect-src 'self' https://api.elevenlabs.io https://api.vapi.ai"
  }
}

extension_pages — applies to the side panel and background worker. Strict, no unsafe-eval.
sandbox — applies only to pages flagged as sandbox.pages in the manifest. Allows unsafe-eval inside that sandbox only.

The sandbox pages are listed by path:

"sandbox": {
  "pages": [
    "src/sandbox/elevenlabs-sandbox.html",
    "src/sandbox/vapi-sandbox.html"
  ]
}

The split

The Voice Concierge UI (start button, transcript, mute, hang-up) lives in the regular side panel under the strict CSP. When the user clicks Call:

The panel mounts a hidden iframe pointing at chrome-extension://<id>/src/sandbox/elevenlabs-sandbox.html.
The sandbox iframe loads the ElevenLabs SDK, which evals its WebRTC codecs to bootstrap the session.
The panel and the sandbox communicate via postMessage — the panel tells the sandbox to start/stop the call and pushes dynamic variables; the sandbox emits transcript chunks back as postMessage events.

┌─────────────────┐     postMessage      ┌──────────────────────┐
│ Side panel      │ ───────────────────► │ ElevenLabs sandbox   │
│ (strict CSP)    │   start, stop, vars  │ iframe (unsafe-eval) │
│                 │ ◄─────────────────── │                      │
│                 │   transcript chunks  │   WebRTC ⇄ ElevenLabs│
└─────────────────┘                      └──────────────────────┘

The strict-CSP context never executes any third-party eval, and the sandbox never sees Teleperson's X-TLE-Token.

Why this matters

The 99% of code in the side panel + background SW stays under the hardest CSP browsers offer for extensions. Any future XSS in our own code can't escape into eval-permissive territory.
Voice provider SDK bugs can't reach our auth tokens, our Plaid data, or our Anthropic key.
Switching voice providers (ElevenLabs ↔ Vapi ↔ a future provider) is a same-shape change: drop in a new sandbox iframe, wire up the postMessage protocol.

What's in `connect-src` for the sandbox

Only the voice provider's own API origin. The sandbox can't fetch from api.anthropic.com, the Teleperson Supabase project, or anywhere else. If a sandbox tries, the browser blocks the request at the CSP layer.

Voice Concierge → — the user-facing feature.
ElevenLabs Voice AI config → — the admin setup.

Voice provider sandbox iframes

What the manifest declares

The split

Why this matters

What's in connect-src for the sandbox

Related

On this page

What's in `connect-src` for the sandbox