Tools

Let your agent call your backend, run code on the caller’s device, or hang up / transfer the call

A tool is a reusable function the LLM can invoke mid-conversation. You create a tool once with POST /v1/tools, then attach it to any number of agents with POST /v1/agents/{id}/tools/{toolId}.

Three kinds

KindRuns whereUse it for
systemInside the workerBuilt-ins: end_call, transfer_to_number, transfer_to_agent, play_keypad_touch_tone, skip_turn.
webhookYour backendAny business logic: look up an order, book an appointment, charge a card. Signed with HMAC-SHA256.
clientThe caller’s browser / SDKUI actions: navigate the page, fill a form, populate a cart. Dispatched over the session’s tools data channel.

Shared schema: parameters

Every tool declares a parameter list the LLM is allowed to pass:

1"params": [
2 { "name": "order_id", "type": "string", "description": "Customer's order ID", "required": true },
3 { "name": "notify", "type": "boolean", "description": "Email confirmation", "required": false }
4]

type is one of string, number, integer, boolean. String params can include an enum of allowed values.

System tools

Worker-resident built-ins. No HTTP, no round-trip — they execute inside the agent process and complete in milliseconds.

1{
2 "name": "hang_up",
3 "description": "End the call when the caller says goodbye.",
4 "kind": "system",
5 "config": { "builtin": "end_call" }
6}

The LLM calls hang_up() when it decides the conversation is over; the room disconnects immediately.

transfer_to_number and play_keypad_touch_tone are SIP-dependent and return a clear error until phone-number support lands.

Webhook tools

The worker signs a JSON envelope, POSTs (or GETs) it to your URL, and passes the response back to the LLM.

1{
2 "name": "lookup_order",
3 "description": "Fetch order details by order ID.",
4 "kind": "webhook",
5 "config": {
6 "url": "https://api.your-app.com/webhooks/lookup-order",
7 "method": "POST",
8 "timeout_ms": 5000,
9 "headers": { "X-Org-ID": "acme" },
10 "params": [
11 { "name": "order_id", "type": "string", "description": "Order ID", "required": true }
12 ]
13 }
14}

On create, the response contains the HMAC signing secret exactly once:

1{ "id": "…", "kind": "webhook", "webhook_secret": "wh_sec_abc123…", "…": "…" }

Store it. Every subsequent read returns a masked placeholder — there is no retrieval endpoint.

Verifying the signature

Every call to your webhook carries an X-Speechify-Signature header:

X-Speechify-Signature: t=1713360000,v1=<hex HMAC-SHA256(secret, body)>

The body is a JSON envelope:

1{
2 "tool_call_id": "call_abc123",
3 "tool_name": "lookup_order",
4 "arguments": { "order_id": "ORD-42" },
5 "timestamp": "2026-04-17T17:30:00Z"
6}

Verification example in Python:

1import hmac, hashlib
2
3def verify(raw_body: bytes, header: str, secret: str) -> bool:
4 parts = dict(p.split("=", 1) for p in header.split(","))
5 expected = hmac.new(secret.encode(), raw_body, hashlib.sha256).hexdigest()
6 return hmac.compare_digest(parts["v1"], expected)

Respond with 200 OK and a JSON body; the agent receives your JSON as the tool’s return value:

1{ "status": "shipped", "tracking": "1ZW…" }

Webhook tools + GET

For method: "GET", the arguments are sent as query parameters. The HMAC header is still set, but it covers the JSON envelope (which isn’t on the wire), so verification by the recipient is only meaningful for POST. We recommend POST for any endpoint you plan to signature-verify.

Client tools

Execution runs in the caller’s browser / SDK. The agent forwards a request over the session’s tools data channel and awaits your response.

1{
2 "name": "navigate_to",
3 "description": "Scroll the page to a named section.",
4 "kind": "client",
5 "config": {
6 "timeout_ms": 4000,
7 "params": [
8 { "name": "section", "type": "string", "description": "Section name", "required": true,
9 "enum": ["pricing", "docs", "contact"] }
10 ]
11 }
12}

Receiving on the client

The upcoming @speechify/agents-js SDK wraps this with a single-call registerTool(name, handler) API — we’ll link the reference here as soon as it publishes. The wire protocol is a JSON tool_call message on the session’s tools data channel; your handler posts a tool_response with the same tool_call_id back on the same channel.

1// Inbound (the agent → your client)
2{ "type": "tool_call", "tool_call_id": "call_abc123", "tool_name": "navigate_to", "arguments": { "section": "pricing" } }
3
4// Outbound (your client → the agent)
5{ "type": "tool_response", "tool_call_id": "call_abc123", "result": { "ok": true } }

Attaching to an agent

A tool must be attached to an agent before the LLM can call it.

1from speechify import Speechify
2
3client = Speechify()
4
5client.tts.agents.attach_tool(id=agent_id, tool_id=tool_id)
6client.tts.agents.detach_tool(id=agent_id, tool_id=tool_id)
7attached = client.tts.agents.list_tools(id=agent_id)

Tool invocations are persisted on the transcript with role=tool, tool_name, tool_args, and tool_result — available via client.tts.conversations.list_messages(conv_id) or GET /v1/conversations/{id}/messages.