Tool Runner
JSON tool call parser and async dispatcher for the coding agent — launches apps, controls Spotify, manages the workspace, and calls web tools.
Overview
The tool runner is used by the coding agent (and general chat agent) to parse structured JSON tool calls embedded in the LLM's output and execute them asynchronously. It is separate from the bracket command parser — tool calls use structured JSON, bracket commands use inline text syntax.
| Module | Contents |
|---|---|
tool_runner/parser.py | parse_tool_call_json(), strip_tool_call_json(), _scan_json_object(). |
tool_runner/executor.py | execute_tool_call() — dispatches 30+ tools. |
tool_runner/verifier.py | verify_tool_result() — validates tool output. |
parse_tool_call_json()
Scans a raw LLM response string for the first valid JSON object containing a"tool" key and returns it as a dict. Handles partial JSON, markdown code fences, and embedded noise.
def parse_tool_call_json(text: str) -> dict | Nonefrom backend.services.tool_runner import parse_tool_call_json, strip_tool_call_json
raw = 'Let me check that. {"tool": "web_search", "args": {"query": "Python 3.13 features"}}'
tc = parse_tool_call_json(raw)
# {"tool": "web_search", "args": {"query": "Python 3.13 features"}}
# Remove the JSON blob from the display text
display_text = strip_tool_call_json(raw)
# "Let me check that." execute_tool_call()
Async dispatcher — takes a parsed tool call dict and returns a result string. The result is fed back into the LLM as a tool-result message for the next reasoning step.
async def execute_tool_call(
tc: dict, # {"tool": "...", "args": {...}}
db: Session,
conversation_id: int,
) -> str # short result string, max ~4000 charsimport asyncio
from backend.services.tool_runner import execute_tool_call
from backend.models.database import SessionLocal
db = SessionLocal()
result = asyncio.run(execute_tool_call(
{"tool": "web_search", "args": {"query": "FastAPI streaming"}},
db=db,
conversation_id=1,
))
print(result) # search results string
db.close()verify_tool_result()
Validates a tool result string — checks it's non-empty, not an obvious error message, and within size limits. Returns True if the result is usable for the next LLM turn.
def verify_tool_result(result: str) -> boolAvailable tools
All tools available to the LLM via the tool runner:
| Tool name | Description |
|---|---|
launch_app | Launch a desktop application by name. |
list_apps | Return a list of known launchable apps. |
spotify_play | Play a track or playlist by search query. |
spotify_pause | Pause playback. |
spotify_next | Skip to next track. |
spotify_prev | Go to previous track. |
spotify_queue | Queue a track by search query. |
switch_audio | Switch default audio output device by name. |
browse_url | Open a URL in the default browser. |
create_task | Create a task with optional due date and priority. |
create_event | Create a calendar event with datetime and duration. |
web_search | Web search — returns text results. |
web_research | Deep research query — structured result with sources. |
dataset_search | Find public datasets matching a query. |
web_fetch | Fetch and return the text content of a URL. |
web_download_file | Download a file from URL to a local path. |
browser_open | Open a URL in a controlled browser session. |
browser_read | Fetch the rendered DOM text of a URL. |
workspace_read | Read a file from the workspace directory. |
workspace_read_base64 | Read a binary file as base64. |
workspace_write | Write text content to a workspace file. |
workspace_write_base64 | Write binary content (base64) to a workspace file. |
list_skills | List all available skill definitions. |
create_agent_task | Create a delegated agent task with a description. |
take_screenshot | Capture a screenshot. |
get_active_window | Get the name of the currently focused window. |
find_text_on_screen | OCR — find text at screen coordinates. |
click_at | Simulate a mouse click at x, y. |
type_text | Type text via keyboard simulation. |
System controls
The following tools map to backend/services/system_controls.py:
| Tool | Args |
|---|---|
get_volume | none |
set_volume | level: int (0–100) |
mute_audio | none |
unmute_audio | none |
get_brightness | none |
set_brightness | level: int (0–100) |
lock_screen | none |
turn_off_display | none |
sleep_system | none |
get_clipboard | none |
set_clipboard | text: str |
get_system_info | none |
Workspace tools
All workspace operations are scoped to the configured workspace root (WORKSPACE_ROOT in .env). Paths outside the root are rejected.
{"tool": "workspace_read", "args": {"path": "src/main.py"}}
{"tool": "workspace_write", "args": {"path": "output.txt", "content": "hello world"}}
{"tool": "workspace_read_base64", "args": {"path": "assets/logo.png"}}
{"tool": "workspace_write_base64", "args": {"path": "export.png", "content_base64": "iVBOR..."}}AuditLog table) with the tool name, args, and result length.Browser tools
browser_open opens a URL in a managed headless browser session.browser_read fetches and returns rendered page text (DOM extraction, not raw HTML). Both calls are audited.
{"tool": "browser_read", "args": {"url": "https://docs.python.org/3/library/asyncio.html"}}
// Returns up to 4000 chars of the rendered page textScreen tools
Screen tools use the screen_perception service for screenshot, OCR, and UI automation. Results are truncated to 200 characters and JSON-encoded.
{"tool": "take_screenshot", "args": {}}
{"tool": "get_active_window", "args": {}}
{"tool": "find_text_on_screen", "args": {"text": "Submit"}}
{"tool": "click_at", "args": {"x": 540, "y": 320}}
{"tool": "type_text", "args": {"text": "Hello world"}}Calendar and tasks
{"tool": "create_task", "args": {
"title": "Review pull request",
"due": "2025-06-15T09:00:00",
"priority": "high"
}}
{"tool": "create_event", "args": {
"title": "Team standup",
"datetime": "2025-06-10T09:30:00",
"duration": 15
}}Both tools write to the SQLite tasks and calendar_eventstables and are automatically surfaced by the scheduler'scheck_upcoming_events() and check_overdue_tasks().