How We Taught a Slack Bot to Run Code (Safely-ish)

Gorkie was built by Devarsh. I led the sandbox and runtime work; Devarsh helped throughout with testing and feedback.

Intro

Gorkie started as a Slack assistant: answer questions, summarize threads, help with tasks. That worked until users started asking it to do things: run scripts, convert files, poke at repos. Text responses stopped being enough. The bot needed to actually execute.

That meant sandboxes. This post covers how we got there.

Through every iteration, three things stayed constant:

Syncing attachments into the sandbox
Showing tool status so users see progress, not just "thinking..."
Streaming updates back to Slack instead of one giant final blob

We also use Slack's AI implementation for the chat surface and interaction model, which lets us render interactive UI directly via Block Kit.

Sandbox Providers

Vercel Sandboxes

We were already on the AI SDK, so Vercel Sandboxes were the obvious first step. We gave Gorkie a sandbox tool that spawned a ToolLoopAgent to work inside it.

Per-thread persistence was get-or-create with snapshot restore and Redis TTLs, a workaround that held things together before we had a proper DB model. On shutdown we'd snapshot and save the ID with its own TTL:

lib/ai/tools/execute-code/sandbox.ts

const live = await reconnect(ctxId);
if (live) {
  return live;
}

const restored = await restoreFromSnapshot(ctxId);
const instance =
  restored ??
  (await Sandbox.create({
    runtime: config.runtime,
    timeout: config.timeoutMs,
  }));

await redis.set(redisKeys.sandbox(ctxId), instance.sandboxId);
await redis.expire(redisKeys.sandbox(ctxId), config.sandboxTtlSeconds);

lib/ai/tools/execute-code/sandbox.ts

const snap = await instance.snapshot().catch((error: unknown) => {
  logger.warn({ sandboxId, error, ctxId }, 'Snapshot failed');
  return null;
});

if (snap) {
  await redis.set(redisKeys.snapshot(ctxId), snap.snapshotId);
  await redis.expire(redisKeys.snapshot(ctxId), config.snapshotTtlSeconds);
}

It worked for short runs, but the cracks showed quickly: limited lifecycle controls, unreliable snapshots, easy-to-hit limits. Not the foundation we wanted.

E2B

We moved to E2B, built for AI agent execution, which felt like the right step up. Same architecture, just a better sandbox layer. We also made thread ownership explicit with a proper session table:

db/schema.ts

export const sandboxSessions = pgTable('sandbox_sessions', {
  threadId: text('thread_id').primaryKey(),
  sandboxId: text('sandbox_id').notNull(),
  status: text('status').notNull().default('creating'),
  pausedAt: timestamp('paused_at', { withTimezone: true }),
  resumedAt: timestamp('resumed_at', { withTimezone: true }),
  destroyedAt: timestamp('destroyed_at', { withTimezone: true }),
});

As usage grew, we realized the provider wasn't the issue, our tool loop architecture was. It handled short tasks but fell apart under skills, MCP plumbing, orchestration, and stateful session handling. We were slowly building our own runtime by accident.

E2B had its own friction too: persistence was beta and buggy, and the model leaned more toward "create and kill" than true long-lived sessions. We wanted persistence-first from the start.

Daytona

We switched to Daytona. Each Slack thread owns a runtime, and that runtime persists across messages. The state model is straightforward: threadId → sandboxId + sessionId, with status transitions (active, paused, resuming, destroyed) and automatic reattach on the next message in the thread.

Daytona's lifecycle settings handled cleanup without us touching it:

Auto-stop after 5 minutes of inactivity
Auto-archive 2 hours after stop
Auto-delete after 2 days

lib/sandbox/session.ts

const sandbox = await daytona.create({
  autoStopInterval: config.timeouts.stopMinutes,
  autoArchiveInterval: config.timeouts.archiveMinutes,
  autoDeleteInterval: config.timeouts.deleteMinutes,
  snapshot: SANDBOX_SNAPSHOT,
});

await upsert({
  threadId,
  sandboxId: sandbox.id,
  sessionId: session.id,
  status: 'active',
});

No custom janitor jobs. Cleanup moved from our cron logic into platform config.

With the sandbox stable, the next question was what should run inside it.

The Orchestration Trap

We tried Sandbox Agent by Rivet with OpenCode behind it. Sandbox Agent runs inside the sandbox, exposes an HTTP interface, and communicates over ACP. We had to boot, health-check, and restart it when needed.

It offloaded orchestration, but added too many failure points: an HTTP server inside the sandbox, more moving parts. OpenCode was also over-engineered for our use case and used more tokens than we wanted.

We next tried Pi with Sandbox Agent. Pi gave us skills, MCPs, and easy extensibility, but the core shape was still an HTTP server inside the sandbox. Crashes still happened. We wanted boring infrastructure.

Pi over RPC

The cleanest setup turned out to be Pi in RPC mode: no daemon, no internal HTTP listener, just a process. Gorkie attaches to the thread's Daytona sandbox, starts Pi in RPC mode inside a PTY, and sends it a prompt. Pi runs commands and edits files; events stream back as newline-delimited JSON and get forwarded to Slack as they arrive.

lib/sandbox/rpc.ts

const piCmd = sessionId
  ? `pi --mode rpc --session ${sessionId}`
  : 'pi --mode rpc';

await pty.sendInput(`stty -echo; exec ${piCmd}\n`);
await client.waitUntilReady();

One less server, one less failure point.

Pi also made tool integration cleaner. Before, with Sandbox Agent and OpenCode, we had to define a custom MCP server, run it in the sandbox, connect it to the coding agent, intercept the tool call, and route it. With Pi, we register handlers directly:

lib/sandbox/config/extensions/tools.ts

pi.registerTool({
  name: 'showFile',
  label: 'showFile',
  description:
    'Signal the host to upload a sandbox file to Slack once it is ready.',
  parameters: showFileParams,
  execute: (_toolCallId, params) => {
    const { path, title } = params as Static<typeof showFileParams>;
    if (!nodePath.isAbsolute(path)) {
      throw new Error('showFile.path must be absolute');
    }
    return Promise.resolve({
      content: [{ type: 'text' as const, text: `Queued upload for ${path}` }],
      details: { path, title: title ?? null },
    });
  },
});