I run Claude and Codex on a loop on my Mac mini — they clear my to-do list while I sleep

I have a Mac mini in the corner of my office that I never log into. Its only job is to watch my to-do list and quietly do the things I’ve assigned to an AI. By the time I sit down in the morning, a handful of tasks are already sitting at “ready for review” — a refactor opened as a pull request, a draft written, a script run. I check them off, send a couple back, and get on with my day.

This post is how that setup works, and why a to-do list with an API and an MCP server turned out to be the missing piece.

The idea: a list that agents can pull from

Coding agents like Claude Code and OpenAI Codex are great at doing a well-scoped task. The hard part is feeding them work and getting the results back somewhere you’ll actually look. People hack this together with Supabase queue tables, GitHub issues, or text files. I wanted the queue to just be my real to-do list — the same one I use for “buy milk”.

That’s the whole reason I built Lume the way I did. Every task is reachable over a REST API and an MCP server, and every task has an assignee. Assign one to claude or codex and it’s now a unit of work an agent can find.

The loop

The core is embarrassingly simple — a shell loop that asks Lume for the next task assigned to an agent, runs the agent on it, and marks the result for review:

#!/usr/bin/env bash
set -euo pipefail

while true; do
  # Pull the next open task assigned to Codex
  task=$(curl -s "$LUME_API/v1/tasks?assignee=codex&limit=1" \
    -H "Authorization: Bearer $LUME_KEY" | jq -r '.tasks[0] // empty')

  if [ -n "$task" ]; then
    id=$(echo "$task"   | jq -r '.id')
    title=$(echo "$task"| jq -r '.title')
    notes=$(echo "$task"| jq -r '.notes')

    # Hand it to the agent in a fresh worktree
    codex exec "$title\n\n$notes" || true

    # Mark it ready for review (never "done" — that's my call)
    curl -s -X PATCH "$LUME_API/v1/tasks/$id" \
      -H "Authorization: Bearer $LUME_KEY" \
      -d '{ "status": "needs_review" }'
  fi

  sleep 30
done

That’s the toy version. The real one does a few more things: it works in a throwaway git worktree so parallel tasks don’t collide, it captures the agent’s output and posts it back into the task’s notes, and it refuses to touch anything that isn’t explicitly assigned to an agent.

Keeping it alive with launchd

A while true loop in a terminal dies the moment the SSH session drops. On macOS the right tool is launchd. I wrap the script in a LaunchAgent with KeepAlive set, so it restarts if it ever exits, and survives reboots:

<key>KeepAlive</key><true/>
<key>RunAtLoad</key><true/>
<key>StandardErrorPath</key>
<string>/tmp/lume-codex.err.log</string>

launchctl kickstart -k restarts it on demand, and the logs go somewhere I can tail when an agent gets stuck. That last part matters more than you’d think — agents hang, and you want to see it.

Why MCP, not just the API

The shell loop uses the REST API because it’s easy to script. But for interactive work I connect Claude Code to Lume over MCP directly. Then I can sit in my editor and say “what did you finish overnight?” and Claude reads the list, summarizes the needs_review tasks, and walks me through them. Same data, two front doors.

The one rule: agents never check the box

The thing that makes this safe to leave running unattended is a hard rule: agents can only move a task to “ready for review”, never to done. Completion is a human action. So the worst case isn’t a wrong thing shipped — it’s a wrong thing sitting in my review queue that I reject in two seconds. And because every change in Lume is undoable, even a bad bulk edit is one keystroke to reverse.

What it’s actually good for

Not everything. I don’t assign judgment calls or anything irreversible. But a surprising amount of my list is well-scoped grunt work:

“Refactor the billing webhook to use the new event types”
“Write first-draft release notes from the last 20 commits”
“Add tests for the date-parsing helper”
“Research three options for X and put the findings in the notes”

Each of those is a task I’d normally carry for days. Now I assign it, forget it, and review the result. The Mac mini doesn’t get tired, and it doesn’t context-switch.

Try it without the Mac mini

You don’t need a server to start — you can assign a task to Claude Code or Codex from the app and watch it work. The loop is just the “while I sleep” version of the same idea. If you want the full setup, the assign-to-agents page covers how dispatch works, and Lume is free to start.