A Couple Clever Ideas in AI Agent Programming
1. Inbox Sweeping
Here’s a clever idea I found courtesy of Pete Stei, the founder of openclaw, via his project opensweeper. Imagine you have a large inbox of open issues, tasks, or message, for instance, the issues or open PRs on a large open source repo, or a project management board at your startup, or some other TODO list.
How do you get your AI to automatically help you tackle those items? The first obvious thought is to setup some kind of process that watches the inbox for changes, then spins up an agent to tackle them when new tasks come in. Another thought is to have a cron that checks for new tasks, and then spins up an agent to tackle them.
The problem with the naive solution to the problem is that it treats AI agents as too deterministic, like a regular cron with a queue. Instead, AI agents are non-deterministic software. The models change, the context changes, the environment changes, the CLAUDE.md file changes, new skills get built and introduced, and we have multiple underlying models that might want to attempt the problem (codex, claude code, gemini, etc…)
To take advantage of this non-deterministic nature, we want something more powerful than a simple watcher daemon or a cron. We want a sweeper*. A sweeper is going to run many agents, in parallel, at all times, constantly trying to resolve all open issues and either bring them to close, or explicitly decide to leave them open, and for good reason (just like a human!)
The architecture of a sweeper is simple:
- README.md
- AGENTS.md
- decision-schema.json
- prompts/review-prompt.md
- /items
- /closed
- /src
- Audit
- Schedule
- Review
- Apply
I won’t go through every piece of this, but I do recommend reviewing opensweeper for inspiration. The gist is this:
- The sweeper works via 4 functions: Audit, Schedule, Review, and Apply
- Audit shows the difference between the sweeper’s file system (all the tickets in /items and /closed and their status) and reality (eg the source of truth, say, Github). This is useful to know if any tickets have been re-opened, or how many open tickets we have yet to even review for the first time, or whether a ticket has new comments or has been updated since the last time we reviewed it.
- The Scheduler decides how often to scan various items. Hot/new items get scanned more frequently, older or more stale items may only get scanned one a week, or even once a month.
- Review is the first step of scanning an item. Items are sent in batches to scanner agents, which use the highest powered agent possible along with the
prompts/review-prompt.mdto produce adecision-schema.json, along with: evidence, a suggested comment, and some metadata. This review gets saved toitems/<number>.mdso future scans can reference it, read it, and amend it. The main two possible decisions are: keep_open and proposed_close. It also reports a confidence. The review agent is READ-ONLY with respect to the outside world (github, email, etc..) - Apply reads the reviews in
/itemsand takes actions with modify the outside world with the advisement of the report, but only when the report is still valid. Then it moves any closed items to/closed/<number>.md. Apply can wake frequently, like every 15 minutes, and would just do nothing if there is no work available. Apply only does work on high-confidence tickets which also have the decisionproposed_close, otherwise the system just keeps the ticket open.
None of that is particularly important, I think what is most important is just the idea of constantly sweeping open problems, forever, and trying them over and over again (like a human would!), until you reach a high confidence decision, either because you have new information, or because you are a new model, with new and better skills.
The other important point is just having really light schemas for review — like, just build a giant directory with one markdown file for every open issue. That actually just… works! Elegant :)
2. Directories as Agents
Okay here’s another neat idea. A lot of people think agents work best when you take care of your file system, great, sure, we all know that, snooze.
But here’s a thought: what if every folder that has an AGENT.md file is literally an agent.
Well what does that even mean?
Imagine you have a daemon process (some kind of process which watches for file changes on your desktop). Okay, now a new file is added to one of your folders — let’s call that folder /photo_gallery. No surprises yet, but your daemon is alerted that there has been a file change. It looks for a /photo_gallery/AGENT.md and if there IS one, it is going to spin up your handy-dandy Photo Gallery agent, which will have all the instructions it needs to help you process your photo. Maybe your Photo Gallery agent helps you automatically send funny or relevant photos to your friends. Or maybe it automatically helps you tag photos with metadata, like location, people, etc… Or maybe it checks to see if the photo is a receipt, and then automatically moves it to /receipts, where YES, there is a /receipts/AGENT.md which gets spun up, which then reads the receipt photo and files it nicely with your bookkeeping software.
Okay, so what?
Well, do you see how folders become composable applications, which automatically process data and forward it around? Folders become agentic apps! So then how would we distribute this fancy software? Just zip it up, send it to me, I will unzip it, and then I’ll put it on my desktop, where my Daemon process will automatically spin up the app anytime a file is changed! It’s so elegant! :)
I got this idea from Kieran Sobel at our most recent Claude Party.