Not subscribed? Sign up to get it in your inbox every week.

Hi {{first_name_tally|Operator}},

We’ve got some exciting stuff coming up in the next few weeks, but I’ve been head down testing one of the newest tools on the block from Airtable. This weeks issue is a little different than normal, so hit reply and let us know what you think!

How’s your week been going?

What's the #1 issue you have been working on in the last 7 days?

You should get an open feedback box after voting.

Login or Subscribe to participate

-Michael

PRESENTED BY 3RD BRAIN
3rdbrain.co

3rdBrain embeds vetted experts in tools like Clay, ClickUp, Notion, Airtable, Claude Code, N8N, Make, Zapier, and more directly into operator-led teams on hourly or monthly contracts. Operations, automation, and AI experts who build until the wins compound. All ravenous self learners.

Disclosure: 3rdBrain is a company I founded. This weeks link goes skips the sales team and goes straight to my calendar.

Field Test: 3 AI Agents. One 150-Page Playbook. Here's What Happened.

I have a 150-page media outreach playbook from a course I bought 5 years ago that worked well enough at the time but I never went back to after the first round of pitches. I have given it to a few VAs and EAs over the years to follow with a variety of results, mostly mid at best.

You probably have a folder full of SOPs (and how to courses) that worked once and never saw the light after their first year too. Running a playbook manually is a full-time job, that's why operators like us make them for our team in the first place.

Last week I got access to Hyperagent from Airtable on Friday. I had read the Substack and signed up for the waitlist, I wanted to see how it compared to Claude Cowork and OpenClaw - my daily driver agents.

So I decided to run an experiment: hand the same playbook to three different agent harnesses, work with each one for ~4hrs, and see how far we could go. I didn't count installation or setup time, the clock started when they each got the playbook. From there we needed to get them integrated with the tools to do the job, update the workflow and email templates, and then actually do the research and draft pitches for my final approval.

Opus 4.6 was the model behind the mask on all 3. It's my (and my team's) daily driver for most agentic knowledge work now no matter the harness, though it is worth noting that both OpenClaw and HyperAgent can run any model.

The Competitors

Hyperagent launched February 19th. It's Airtable's standalone agent platform currently in beta. Every session gets its own cloud computer with a browser, code execution, and comes with hundreds of integrations out of the box via Composio. The key differentiator: it builds skills and saves memories (learnings as they call them) as it works, essentially teaching itself so it gets better at your specific workflows over time.

Claude Cowork dropped in January and it's been everywhere since. Essentially Claude Code for anything that's not code... you point it at a folder or web tab on your machine, give it instructions, and it executes. I've been using it daily since release on Windows and loving the power and flexibility it provides.

OpenClaw is the open-source darling with it’s founder now acquihired and funded by OpenAI. You host it yourself (or popularly on a Mac Mini), point it at your own APIs/tools/skills, and let it run autonomously. We've all seen the hype, and maybe you've felt the reality as well.

The Test

The task was something we've all done with an intern or new hire, just with a 4 hour time limit because if I can't get a tool setup and producing in half a day then the learning curve is too steep for most new hires.

  1. Give them the playbook,

  2. Setup the tools needed to execute it (media databases, email verification, Gmail, CRM)

  3. Update the five-year-old templates to sound like a human wrote them in 2026

  4. Research relevant media targets

  5. Draft personalized pitches ready to send.

Hyperagent stumbled at the start due to context length, but after we got it to extract text, it walked me through integrations (even building it's own skill to access an API for the media research tool). It took about 45min in total to get it to training, and then we updated the pitch template, added email verification, and created a rubric for it to maintain operational skill. In about 4 hours work total we had 30 pitches drafted in my gmail to send.

Claude Cowork needed me to leave the computer running and hogged the RAM. Took about 2 hours to get to have it all setup (barring gmail) but it was clumsier with the media search since it didn't wrap a skill (nor did I request it, though I would if I was in my day to day). I had it use Gmail to write them directly, which again hogged the browser/RAM. The quality of the writing was the same.

OpenClaw I had hopes for, but it had a lot of difficulty with the length of the PDF to start just like Hyperagent. It crashed a few times while we were trying to get the workflow smoothed out, and the context management eventually frustrated me enough to just give up before we got through the workflow updates.

The Scoreboard

Hyperagent

Cowork

OpenClaw

Setup to first output

~45 min

~2 hrs

DNF

Pitches drafted in 4 hrs

30

6

0

Cost

$140.71 (usage)

$125/mo (flat)

~$50 in API tokens

Skill/memory building

Yes (automatic)

No

No (crashed first)

Runs on

Cloud (Airtable)

Your machine

VM on Fly.io

RAM/resource impact

Zero (cloud)

Heavy

N/A (remote VM)

Handholding required

Low (built its own rubric)

Moderate

High (before crash)

Writing quality

Strong

Same

N/A

Where Hyperagent wins: It learned while it worked, that was legitimately impressive. It wrapped the API of the media research tool in a skill without me asking, offered to create a rubric to maintain quality across drafts (and then did in 1 click), and remembered what I liked and that I hated metaphors between pitches. 

Each draft needed less from me than the last.

HyperAgent Agent Summary

It's not AGI-level like they're claiming but the skill and memory system made it feel less like prompting a tool and more like onboarding an employee who takes good notes. The cloud execution means your machine is free. You could run 10 instances at once without slowing down your computer.

It was legitimately fun to use, which is rare in tools for business cases. Not quite "first time with ChatGPT" but definitely the first time in a long while where I reached out to the PM and asked for 4 more invites to add the team.

Where Cowork wins: If you're already on a Claude plan, Cowork cost you nothing extra. It can touch your local files, work in your browser tabs, and for many tasks it's still the most flexible agent - which is why I used it daily.

But the infrastructure gap is real, Cowork doesn't auto-build skills or carry memory between steps, it often made the same mistakes twice. If I'd asked it to wrap a skill for the media DB API the way Hyperagent did automatically, the gap would have narrowed. But that's something I'd have to train every non technical team member to do, along with how to find the API docs to give it.

OpenClaw failed this test entirely. It might be user error, or it may just be a fragile system. I still like it and I'll keep using it for its proactive scheduling, but I see a time on the horizon when this drops from our stack in favor of a cloud native and more friendly infrastructure like HyperAgent. It's still super cool, but I still haven't seen the hype pay off.

The Verdict

I sent 15 of the Hyperagent pitches last night. The rest go out this morning. Follow-ups scheduled for Sunday and all the pitches are tracked in an Airtable that it configured itself (not part of the test, just fun and easy).

The verdict is clear - this is an impressive tool and I have high hopes that it will continue to improve and that it will stay in my stack.

Opus 4.6 powered all three harnesses well, it's a great model. But it's just a model. We still need to choose an infrastructure package that lets you give it a workflow, "teach it," and walk away.

Hyperagent got closest to that mark. Cowork has the power but makes you build more of the system yourself. OpenClaw just had too many speedbumps.

I cannot wait to see what happens next.

The Bottleneck Talent Network

Searching for your next role? Fill this form out, and we’ll intro you to the best companies in the world

Hiring? just respond to this email! We’ve got dozens of vetted operators standing by.

Would you share with a friend?

Login or Subscribe to participate

Reply

Avatar

or to participate

Keep Reading