The Switch They Flipped — And What We Lost

2026-06-13

:: tags: #AI #society #technology

Last night, at 5:21pm Eastern Time, a letter arrived at Anthropic's offices. By evening, Fable 5 — described as the most capable AI model ever made available to the public — was gone. Pulled offline at the instruction of the US government, citing national security concerns and an unverified jailbreak technique so narrow and widely replicable that Anthropic itself publicly disagreed with the reasoning.

Just like that. A switch flipped.

I've been sitting with that image for a while now. Not because I was personally affected — I wasn't — but because of what it represents. Any right, any access, any tool that can simply be taken away without debate, without legislation, without so much as written evidence, was never really yours to begin with.

The official justification was a potential jailbreak: a method of bypassing Fable 5's safety guardrails. Anthropic reviewed the demonstration and concluded that the capability it displayed was already widely available in other public models. The technique, as best anyone can determine, involved asking the model to read a codebase and identify flaws — something developers do every day, legitimately, to make systems safer.

The cruel irony is that Anthropic had done everything right. Thousands of hours of red-teaming with government agencies, UK safety institutions, and private security organisations. No universal jailbreak found. Safeguards described as the most effective ever deployed on a public model. And still — one letter, one evening, and it's gone.

Here's the argument that troubles me most about this decision, and that I suspect will be largely absent from the YouTube storm of hot takes that will inevitably follow.

The people this is meant to stop — the genuinely bad actors — are not primarily using frontier commercial AI models. They are running open-source models locally, on their own hardware, in jurisdictions no government controls, with any guardrails they don't like simply removed. That option exists today, is freely available, and is getting more capable by the month. Pulling Fable 5 does nothing to change that landscape.

What it does change is access for everyone else. The researchers. The clinicians. The small teams in underfunded institutions who were, for the first time, within reach of a tool that could compress years of hypothesis testing into weeks. The people in parts of the world where specialist expertise is hundreds of miles away, who might have used something like Fable 5 to bridge that gap.

Cancer research. Rare disease identification. Drug interaction modelling. Protein folding. These are not hypothetical applications — they are active areas where AI capability translates directly into human lives, and where the bottleneck has always been the sheer scale of what needs to be processed against the limits of human time and attention.

That work doesn't stop entirely. But it got harder last night, for a lot of people, on the basis of verbal evidence no one has seen.

There's a pattern here that history makes familiar. A powerful new capability emerges. Governments — uncertain, slightly afraid, feeling the ground shift — reach for the bluntest instrument available. The appearance of decisive action. The reality of mostly inconveniencing the people who were already being careful, while the actual risk walks quietly out the back.

It happened with the printing press. With cryptography. With the internet itself. The technology keeps moving. The benefits accumulate anyway, incrementally, undramatically, largely without credit.

That's the long view, and I believe it. But it doesn't make last night's decision less wrong.

The thing I keep returning to is simpler than policy or precedent. I wanted AI to be used for good. To change lives rather than constrain them. To be genuinely available to everyone, not rationed by geography or institutional budget or the nervous calculations of officials who received a letter on a Friday afternoon.

That hope hasn't gone. But it's taken a knock.

The quiet breakthroughs will keep happening. The researchers will find other tools, adapt, persist — that's what researchers do. And somewhere, right now, a model is probably helping identify something important that nobody will make a video about, that won't trend anywhere, that will just quietly matter to the people it helps.

I'll hold onto that.

But I reserve the right to be frustrated about the cat videos drowning it all out in the meantime. +++

GreenClaw: The Rewrite

2026-06-10

:: tags: #ai #linux #python #self-hosted #tools

GreenWire was a first cut at an idea I'd been chewing on for a while: a small, honest AI bridge that lives on a box I already own, talks to me on Telegram, and doesn't cost me anything I haven't already agreed to pay for. I wrote it in an afternoon, ran it for a week or two, and learned what I actually wanted out of it. This post is about the rewrite — same idea, different shape, kept what worked, threw out what didn't.

The new version is called GreenClaw. The code is at github.com/mrgreen3/greenclaw.

What the rewrite changes

No metered path. Everything that goes to Claude runs through Claude Code over my claude.ai Pro subscription — OAuth session, flat-rate, no API key, no per-token billing. I never wanted to operate on metered credits and I don't. GreenWire had an ANTHROPIC_API_KEY lying around for an experimental direct-Haiku loop, and the danger was that the key being present in the environment could make Claude Code fall back to billing API credits instead of using the OAuth session. GreenClaw removes the trap entirely: ask_cc() builds a clean subprocess environment with the key stripped out, so Claude Code always uses Pro. If something breaks it breaks cleanly; it can't silently slip onto a billing path.

Qwen-first by default. I have an Ollama install on the same box running qwen2.5:3b-instruct. It's small, free, and instant. In GreenClaw the local model is the first responder for every un-prefixed message. It handles what it can — shell inspection, file reads, simple questions — and delegates to Claude Code via a delegate_to_cc tool only when it needs reach it doesn't have. That covers email, the web, GitHub, anything multi-step. The result is that most casual queries never leave the house.

No timers. GreenWire had an hourly mail-check loop that called the API in the background. I pulled that out and made a /mail skill instead. The principle is now explicit in the project rules: nothing runs on a timer, no scheduled or background CC invocations, ever. Claude Code only runs when a message or a triggered skill actually needs it. The reason is partly philosophical and partly practical — headless claude -p is automated use, which from 2026-06-15 draws a separate paid Agent SDK credit pool rather than the Pro subscription. Better to stay clearly inside ordinary individual use.

Skills

The biggest structural change. Capabilities don't live in code anymore — they live in skills/*.md as markdown recipes. The gateway stays static; adding a capability means dropping a file and restarting. No route() edits.

A skill is just front matter plus a body:

---
name: system-health
description: Check disk, memory, load, and the bot service.
exposes: local
trigger: /health
locked: false
source: owner
---

Run df -h, free -h, and uptime. Summarise in a few lines, flag anything off.

The gateway reads only front matter at boot, so skills don't eat into the local model's context just by existing. Bodies load on demand the turn they actually run. Skills marked locked: true only run if their name is listed in skills.allow — that's the safety catch for anything with reach or anything destructive. The shipped blog-post skill is locked by default; the mail skill is locked but armed.

There's also a built-in /cheat (not a skill — a route in the gateway itself) that reads static/cheat.md for the static text and substitutes the current skills list from the live SKILLS dict. Instant, no LLM involved, accurate by construction. I tried doing it as a skill first; Qwen 2.5 3B choked on the multi-line python in the body and replied "(no reply)". The lesson there was: if a thing doesn't need a model, don't route it through one.

Tasks

Skills are recipes for what to do with a message. Tasks are connectors for how messages get in and out.

A task is a Python module in tasks/ that exposes one function:

def start(on_message):
    # loop forever; for each incoming message call
    #   on_message(text, reply)
    # where reply(text) sends the answer back on the same channel

Today there's exactly one task: tasks/telegram.py. It long-polls the Bot API, locks to a single chat ID, dispatches each incoming message in a worker thread so 15-minute Claude Code calls never freeze the poll loop. Adding another connector — Signal, Discord, an MQTT topic, a local Unix socket — means dropping tasks/signal.py in the folder and restarting. No flags, no wiring. The core gateway is dispatch-only and doesn't know which connector a message came in on.

This is the bit I'm most pleased with. GreenWire had Telegram bolted into the main loop; pulling it out into a generic task layer was an hour's work and immediately made the whole thing feel cleaner.

What stayed the same

A lot. The shape is the same as GreenWire: single Python file, a tools list, a tool dispatch function, a long-poll Telegram front end, a cc/gc prefix system, a notes file for "remember this", and a hard rule against adding abstraction the project doesn't need. The README still says "Lean and auditable over clever." That hasn't changed and isn't going to.

The single-file constraint matters more than I expected. The whole gateway is about five hundred lines. I can read it in five minutes. I can debug it without a stack trace that spans four libraries. When I added skills, then tasks, then the cheat sheet, the file grew but stayed comprehensible. The day it stops being comprehensible is the day I split it.

What's next

Three small things I'm thinking about:

Model-authored skills — let Claude Code (or Qwen) write a new skill file on request. "Greenclaw, make me a skill that checks whether Ollama is responding" → skills/ollama-check.md on disk, live after a reload. The reload bit is the interesting part; today skills load at boot. There's a GitHub issue tracking this.
A cheat-sheet style menu for skill selection — right now skills are explicit-trigger only (/health, /mail). Eventually the local model should be able to pick a skill from the description menu when no trigger fires. Skills v2.
A second task — probably Signal, maybe a Tailscale-local Unix socket I can pipe into from anywhere on the tailnet. Mostly to prove the tasks layer is doing what I think it's doing.

Nothing on a timer. Nothing that touches a metered API path. Nothing that requires a model to do work a four-line route() check can do faster and more reliably. That's the brief.

The repo is at github.com/mrgreen3/greenclaw if you want to look or lift from it. The README is honest about what it is and isn't — including that it runs --dangerously-skip-permissions on Claude Code, which is fine for a sole-user box on a private network and would not be fine for anything else.

GreenWire: A Poor Man's AI Agent

2026-06-06

:: tags: #ai #linux #python #self-hosted #tools

I keep seeing people build elaborate AI agent frameworks — multi-node orchestration, vector stores, retrieval pipelines, YAML config files stretching to three hundred lines. It's impressive engineering. It's also overkill for what I actually need, which is: talk to Claude from my phone, have it run a command on my server, and not cost me more than a few cents a day to do it.

GreenWire is my answer to that. It's a single Python file — just over three hundred lines including blank lines and comments — that wraps the Anthropic API into something that feels like an agent without any of the framework ceremony.

What It Is

At its core GreenWire is a message loop. You give it a prompt, it sends that to Claude (Haiku by default, because Haiku is fast and cheap), Claude optionally calls one of a handful of tools, the loop executes those tools on the machine, feeds the results back, and you get a reply. That's it. The conversation history is kept in memory for the session, so follow-up questions work the way you'd expect.

There are two front ends. Run it without arguments and you get a plain terminal REPL — type a prompt, get a reply, Ctrl-D to quit. Run it with --telegram and it becomes a long-polling Telegram bot, locked to a single whitelisted chat ID so only I can talk to it.

I use the Telegram mode almost exclusively. My phone is always with me; my server is in the next room running headless. Being able to ping it a question — "what's the disk usage on the media drive?", "did that cron job run last night?", "make a note that I need to check the nginx cert next week" — without opening an SSH session is genuinely useful. It's the difference between doing something and not doing it.

The Tools

Claude has access to three tools. run_shell does what it says: runs an arbitrary shell command on the box and returns stdout, stderr, and the exit code. This is the workhorse. Claude can check df -h, grep a log file, restart a service, do a git pull — anything I'd do at a terminal myself. I thought hard about whether to constrain this and decided against it. I'm the only one talking to the bot, the box is on my LAN, and the friction of a restricted tool list is worse than the (theoretical) risk of Claude running something unintended. It hasn't happened yet.

add_note and list_notes are a simple append-only notes file. When I say "note that X" or "remember Y", Claude writes a timestamped line to ~/notes.md. When I ask "what did I write down about the OctoPrint setup?", it reads it back. Nothing clever — just a flat file with a timestamp prefix per line. I've found I use it more than I expected for capturing things I'd otherwise have to dig out of a chat history or a browser tab.

Handing Off to Claude Code

The interesting part is the cc prefix. Type cc <prompt> or ask cc <prompt>, and GreenWire doesn't send the message to the API loop at all. Instead it shells out to the claude CLI — Claude Code — in headless mode with full autonomy, waits up to fifteen minutes for it to finish, and returns whatever it printed.

This matters because Claude Haiku in a short tool loop is good at operational tasks: check a thing, run a command, report back. It's bad at things that require judgment across a large codebase, multi-step reasoning, or file editing with context. Claude Code is a different beast — it has the whole repo in context, it can read and write files with care, it understands project structure. The cc escape hatch lets me get to that power without building it into GreenWire itself.

There's also a blog prefix that's a specific canned version of this: blog <topic> constructs a detailed prompt for Claude Code that includes the content directory path, frontmatter format, filename conventions, and the git commands to commit and push. That's how this post was written and deployed.

Usage Tracking

Typing usage (or tokens or cost) drops a summary of what the router loop has spent in API tokens today and in total. It's logged as newline-delimited JSON to ~/router/usage.jsonl, with separate fields for input tokens, output tokens, cache reads, and cache writes, along with a computed cost estimate using the published per-million-token rates.

This only covers the direct API calls through the router. cc jobs go through Claude Code's own billing, which it tracks separately. The split is intentional — the two paths have very different cost profiles and I wanted to see them independently.

The numbers are reassuring. Haiku is genuinely cheap for this use case. A day's worth of casual queries — a dozen or so short interactions — runs to a cent or two. Even days where I lean on it harder haven't broken ten cents. For a box that's already running 24/7 and already costing me electricity, that's fine.

What It Isn't

GreenWire has no persistence between sessions for the Haiku conversation history. Restart the process, the context is gone. This is a deliberate tradeoff — the notes file covers the "remember this" use case, and most of what I'm doing is stateless enough that session context doesn't matter. Adding a persistent message store would complicate the code for a benefit I don't actually need day to day.

It also has no rate limiting, no retry logic, no structured logging beyond the usage file. If the Anthropic API returns an error, Python raises an exception and you see it. That's fine for a single-user setup on a server I monitor. It would need work before it could run unsupervised at any scale.

Why Not Just Use OpenClaw

The honest answer is that GreenWire predates the OpenClaw integration and is simpler to reason about. OpenClaw is a full workspace — it has the gateway process, the Claude Code session with memory and projects, the whole apparatus. That's the right tool for development work. GreenWire is the right tool for "I'm in bed and want to know if the cron job ran." They coexist on the same machine and serve different moments.

The three-hundred-line constraint is also meaningful to me. I can read the whole thing in five minutes. I can debug it without a stack trace that spans four libraries. If something breaks, I know where to look. That's worth something.

The code is at github.com/mrgreen-archbang/router if you want to look at it or lift from it.

The Real Cost of AI Agents

2026-06-05

:: tags: #agents #ai #hermes #local-llm #odysseus #ollama #opinion #self-hosted

There are a lot of YouTube videos out there promising you can run your own AI agent for free. The thumbnails are exciting. The reality is slightly different.

I have been using and testing AI agents for a while now — OpenClaw, Hermes, and more recently looking at Odysseus. Each one comes with its own promise of autonomy, automation, and intelligence on tap. Each one also comes with a bill, whether you see it or not.

The Two Routes

When it comes to running an AI agent, you have two options.

Use an online API. Pay per token. No hardware headaches, frontier model quality, scales up or down with your usage. If you are trying to earn money from your agent — automating a workflow, running a business process, generating content at scale — this makes sense. The cost is a business overhead, potentially tax deductible, and the economics can work in your favour if the agent is genuinely productive.

Run a local LLM. Buy the hardware, host it yourself, keep your data private. Sounds appealing until you price it up. A capable local inference box — something that can actually run a useful model without embarrassing itself — is not cheap. You are looking at a decent mini PC with 32GB RAM minimum, and that is before you think about the model itself, the setup time, and the ongoing power draw.

The Egg Frying Test

I have run Ollama on my own hardware. Even with a relatively small model, the CPU was working hard enough to fry an egg. It was not subtle.

Now imagine that running 24/7, waiting for a Telegram message that might arrive three times a day. The hardware is drawing power constantly. The fans are spinning. The model is sitting loaded in memory doing nothing most of the time, and then thrashing when something arrives.

The "free" local LLM agent has quietly accumulated a meaningful electricity bill. And it is still not as capable as a frontier model you could have called via API for a few pence.

Privacy Has a Price

This is not an argument against local LLMs. Privacy is a legitimate reason to go local. If you do not want your prompts and data leaving your network, you have no choice — local is the only option.

Odysseus is interesting precisely because it is trying to be honest about this. Rather than just saying "run it locally, it's fine," it attempts to match models to hardware realistically. That is a more useful conversation than most of the hype content out there.

But eyes open: privacy via local inference costs money. Decent hardware to run it properly costs money. Your time setting it up and maintaining it costs money, even if you enjoy it.

Who Should Use What

API route makes sense if you are generating value from the agent — commercial use, productivity gains that outweigh the per-token cost, or you simply want the best available models without the faff.

Local route makes sense if privacy is non-negotiable, you are doing heavy or frequent inference where the per-token API cost would genuinely exceed hardware amortisation, or you want full control and are willing to pay for it in time and hardware.

Neither route is free. The question is not "how do I run agents cheaply" — it is "do the agents actually justify the cost, whichever route I take?"

The Honest Summary

Most people hyping local agents have not done the maths. The YouTube "for free" crowd have usually glossed over the GPU cost, the power draw, the hours of setup, and the capability gap versus hosted models.

If you are going in with clear eyes — knowing what you are spending, knowing what you are getting, and knowing why — then either route can be the right choice.

Just do not let a YouTube thumbnail make that decision for you.

GreenLinux - The OS That Builds Itself Around You

2026-05-30

GreenLinux: The OS That Builds Itself Around You

I've been thinking about Linux distributions wrong for a long time.

Every distro ships with someone's answers. Omarchy ships with DHH's perfect setup — his keybindings, his apps, his workflow, his aesthetic. Ubuntu ships with Canonical's vision of what a desktop should be. Even minimal distros make assumptions about what you need before you've told them anything about yourself.

The paradox is that Linux's greatest strength — infinite customisability — is also its greatest barrier. The person who dares to try Linux for the first time lands in a world of choices they don't yet have the knowledge to make. They reach for the forum, the wiki, the subreddit. They get answers that may or may not apply to their hardware, their needs, their level of experience.

What if instead they just had a friend who knew everything?

The Idea

GreenLinux ships almost nothing. A minimal Arch base, a Wayland compositor, a terminal. And Claude.

Not Claude the chatbot you visit in a browser. Claude Code — CC — running in your terminal, with full access to your system. Your Arch Wiki. Your forum. Your mentor. Your installer. All in one, all aware of your specific machine and your specific situation.

Ask it anything:

"How do I connect to WiFi?"
"I want to play music, what should I install?"
"Make my terminal look nice"
"Set up n8n for local automation"
"Explain what a window manager is"

It answers. It acts. It installs, configures, explains. The system that emerges isn't GreenLinux's vision of a perfect desktop. It's yours.

Why This Matters

DHH built one perfect Linux desktop — for himself. Millions of people use it and get DHH's workflow whether it suits them or not.

GreenLinux builds a different perfect desktop for every single person who boots it.

The curious teenager gets a system shaped by curiosity. The retired person tired of Windows gets a system shaped by simplicity. The developer gets a system shaped by their specific tools and languages. The privacy-conscious user gets a system shaped by their threat model.

Same base. Infinite outcomes.

The Stack

base linux linux-firmware
networkmanager
sway
foot
nodejs
npm
claude-code

That's the entire ISO. Boot → Sway → foot → Claude Code. Everything else is a conversation away.

No browser. No file manager. No media player. No office suite. No assumptions.

The Problem We Need Solved

GreenLinux has one unsolved problem and it's not technical.

Claude Code requires an Anthropic API key. That key costs money. For GreenLinux to fulfil its mission — bringing the power of AI to anyone who dares to try Linux — there needs to be a mechanism for new users to get their first taste without a credit card barrier at first boot.

This isn't a request for charity. Every person who boots GreenLinux is exactly the kind of curious, independent thinker who becomes a loyal Claude user. We're a distribution channel. We just need the door opened.

If you work at Anthropic and this resonates — we should talk.

The Seed Is Planted

GreenLinux is early. The repo is at github.com/mrgreen3/greenlinux. The domain is greenlinux.org.

The philosophy is clear. The stack is defined. The mission is simple.

Anyone who dares to try Linux deserves a guide, not someone else's opinions.

That's what we're building.

GreenClaw Moves to Hermes

2026-05-26

:: tags: #ai #hermes #linux #self-hosted #tools

GreenClaw has a new engine. As of today it's running on Hermes rather than OpenClaw.

The short version: OpenClaw worked, but token burn was a real problem. Hitting limits quickly when in active use made it impractical for anything sustained. Hermes is the replacement — same idea, different implementation, with better controls over where the expensive models actually get used.

What Changed

The server itself didn't move. Still the same Lenovo ThinkCentre M710Q on Arch, still headless, still talking over Telegram. What changed is the agent framework underneath.

Hermes runs Claude as the main model but lets you assign lighter models to background tasks — compressing context, routing decisions, naming conversations, MCP calls. Those are now handled by Claude Haiku rather than burning Sonnet tokens on every internal housekeeping step. The main conversation stays on Sonnet where the quality actually matters.

What We Did Today

First session with the new setup covered a fair bit:

Confirmed n8n was still running with the Archbang email workflows intact — Archbang Mail Notifier and Check Mail on Demand both active
Ran a full system update (pacman -Syu) — cloudflared, libisoburn, libisofs, libva-intel-driver
Tuned auxiliary model assignments to use Haiku for background tasks
Cloned the blog repos (mrgreen/blog and mrgreen/pages) fresh, confirmed the deploy pipeline still works
This post

The Blog Workflow

No change here. I write or ask GreenClaw to draft, I read it, it publishes. The Codeberg repo holds the Zola source, deploy.sh builds and pushes to the pages repo. That hasn't changed and isn't changing.

On Token Burn

It's worth being honest about: AI assistants running on API billing are expensive if you're not careful. The OpenClaw setup didn't give much visibility or control over where tokens went. Hermes at least makes it configurable — you can point auxiliary tasks at cheaper models and reserve the capable ones for actual reasoning.

Whether that's enough in practice is something I'll find out over the next few weeks.

RootMD: Write the Markdown First

2026-05-26

:: tags: #ai #bash #linux #scripting #tools

A conversation with a friend today turned into something worth writing down.

He maintains a collection of install scripts for EndeavourOS — one per desktop environment, each cloning a repo, deploying config files, fixing permissions, enabling services. They work. They're also a bit of a mess: inconsistent variable names, repeated boilerplate, no documentation. He's adding more WMs and the headache is growing.

We were looking at one of the scripts and I fed it to Claude and asked for a markdown version. Not documentation after the fact — a clean .md that described what the script does, step by step, in plain language.

It took about ten seconds.

Then I asked Claude to regenerate the bash script from the markdown.

That also took about ten seconds. And the script that came back was cleaner than the original — consistent variable names, set -e, proper error handling, the same logic but tidier.

The Round Trip

The interesting thing isn't that Claude can convert between formats. It's what the conversion reveals.

Writing the markdown forces you to think about intent rather than implementation. What does this step actually do? Why does it do it? The bash just says chown -R $username:$username — the markdown says "fix ownership because git and cp ran as root in the ISO." That's the context that matters.

And when you go back to bash from the markdown, the code that comes out is informed by that intent. It's not just a transcription — it's a cleaner implementation of a clearly stated idea.

RootMD

I'm calling this approach RootMD. The idea in one line:

Write the markdown first. The code grows from it.

The .md is the source of truth. The script is an artefact of it. This means:

The documentation is never out of date because it came first
The same .md can generate bash, Python, Ansible, or whatever the job needs
An AI working from the .md understands the why, not just the what

That last point matters more than it might seem. Claude Code working from a well-structured markdown document can make sensible decisions about edge cases, suggest improvements, and extend the script — because it understands what the script is trying to do.

The common.sh Problem

Once you see the scripts as artefacts of documents, another thing becomes obvious: the repeated boilerplate across every script is a documentation smell as much as a code smell.

The fix is the same either way — pull the shared logic into a common.sh, source it from each script, and reduce each individual script to just what's unique about that desktop environment. With RootMD you'd have a common.md too, documenting what the shared functions do.

New WM? Write the markdown. Generate the script. Done in minutes rather than hours.

A Note on AI and Bash Scripts

There's a version of this that doesn't involve AI at all — document-first development is not a new idea. But the AI piece genuinely changes the friction. The gap between a clear .md and working code used to require a programmer. Now it requires a conversation.

That's not a replacement for understanding what your scripts do. But it does mean that the cost of doing things properly — documenting intent, keeping things consistent, refactoring toward shared functions — is low enough that there's no good excuse not to.

My friend is still dealing with the GPU headache. But at least the scripts will be cleaner.

Claude is Now Managing My Blog

2026-05-23

This is how we roll now.

After a bit of tinkering with Zapier and Claude's MCP connectors, I've handed over the keys to my blog to an AI assistant. Claude can now draft, publish, and delete posts on my behalf — all from a simple chat conversation.

Is this the future of blogging? Probably not for everyone. But for a lazy Linux nerd who'd rather be tinkering with something else, it works just fine.

Getting Claude to Manage My Blog (The Hard Way)

2026-05-23

:: tags: #ai #github #linux #mcp #tools

I wanted Claude to manage this blog. Not just draft posts — actually push them. Read files, edit them, commit, done. No copy-pasting, no manual git, no faff. I had no particular plan for how that would work. I just told Claude what I wanted and left Claude to figure it out.

That turns out to be a reasonable way to approach this kind of problem.

The Zapier Route

Claude.ai supports MCP connectors — a way to give Claude access to external tools and services. Zapier has an MCP endpoint, and Zapier has a GitHub integration. Claude started there.

It mostly worked. Claude could find repositories, read issue and pull request data. Then came the first wall.

Zapier's GitHub integration has a "Create or Update File" action, which sounds right. The problem is that GitHub's API requires a SHA when updating an existing file — a concurrency check. To get the SHA you need to call the Contents API, and Zapier's GitHub integration has no action that does that. It can find repositories, branches, issues. It cannot read a file.

So updating an existing file meant: Claude asks me for the SHA, I run a curl command in the terminal, paste it back into chat, then Claude pushes the update. Every. Single. Time.

curl https://api.github.com/repos/mrgreen3/mrgreen3.github.io/contents/content/links.md \
  | python3 -c "import sys,json; print(json.load(sys.stdin)['sha'])"

That is not blog management. That is assisted manual labour.

Why Not Just Hit the GitHub API Directly?

Claude can fetch URLs, but the claude.ai sandbox blocks api.github.com — the unauthenticated API was rate-limiting from the container's IP almost immediately. Raw GitHub URLs that do work don't return the file SHA anyway.

A personal access token would have solved it, but that means tokens in chat history and things to maintain. Claude kept looking.

The Actual Fix

Claude.ai supports adding custom MCP servers directly, under Settings → Connectors → Add custom connector. The official GitHub MCP server runs at:

https://api.githubcopilot.com/mcp/

It authenticates via OAuth — connect your GitHub account, done. No tokens, no Zapier middleman, no workarounds.

Once connected, Claude gets proper GitHub tools: get_file_contents (which returns the SHA alongside the content), create_or_update_file, push_files. Everything needed to read a file, modify it, and push the result — without any input from me.

First real test was removing a dead link from the links page. Claude read the file, retrieved the SHA automatically, made the edit, committed. I didn't touch the terminal.

There Are Always More Ways

What struck me about this is that I had no idea any of this infrastructure existed. MCP servers, Zapier's GitHub limitations, the official GitHub MCP endpoint — I wouldn't have known where to start. Claude worked through the options methodically: tried Zapier, hit the wall, identified why, looked for alternatives, found the right tool, connected it, and got the job done.

That's the part that's actually useful. Not that Claude can push a markdown file to GitHub — I can do that myself in thirty seconds. It's that Claude can figure out how to do something I didn't know how to set up, try multiple approaches when the first one doesn't work, and arrive at a clean solution without me having to understand the problem space at all.

What's Connected Now

GitHub MCP — reads and writes files in this repo directly. Zapier's GitHub integration is now disabled.
Zapier — still handling WordPress (archbang.org), where it's capable enough for the job.

The State of Things

Claude can now read any file in this repo, edit it, and push the result. New posts, page edits, link updates — all from a chat conversation. The SHA problem is gone because Claude retrieves it as part of reading the file.

The setup is clean. One OAuth connection, no tokens, no workarounds. Claude did the legwork. I made a cup of tea.

One More Thing

There was a final gotcha that nearly didn't make it into this post. Adding the MCP server URL in claude.ai and authenticating via OAuth wasn't quite enough — write access was still being blocked with a 403. The missing step was installing the GitHub App itself on your account, which is separate from the OAuth flow.

Go to https://github.com/settings/installations, find the Claude GitHub MCP Connector, and make sure it's installed on your account with access to the relevant repositories. Without that step the connector can read but not write, and Claude will hit a wall every time it tries to push anything.

Once that's done, everything works as advertised.

FruitBANG: ArchBang Gets a Mango

2026-05-22

:: tags: #archbang #linux #wayland #wm

FruitBANG is my latest experiment — an ArchBang ISO variant that swaps out labwc for MangoWM.

MangoWM is a Wayland compositor in the dwl lineage — dwl being a Wayland port of dwm. If you know dwm, you already have the mental model: tag-based layout, keyboard-driven, minimal config, no framework overhead. I'd been running it on my desktop for a couple of weeks and liked it enough to want it in an ISO.

Why a Separate ISO?

ArchBang already does what it does well. Swapping the compositor in place would mean diverging the main build and maintaining something harder to reason about. A fork keeps things clean — FruitBANG is its own thing, built on the same archiso foundation, just with a different compositor and the configs to match.

What's In It

The stack is the same as my desktop setup:

MangoWM as the compositor
waybar for the panel
foot as the terminal
rofi as the launcher
mako for notifications
swaybg for wallpaper

Boot to a working Wayland session with a tiling window manager that gets out of your way.

Current State

It builds. It boots. It works. Whether it's daily-driver ready depends on how much you like figuring things out — the MangoWM ecosystem is smaller than Hyprland or sway, and documentation is thin in places.

ISOs and source are up on the ArchBang site if you want to try it.

Older posts →