元記事の説明文
<p>This morning <a href="https://news.ycombinator.com/item?id=48630171">on Hacker News</a> I saw <a href="https://hustvl.github.io/Moebius/">Moebius: 0.2B Lightweight Image Inpainting Framework with 10B-Level Performance</a>, describing a small but effective inpainting model - a model where you can mark regions of an image to remove and the model imagines what should fill the space. The released model <a href="https://github.com/hustvl/Moebius/blob/9310b76e368f5f7a8ecdf06493231af279c9973b/requirements.txt#L1">required PyTorch and NVIDIA CUDA</a>, but since it described itself as 0.2B I decided to try and get it running using WebGPU in a browser. TL;DR: I got it working, and you can try the demo at <a href="https://simonw.github.io/moebius-web/">simonw.github.io/moebius-web/</a>. Read on for the details.</p>
<h4 id="the-finished-tool">The finished tool</h4>
<p>Here's a video demo of the finished tool:</p>
<video controls="controls" height="1070" poster="https://static.simonwillison.net/static/2026/inpainting_1280_poster.jpg" preload="none" style="height: auto;" width="1280">
<source src="https://static.simonwillison.net/static/2026/inpainting_1280.mp4" type="video/mp4" />
</video>
<p>You can open any image in it (non-square images get letterboxed), highlight areas to remove, click the "Run inpaint" button and wait for the model to do its magic.</p>
<h4 id="a-parallel-agent-side-project">A parallel agent side-project</h4>
<p>My main project for today was landing a major feature in Datasette: a UI for creating and altering tables, as a follow-up to the <a href="https://simonwillison.net/2026/Jun/16/datasette/">insert and edit rows feature</a> I released last week.</p>
<p>I was working on that in Codex Desktop (here's <a href="https://github.com/simonw/datasette/pull/2789">the PR</a>) and often found myself spending 5-10 minutes spinning my fingers waiting for it to complete a mid-sized refactor or add the finishing touches to a change to the UI.</p>
<p>(An amusing thing about coding agents is that the harder a problem is the <em>more</em> time you have to get distracted while you wait for them to finish crunching!)</p>
<p>So I decided to spin up Claude Code in a terminal window and see how far I could get at porting Moebius to the web.</p>
<h4 id="some-agentic-research-to-kick-off-the-project">Some agentic research to kick off the project</h4>
<p>My first step was to ask regular Claude about the feasibility of this project. In <a href="https://claude.ai/">Claude.ai</a>, which has the ability to clone repos from GitHub:</p>
<blockquote>
<p><code>Clone [https://github.com/hustvl/Moebius/](https://github.com/hustvl/Moebius/) and tell me if they published the code and weights to run this model anywhere</code></p>
</blockquote>
<p>(I hadn't spotted the link to the weights yet, that's tucked away in the "News" section.)</p>
<p>Then:</p>
<blockquote>
<p><code>For Moebius what are the options for running it right now - Python and NVIDIA CUDA only or other options too?</code></p>
</blockquote>
<p>And:</p>
<blockquote>
<p><code>Muse on the feasibility of porting it to Transformers.js or similar and running it in a browser</code></p>
</blockquote>
<p>I like telling models to "muse on X", it's the shortest way I've found of expressing that I want them to contemplate a problem for me without providing them with a concrete goal.</p>
<p>Here's <a href="https://claude.ai/share/551c3dc8-17ce-4a4b-a0c9-8cbded6c7bf1">that chat transcript</a>. I copied out the last answer and saved it as <a href="https://github.com/simonw/moebius-web/blob/main/research.md">research.md</a> for Claude Code to read later.</p>
<p>Claude suggested using <strong>ONNX Runtime Web on the WebGPU backend</strong> - the layer <em>below</em> the <a href="https://huggingface.co/docs/transformers.js/en/index">Transformers.js</a> library I had suggested.</p>
<p>That was enough to convince me it was worth setting Claude Code loose and seeing how far it could get.</p>
<p>I usually start projects like this by gathering as much information as the coding agent might need as possible. Since I didn't expect this project to actually work I did everything in my <code>/tmp</code> folder:</p>
<div class="highlight highlight-source-shell"><pre><span class="pl-c1">cd</span> /tmp
mkdir Moebius
<span class="pl-c1">cd</span> Moebius
<span class="pl-c"><span class="pl-c">#</span> Grab the Moebius python code</span>
git clone https://github.com/hustvl/Moebius
<span class="pl-c"><span class="pl-c">#</span> And the model weights (Claude figured this out):</span>
GIT_LFS_SKIP_SMUDGE=0 git clone \
https://huggingface.co/hustvl/Moebius Moebius-weights
<span class="pl-c"><span class="pl-c">#</span> Finally a couple of libraries we might use:</span>
git clone https://github.com/huggingface/transformers.js
git clone https://github.com/microsoft/onnxruntime</pre></div>
<h4 id="setting-off-claude-code">Setting off Claude Code</h4>
<p>I created a directory for the rest of the project and ran <code>git init</code> in that so Claude could start committing code notes:</p>
<div class="highlight highlight-source-shell"><pre>mkdir /tmp/Moebius/moebius-web
<span class="pl-c1">cd</span> /tmp/Moebius/moebius-web
git init
<span class="pl-c"><span class="pl-c">#</span> Copy in that research.md from earlier</span>
git add research.md
git commit -m <span class="pl-s"><span class="pl-pds">"</span>Initial research by Claude Opus 4.8<span class="pl-pds">"</span></span></pre></div>
<p>I fired up a <code>claude</code> instance in the <code>/tmp/Moebius</code> folder, the level above all of the research materials I had prepared for it. I prompted:</p>
<blockquote>
<p><code>Read ./moebius-web/research.md - your goal is to port this model to ONNX and WebGPU so we can run it directly in a browser, with a simple UI</code></p>
</blockquote>
<p>As it started to work I dropped in this follow-up (typos included):</p>
<blockquote>
<p><code>Bulid this in /tmp/Moebius/moebius-web and commit early and often, also maintain a notes.md file in there with notes about what you figure out along the way - also start by writing out a plan.md in there and update that plan as oy work too</code></p>
</blockquote>
<p>I often ask agents to keep notes like this - the end result is often interesting, both for myself and for the next agent session that touches the same project. Here's what that <a href="https://github.com/simonw/moebius-web/blob/main/notes.md">notes.md file</a> looked like at the end of the project.</p>
<p>I kicked it off and went back to my main project, checking in occasionally to see how Claude was doing. When it looked like it might have something that worked I prompted:</p>
<blockquote>
<p><code>Tell me what URL I can visit in my own browser to try this</code></p>
</blockquote>
<p>Then I tried it out in Chrome and pasted some errors (and screenshots of errors) back into Claude Code.</p>
<p>After a few rounds of this we had something that appeared to work! Time to put it on the internet so other people could use it.</p>
<blockquote>
<p><code>How would we publish this to Hugging Face such that the model weights were on there and the HTML demo would show up in Hugging Face spaces?</code></p>
</blockquote>
<p>Claude Code knows how to use the <code>hf</code> CLI tool, so I created a model repo on <a href="https://huggingface.co/">Hugging Face</a>, then <a href="https://huggingface.co/settings/tokens">created a token</a> that could write to that repo and dropped it into a <code>/tmp/Moebius/token.txt</code> file so Claude could use it.</p>
<p>It published the 1.24GB of converted ONNX weights to <a href="https://huggingface.co/simonw/Moebius-ONNX">huggingface.co/simonw/Moebius-ONNX</a> for me.</p>
<p>I'd seen other demos load weights into the browser from Hugging Face before, so I knew it was possible. I decided to host my own frontend code on GitHub Pages, so I said:</p>
<blockquote>
<p><code>I want to publish the moebius-web folder to GitHub, minus the large files (so maybe minus the models/ folder), such that when I turn on GitHub Pages for that repo navigating to https://simonw.github.io/moebius-web/ serves the UI</code></p>
</blockquote>
<p>Telling it the final URL was important in case it needed to fix the URLs in the demos that it was building so they would work when deployed to production.</p>
<p>After a few more rounds of iteration, in between working on my main project, we got to a working, deployed version!</p>
<p>Except... each time I reloaded the page it seemed to download ~1.3GB of model weights. Browser caching seemed pretty important for this!</p>
<blockquote>
<p><code>anything clever we can do with serviceworkers or similar to help cache this stuff? It seems to reload every time, I am concerned that there might be something weird about the way HF redirects work that mean we don't benefit from browser caching</code></p>
</blockquote>
<p>I knew that Transformers.js projects could handle this properly, so I grabbed a copy of the <a href="https://huggingface.co/spaces/Xenova/whisper-web">Whisper Web</a> demo, dropped it into <code>/tmp/Moebius/whisper-web</code> and said:</p>
<blockquote>
<p><code>look in /tmp/Moebius/whisper-web (with a subagent) and see how they do this</code></p>
</blockquote>
<p>That project was entirely obfuscated, built JavaScript files so I figured using a subagent would avoid spending the rest of my top-level token context deciphering those files.</p>
<p>Claude figured out that it was using <code>caches.open("transformers-cache")</code> - the <a href="https://developer.mozilla.org/en-US/docs/Web/API/CacheStorage/open">CacheStorage API</a> - and <a href="https://github.com/simonw/moebius-web/commit/05c1cbc4894460a70a8bc1718ac6d152219e0f28#diff-fb89c342dfa36f544a2d16a885b0f3d1d49f436a7d0eaeb80505f80a1f922603">added that to our project</a>.</p>
<p>I've shared the <a href="https://gisthost.github.io/?58039ba5c1ca3ed177e8659168996ee4">full Claude Code transcript</a> for this project (published using my <a href="https://github.com/simonw/claude-code-transcripts">claude-code-transcripts</a> tool).</p>
<h4 id="what-did-i-learn-from-all-of-this-">What did I learn from all of this?</h4>
<p>This definitely counts as vibe coding: I didn't look at a single line of code from the project, restricting my input to testing, suggesting small feature improvements (like a progress bar for the large file downloads) and pointing the model in the direction of examples of how I wanted things to work.</p>
<p>Since I didn't write any code the amount I learned about the underlying technologies - WebGPU, ONNX, and the Moebius model itself - was very limited.</p>
<p>As is usually the case with this kind of project the most important things I learned concerned what was <em>possible</em>:</p>
<ul>
<li>Claude Opus 4.8 is capable of converting a PyTorch model to ONNX, publishing the result to Hugging Face and then building out a web application and interface that can load and execute that model.</li>
<li>Chrome, Firefox and Safari are all now capable of running this kind of model - I tried it in all three.</li>
<li>The CacheStorage API works with ~1.3GB model files.</li>
<li>... which means we can have inpainting as a feature of a client-only web application! (If our users can tolerate the 1.3GB download.)</li>
</ul>
<p>I felt like I should probably try and learn a little more about my project. I fired up <a href="https://claude.ai/">Claude.ai</a> and prompted:</p>
<blockquote>
<p><code>Clone [https://github.com/simonw/moebius-web/](https://github.com/simonw/moebius-web/) and use it to teach me all about the model and ONNX and the process of converting a model to ONNX and WebGPU and basically everything I'd need to know in order to fully understand this repo</code></p>
</blockquote>
<p>Here's <a href="https://claude.ai/share/d11b8f2b-a52d-4ca2-be75-a710eaf18572">the transcript</a> and the <a href="https://github.com/simonw/moebius-web/blob/main/understanding.md">understanding.md</a> Markdown file it created, which I've now added to the GitHub repo. I found the explanation of ONNX particularly enlightening:</p>
<blockquote>
<p><strong>ONNX</strong> (Open Neural Network Exchange) is a portable, framework-neutral file format for neural networks. An <code>.onnx</code> file is essentially two things bundled together:</p>
<ol>
<li>
<strong>A computation graph</strong> — a directed graph of <em>nodes</em>, where each node is an <strong>operator</strong> (<code>Conv</code>, <code>MatMul</code>, <code>Add</code>, <code>Einsum</code>, <code>Softmax</code>, <code>Gather</code>, <code>Resize</code>, …) wired together by named tensors flowing between them. This is the "recipe" for the forward pass.</li>
<li>
<strong>The weights</strong> — the learned parameter tensors (the convolution kernels, the embedding table, etc.), stored as initializers in that same graph.</li>
</ol>
<p>Crucially, ONNX describes <em>what to compute</em>, abstractly, without saying <em>how</em> or <em>on what hardware</em>. The operator set is versioned by an <strong>opset</strong> number (this repo uses <strong>opset 18</strong>), which pins down exactly which operators exist and what their semantics are.</p>
</blockquote>
<p>It turns out PyTorch has built in mechanisms for exporting to ONNX, as seen <a href="https://github.com/simonw/moebius-web/blob/080be6e737ec976130e260d34707d7d9b7f63d5b/python/export_onnx.py#L91">here in export_onnx.py</a>:</p>
<pre><span class="pl-s1">torch</span>.<span class="pl-c1">onnx</span>.<span class="pl-c1">export</span>(
<span class="pl-s1">dec</span>, (<span class="pl-s1">lat</span>,), <span class="pl-s1">dec_path</span>, <span class="pl-s1">opset_version</span><span class="pl-c1">=</span><span class="pl-s1">args</span>.<span class="pl-c1">opset</span>,
<span class="pl-s1">input_names</span><span class="pl-c1">=</span>[<span class="pl-s">"latent"</span>], <span class="pl-s1">output_names</span><span class="pl-c1">=</span>[<span class="pl-s">"image"</span>],
<span class="pl-s1">dynamic_axes</span><span class="pl-c1">=</span>{<span class="pl-s">"latent"</span>: {<span class="pl-c1">0</span>: <span class="pl-s">"B"</span>}, <span class="pl-s">"image"</span>: {<span class="pl-c1">0</span>: <span class="pl-s">"B"</span>}},
)</pre>
<p>It also included a <a href="https://github.com/simonw/moebius-web/blob/main/understanding.md#12-mini-glossary">handy glossary</a> and an only-slightly-broken <a href="https://github.com/simonw/moebius-web/blob/main/understanding.md#10-putting-the-whole-pipeline-in-one-picture">ASCII-art diagram</a> showing how the model pipeline fits together.</p>
<p>Tags: <a href="https://simonwillison.net/tags/browsers">browsers</a>, <a href="https://simonwillison.net/tags/transformers-js">transformers-js</a>, <a href="https://simonwillison.net/tags/webgl">webgl</a>, <a href="https://simonwillison.net/tags/vibe-coding">vibe-coding</a>, <a href="https://simonwillison.net/tags/coding-agents">coding-agents</a>, <a href="https://simonwillison.net/tags/claude-code">claude-code</a>, <a href="https://simonwillison.net/tags/onnx">onnx</a></p>