{"author":"gm678","children":[{"author":"brcmthrowaway","children":[{"author":"mock-possum","children":[{"author":"MomsAVoxell","children":[],"created_at":"2026-07-04T10:26:36.000Z","created_at_i":1783160796,"id":48784320,"options":[],"parent_id":48783717,"points":null,"story_id":48782671,"text":"The immensely desperate search for meaning in the mundane, perhaps?","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T08:28:06.000Z","created_at_i":1783153686,"id":48783717,"options":[],"parent_id":48782748,"points":null,"story_id":48782671,"text":"How do you figure?","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T04:58:23.000Z","created_at_i":1783141103,"id":48782748,"options":[],"parent_id":48782671,"points":null,"story_id":48782671,"text":"This seems like the beginnings of AI psychosis, tbh.","title":null,"type":"comment","url":null},{"author":"zarzavat","children":[{"author":"eru","children":[{"author":"zarzavat","children":[{"author":"eru","children":[{"author":"baq","children":[],"created_at":"2026-07-04T15:27:51.000Z","created_at_i":1783178871,"id":48786087,"options":[],"parent_id":48784281,"points":null,"story_id":48782671,"text":"The $200 sub is the new free tier, has been for a while now.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T10:17:42.000Z","created_at_i":1783160262,"id":48784281,"options":[],"parent_id":48782914,"points":null,"story_id":48782671,"text":"I suspect subscriptions will stay.  But you will see more and more roadblocks that the &#x27;barely goes to the gym&#x27; user barely notices, but that the power user will chafe under.<p>I make that prediction, because the people who pay for subscriptions but only use them moderately at best are truly profitable.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T05:42:39.000Z","created_at_i":1783143759,"id":48782914,"options":[],"parent_id":48782810,"points":null,"story_id":48782671,"text":"API prices are the new normal. I doubt that prices will drop to the level of the subsidized subscriptions any time soon. Usage is growing exponentially but capacity cannot. There is no reason for them to waste their capacity on subscription users if they can sell that same capacity to API users.<p>Like with Uber and Lyft, the low prices were a fight for market share, but now they have successfully captured that market share the focus changes to balancing their books.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T05:13:48.000Z","created_at_i":1783142028,"id":48782810,"options":[],"parent_id":48782759,"points":null,"story_id":48782671,"text":"For now, I can use Fable from the web just fine.<p>&gt; You&#x27;re not likely to want to run Fable in a loop any more than you want to take a bunch of dollar bills and light them on fire. Every invocation of Fable has to be intentional, its context carefully managed.<p>Eh, that&#x27;s just because it&#x27;s the current frontier model.  Give it a few weeks, and prices will drop.","title":null,"type":"comment","url":null},{"author":"weird-eye-issue","children":[{"author":"danielbln","children":[{"author":"stingraycharles","children":[{"author":"danielbln","children":[{"author":"skrebbel","children":[{"author":"danielbln","children":[],"created_at":"2026-07-04T08:45:16.000Z","created_at_i":1783154716,"id":48783806,"options":[],"parent_id":48783708,"points":null,"story_id":48782671,"text":"And now you&#x27;ve added to the pile, congratulations.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T08:25:27.000Z","created_at_i":1783153527,"id":48783708,"options":[],"parent_id":48783024,"points":null,"story_id":48782671,"text":"Posting meaningless comments isn\u2019t suddenly useful because you\u2019re aware of it and \u201criffing on that\u201d.","title":null,"type":"comment","url":null},{"author":"weird-eye-issue","children":[],"created_at":"2026-07-04T09:51:54.000Z","created_at_i":1783158714,"id":48784180,"options":[],"parent_id":48783024,"points":null,"story_id":48782671,"text":"So you added BS, on purpose? There is nothing meaningless about anecdotes. The plural is data","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T06:08:59.000Z","created_at_i":1783145339,"id":48783024,"options":[],"parent_id":48783014,"points":null,"story_id":48782671,"text":"They add nothing, meaningless anecdotes. I was kind of riffing on that.","title":null,"type":"comment","url":null},{"author":"MomsAVoxell","children":[],"created_at":"2026-07-04T10:28:41.000Z","created_at_i":1783160921,"id":48784332,"options":[],"parent_id":48783014,"points":null,"story_id":48782671,"text":"Yeah come on, if these endless dialectic statements get made, at least post more detailed information about how your position was attained.","title":null,"type":"comment","url":null},{"author":"zarzavat","children":[{"author":"stingraycharles","children":[],"created_at":"2026-07-04T14:00:17.000Z","created_at_i":1783173617,"id":48785462,"options":[],"parent_id":48785337,"points":null,"story_id":48782671,"text":"But they\u2019re not at all meaningful after a certain point, because they\u2019re not even attempting to explain why it works well for them. It\u2019s just noise like this.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T13:42:21.000Z","created_at_i":1783172541,"id":48785337,"options":[],"parent_id":48783014,"points":null,"story_id":48782671,"text":"Every metric becomes a target (Goodhart&#x27;s law). Also, the plural of anecdote is not data.<p>The subjective anecdotes from HN users matter because they are <i>not</i> data and are much harder to game. Not impossible to game, always be aware of users with low karma, but more difficult than gaming a benchmark.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T06:06:23.000Z","created_at_i":1783145183,"id":48783014,"options":[],"parent_id":48783006,"points":null,"story_id":48782671,"text":"I don\u2019t understand what these comments add to the discussion, you always see these and it\u2019s just noise at this point.","title":null,"type":"comment","url":null},{"author":"phplovesong","children":[],"created_at":"2026-07-04T06:10:34.000Z","created_at_i":1783145434,"id":48783029,"options":[],"parent_id":48783006,"points":null,"story_id":48782671,"text":"Damn that was a cringe comment","title":null,"type":"comment","url":null},{"author":"wwind123","children":[{"author":"HappySweeney","children":[{"author":"wwind123","children":[],"created_at":"2026-07-04T18:35:55.000Z","created_at_i":1783190155,"id":48787659,"options":[],"parent_id":48784466,"points":null,"story_id":48782671,"text":"It&#x27;s not a brand new project, but a project I&#x27;ve been working on and off for the last half year. I was trying to add a major feature which required some big refactoring of the current structure of the code. With the previous models I&#x27;d expect many rounds of reviews and debates between the AI agents. But with Fable 5, there&#x27;s basically no debate, Codex and Gemini basically approved immediately. :)","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T10:58:36.000Z","created_at_i":1783162716,"id":48784466,"options":[],"parent_id":48783647,"points":null,"story_id":48782671,"text":"In my experience, audit findings decrease in frequency as a codebase matures.  Was Fable doing greenfield work?","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T08:14:30.000Z","created_at_i":1783152870,"id":48783647,"options":[],"parent_id":48783006,"points":null,"story_id":48782671,"text":"I make Claude, Codex and Gemini review each other&#x27;s design plan and implementation. Each always found a lot of things the others missed...until Fable 5 came out. Whatever plan or code Fable 5 comes up with, now it&#x27;s very hard for Codex and Gemini to find any serious hole in it.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T06:03:47.000Z","created_at_i":1783145027,"id":48783006,"options":[],"parent_id":48782856,"points":null,"story_id":48782671,"text":"And I&#x27;ve been quite impressed. Opus talks the talk, Fable walks the walk.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T05:28:01.000Z","created_at_i":1783142881,"id":48782856,"options":[],"parent_id":48782759,"points":null,"story_id":48782671,"text":"Compared to Opus 4.8 I really haven&#x27;t been impressed","title":null,"type":"comment","url":null},{"author":"NitpickLawyer","children":[],"created_at":"2026-07-04T05:28:40.000Z","created_at_i":1783142920,"id":48782860,"options":[],"parent_id":48782759,"points":null,"story_id":48782671,"text":"I agree with you that you don&#x27;t <i>need</i> fable for everything, and you have to be careful on what you run it on. CRUD stuff, sure even the small models can do it. But there certainly are tasks that are very much suited for the absolute SotA and you&#x27;d leave money on the table by not using it. And how much a task is <i>worth</i> is dependant on how much it improves your bottom line. So the cost&#x2F;token becomes largely irrelevant.<p>Let&#x27;s take this [1] benchmark. A bit more context here [2].<p>Here models are asked to create kernels for running inference on models. This is a benchmark perfectly suited and highly relevant right now. It&#x27;s easily verifiable, an active are of research, and the results are immediately useful.<p>Say you have 1 unit of compute, it costs 300k $ and serves 1x users. In comes Fable and after one session it gives you 30% speed-up on your 1 unit of compute. It can now serve 1.3x users. How much is that one session worth for you? How much is it worth for a company using 10 units? 100 units? How much is it worth for a hyper-scaler running 10.000 units? How much is it worth for a lab that trains the next frontier model and then serves it from 100.000 units? 30% is relative. And the cost for one session is really meaningless. It can cost 1m$ &#x2F; session and it would <i>still</i> be worth it for someone.<p>[1] - <a href=\"https:&#x2F;&#x2F;kernelbench.com&#x2F;mega\" rel=\"nofollow\">https:&#x2F;&#x2F;kernelbench.com&#x2F;mega</a><p>[2] - <a href=\"https:&#x2F;&#x2F;x.com&#x2F;elliotarledge&#x2F;status&#x2F;2072814573753975266\" rel=\"nofollow\">https:&#x2F;&#x2F;x.com&#x2F;elliotarledge&#x2F;status&#x2F;2072814573753975266</a>","title":null,"type":"comment","url":null},{"author":"jacobgold","children":[{"author":"techpression","children":[],"created_at":"2026-07-04T08:09:59.000Z","created_at_i":1783152599,"id":48783621,"options":[],"parent_id":48783256,"points":null,"story_id":48782671,"text":"They can\u2019t keep their current models working on subscriptions[1], so we\u2019ll see if this is marketing or not in the future.\nIt\u2019s smart to tease it no matter what, \u201dinsert classic first hit is free drug reference\u201d.<p>[1] <a href=\"https:&#x2F;&#x2F;status.claude.com\" rel=\"nofollow\">https:&#x2F;&#x2F;status.claude.com</a>","title":null,"type":"comment","url":null},{"author":"zarzavat","children":[],"created_at":"2026-07-04T10:56:10.000Z","created_at_i":1783162570,"id":48784451,"options":[],"parent_id":48783256,"points":null,"story_id":48782671,"text":"Apple claims their price increases are temporary too. I&#x27;ll believe it when I see it. The capacity limitations are not going away any time soon.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T07:06:07.000Z","created_at_i":1783148767,"id":48783256,"options":[],"parent_id":48782759,"points":null,"story_id":48782671,"text":"Fable is supposed to return to subscription plans, unless I&#x27;m missing something: <a href=\"https:&#x2F;&#x2F;jacob.gold&#x2F;posts&#x2F;fable-5-removal-is-temporary&#x2F;\" rel=\"nofollow\">https:&#x2F;&#x2F;jacob.gold&#x2F;posts&#x2F;fable-5-removal-is-temporary&#x2F;</a><p><pre><code>  Anthropic says the change is about capacity and is temporary. In its launch announcement on June 9, 2026, it says:\n\n  &quot;After this point\u2014when sufficient capacity allows us to do so\u2014we aim to restore Fable 5 as a standard part of subscription plans. We intend to do this as quickly as we can.&quot;</code></pre>","title":null,"type":"comment","url":null},{"author":"vidarh","children":[{"author":"steveklabnik","children":[{"author":"vidarh","children":[],"created_at":"2026-07-04T17:18:33.000Z","created_at_i":1783185513,"id":48787055,"options":[],"parent_id":48786544,"points":null,"story_id":48782671,"text":"That sucks - I&#x27;ve thankfully only had that happen once. Similar thing - I was testing my new X11 server, and it turns out that broken X11 packets can make Firefox crash, and I got a refusal on a sub-agent request <i>Fable prompted</i> to have an agent narrow down why.","title":null,"type":"comment","url":null},{"author":"ianbutler","children":[],"created_at":"2026-07-04T19:26:37.000Z","created_at_i":1783193197,"id":48788149,"options":[],"parent_id":48786544,"points":null,"story_id":48782671,"text":"There&#x27;s a setting that stops fallback to Opus 4.8 if you would like to avoid that behavior.<p>Config -&gt; Switch models when a message is flagged -&gt; false<p>That should stop it entirely when a message is flagged and then you can come back without Opus having potentially made a mess.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T16:20:05.000Z","created_at_i":1783182005,"id":48786544,"options":[],"parent_id":48784346,"points":null,"story_id":48782671,"text":"I had fable running in a loop overnight last night, finding bugs. It found a heap overflow. That triggered its safety guards, which converted the thing to opus, leaving opus to run the rest of the night, wasting my precious time with Fable.<p>Oh well, it was pretty funny, all things considered.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T10:32:15.000Z","created_at_i":1783161135,"id":48784346,"options":[],"parent_id":48782759,"points":null,"story_id":48782671,"text":"I just had Fable run overnight in a loop, and it fixed ~150 compiler crashing bugs that Opus had kept deferring.<p>I wouldn&#x27;t <i>start</i> with Fable - when I use burndown loops I tend to include instructions to document progress and set aside anything that turns out to be harder than expected, and solve the easy stuff first. When a model runs out of easy stuff and start struggling to make progress on what is left, I can let it keep churning on that - they get there eventually - or I can bump it up to a smarter model if one is available.<p>Opus had churned <i>a week</i> driving down spec failures, and did a great job. The 150 Fable took overnight were the ones Opus had kept putting aside.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T05:00:31.000Z","created_at_i":1783141231,"id":48782759,"options":[],"parent_id":48782671,"points":null,"story_id":48782671,"text":"Fable changes the game yet again, because it&#x27;s API-only.<p>You&#x27;re not likely to want to run Fable in a loop any more than you want to take a bunch of dollar bills and light them on fire. Every invocation of Fable has to be intentional, its context carefully managed. I feel like a babysitter.","title":null,"type":"comment","url":null},{"author":"bob1029","children":[{"author":"stingraycharles","children":[{"author":"themgt","children":[],"created_at":"2026-07-04T08:26:18.000Z","created_at_i":1783153578,"id":48783709,"options":[],"parent_id":48783009,"points":null,"story_id":48782671,"text":"Forgetting? I think you mean to say your advice was auto-compacted to keep our context small <i>and</i> deliver better results.","title":null,"type":"comment","url":null},{"author":"bob1029","children":[],"created_at":"2026-07-04T11:20:17.000Z","created_at_i":1783164017,"id":48784568,"options":[],"parent_id":48783009,"points":null,"story_id":48782671,"text":"I learned some nuance. Small context is important if the context isn&#x27;t cachable. If most of it is the same prefix every time, the situation changes pretty significantly.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T06:05:33.000Z","created_at_i":1783145133,"id":48783009,"options":[],"parent_id":48782882,"points":null,"story_id":48782671,"text":"You\u2019re forgetting that keeping the context small is better economically <i>and</i> delivers better results.","title":null,"type":"comment","url":null},{"author":"embedding-shape","children":[{"author":"fooster","children":[],"created_at":"2026-07-04T11:54:44.000Z","created_at_i":1783166084,"id":48784748,"options":[],"parent_id":48783812,"points":null,"story_id":48782671,"text":"I don\u2019t find that to be true?","title":null,"type":"comment","url":null},{"author":"stalfie","children":[],"created_at":"2026-07-04T12:21:44.000Z","created_at_i":1783167704,"id":48784888,"options":[],"parent_id":48783812,"points":null,"story_id":48782671,"text":"There&#x27;s at least one benchmark that attempts to measure this, but it has been running for a year plus so it&#x27;s quite infrequently updated now.<p><a href=\"https:&#x2F;&#x2F;fiction.live&#x2F;stories&#x2F;Fiction-liveBench-Mar-25-2025&#x2F;oQdzQvKHw8JyXbN87\" rel=\"nofollow\">https:&#x2F;&#x2F;fiction.live&#x2F;stories&#x2F;Fiction-liveBench-Mar-25-2025&#x2F;o...</a>","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T08:46:15.000Z","created_at_i":1783154775,"id":48783812,"options":[],"parent_id":48782882,"points":null,"story_id":48782671,"text":"&gt; melted away in the face of massive context sizes<p>If only. There is a huge difference between &quot;Gives good responses&#x2F;can easily spot things within N context size&quot; and &quot;Technically works but sucks within N context size&quot;, almost all models basically become cave-people once you go beyond 50% of the &quot;supported&quot; context size, meaning while they may technically work with 1 million output tokens, those last 500K tokens are gonna be massively &quot;dumber&quot; than the first 500k tokens.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T05:35:25.000Z","created_at_i":1783143325,"id":48782882,"options":[],"parent_id":48782671,"points":null,"story_id":48782671,"text":"A lot of the crazy ideas seem to have melted away in the face of massive context sizes. Today, I can put roughly <i>a megabyte</i> of utf8 text into my system prompt before things start to get weird.<p>That is a massive amount of information even if we are being sloppy with it. You can read The Hobbit and the first Harry Potter book cover-to-cover and still have room to spare. I would deeply struggle to develop a world model this detailed for any business. Anything that needs to get more specific than these narratives can be a SQL query tool into the data warehouse, grep over the codebase, MS graph API lookup, etc.<p>Giving the business a balanced way to collaborate over this one shared model of the world is a new challenge I am beginning to engage with. I&#x27;ve also noticed that the world model will compound on itself in terms of self-detection of update opportunities. The more constraints there are, the more likely we appear to violate one.","title":null,"type":"comment","url":null},{"author":"foobarbecue","children":[{"author":"stingraycharles","children":[{"author":"foobarbecue","children":[{"author":"foobarbecue","children":[],"created_at":"2026-07-04T05:51:14.000Z","created_at_i":1783144274,"id":48782953,"options":[],"parent_id":48782951,"points":null,"story_id":48782671,"text":"Aaand now OP has fixed it in the HN post title. Still wrong in the linked article.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T05:50:28.000Z","created_at_i":1783144228,"id":48782951,"options":[],"parent_id":48782917,"points":null,"story_id":48782671,"text":"Argh, autocorrect got me. Thanks, fixed.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T05:42:58.000Z","created_at_i":1783143778,"id":48782917,"options":[],"parent_id":48782905,"points":null,"story_id":48782671,"text":"You miswrote OP\u2019s miswriting in the third version :)","title":null,"type":"comment","url":null},{"author":"MomsAVoxell","children":[],"created_at":"2026-07-04T10:27:22.000Z","created_at_i":1783160842,"id":48784327,"options":[],"parent_id":48782905,"points":null,"story_id":48782671,"text":"He&#x27;s not referring to the real islands, but rather the state of mind imposed upon existence on Vancouver Island.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T05:40:36.000Z","created_at_i":1783143636,"id":48782905,"options":[],"parent_id":48782671,"points":null,"story_id":48782671,"text":"It&#x27;s &quot;Galapagos&quot; or &quot;Gal\u00e1pagos,&quot; not &quot;Galapogos.&quot;","title":null,"type":"comment","url":null},{"author":"martey","children":[{"author":"pluc","children":[],"created_at":"2026-07-04T07:57:05.000Z","created_at_i":1783151825,"id":48783534,"options":[],"parent_id":48782974,"points":null,"story_id":48782671,"text":"Nobody calls it that.","title":null,"type":"comment","url":null},{"author":"skrebbel","children":[{"author":"progbits","children":[{"author":"MomsAVoxell","children":[],"created_at":"2026-07-04T10:25:42.000Z","created_at_i":1783160742,"id":48784317,"options":[],"parent_id":48783706,"points":null,"story_id":48782671,"text":"I&#x27;m saddened to have gotten this far and had my bubble burst, I legitimately thought someone was on the real GI&#x27;s and was able to compose such a treatise, nevertheless.","title":null,"type":"comment","url":null},{"author":"mattlondon","children":[{"author":"fragmede","children":[{"author":"fuzzfactor","children":[],"created_at":"2026-07-04T19:31:37.000Z","created_at_i":1783193497,"id":48788200,"options":[],"parent_id":48785245,"points":null,"story_id":48782671,"text":"Could be upscale compared to Dry Tortugas, which is obviously less of a &quot;remote&quot; location.<p><a href=\"https:&#x2F;&#x2F;science.nasa.gov&#x2F;earth&#x2F;earth-observatory&#x2F;dry-tortugas-florida-9011&#x2F;\" rel=\"nofollow\">https:&#x2F;&#x2F;science.nasa.gov&#x2F;earth&#x2F;earth-observatory&#x2F;dry-tortuga...</a>","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T13:24:36.000Z","created_at_i":1783171476,"id":48785245,"options":[],"parent_id":48784484,"points":null,"story_id":48782671,"text":"Looks like there is Starlink though! <a href=\"https:&#x2F;&#x2F;dplnews.com&#x2F;cnt-ecuador-y-starlink-conectan-las-4-islas-de-galapagos&#x2F;\" rel=\"nofollow\">https:&#x2F;&#x2F;dplnews.com&#x2F;cnt-ecuador-y-starlink-conectan-las-4-is...</a>","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T11:02:14.000Z","created_at_i":1783162934,"id":48784484,"options":[],"parent_id":48783706,"points":null,"story_id":48782671,"text":"I visited a decade or so ago.  Most of the islands you are forbidden from staying overnight on. No running water. No power. No phone signal.<p>You&#x27;re going to need a big big solar panel to run local inference during the day, but luckily there is plenty of sun!","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T08:25:11.000Z","created_at_i":1783153511,"id":48783706,"options":[],"parent_id":48783686,"points":null,"story_id":48782671,"text":"Realizing he&#x27;s just using it to mean remote place in terms of AI bubble (Vancouver! What does that make all the other places that are not major tech hubs?) was a bummer.<p>Who cares about AI, I wanted to read about living in Galapagos","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T08:21:49.000Z","created_at_i":1783153309,"id":48783686,"options":[],"parent_id":48782974,"points":null,"story_id":48782671,"text":"Yeah I was like \u201cwoa a PGConf on the galapagos, i gotta get my ass to one if those!\u201d","title":null,"type":"comment","url":null},{"author":"dan-robertson","children":[],"created_at":"2026-07-04T10:41:20.000Z","created_at_i":1783161680,"id":48784391,"options":[],"parent_id":48782974,"points":null,"story_id":48782671,"text":"I think it\u2019s just a metaphor for being isolated from the wider tech community.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T05:56:13.000Z","created_at_i":1783144573,"id":48782974,"options":[],"parent_id":48782671,"points":null,"story_id":48782671,"text":"OP&#x27;s alt text makes it clear that by &quot;Galapagos Island&quot; they mean Vancouver. I assumed that this was some sort of local nickname, but all of the references to &quot;Gal\u00e1pagos of Canada&quot; I could find are talking about Haida Gwaii instead.","title":null,"type":"comment","url":null},{"author":"gwern","children":[],"created_at":"2026-07-04T06:26:08.000Z","created_at_i":1783146368,"id":48783094,"options":[],"parent_id":48782671,"points":null,"story_id":48782671,"text":"URL typo: &quot;hange how he works](&#x2F;productivity-velocity&#x2F;)&quot;. (I make this kind of Markdown syntax error all the time and set up a lint for &#x27;](&#x2F;&#x27;.)<p>You should talk to <a href=\"https:&#x2F;&#x2F;www.mechanize.work&#x2F;\" rel=\"nofollow\">https:&#x2F;&#x2F;www.mechanize.work&#x2F;</a> for sponsorship&#x2F;credits and about environments.","title":null,"type":"comment","url":null},{"author":"duckmysick","children":[{"author":"bitwize","children":[{"author":"imrehg","children":[],"created_at":"2026-07-04T11:36:23.000Z","created_at_i":1783164983,"id":48784642,"options":[],"parent_id":48784289,"points":null,"story_id":48782671,"text":"Should be, judging by their LinkedIn from the bottom of the page (Centaur Technology - Member of Technical Staff: 2005 - 2013)<p>This is a blast from the past. Centaur was doing VIA Technology&#x27;s CPUs, and I was at VIA (in Taiwan) while this author was in Centaur (in the US). I was on the embedded side, but I remember some distinct collaborations with the US team, so there&#x27;s a non-zero chance to have crossed path.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T10:19:37.000Z","created_at_i":1783160377,"id":48784289,"options":[],"parent_id":48784102,"points":null,"story_id":48782671,"text":"The first thing I started wondering was &quot;is this the same Centaur that comes up as &#x27;CentaurHauls&#x27; on a CPUID (EAX=0)?&quot;","title":null,"type":"comment","url":null},{"author":"rtpg","children":[{"author":"kordlessagain","children":[{"author":"fuzzfactor","children":[],"created_at":"2026-07-04T19:02:29.000Z","created_at_i":1783191749,"id":48787926,"options":[],"parent_id":48784974,"points":null,"story_id":48782671,"text":"Arf.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T12:36:09.000Z","created_at_i":1783168569,"id":48784974,"options":[],"parent_id":48784950,"points":null,"story_id":48782671,"text":"Dogfood everything, all the time.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T12:31:33.000Z","created_at_i":1783168293,"id":48784950,"options":[],"parent_id":48784102,"points":null,"story_id":48782671,"text":"I really wonder what &quot;randomixed testing&quot; looks like in practice. What is the measure of success&#x2F;failure?<p>I undrestand for fuzzing you have a very basic &quot;doesn&#x27;t crash&quot; metric. Property based tests.... you gotta write properties for the PBTs to work on. What is the randomized testing hitting?","title":null,"type":"comment","url":null},{"author":"kordlessagain","children":[{"author":"disgruntledphd2","children":[{"author":"fuzzfactor","children":[],"created_at":"2026-07-04T19:03:23.000Z","created_at_i":1783191803,"id":48787935,"options":[],"parent_id":48784991,"points":null,"story_id":48782671,"text":"I&#x27;m with him 100% on these points:<p>&gt;For no particular reason, I&#x27;ve always liked designing experiments and measuring things. This was true long before I ever thought about careers and it&#x27;s still true today. As discussed here, measurement is one of the primary themes of this blog, maybe the primary theme. Lucky for me, this lifelong hobby has been something I&#x27;ve been able to make a career out of. And even luckier, this skill seems to have been made relatively more valuable by coding agents3.<p>&gt;3.And, coincidentally, as with testing, it happens to be a skill that gets developed a lot more at CPU companies than in typical software companies, even ones that produce highly performance sensitive products, like databases. Above, we estimated that the effort spent on testing at the CPU design shop I worked for was maybe a ~2:1 ratio over what you&#x27;d see in a traditional software company. When it comes to benchmarking&#x2F;evals&#x2F;experimental design, the denominator is low enough at traditional software companies that it&#x27;s hard to estimate the ratio, but it&#x27;s surely at least 10:1 and 100:1 and 1000:1 are plausible numbers as well.<p>&gt;Of course, by focusing on and developing much more expertise than software companies in these areas, chip companies are often relatively in the stone ages in a number of other areas. Relatively speaking, I&#x27;m also relatively weak in most of those areas. I think this works out ok in the context of a company, where it&#x27;s valuable to have people with complementary skills, but it can definitely cause some problems in interviews.\n[return]<p>Notice how much numerical difference Luu attributes to highly performance sensitive software products versus a traditional software company.<p>As for Claude and Codex in mid-2024:<p>&gt;At the time, I didn&#x27;t find this useful enough to use for anything where I knew what I was doing, but it enabled me to embed a little web game into that post and do other tasks that would&#x27;ve required me to learn something about an area where having actual expertise will probably never be particularly interesting to me, such as building a web app.<p>Me neither when it comes to web apps.  I always thought the fundamental thing with the greatest room for improvement was getting the most out of stand-alone electronics, before it made as much sense to even network them to begin with.  Looks to me that performance progress was less than halfway there when it got to be reversed in personal computers, and now more resources than ever are being put to keep personal computers from allowing users to get as much advantage as they could have been.  Basically more effort than ever imagined since before the 1990s, toward a return to centralized data centers instead, not much differently than we had with mainframes.  The difference is that &quot;everybody&quot; has a &quot;terminal&quot; now but those user PCs are going to need to become less-capable as stand-alone <i>personal</i> computers, and more like &quot;dumb&quot; terminals otherwise a growing number of high-rollers will not be able to fulfill their massive vision as overwhelmingly.<p>So what if I thought it was fairly intuitive learning to program on mainframes back in the 1960s.  I was pursuing natural science not computer science, in ways people like Wozniak and Gates were not.  For me the programs are supposed to be a tool, and one that&#x27;s still not always necessary to achieve the optimum outcome in the research lab.  It took a number of years before I was able to pay for the experimentation by doing chemical testing in the lab, and was basically the only computer pioneer in my field for a while after that, as alternative labs also had some emerging &quot;computerized&quot; or &quot;digital&quot; instruments but not for user-programming.<p>Anyway, by the tine the 1990&#x27;s rolled around, with all the overtime at the bench, it gave me about the equivalent of 40 years of intense testing in only 20 years.  From where I have never stopped, but it was mostly chemical testing.  Not as applicable to things like CPU design as it could be, but the hardware I build has to handle chemicals not only electrons.  Usually quite toxic materials, hazardous in other ways, and highly regulated.<p>And I was even more out-of-date by then on computer languages.  Much more of a disadvantage compared to digital hardware engineering like where Luu is coming from, since I was not the coder those folks were, and they don&#x27;t have the time to put most of their effort into the kind of code that software engineers have to do.<p>I didn&#x27;t code every year, just as needed, and &quot;shipping&quot; code was not an objective at all.  By then I had founded my first company and I made more money using my code than I was before, or developed routines that made the work easier.  The LOB was mostly scientific and automation.  I figured anybody would prefer NASA-type performance if they could, so why not, I had no deadlines, and these are some risky chemicals.<p>It should be easy to see that the only unfair advantage I had was a lifetime of testing up the wazoo, even if only a baby part of that was testing code.  And I already could tell that testing had to be at least a bit lacking in the commercial software that had appeared (and is still often raising its ugly head).  As a total laboratory system I had always been more reliable than that and I was not ready to stop at all.<p>Ended up at 100:1 in testing:development ratio to hit the sweet spot.  That&#x27;s not relative to any other companies, just what it took to give lab business clients their money&#x27;s worth.  Of course lots of &quot;unit tests&quot; and code reviews were included.  With chemicals you must leave no stone unturned even more so than plain electronics.  Never could have done it if I wasn&#x27;t already giving clients their money&#x27;s worth without the code.<p>Otherwise the likelihood of unforeseen regrets is far too high for mission-critical application, and can not be very accurately estimated either.<p>The whole time it was on my mind how I would incorporate AI into some advanced automation, if PCs got powerful enough, but that was a bit of a dream 30 years ago.  AI was just unheard of for any type of everyday use.  I had to accept that if I ever <i>did</i> want to ship some software I would have to be more of an architect and let professional engineers rewrite my stuff in a more modern language that they have experience in.<p>Now with things like Claude and Codex I could be doing it all on my own, but nah.  I&#x27;ve already got a well established lack-of-momentum and a couple years ago it looked like LLM autocoding was impending so I figured it would be good to see how AI develops from the sidelines.  While I&#x27;m past &quot;retirement age&quot; anyway.  Why would I settle for my own single-handed code when now I could have a <i>team</i> of software engineers who are using AI to help them.  Just like I always figured I would need a team to do all the coding in order to make a &quot;product&quot; out of my efforts since before the &#x27;90s.<p>If I get such a wild hair I&#x27;ll still have to do all the testing myself regardless :\\","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T12:39:40.000Z","created_at_i":1783168780,"id":48784991,"options":[],"parent_id":48784968,"points":null,"story_id":48782671,"text":"I feel like Dan is one of the most consistently interesting writers in tech at the moment.<p>This is most likely because we take similar approaches towards things.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T12:35:05.000Z","created_at_i":1783168505,"id":48784968,"options":[],"parent_id":48784102,"points":null,"story_id":48782671,"text":"I bake MCP tools into everything now, including doing screenshots. Any LLM can run just about every function, including resizing the window. I just watched Fabel 5 do a full usability test on a new project, copying the release cycle for my agentic terminal, relaunching the app like 20 times as it went, ensuring the move to a built and signed release was working. It installed the program like 5 times (something I do daily multiple times).<p>I noticed map tiles were not working and started to tell it, but then all of a sudden they reappeared and checking the logs it had found the issue and autocorrected itself.<p>The key here is feedback loops and systems annealing.<p>As for Dan, my God I love this guy. Glad someone posted it!","title":null,"type":"comment","url":null},{"author":"gregwebs","children":[],"created_at":"2026-07-04T15:04:48.000Z","created_at_i":1783177488,"id":48785928,"options":[],"parent_id":48784102,"points":null,"story_id":48782671,"text":"No code review by default goes against actual established evidence (there is little of this for software development practice) that code review is the best way to find defects.<p>I always get the impression from using hardware and other anecdotes like this that it is rare for hardware companies to know how to do software development well because their core competency is hardware. In fairness, it is uncommon for software companies to know how to do software development well.<p>Every form of testing has its value when done well and they are using several forms that most software developers don&#x27;t use- probably helps make up for the lack of code review and unit tests. But if they incorporated code review and unit tests their software would likely be even higher quality.<p>Property based testing is amazing, but it won&#x27;t provide full coverage.\nRegression suites are amazing, but generally the most expensive form of testing in terms of time to write and maintain tests and time to run them.<p>Today AI can crank out unit tests so its silly not to have them.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T09:35:41.000Z","created_at_i":1783157741,"id":48784102,"options":[],"parent_id":48782671,"points":null,"story_id":48782671,"text":"I&#x27;d like to highlight a different part of the article:<p>&gt; In general, when I talk to software folks about testing, I&#x27;m coming from such a different place that they immediately look at me like I&#x27;m an alien, so let&#x27;s talk about how we tested at this hardware company I worked for, Centaur, which informs my biases about how I like to work. Some of the things that we did that were or are unorthodox in the software world are:<p>&gt; Hired dedicated QA &#x2F; test engineers, with testing being a first-class career path on par with being a developer - No code review by default - Virtually no hand-written tests - Constant testing via what programmers sometimes called property based testing, randomized testing, fuzzing, etc., although we just called those tests (hand-written tests were called &quot;hand tests&quot;). - Large regeression test suite (3 months wall clock to execute on compute farm) - No unit tests<p>Anybody here tried that (or a similar) approach? Especially going all-in on property based testing and fuzzing with no unit tests.<p>I tried that approach somewhere before and the initial results were promising, but ran into political issues so the idea was canned.","title":null,"type":"comment","url":null},{"author":"zapnuk","children":[{"author":"Aldipower","children":[{"author":"zapnuk","children":[],"created_at":"2026-07-04T17:13:13.000Z","created_at_i":1783185193,"id":48787011,"options":[],"parent_id":48784147,"points":null,"story_id":48782671,"text":"&quot;You can lead a horse to water, but you can&#x27;t make him drink&quot;<p>Reader view makes the text too narrow: <a href=\"https:&#x2F;&#x2F;imgur.com&#x2F;a&#x2F;yQqzxco\" rel=\"nofollow\">https:&#x2F;&#x2F;imgur.com&#x2F;a&#x2F;yQqzxco</a><p>Sure i could take extra steps to make it more readable, but at that point I rather not read it as im not that interested nor invested.<p>Totally fine, no complains from my side. But If the author wants to increase the reach of their posts they just might use agentic coding to have the llm optimize the website for readability.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T09:45:21.000Z","created_at_i":1783158321,"id":48784147,"options":[],"parent_id":48784138,"points":null,"story_id":48782671,"text":"Toggle to &quot;reader view&quot; or resize your window. It is up to you and really isn&#x27;t that hard.","title":null,"type":"comment","url":null},{"author":"layer8","children":[{"author":"baq","children":[{"author":"layer8","children":[],"created_at":"2026-07-04T16:35:29.000Z","created_at_i":1783182929,"id":48786698,"options":[],"parent_id":48786065,"points":null,"story_id":48782671,"text":"Because not all content is suitable to be stretched out 16:9 (or whatever the screen ratio of your desktop monitor is). In fact, in my experience most web site content isn\u2019t suitable for it. The default browser window size I use is closer to 4:3 for that reason. There\u2019s also software to auto-resize the window to different widths and to auto-place it to different horizontal positions depending on the website, that you can configure for frequently used websites.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T15:25:18.000Z","created_at_i":1783178718,"id":48786065,"options":[],"parent_id":48784729,"points":null,"story_id":48782671,"text":"Why would you work with anything other than maximized windows unless you have a super ultra wide display?","title":null,"type":"comment","url":null},{"author":"zapnuk","children":[],"created_at":"2026-07-04T17:16:20.000Z","created_at_i":1783185380,"id":48787034,"options":[],"parent_id":48784729,"points":null,"story_id":48782671,"text":"Actually HN handles it almost perfectly. The automatic line breaks are really good.<p>Only complain would be that its sometimes difficult to see the parent comment(s) when you scroll down.<p>HN could fix this quite easily by making the current parents sticky while you scroll.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T11:51:00.000Z","created_at_i":1783165860,"id":48784729,"options":[],"parent_id":48784138,"points":null,"story_id":48782671,"text":"There\u2019s a reason not to maximize your browser windows. How do you handle HN threads on that monitor?","title":null,"type":"comment","url":null},{"author":"ornornor","children":[],"created_at":"2026-07-04T13:53:59.000Z","created_at_i":1783173239,"id":48785413,"options":[],"parent_id":48784138,"points":null,"story_id":48782671,"text":"The point of a wide monitor for me is to have two 720 wide windows side by side, not a single gigantic 1440 window with impossibly long lines.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T09:43:36.000Z","created_at_i":1783158216,"id":48784138,"options":[],"parent_id":48782671,"points":null,"story_id":48782671,"text":"There is a reasone we use left and right margin&#x2F;padding.<p>This blog is quite unreadable for 27&#x2F;32&quot; monitors.","title":null,"type":"comment","url":null},{"author":"cognitiveinline","children":[{"author":"layer8","children":[{"author":"cognitiveinline","children":[],"created_at":"2026-07-04T12:32:32.000Z","created_at_i":1783168352,"id":48784955,"options":[],"parent_id":48784714,"points":null,"story_id":48782671,"text":"There&#x27;s a lot people will do for equal quality and faster work at a 100th the cost. Like being OK with character changing and adapting to it. You are not seeing this in your workplace?","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T11:48:49.000Z","created_at_i":1783165729,"id":48784714,"options":[],"parent_id":48784528,"points":null,"story_id":48782671,"text":"On the other hand, you have many more humans to choose from than models, and they don\u2019t change their character every few months.","title":null,"type":"comment","url":null},{"author":"baq","children":[],"created_at":"2026-07-04T15:26:02.000Z","created_at_i":1783178762,"id":48786070,"options":[],"parent_id":48784528,"points":null,"story_id":48782671,"text":"It\u2019s going to even out on both eventually, the diffusion will be painful though.","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T11:13:06.000Z","created_at_i":1783163586,"id":48784528,"options":[],"parent_id":48782671,"points":null,"story_id":48782671,"text":"It&#x27;s really coming down to &quot;Do we want to subscribe to a human with a salary of many ~$10000(s) &quot;, or &quot;Do we spend 100(s)$ on an AI subscription&quot;<p>Even with it&#x27;s issues, the latest models are going to disrupt the labor economics.","title":null,"type":"comment","url":null},{"author":"nasretdinov","children":[],"created_at":"2026-07-04T12:15:29.000Z","created_at_i":1783167329,"id":48784858,"options":[],"parent_id":48782671,"points":null,"story_id":48782671,"text":"I can agree with Dan on two things: LLMs do often produce incorrect results and that it&#x27;s still useful for productivity when used in moderation. For me the wrong results actually cause some kind of ragebait response so I become much more motivated to learn more about the subject to actually generate correct response. After I&#x27;ve learnt the subject area enough I find I&#x27;m better off having LLM review my code instead of writing it.<p>I haven&#x27;t even begun to try to comprehend how to use fuzzing testing to improve the ability to find bugs, but it sounds really interesting. I&#x27;ve seen mutation testing to be very useful for finding gaps in tests, so I can only imagine that fuzzing + LLMs might produce insane results.","title":null,"type":"comment","url":null},{"author":"joshka","children":[{"author":"jujube3","children":[],"created_at":"2026-07-04T20:48:52.000Z","created_at_i":1783198132,"id":48788938,"options":[],"parent_id":48784971,"points":null,"story_id":48782671,"text":"TL;DR: a webshit used AI. It was kinda cool, I guess?","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T12:35:16.000Z","created_at_i":1783168516,"id":48784971,"options":[],"parent_id":48782671,"points":null,"story_id":48782671,"text":"TW;DR (too wide didn&#x27;t read) ;)","title":null,"type":"comment","url":null}],"created_at":"2026-07-04T04:37:15.000Z","created_at_i":1783139835,"id":48782671,"options":[],"parent_id":null,"points":158,"story_id":48782671,"text":null,"title":"Agentic coding notes from Galapagos Island","type":"story","url":"https://danluu.com/ai-coding/#appendix-agentic-loops-and-writing-this-post"}
