435: How to Actually Use Claude Code to Build Serious Software

Download MP3
Arvid:

Hey. It's Arvid, and this is the Bootstrap founder. I've been using Cloud Code for over half a year now, pretty much exclusively to build my platform. I haven't really deviated much to other tools, and I think I have gathered enough experience with the system that it's time to share what I have learned about using Cloud Code effectively to build this kind of nontrivial software service application that I am building as well. Here's what I found.

Arvid:

That's kind of the summary of this. A lot of the value of Cloud Code is in its configuration and in the correct prompting. When you're just starting, you might think that the value is really in the code it generates. It's kind of true. But honestly, most of the value of Cloud Code is in the code it doesn't generate, so you don't have to throw it away.

Arvid:

The better you are at setting a really solid system prompt at executing the AgenTic loop correctly with the right permissions and the right interceptions and restrictions, the better your experience with Cloud Code will be. And I will talk about just how those things work today. First off, any plan of Cloud Code will work for whatever you might need, and I mean subscription plans here. Right? The moment you start using it more heavily, if you start using it for feature building or for extensive testing for background processing for other things than just creating code, you will likely exceed the cheapest plan.

Arvid:

I think it's $20 a month. You wanna look at the max plan, and it tends to be what I would recommend at this point because then you can just keep it running and you will see why this matters in a second. Cloud Code released an integration with Chrome, the browser, a couple months ago. So you can start Cloud with the dash dash Chrome flag so that it connects to your Chrome instance, and you really just need to install a Chrome extension in your browser. And then you have this bridge to the browser that allows Claude to operate within that context of the browser for you.

Arvid:

So if you don't do anything else from all the things that I tell you today do this one thing allow Claude to see your application use the Chrome flag to literally be able to click around in it and take screenshots and move things and inject things and investigate the DOM and all the variables that run within it, that is incredibly powerful. It's very smart to connect your cloud instance to this running version of your software. With the prompt and within the operational loop, it can then be tasked with things like, look at this page, scroll down a little bit, and see how this one component overlaps with another. That's quite literally something that I used earlier today to fix the display bug. Or you can say something like, look at this page, inject some data, and see how it shifts the layout in weird and unexpected ways, which is a bug I had a couple days ago.

Arvid:

So I just had it add data to the running application inside the browser to record what happened, And Cloud will do this. It will try to recreate the situation, take screenshots, and then actually investigate what's going on by diving into both the visual, by using screenshots just to see what's going on, and the hierarchical structure of the page, by diving into the document object model, the DOM, and looking how things are arranged, looking at style sheets and all that. It's very powerful. I've done things over the last couple weeks like asking Claude to go into the navigation on the left and click on each item and see if all the first pages that come up there look similar or which one looks slightly different and might that need to be changed to create a consistent design language. And Cloud is incredibly good at this because it can navigate your website easily, can click, it can read, it can see, and then build this internal representation of that structure and work with that.

Arvid:

It also makes it incredibly easy to build features that have visual components, which a lot of software tends to be. Instead of hoping that it looks right or having to manually check it, you can now task Claude to build something, quite literally look at it, and then refine it. That's super enjoyable just even to watch it build. Like, try this out in an established code base and just see it manipulate paddings and just checking stuff. It doesn't have to imagine it because it can just check it out.

Arvid:

It's very powerful. It goes hand in hand with one of the biggest inventions in agentic usability that I have come across over the last months ever really since I started using this. That is called the Ralph Wiggum plugin or the Ralph Wiggum loop. And this might sound surprising because, obviously, Ralph Wiggum is a character from The Simpsons, so what does that have to do with agentic coding? Well, Ralph embodies this philosophy of persistent iteration despite setbacks.

Arvid:

Right? You try something, doesn't work. You try again, doesn't work. You try again, doesn't work, try again, until eventually it does. That's the whole idea of this loop approach.

Arvid:

It's not try it once and then stop if it doesn't work that is what a smart person in the world of The Simpsons would do instead you set this goal you set this completion promise and until that promise is fulfilled do you repeat working on it forever if necessary? If you need three iterations to get there that's probably one of the best cases if you need 30 okay maybe that's what it takes and if it needs 300 maybe it takes that it still will keep working on it until it reaches that goal state the desired promised state and it's a very mindless but also very thoughtful approach because it puts the idea of failing as information at the center of its core. Failure is a good thing here because it teaches us another way of how a thing is not to be accomplished. Success is when we ultimately reach the state of the promised goal but every failure state along the way removes one further potential state that we might need to check. The way it works is that you describe your task like you usually would in a prompt and then Cloud works on the task and then once it finds a state where it thinks it has something it tries to exit that task And that could be because it's done, it could be because it ran into a problem, or because the API didn't work or whatever.

Arvid:

Right? There's a stop. But the Wiggum loop, the Ralph Wiggum loop has a stop hook that blocks the exit. And instead of going back into the prompting state of Cloud Code the stop hook feeds the same prompt that you initially had back into the loop and until the task is completed, until this completion promise is uttered in the execution of the task that loop just keeps going right you tell it in the prompt that you want to say everything fully finished at the end and until it finds everything fully finished it'll just keep going and this is a different approach than what Cloud Code currently does by default right now it tries the thing or whatever it might be and the moment there's a problem it exits and asks you to do something the Ralph Wiggin plugin allows it to keep working and iterating on a solution by trying new things without your intervention. And I have a strong feeling that Clothecode will adopt this methodology eventually into its main loop maybe as a mode that it will have you just turn it on and it has this endless loop until you reach this final state that you want to accomplish but right now this plugin is how you get that behavior And the plugin, you can find in the official plugins repository, you can just type plugin in your cloud code, and then in the official plugins, you can just scroll down a little bit or just type ralph, and then you will find the plugin right there.

Arvid:

You install it, and then it should work immediately. When you run it for the first time, it asks you if it should be allowed to run, and then you say yes, and then it starts. And allowing things, and maybe more importantly, restricting things for Cloud Code, that's the next thing you need to get good at. It's pretty critical. You don't want to have to be there all the time saying, yes, you can do this.

Arvid:

Yes, you're allowed to run this command, or saying, no, don't do that. Do something else. For commands you know that are perfectly fine to run just automatically without confirmation, you will wanna persist them as permissions in the allow array in your settings dot local dot JSON, in your Cloud Code subdirectory, I think. It might even be in the the main directory, not quite sure. Doesn't really matter.

Arvid:

Cloud Code will automatically do this for you ever if you say yes and allow this in the future from the options that you might get if it asks for something. It will automatically add that line. And you probably wanna have a couple of commands in there, likely the Ralph Wiggum command, the skill, or skills that you wanna use in the future, other things like GitHub push or whatever, or particular build commands in your framework of choice. But just importantly, you wanna look into the deny array in your permission setting. There might be commands, particularly when it comes to testing and setting up environments, that can be quite dangerous.

Arvid:

Most of the time, it's fine if we wipe the test database, which we just created for the sake of testing anyway, but your local development system might have some state in your development database that you wanna keep because you're working on something else or you just wanna you have ingested a lot of data, you wanna keep it there. You wanna prevent your framework's tooling from wiping a database, from remigrating all the migrations or stepping back or reseeding it with data that overwrites what you're currently working with. So this will be different for any system that you use. But if you're using PHP and Laravel like I do, you probably want the PHP artisan db:wipe or the migrate:fresh or the db:anything commands in your deny list because the moment it's in the deny list cloud will try it at its worst it will fail and then it will likely stop to ask you what to do. So at that point, you can figure out what the agent is currently trying to do.

Arvid:

Is it really trying to delete my database, or is it just like stepping back one thing, like one migration to test something else? And if so, what is it trying to do? Is it working on a test database? Is it on my dev one? That's probably not what it should do.

Arvid:

Is it trying to connect to prod? Oh god. That's not. Right? This is what deny permissions really help with.

Arvid:

Here's something important though Claude is quite smart and it might see that you're blocking a command with a rejection so it might try to create a new bash script file that contains the command and then tries to run that bash file so if you've allowed running all bash files at once it can circumvent your restrictions that way so you have to be both very restrictive even in what you allow Cloud Code to run because it might find alternative ways around your restrictions. I've run into this a couple of times. I now always have a backup of my local database readily available in my Dropbox because I know it might just break out and destroy stuff. So we've got to be careful with that. And since we're talking about completely agentic systems now with the Rigum Loop, you want to have some vigilance.

Arvid:

It really matters. Now I have another tip here. Ask your Cloud Code to write tests for every single thing it builds. Because testing is something that I never did before. It has become extremely easy with Cloud Code.

Arvid:

Cloud knows how to write tests for the language that you work with, for the framework that you're running. It just knows how to write them, how to execute them. This is common knowledge in the community, and it knows how to safely execute them if you define it well. If you have a testing environment variable in a testing database that's isolated from your development and production staging systems, which I recommend, obviously, that's kind of a best practice, then you can ask Claude to test for you if you already have a collection of tests it will just test them see if they still work and if you're building a new feature you should always in the prompt that you're inputting or in the system prompt that you should have to begin with ask it to create tests and make sure that all tests pass after new features have been built. One thing that I've learned, this is a timing issue really, you tend to want to iterate a little bit before you add tests.

Arvid:

This is not test driven development here. This is agentic driven development. And usually it's smarter to build the feature first and build it out bright and then build the tests on top. If you go test first with an agentic system, iteration becomes problematic. The moment Cloud Code starts experimenting and tests start to fail and it tries to keep them right, it becomes this weird interference loop where there is more focus on getting the tests right than to get the iteration closer to your goal state.

Arvid:

So let it build the feature first and make sure that before you commit, but after you're done with building the feature for to build tests around it. That's what I found works best in my experience. Now I mentioned a system prompt just now and I think that's something that we tend to forget. We start Claude and we just run, and it works. But the system prompt can be really helpful.

Arvid:

And for it to be effective, it has to kind of stand on its own. If you look at the tools provided by there's a company called Auxter or product called Auxter that's somewhere out there. I think it's a it helps with agentic systems. They have a repository on GitHub that contains the Auxter prompt, which I've been using as my system prompt or as part of my system prompt for the longest time. It's a very specific system prompt that encourages Cloud Code to run-in a loop, kind of like what Ralph Wiggum does, but more specifically as a definition of what good code writing looks like.

Arvid:

It's more of a description of the output and the person building it, this virtual developer, than just a process. Ralf Wiggum kind of sits on top of this. And I still recommend building a really solid system prompt for your project. So let's maybe look at what a good system prompt should contain in addition to this structural system pre prompt kind of that the OXR is. I know it's kinda weird, but it's like a system prompt for your system prompt.

Arvid:

Really recommend checking it out. It's kind of an XML shaped description of what good code writing looks like, what good quality is, what the priorities are. It's it's really fun to look into and see how these agentic systems are to be communicated with. But besides that, once you have this basic prompt, you should add more. And what you should add is, first off, a description of the product that you're building, their project and its capabilities.

Arvid:

And maybe more importantly, a description of the people who are going to be using it, the ideal user. That could be you for your own project. It could be your ICP, the customer that you really want to serve. The better you describe this person, the more this will allow Cloud to build tools and functionality that are aligned with what those people might need. It's really helpful for Cloud to know who this is for because then the automated systems that build features have a better understanding both of user capabilities, like what do they know already, and with their expectations, what do they want.

Arvid:

And obviously, if you put all of this together, like the AgenTic Loop and the System Prompt and the testing and all that, this is just right now. This all changes almost on a weekly basis. The Ralph Wiggum thing is like a couple weeks old. But at this point, with a system prompt that encourages testing, a couple of things that you allow to be automated, a lot of things that you restrict to that nuclear local system, and then kind of supply it with this operational loop like Ralph Wiggum or even just the Oksutra prompt and browser access using the Chrome flag, I think you're probably going to be in the top one or 2% of Cloud Code users at this point if you use this. The real insight here isn't any single tip.

Arvid:

I think it's the value of agentic coding that comes from how you configure and constrain it, not from the raw code generation. That you should always check anyway. But set up your environment right and Cloud becomes this genuine self contained collaborator. And leave it unconfigured and you'll spend more time cleaning up messes than building features that you actually want to build. So I hope this gives you some insight into what you can do to level up your Cloud Code workflow.

Arvid:

If there's more to share as tooling evolves, I'll keep you updated. And that's it for today. Thank you so much for listening to the Bootstrap Founder. You can find me on Twitter avid kahl, a r v a d k a h l. If you're building a product and you wanna know what people are talking about or your competitors in the world of podcasting, that's exactly what PodScan does.

Arvid:

That's We monitor over 4,000,000 podcasts in real time and we alert you when influencers or customers or competitors mention your brand. And we turn all that unstructured podcast conversation into competitive intelligence and opportunities insights at podscan.fm. And if you're a founder hunting for the next idea, well, then check out ideas.podscan.fm, where we have a system that surfaces startup opportunities, like startup business ideas from hundreds of hours of expert conversations daily because they talk about validated problems straight from the market. Thanks so much for listening. Have a wonderful day, and bye bye.

Creators and Guests

Arvid Kahl
Host
Arvid Kahl
Empowering founders with kindness. Building in Public. Sold my SaaS FeedbackPanda for life-changing $ in 2019, now sharing my journey & what I learned.
435: How to Actually Use Claude Code to Build Serious Software
Broadcast by