The Bootstrapped Founder | Transcript: 356: James Phoenix — Mastering Code & AI for the Modern Developer

356: James Phoenix — Mastering Code & AI for the Modern Developer

November 20, 2024 / 53:01/E356 Download MP3

Arvid: 00:00

Hey. I'm Arvid. Welcome to the Bootstrap Founder. Today, I'm talking to James Phoenix, an AI expert and software developer who'll teach you how to approach building software projects quickly and effectively with AI tooling. We will dive into the limitations and promises of Cursor and similar LLM based coding environments.

Arvid: 00:20

We will talk about the future of software engineering as a whole and James will give you the biggest, most high impact things you can do to get up and running with Cursor. He helps me understand how to get things up and running with Cursor, and it's really really really cool. This episode is sponsored by paddle.com. If you're looking for a payment provider for your business that lets you build the things you wanna build instead of having to deal with things like taxes and invoices, go check out paddle. It's worth it, I use it too.

Arvid: 00:50

Now, here's my conversation with James. James, you're the coauthor of the book Prompt Engineering for Generative AI, which is an amazing book. I've been reading it over the last couple weeks, and I've been really, really enjoying it. I've been learning loads. So you're an expert on getting AI systems to do what you want them to do, and you're an engineer.

Arvid: 01:10

So we have a lot of overlap here. Is coding still the same thing to you as it was back in the day before AI? I guess the question is, has your understanding of what software development fundamentally is or is about changed over the last couple

James: 01:24

years? Yeah. So a really great question, Ovid. I think the the way that it's changed is we're all being lifted up, and we're doing less menial work. So, you know, back in the day when you had people using Chat.

James: 01:35

You can see to write blog posts, there were a car a couple of people that were playing around with the idea of generating React components in GPT 3, and it didn't really have enough cognitive reasoning power to actually produce high quality code. And what's happened is in the last year and a half, we've now reached the point where LMs are almost better, programmers or or able to generate highly, you know, high quality code, directly from just one one example or or not even an example. And so I think that we've all reached the point where we should be AI first, and we should primarily be, like an engineering manager where you have almost like a a bunch of juniors that are going off with a few project requirements or your task requirements. And And they're basically executing that, and then they're coming back to you with some output, and you're sort of either giving them guidance and telling them that they're wrong. And and that feels like a lot of what I'm doing Yeah.

James: 02:27

Doing it with Claude right now. I'm sort of treating it like a junior when it comes back and go, you're like 80% of the way there, but you're not 100% the way that that can that can happen. Yeah. Or it's just getting it absolutely right and nailed it. And I'm like, you're the best intern, considering the price point, you know?

James: 02:43

So, yeah, that's that's kind of where I think we're at at the moment.

Arvid: 02:47

Yeah. It it does feel and I've been AI first for a while too over the last at least half year while I've been building my own business. It feels like it's an ongoing pull request. Just this constant pull request that's just happening every single day, several several times an hour. But you're just saying, yep, sure.

Arvid: 03:02

Oh, no. Here is a mistake here. I want this different. Okay. Now try another way.

Arvid: 03:06

Maybe you can do it better. Like, this is what coding with AI feels so much more than before, where you would iterify explore through reading documentation, through trying to kinda take other people's examples and shape them into what you need, it feels much more like a judgment call now than it actually is an intellectual exercise in creating, like, a logical structure. Do do we atrophy? Do you think this might be a problem for engineers who are afraid? Like, there's a lot of engineers out there who try not to touch it.

Arvid: 03:35

Right? Because they think the skill set might be going away. Do you think that's the case?

James: 03:38

So ProvoGen did make a good point, about a week and a half ago on a video, which was we're gonna run into this problem of beginner experts where you basically have someone who uses composer and they create a full stack app, but they've got no idea how it works. And that's a huge dangerous point. Right? Like, you've just built this entire app, and you've basically fumbled your way by working till 2 AM and midnight and just saying fix it, fix it, fix it. So I think there's a danger there where we basically just end up, prompting too much.

James: 04:09

So, what I tell most people is, you should basically, always understand the lower level primitives. So if you're working in React JS, you need to understand state management. You need to understand useEffect. You need to understand the rendering cycle. And but when you're generating the code, you don't need to understand that code unless you need to fix that implementation.

James: 04:31

So So it's almost like, because you've been lifted up to an engineering manager, you now no longer need to understand every part of the implementation. However, when the implementation needs to change, then you obviously need to dive into the code. And, also, if you don't understand those level of primitives, then what happens is you're gonna spend a lot more time in learning I call it learning and discovery phase of Claude. So I think there's sort of 2 main ways that we're using, any kind of LLM. The the primary way is you're using it in work mode.

James: 05:01

So, here's a task, here's context, x, you know, generate me outcome y. And, basically, that's how most programmers work. But then there's another side that actually a lot of the juniors will use, which is, how you know, that it'll write some code, and and a good junior will say, I don't understand this. Can you explain this to me? And they'll be feeding off Claude and sort of upskilling themselves almost like 1 to 1 on 1 on 1 mentoring.

James: 05:26

And, and so what what's interesting is that a senior will do a lot more work mode, and a junior will do a lot more learning and discovery mode. So if you're learning the new language and you don't understand syntax, that's the point where you should flip modes and say, you know, you've just generated this code. Great. You've solved it, but I didn't understand these 3 bits. And so that's where I think you can get a lot of power from using LMs as you basically you can write off the back of the implementation as long as they describe those lower level primitives to you in real time.

Arvid: 05:58

Yeah. I I think there's also a discovery mode on the other side, on on the expert side of things where you are so kind of siloed in your own knowledge situation. Wait. The only thing you know is the stuff you've always been doing, like your best practices, your what you think is right, that when you present your work to the AI, to any LLM that has these capabilities, and tell it, well, find me alternative ways to think about this. It can now look at it from an outside perspective.

Arvid: 06:23

I found this extremely useful in just trying to curtail my own limited vision that I have into my software as it came. Do you suggest that for people to to lean into more? Because you say a lot of experts are in work mode because they need to get stuff done. Should they also be in this kind of explorer mode?

James: 06:39

Yeah. So I think that there's a danger of committing to one implementation too fast because of these types of you know, you can just generate a load of code. And so one thing that I often do is and I actually do this in cursor and just in the chat window, is before making a a larger decision, I will ask it, like, here's all my code. Here's what I'm thinking about doing. Here's a couple of different approaches.

James: 07:02

Which approach do you recommend, and why should I go with that? And I'll give you an example. I'm building, a data pipeline at the moment, and I had a choice between using a callback architecture versus a polling architecture. And I gave it all the code, and then I said, which do you think we should do given this kind of this kind of implementation that we've already got? And, you know, then it comes up with a series of trade offs, and that can be a really great way for you as a senior or someone that's an intermediate to not just dive straight into, let's produce it, let's get it working.

James: 07:33

So I think there is some value there.

Arvid: 07:35

Does it ever suggest stuff to you that is kind of way out of your just even field of understanding? I'm thinking I I've done similar things. I've had I need this, and I have many different approaches. Could be callback, could be a polling, could be an RPC. And then it started suggesting, like, a, a, what is it, Apache Kafka situation, a message queue implementation, something that I've maybe done before but probably wouldn't have thought in this this kind of particular infrastructure thing.

Arvid: 08:00

So how much exploration should be put into all the different varieties? Because it feels like you have this kind of selection overflow. There's this overwhelm of all the different things you could be doing. How do we judge this when they're presented so eloquently by the LNMs?

James: 08:15

Yeah. So I think I think it all comes back to you should always choose the this is kind of an architectural question, but, yeah, you should always choose the simplest architecture, and you should always have the fewest lines of code. So, ideally, you should have no code, and you should have simple architecture like that. You should basically start with those premises. Assume that all code is basically rotting in real time, and, it's it's you know, there's that phrase, you know, like, the the basically, it's it's software is a lot like milk.

James: 08:43

You know, you have to keep sort of going. It's and and sort of improving over time. So So it has a high maintenance cost. So I think when you're thinking about architectural patterns, it's basically always the simplest. So as long as you can get away with a simple architecture pattern and it's not gonna shoot you in the foot in, like, a couple of months, then you should just generally go for those as a software engineer.

James: 09:01

Yeah.

Arvid: 09:02

When you start with a project, a new project, greenfield, or maybe even a a legacy project, do you first go through this kind of infrastructure architecture phase, or do you dive right into building the into prototyping?

James: 09:14

Right. So so this is gonna sound crazy, but I actually sort of a read me file. And I I basically say, okay. Right. Let's map out all of what we want to achieve.

James: 09:24

And I won't they won't be high level goals, like we're gonna create a CMS. There'll be subgoals within that because the CMS is too vague of a concept. You've got things like Contentful and WordPress. But if we start to dive into some lower level subtasks, then what what that's gonna do is you can basically take a a boilerplate and this highly refined product specification that's almost written by a product manager. And you can basically say, here's my boilerplate code.

James: 09:50

Here's my product features. I need you to start banging out SQL migrations in cursor to basically get it from this boilerplate that kind of does auth and payments, but it doesn't have any of that business logic built in and and and rightly so. But then combined with that read me, you can you can actually go really far. So that's actually how I start every project is I pull in some boilerplate. I I I've already got a read me that's very rich.

James: 10:18

And, also, in that read me, I will include things like what API services we're gonna use, rate limits for those API services, will will include all the kind of limitations, on and sometimes I'll even throw in the API documentation and the links to those so that not only does it have the kind of product features, but it also kind of has the recommended tech stack, It has the recommended documentation, and it's also already got the boilerplates database migrations built in. So it already knows where we're up to because what you're trying to do is avoid sort of going forward then going backwards then going forward then going backwards. So as long as you sort of bring it up to context in the sense of, we've got these three tables, we've got next, we've got Stripe, and, like, here's what we're trying to do right now with these API docs and these types of subgoals, then I think that's a really good place to start.

Arvid: 11:08

It does remind me a lot about just the waterfall model of software. I know. In many ways. It's like fully specked out, but the the process itself is agile. Right?

Arvid: 11:17

Then you kinda let it build and you do this kind of, okay. This is a feature. This is kinda how I want it, sure, for now, but this is kinda different. It feels like a mix of different software development archetypes.

James: 11:28

Yes. Yeah. I think the reason why it's good to specify everything up front is because writing text is really cheap from an implementation point of view. And, like, say, for example, you define, like, this app is gonna have 3 features or 5 features. Maybe cursor would produce you different code if you actually told it it's gonna it's gonna have these 7 features.

James: 11:49

Right? So by kind of just saying the end goal of this app, you know, roughly and we might change some things as we go along. It's gonna have these kinds of features. It's very cheap to do. It takes you 20 minutes to write this all in a read me file.

James: 12:02

And and because of that, it already has the context when it goes to then start building out new migrations in, you know, for SQL or any database that you're doing, you know, new Mongoose schemas if you're using MongoDB. So because of that, you're already getting a lot of the benefits of the, of the foresight. You know? You're like you're trying to avoid, living in hindsight and live a bit in foresight. So that's kind of the the the theory behind that of that approach.

Arvid: 12:28

That's cool. I I like you because you're anticipating that the model also anticipates your future features. Yes. It's kind of a nice, like, double anticipation here. I wonder when when you talk about the street me, I think this is a great approach.

Arvid: 12:39

I think I, subconsciously use this. Now that I write my own features, I I just write them out in a Notion document, like, the way I want them, and then I just throw that plus some files from my my code base that are kinda like it or that they have to fit in into Claude and just let it generate a whole thing. That's how I used to do this. I I recently jumped into cursor. We can talk about this in a minute, but I wanna get, like, back to this approach for a second.

Arvid: 13:03

How detailed are those steps? Because what I hear is that you have the features that you wanna be in this this kind of grand scheme of things. But when it comes to the data models and the APIs, you're getting very specific. So it's a mix of highly specific and general. Is that correct?

Arvid: 13:19

Like, how do you approach it?

James: 13:20

Yeah. So I think the key is to essentially give Cursor or some type of LLM the the sort of rough idea of what the app is doing. Then I think it's up to you as a developer to basically go in and start doing the first feature. And once you've done that first feature, you should go and revisit the Readme with anything that you found that's gonna potentially change. And you can almost think of, like, some Readme or some instruction MD file as a place where you're storing kind of the the rough end gold vision of what this app's gonna do.

James: 13:51

And I think it's dangerous to, to to not believe or be in disbelief that things aren't gonna change. Right? You're gonna get into one feature and you're gonna realize, oh, I need a different type of, of data structure or type of, or type of API service because of how this feature the problem is. Right? The problem as you explore the problem, you will incrementally gain knowledge around how to appropriately solve that problem.

James: 14:17

So, know, like, for example, if you're using the OpenAI batch API, there's some limitations on that. Right? Like, you can only put in 50,000 requests per file, and it has to be under a 100 megabytes per size. So so that's also why I think that if you put some of these limitations as well inside of the features, as well as just mocking out, like, I'm gonna build these 3 features, that's why I think, you can avoid some of those pitfalls of kind of having to go back and change. So a mixture of both the subgoals and limitations and opinionated choices is gonna mean you're gonna have to do a lot less iterations when you start diving into that lower level implementation.

James: 14:55

But you will always have some some backwards and forwards for sure. Yeah.

Arvid: 14:59

So does this read me file, this kind of initial spark of the project grow over time? Do you just add more things to it?

James: 15:06

Yeah. I I do sometimes. It depends on how much it's changing. I I would say that on general, the first version of it is the most useful because that will give you a star and will immediately start pushing you into a feature. After that after you've done that feature, you probably have enough code as a reference point to then piggyback off plus the Readme to then start building subfeatures or or different features.

James: 15:35

So it's almost like you're using the Readme as a as a sort of rocket ship to get you to to to understanding. But then once you've got enough code, the code itself is also partially the business context. Right? So I think there's an element of that because as you start to build out different types of schemers in whatever you're doing, the LLM was very good at figuring out, oh, okay. Right.

James: 15:58

Cool. So if, you know, if you looked at pods pod scan schema, right, we'd have, like, pod we'd maybe have a podcast table. We'd have maybe a transcriptions table. And and but but but over time, you could give that schema directly to an l m, and you wouldn't need the Readme file anymore because it knows that it's a podcast. It so so but there there's this weird intermediate where if you've got no extra tables and you start a boilerplate, the Readme kind of acts as a pick me, like a rocket ship to get you to that that structured schema.

James: 16:27

And then from there, then you can then just pull in the schema, the migrations, and it's like, okay. Right. We're back to where we are. So I think it's more of, like, an initial jump ship, like, that you're sort of using.

Arvid: 16:37

Yeah. That that makes perfect sense. Because, like, if you don't need to update it because the code is actually the truth of the project. Right? Instead of instead of the wishful thinking that the read me expresses, it's the actual thinking that happened plus its implementation and code that gives a lot more information in schema.

Arvid: 16:52

Yeah. For sure. I do wonder. This is a thought that I just for the very first time, sadly, I should have had this way earlier. How important is naming of, tables, naming of properties on tables or whatever kind of properties you might have in whatever database system you have?

Arvid: 17:07

Because if the schema and its connectivity is so relevant for the, the LLM to understand the the data and the the kind of relationships that it have, how specific do you need to be when you name variables or or fields in there?

James: 17:21

Yeah. So so I I think that we don't have to worry about this problem. And the reason why is whenever I've used Claude or GPT 4 o, the field names they suggest are already pretty good at, and, you know, they have it's quite good semantically. So I think I think, yeah, you're in the danger of if your all your columns say something like time, call x, y, zed, then it's a very dangerous place to be. Right?

James: 17:45

Because there isn't any semantic linkage or context. Generally speaking, LLMs, when they're providing kind of these these generations in terms of your database over time, they generally time to be quite good. So so I think it's almost like a byproduct of, naturally, developers have written good field names, and that's already in the training data. So Claude's already kind of already kind of just saying, yeah, this seems like a good name for this column or this this this key value in in MongoDB, for example. Yeah.

Arvid: 18:15

It it makes me wonder. Whenever I think about these these machines that are now ingesting data that has also potentially been created by machines, how much kind of, you know, self sustaining, self, enggrandizing problems we might get in the future or benefits you might get in the future from, you know, how these machines express things are the things that these machines then expect future tasks or future text to also be generated as it's I don't know if it makes a lot of sense to you, but, you know, this this, AI ingesting AI results problem.

James: 18:46

Uh-huh.

Arvid: 18:47

Do you see this being a thing with with AI generated code?

James: 18:50

I I think what they're gonna do is they're just gonna use synthetic synthetic generation to solve this problem. So they're basically gonna use LMs to generate data. And then from that data, what they're then gonna do is use human reviewers and then use those as the way to get around the fact that we've kind of gone through. And, we've, you know, we've we've experienced that where we basically we've already used quite a lot of the training data for text. So that's that's mainly the reason why I think they're gonna use synthetic data generation to get around that.

Arvid: 19:21

Yeah. But what if these humans have been trained by LLMs? What what if they've only learned everything they know about coding by discovering these kind of things from LLMs? I mean, it's it's a joke. Right?

Arvid: 19:32

But but it's it's kind of it's kind of a potential problem.

James: 19:35

It is. It's it kind of makes me think that, actually, the frameworks and everything that we have at the moment is gonna be the primary way. I think because of of what's happened with so much training data on Stack Overflow online, there's a there's there is a point of maybe we'll just stick with some of the frameworks that we're used to because the newer framework is competing with so much training data, so many answers. So React's kind of become like the golden standard, for example.

Arvid: 20:04

Okay. That makes sense to me. Yeah. That is an that is an interesting point. The the LLMs are so capable of dealing with these languages, these frameworks that they know, yet they probably would really struggle with something very new, something that people could probably pick up and, you know, deal with within days after being released or invented.

Arvid: 20:25

Do you think, do you think this is a a drawback for for new paradigms and software engineering that those machines are so trained on decades of the other stuff?

James: 20:35

I think I think that the the danger of it is is if you release a framework and it's not indexed into something like Claude or GPT 4 o, you're gonna struggle with developer adoption because someone is just gonna be so much faster in React or Laravel or or any kind of framework that's got a lot of training data. So I think what's gonna end up happening, is it won't stop frameworks being developed, but they won't the newer frameworks won't be as good until web search and in time, context retrieval is really very powerful. And what I would say at the moment is even, like, the documentation feature in cursor is generally not that great. I normally just pop you know, copy and paste the the actual documentation page in, and I find that that works a lot more than using rag and using chunked up embeddings to get portions of the documents in. So I do think that in the short term, it is a disadvantage for framework developers.

James: 21:34

In the long term, it's probably gonna be okay when the context window is good enough that you can just throw in a 100 pages from a web search, and it just gets the framework now. Right? So so in the short term, it's definitely an an a a problem, but in the long term, I I think we'll be fine.

Arvid: 21:50

I'm looking forward to the days where the interval between new information being out there and new information being integrated into the system gets shorter and shorter, where it's almost at a at a real time level. Right? That's kind of my my Star Trek fantasy is there is a computer system that just knows when things happen because it has this this data inflow all the time. I guess that's what PodScan also is trying to help people with. Right?

Arvid: 22:11

That's that's why I've built this thing. So for real time information. So try I'm trying to sell to companies like OpenAI and and Anthropic because they have training needs and I have the data. Right? So that's a that's a thing.

Arvid: 22:21

But let's maybe get to something that the people listening to this can actually use, which is Cursor. I think you've been talking about this quite a bit. I have started using Cursor 3 days ago, so I'm super behind on this. So I'm I just wanna give you a really, really short glimpse into my my first couple days of experiencing this tool, which were, oh, this is Versus code. And then I did something, and then, oh, this is not Versus code anymore.

Arvid: 22:44

Like, all of a sudden, this what to me was an editor for decades, just a thing where I type text and maybe get some intelligent suggestion for what a variable name is that is already in the same file, turned into a wholly different experience. And I gotta say, initially, I was quite overwhelmed by the idea of allowing this system to write the code for me. I think I I kinda slowly iterated myself into AI assisted coding by first using chat gpt for questions that I had, by then using Claude for just putting my prompts in there, taking the code, and manually transferring it over, like, you know, kinda the the boomer way of coding, but that's kinda what it is. And and now I I got into Cursor, and all of a sudden, everything changed. And I feel overwhelmed.

Arvid: 23:29

So maybe you can give me the first steps that I should be taking to make this usable to me and to make it comprehensible to me. This tool that has so many capabilities that I probably haven't even seen 5% off at this point. So what are the the best first steps here?

James: 23:46

Yeah. So I think for anyone listening, just, in terms of what you should be thinking about for for Cursor, well, Versus Code has basically got your GitHub Copilot, which is your auto completions on your on your single file, and they have started doing multi file edits now as well. But for anyone who's listening, cursor has basically got 3 main modes. You've got inline edits, which is great for if you're just changing a block of code with your command k or control k. And you can even ask follow-up questions.

James: 24:14

So if it doesn't give you the first time the answer you're looking for, you can basically ask a follow-up question. Command l or control l is your chat mode. Now I would personally recommend using that on 1 or 2 or 3 files. If you're doing huge sweeping changes, you don't wanna use chat mode. It's it's too much effort to manually apply all of the different blocks of code.

James: 24:34

And at that point, when you're looking for that, you'll want to use composer, which is command I or control I on on Windows. And so there's 3 different modalities. And in each of those modalities, you can also use the at symbol, and I'm sure you've probably started using that quite a lot of it. It's it's really great. You you it's basically replaced that entire work stream of, I'm gonna go into Claude.

James: 24:57

I'm gonna copy my code in or or chat GPT, and then I'm gonna get the answer. I'm gonna paste it back in. And there's a lot of manual files that you'd need to troll through and making sure that Claude or or gpt4o has that relevant file context. And so it's saving a lot of developers' time by just adding a folder or adding a file, or you can even do something called at code base, which will basically look at the local embeddings. And given your query, it will find relevant files, which it will then basically pull in.

James: 25:27

And so, you know, when you've built a feature, you can even use app code base to then figure out what files it should actually be using. My personal favorite is actually the app folder. I I actually think that you have the when you have direct control over what files are are being referenced into the prompt, you're gonna get the best kind of generations. At code base is kinda good, but, again, it's all based on the semantic similarity between the query and the file names and the the function the function names and the class names and all that kind of stuff in those files. I I think that you you should default to composer mode if you're building indie hacking apps, and you should only use chat mode and inline edits when you really need to make sure that a certain part of a workflow works really well or you're struggling in composer and you've got 99% of it done and you're just working on a single file or a couple of files, that's when I start to switch into that kind of mode.

James: 26:27

There was a time about a month ago where when we didn't have the newer Claude model that sometimes even Claude would struggle with features. And at with and at that point, then it's sometimes worth actually going in and looking at the logic yourself. So depending upon the intelligence of these models, you can give them bigger and bigger goals. And as you give them bigger and bigger goals, you should default to using composer because they'll be able to handle that multifile and editing experience. So it's almost like as the models get more performant and more intelligent, we should we should generally be doing multifile editing as the primary way of working on a project.

James: 27:07

And, the only danger with this is when you change a lot of files, you lose a lot of context. So, you do have to go and read through those files and understand what's changed. Because if you don't, you're basically it's almost like you've given away a little bit of your project. Someone's gone and done it in some other country, and they've given you the code. They've sorted it in, and you've got no idea how this this entire piece works.

James: 27:31

So that's the danger of composer. Right? The other danger of composer is regressions. So, you know, once you've done, like, 90% of a feature or a file, and then you say, can you add this? Sometimes it will actually randomly delete stuff in the file.

James: 27:45

And so you have to be like, no. Actually, put this back. So always be on the lookout for regressions when you're looking when you're using composer.

Arvid: 27:53

Do you do you think this will be something that the the team at Cursor will fix, this kind of regression stuff, or is that an innate problem?

James: 28:00

I think that's an innate problem with the LLMs. Yeah. And we'll see that that goes away when we get new new models. So, like, in a year and a half, that probably won't even be a problem anymore. But right now, it's a problem because it's only focusing on a certain portion portion of the generation.

James: 28:17

And, there there is a feature that they've implemented in Cursor to kind of patch this, which is called, composer checkpoints, which allows you to revert almost like a git reset or a git checkout back to a specific commit sha, and you can do that directly in a composer chat window. So you can already revert state back. So you can almost YOLO your way there, and then you can YOLO a little bit more. And if it fails, you can always go back to the previous YOLO. So it's not too bad.

James: 28:44

It's pretty good.

Arvid: 28:45

It reminds me so much about, like, of of machine learning and and the way we do that gradient descent and these kind of things where you just say wiggle around the hyperparameters a little bit to see if it's better or worse. It's it's so funny how all of these things come together and the approach that we have with AI. Like, this, very experimental, but also then, iterative step of building stuff. Like, does this work? Maybe.

Arvid: 29:08

Let's see. Let's try this. Let's try an alternative. Are there other things that you see in your own use of of cursor or LMS in general where you you have to look very specifically to make sure that they don't mess up too much? Like, you you just said this is kind of random deletion.

Arvid: 29:22

I had that too in many ways where but just forgot that that was part of the thing that I wanted it to build. Is there are there other things you have in mind?

James: 29:29

Yeah. I think when you start building, distributed services, that can get quite tricky because if you've got, for example, this API calls this other API, and then a load of jobs happen, and then those jobs get, you know, put into a table somewhere or in a Redis, and then all of those gets collected. And then so so so that's where where an example of where you as a software engineer should focus a lot more on the in the the flow of data and the potential errors that can occur between these distributed systems. What I have found is the more monolithic the app, the better cursor is just like, Yeah, I love this stuff. Great.

James: 30:07

So, you know, like if you're using next year's like and you're not using an API back end and you've got it all in 1 mono repo and you've got types, you can just smash out crud like it's nobody's problem, right? When you start having like an API server and you've got specific like a step functions or you've got a Google Cloud workflow, like anywhere where you're adding on cloud services and you're breaking up the the flow of the data, that is where you need to spend a bit more time sense checking. Okay. I have this service a, and it's calling service b, and then that spins up a load of jobs. That to me is where we should be spending our time, and we should be also doing a lot of the architectural work upfront.

James: 30:49

And one thing I do do, so I did this recently is you can use a tool like Excalidraw.com and you can mock up your architecture for a certain portion of cloud. And then you can just push that into cursor, give it your database types, and say, generate me the API routes to do this part of it. So it's still quite useful for converting architectural diagrams into service level codes that executes those. And yeah. So that that's my thoughts on that.

James: 31:19

The the the more distributed the system, the harder it is to implement these code generation tools.

Arvid: 31:24

Yeah. It also I've been I've been trying to have it generate AWS compliant policy, JSON objects, that kind of stuff. You know? These these buckets policies for s 3, and sometimes it works, sometimes it doesn't. There's a there's a lot of hit or miss in this as well, just in my own experience.

James: 31:42

Right. So so so my my recommendation for that is, if it fails, like, more than twice, go to the website or and just go and get the relevant web pages and just command a them or at them in using cursor. And if that fails, then, you've reached the limits of the the current model is generally how I think about things. So if it does it without any context, then it's already baked into the the training data. If it doesn't, you need to add the relevant context.

James: 32:11

That could be your data type base types. It could be the web pages on AWS. Maybe, for example, the policy API has changed, then you need to go and fetch that context manually. If that's failing, then that's basically like you probably the LLM can't do it at that point, and you've gotta go and sit down and do it, basically. So that's currently how I think about things, if that makes sense.

Arvid: 32:31

That does make a lot of sense. And it sounds like we're still doing the stuff that we used to do in the pre AI days. We still try to find the proper documentation. Now it's just that we don't wanna read it ourselves. We feed it into the system that then kinda turns the the knowledge in in there into code that our brain would do, which is why the atrophy idea is so strong.

Arvid: 32:51

Right? We we're almost there. We know where the documentation is. We literally copied the the URL to the docs, but we don't read it. We just give it to the AI.

Arvid: 32:59

Right? It's a I I and and I don't wanna don't wanna sound like old man yelling at cloud, but it does feel like a risk in the in the career of a software engineer to not read the docs, to hope that the machine does it enough. Do you still read docs? Do do you even still manually write code at this point?

James: 33:18

Yeah. So so my my advice on this is just what we were discussing earlier. As long as you understand the primitives of what those docs are talking about, I don't read the docs anymore. I I just because to me, the docs are describing ways of interfacing with some type of service via language. So as long as you understand that language so for example, if we take Terraform.

James: 33:45

Right? Terraform has a series of resources, reference names. You've got data. You've got outputs. You've got variables in Terraform Cloud.

James: 33:52

You've got this idea of Terraform state. You have the ability to, to do version control on your Terraform. And as long as I know all those things, the the usefulness of me going and understanding a specific doc is fairly low in terms of impact in the long term. So for me, I'm actually more interested in learning more more different primitives. As long as I know the primitives, like 80 or 90%, that's when I'm actually thinking, well, there's no point in me reading these docs because, you know, web search, you can already do that with cursor.

James: 34:23

You can actually say every time you do a query, do a web search. But I I basically think, yeah, the primitives are the main thing that you should be focusing on. So for example, I didn't know a bit of a full loop inside of Terraform. I said, explain this, break it down for me, and then I kind I kind of understood it. So that to me is the is the usefulness of this stuff.

Arvid: 34:42

I think, generally, I guess, in in any skill based endeavor, it's good to know the fundamentals. You know? And then and then from there, extrapolate the complex relationship between them, which is the same thing that these machines do. Right? It's they just turn it into code where we then turn into a judgmental product managers.

James: 34:58

Yeah. Having said that, you can do things like you can say, I know, I'm learning about this. Here's what I know, x, y, and zed. Can you teach me some new stuff? So I did that with I did that with what did I do that with recent?

James: 35:11

I think it was with SQL, and it started talking to me about window functions and partitioned indexing. So you get into some really, interesting and advanced topics, which is a which I think is a way to grow your skill set. So you basically take anything that you're learning, you know, whether it's React or Next. Js or Python, and you basically say, here's what I know. I wanna become a 10 x engineer.

James: 35:36

Teach me what I don't know. And you can get some really interesting stuff out. Like, it was teaching me about property based testing and mutation based testing, which is, like, really advanced stuff or, like, reverse debugging is, like, what a 10 x engineer will do. And you won't so so there's some really great stuff you can get. And that to me is the usefulness because you're exploring these higher level fundamentals that are built upon what you already know.

Arvid: 36:01

Yeah. Definitely. And it it's it's an infinite treasure trove of knowledge too. Right? Because for each of these things, you could dive into probably what is a lifetime of other people's knowledge and insight just compressed into the other item.

Arvid: 36:13

Like, particularly with testing. I think we should talk about testing because that's one thing I've never done. I'm sorry. I'm outing myself as a as a non test driven engineer here. But do you test?

Arvid: 36:23

Like, do you let the thing write tests, and and how much of that, impacts the the actual building part of your your software development?

James: 36:32

Yeah. So I think, I've got I'm I'm in very two minds with regards to testing. Testing, if we just talk about the drawbacks, it has a lot of maintenance involved, and, it will often mean that you're spending time setting up testing infrastructure. The benefits are that you can essentially figure out that something's working. And when you change something else, you don't end up with regressions, and, you know, you always know whether something has caused something else to fail.

James: 37:02

Now, for indie hackers, you guys should go on in integration tests. Right? So Playwright is your friend. Don't don't go for just tests. Don't go for unit level testing.

James: 37:13

Just basically go for, I have this one test that runs, and it tests the entire feature end to end. Right? That's it. So, you know, that maybe that could be, for example, logging in, clicking on a table, uploading a PDF, and then, clicking the go button. And then, after a bunch of API endpoints, a result comes back, and that's, like, one feature.

James: 37:37

And you can just write a single playwright test to do that, and then you get the benefit of an integration test, which will test these, larger flows without having to test all the bits of the app. Personally, I do I do write tests, but I actually have in my cursor rules file, I have, Jest tests and playwright tests and py tests py test tests. But, basically, what I do is I tell it as I write the code, can you make empty tests? And my feet so so I'm feature first with empty tests, and sometimes those empty tests get filled if it means that the product is gonna be more reliable. So I I think that, actually, there is value in both, and people that don't test at all have got it wrong and people that are militant with testing have also got it wrong.

James: 38:25

We're we're strictly talking about the start up space. If you're building an enterprise banking app, then you can't not test because if the the risk of something going wrong actually outs offsets basically the test. But when we're talking about indie hacking, I think you should be feature first, empty tests, and then you should fill some of the empty tests with, playwright integration tests for your bet. They're the best bang for your buck. They're called black box tests.

James: 38:49

You're not testing the whole, every single, you know, function signature call. You're basically just testing. Does the thing work? And you'll see level Levels has done that with loads of his robots. He's just writing integration tests.

James: 39:01

He's not testing at the function level or at the class level, and and that's that's what you should do in my opinion. Yeah.

Arvid: 39:08

Yeah. And and when things break, you can dive into the test and dive into where it broke and then look at that.

James: 39:13

There's also an idea of the this concept called defensible programming, which is basically where you you know, as you write code, if you run into something that shouldn't happen, then you basically should log an error or you should raise an exception. And, actually, that ends up partially being a testing suite in some way. So I'll give you an example. Like, if you have a job, that should get created and that should create 10 jobs, In the next service, if you check for if those 10 jobs should exist, and if they don't, you throw, like, a century error, then that's, like, a good way for you to actually be, not necessarily testing the actual architecture. But as the app experiences bugs in real time, you're just logging a century error.

James: 39:56

This shouldn't have happened. We should've we just raised an assertion. So to me, that's actually that actually gets you quite a lot of the way anyway.

Arvid: 40:03

Yeah. Canary testing. That makes perfect sense. That's the it's it's a very interesting approach. Like, you're you're going more on the the the general grand grand scheme of things, and you try to see, do things work mostly?

Arvid: 40:15

And then you you go then back into the details. You you mentioned one thing, the the cursor rules file. What is this? Like, I've I've never heard of this before. Okay.

Arvid: 40:23

I went and this sounds like something that I should probably know a lot about.

James: 40:26

Yeah. Yeah. You should know a lot about this. So there's a, there's a rule, which is a hidden rule. It's called a dot cursor rules file.

James: 40:33

And you should yeah. You're meant to put it at the root level of duodirectory, but I've actually found that it works better if you put single cursor rules files in different ones. So if you've got a client or an API, you should, default to putting your cursor rules file for your Next. Js app or your, you know, whatever it is, view or some other framework. You should put that in, and it should talk about view.

James: 40:57

And when you load your cursor project, load it just from the client app if you're gonna work in the client for a bit because cursor rules doesn't seem to work that well when you have multiple different folders. So but, anyway, a cursor rules file is basically a file that will change the outputs by constantly feeding in a prompt. And there's a website that everyone should check out called cursor dot directory, and it gives you a bunch of different, prompts that you can then copy and paste. And a load of these, like, you know, you're a NextGen's developer using NextGen's 15. You're like writing server side React components, minimize use client.

James: 41:33

You know? Like so it's trying to encapsulate all the proper conventions that ensure that when it produces code, it produces it in a certain style and a certain flavor that you as the developer like.

Arvid: 41:44

That sounds like something that all the big frameworks, like I'm thinking about Laravel and React, should supply with the framework. Like, that this this opinionated stuff that goes into the framework might as well be expressed very clearly in a cursory rules files. I wonder I wonder if that becomes like a new standard, like, where it's not gonna be cursory rules forever, but for any tool that has this kind of capacity. Just like we have, you know, robots. Txt or things like this.

Arvid: 42:09

Like, some some kind of neutral format that allows to express these kind of things. That is very smart. That's cool. What else do you put in there?

James: 42:16

So anytime you experience a bug because the LLM currently has a bug inside of it, we'll also go in the cursor rules file. So I'll give you a good example of this. SuperBase just migrated about maybe 3 or 4 months ago from their next auth helpers to their SSR package, and that's a note package. And so whenever cursor was generating files for me, it would actually generate it using the next sort of helpers. And I'm going, come on, Mike.

James: 42:42

This is ridiculous. Like, we've changed this. So in my cursor rules file, it now says, if you need to look at the, the client libraries for super base, always default to use the super base slash SSR package, and they are in these folders. And then it knows exactly that it's already got the clients created, which have got types attached, and it knows that it shouldn't hallucinate and start using Nextdoor helpers. It defaults to using the newer package.

James: 43:07

So as you experience bugs through the lack of the knowledge being inside of an LLM and its pretraining data, you basically just add those into the cursor rules with the hope that the new LLM, when it's trained, will have got all those bugs out of it. But it's almost like you're putting a Band Aid on the fact that this LLM isn't perfect.

Arvid: 43:25

And that's that's just a progression of these systems. They will get smarter. They will get better. They will integrate this in the future. It just takes some time to train them, and it probably is, just a matter of time before these things get outdated.

Arvid: 43:36

But that's it sounds also like something that developers should maybe share amongst each other a little bit more. Right? It sounds like something that in a in an enterprise or any kind of team development settings, you could share amongst each other to generate. Like, it's almost like a linter, but on a on a different level. Right?

James: 43:52

It's like a probabilistic linter. So that way is how I would describe it. Yeah. So, yeah, it's it's it's it's good. But, yeah, definitely use it to any bugs that you experience, you know, and also following following certain rules.

James: 44:06

Like, so, you know, when you produce an API endpoint, it needs to be structured like this. Or when you produce server actions, they should look like this, and you can put an example of the server action in there. So you're really trying to make it so that when you tell it to generate a new server action or a new set of API routes, it's gonna do the same thing, which means that you won't then have to go and change all of those manually or reprompt it. You're just getting the, the actual output that you'd want kind of, zero shot first time around. That's really cool.

Arvid: 44:38

Yeah. I'm I'm really gonna use this, man. I'm I'm looking forward to to writing all my little idiosyncratic things that I that I want right, into this file for the thing to do it. Yeah. I never knew.

Arvid: 44:47

This is really cool. And that immediately makes this more usable to me because now I can really put my personal touch on this code base without constantly having to explain myself. That's really, really awesome.

James: 44:58

I have I have a question for you. What what do you do when you have, when you have problems inside of your your your problems tab at the moment? Like, what's your workflow? So, you know, like, when you generate code and you get these problems forming, do you do anything with those at the moment? Or

Arvid: 45:13

Well, usually, I take a nap. But, yeah, the the the way I solve this is by, giving I give I give myself, like, 15 minutes to deal with it, maybe 30 depending on the complexity of the of the thing. And if I can't solve it, I just go back like blank slate to the like, almost to a checkpoint like you mentioned before and try to do it again, but this time think about it differently. Like, take a take an outside perspective, maybe describe it differently, describe outcomes, describe user outcomes, describe infrastructure outcomes, whatever for, I describe features. Like, all of I I just try to switch it, and then I go down the same rabbit hole again.

Arvid: 45:47

And usually, after 1 or 2 tries, I get to that point. Like, I think checkpointing is kinda what I do. Yeah.

James: 45:53

So the reason why I was asking about this is because there's a tab inside of cursor called problems tab, and, it will highlight all of these syntax errors that you experience and type errors. And one thing you can do is when cursor generates code, you get it to generate the code and you restart your TypeScript server or your PyLance server or whatever language server you're running. And then after it's done that, then if it's generated these errors, you go into the problems tab, hit control a, control c, or command a, and command c, and then you go and just take all those errors and redump them back into Claude or some type of LLM. So so give that a go. So the next time that you're experiencing problems because there's, like, a a type error or, like, some type of syntax error, basically, just get the l n to to generate the output, assuming you've got all the right context to start with.

James: 46:43

And then just say, I've got these types now after restarting my language server, And you can do that by hitting command shift p or control shift p depending on what we are on Windows or Mac, and there'll be a restart TypeScript server or whatever the server is. And you have to restart the TypeScript server because when you use composer, it can generate 10 files, and it says that there these imports aren't actually there. And that and then once you've restart the language server, the number of problems will naturally decrease, and you're left with, basically typed problems, like, compilation bugs, syntax bugs, and you can take all those and just dump them straight back in, and you can work around in a loop a couple of times. If it doesn't if if you can't if you do it about 3 more times or 3 or more more 3 or 4 more times, then it's generally, you won't find a solution. But a 1 or 2 times, yeah, you can you can use the problems tab like that, and it works like magic.

Arvid: 47:34

That sounds way more efficient than my nap and reset strategy that I have. But, yeah, I I don't I don't think I've even gotten this far into cursory yet where where I've noticed this happening. I'm just, like, trying to take it all in. But, hey, it's it's still extremely, useful in in so many ways, even if it comes up with problems. And I'm I'm using, like, PHP, which is pretty resilient to types.

Arvid: 47:58

Like, it doesn't it it's fine. Right? It's there's a lot of coercion type coercion there. So I'm not running into these issues. But I've I've had my little issues there, and I think, now that I've talked to you, I will have way fewer issues.

Arvid: 48:10

It's really cool. Do you do you ever code, like, without AI assisted help anymore?

James: 48:17

Oh my gosh. That is the perfect question. Right. So the the time to do that is when you're learning new primitives. So if you're learning a new programming language, so, like, if I was gonna go and learn some more Rust or Go, I would turn off tab completion, and I would code a vanilla like style, basically.

James: 48:35

And the reason why is if you don't, the risk is you're basically being an engineering manager that's that's got a go intern, but you don't know how to write go. And there's a real danger of you not understanding the lower level fundamentals. So that is a really great example of where you shouldn't be using cursor, because you will not gain that foundational knowledge to then leverage it at a higher rate when you're using, you know, chat or or composer for sure. So definitely always turn it off when you're learning something new.

Arvid: 49:06

Yeah. That sounds about right. Yeah. Otherwise, it becomes a crutch.

James: 49:09

Yeah. Now there there is an argument for, I don't wanna learn how to write AppleScript, but I wanna use AppleScript to automate my Mac. So you could basically just basically say, I'm gonna cursor through it and and go through the pain, but not understand it and utilize that tech. But if you want to use it with compounding growth and you want to use that from project to project, then that's where you would basically go. I'm gonna take the guardrails off.

James: 49:33

I'm actually gonna go and learn, like, learn it myself. So it depends on whether it's, whether you don't mind not knowing or whether you're gonna need to know that knowledge for future projects.

Arvid: 49:43

Yeah. That that'll set people apart. Right? That will set apart the people who care about deeply understanding and the people who just wanna get stuff done and the different levels of growth between them.

James: 49:52

Yeah. One one of the one of the tips that I've got is, emotional prompting. So if you tell it things like, you know, I'm gonna lose my job if if I don't get this right, that that actually will give you better outputs than if you just ask her to do the task.

Arvid: 50:08

Oh, boy. Do do you know why? Like, do you know the the technical background

James: 50:13

No.

Arvid: 50:13

For this?

James: 50:14

No. I mean, they've they've seen it in science, you know, in in in loads of different papers, in research papers, but they I I guess it's because when you're emotionally blackmailed, there's a sense of urgency. And so so, yes, I will do that from time to time. You must write my Terraform code on my NextGen's code. Otherwise, I'm gonna lose my job.

James: 50:36

I just feel like I'm I'm basically threatening this life form that I hope doesn't become sentient one day. Yeah. Right? But yeah.

Arvid: 50:44

Yeah. That's that's pretty funny. But that that's that's also I think that's the big difference between being an an actual technical manager who has to have people skills and being able to just yell at an l and m all day long. That's a big difference. Yeah.

Arvid: 50:58

And that's really cool. Thank you so much for sharing all of this. We really, really excited to dive deeper into Cursor and obviously work with AI all day, every day. This is just something that is both part of my business, part of my software development strategy, and I'm just excited about the technology. And I bet there's a lot of people out there who have really enjoyed this as well and would like to know more about you.

Arvid: 51:18

So if people wanna find you, people wanna follow you, where should they go, and what should they look at?

James: 51:23

Yeah. So, I'm on LinkedIn on x. So on LinkedIn, it's James Phoenix. And also on x, it's James a Phoenix 12. And then I've also got a product, which is basically a NextGen spoiler play, which comes with Cursor already set up, and it comes with integration testing, so Playwright and Jess tests.

James: 51:45

So that's called cursordevkit.com. And the idea behind that is to basically give you a really nice boilerplate that's also got all the benefits of everything that you should have with Cursor. So those are the types of places that I hang out mainly. So yeah.

Arvid: 51:58

Very cool. Yeah. I follow you on Twitter, and I I often watch you on Twitch when you're there, when you stream. That's really cool too. Like, seeing you build software is quite enjoyable.

Arvid: 52:06

It's it's bizarre watching people build software as an entertainment thing, but it's really fun. It's really cool. Thanks so much, man. It's I I appreciate everything you do to teach people to deal with this stuff and to also supply them with software to get things started. That is really, really cool.

Arvid: 52:21

Thanks, man. I really, really appreciate everything you did.

James: 52:23

Yeah. Yeah. Thanks a lot for having me, Avid. I really appreciate it. And, yeah.

James: 52:27

See you around.

Arvid: 52:28

And that's it for today. Thank you so much for listening to the Bootstrap founder. Thank you, Paddle, for sponsoring this episode. You can find me on Twitter at avitkahl, arvidkahl, and you will find my books on my Twitter core stat too. If you wanna support me and this show, please tell everyone you know about podscan.fm and leave a rating and a review by going to rate this podcast.com/founder.

Arvid: 52:49

It makes a massive difference if you show up there because then the podcast will show up in other people's feeds. Any of this will help the show. Thanks so much for listening. Have a wonderful day and bye bye.

Creators and Guests

Host

Arvid Kahl

Empowering founders with kindness. Building in Public. Sold my SaaS FeedbackPanda for life-changing $ in 2019, now sharing my journey & what I learned.

Guest

James Phoenix

🏗️ Building in Public @ https://t.co/9cjRRJWgQ8 🤖 AI Engineer 🎮 Ex-Wow Professional (top 0.5%) https://t.co/RsQPT3qRiYhttps://t.co/hYUPMaYovQ

Broadcast by

Creators and Guests

headphones Listen Anywhere

Listen Anywhere