From Prompt to Production: Building CraicGPT.ie with an AI Co-Pilot

Yesterday, we talked about the dizzying pace of AI innovation, perfectly illustrated by last week’s tale of two launches: Anthropic’s smooth Opus 4.1 update versus OpenAI’s GPT-5 faceplant. It’s easy to get lost in the hype. But what’s it actually like to roll up your sleeves and build something with these tools? That’s what today is all about.

I want to give you a transparent, behind-the-scenes look at “Phase 1” of my AI journey: the creation of a little project I call craicgpt.ie.

More Than a Museum Piece: The “Why” Behind CraicGPT.ie

I built craicgpt.ie for a simple reason: to create a tangible artifact of this specific moment in AI. Think of it as a living museum or a time capsule. The site is a lightweight, serverless webpage designed to do one thing: show and compare the same prompts across various LLMs and image generators.

The magic is in the transparency. If you hover over any piece of text or any image on the site, it reveals the exact prompt I used to generate it. You can see my ugly mug if you hover over the geekwiththepeak logo (yes, geekwiththepeak.com is where these blogs will eventually live, as soon as I get a spare weekend!).

This isn’t meant to be the next killer app. In the world of AI, being 8 weeks old makes it practically a museum piece already! But its value is in demonstrating the practical reality of AI-assisted development. It’s a way to cut through the polished demos and show the raw inputs and outputs, warts and all.

The AI Co-Pilot: My Experience with “Vibe Coding”

My journey coding with LLMs has been a rollercoaster. The very first models from OpenAI and Google were bizarrely impressive. Then, almost overnight, it felt like they got dumber. I have a sneaking suspicion this was a manufactured dip, as the providers realised the commercial implications of their new golden goose and started tightening the reins.

Lately, however, the capabilities have soared again. This new wave of models has made something I call “vibe coding” a reality—building software through natural language conversation rather than traditional, line-by-line coding. You describe what you want, and the AI generates the code.

craicgpt.ie was my experiment in this new paradigm.

This isn’t new to me in principle. In past roles, I’ve often had to script my way out of a corner—knocking up a quick frontend API wrapper to show a skeptical Windows enterprise team how “easy” our API-driven tool was. But using an LLM as a co-pilot takes this to a whole new level.

Under the Hood: The Good, The Bad, and The Ugly Commits

So, what was the process really like? It certainly wasn’t a single prompt that spat out a perfect website. As I mentioned in my project notes, you just have to look at the number of commits it took to get things right. My Git history looks less like a project timeline and more like the diary of a madman. The real work wasn’t in the coding, but in the engineering.

The Good: AI was phenomenal at the grunt work. It generated boilerplate HTML, CSS, and JavaScript in seconds. It was fantastic for translating logic and acting like an infinitely patient tutor. When I was building the site, I used a model that was a predecessor to last week’s releases. To see how things have changed, I tried rebuilding a component with the new

Claude Opus 4.1. The difference was night and day. It understood the context between files and produced clean, working code on the first try. It was a glimpse of the future.

The Bad: Then I tried the same task with the newly launched GPT-5. And let’s just say the results were… interesting. Thanks to the broken router we talked about yesterday, the model seemed to have the attention span of a goldfish. It would confidently “hallucinate” function calls, forget decisions we’d made two prompts earlier, and generally require more hand-holding than a toddler in a sweet shop. It required constant reminders and careful management of the conversation history.

The Ugly: The “engineering” wasn’t in writing the code, but in:

Prompt Refinement: Learning how to ask the right question in the right way. With GPT-5 in its current state, this feels more like a dark art than a science.
Strategic Debugging: Quickly identifying when the AI had gone off the rails and knowing how to guide it back.
Architectural Oversight: Making the high-level decisions about how the different AI-generated components should fit together.

These are skills that AI currently augments, not replaces.

The Verdict: Is AI Ready to Take My Job? Not Yet, But It’s a Hell of an Intern.

So, after building a full project with an AI co-pilot, what’s the verdict? Are we at the point of a single-line prompt generating a production-ready application? Absolutely not. Not for us mere mortals, anyway.

My experience leads me to this conclusion: today’s AI co-pilot is like a brilliant but wildly inconsistent intern. It’s incredibly fast, has encyclopedic knowledge, and never gets tired. But it lacks real-world context, has no common sense, and you absolutely cannot leave it unsupervised on a critical task. The difference between a model like Opus 4.1 and the launch-day version of GPT-5 is the difference between an intern who listens carefully and one who’s already planning their weekend.

The most valuable skills in this new world are not just about writing code. They are about the ability to effectively guide, validate, and integrate the output of an AI system. It’s a shift from being a builder to being an architect and a conductor.

And honestly? It’s the most fun I’ve had in years. This collaborative process is energizing, and it opens up incredible new possibilities. It’s a powerful new tool in the engineer’s toolkit, and we’re only just scratching the surface.

Tomorrow, we’ll look at how to take this to the next level in an enterprise setting by giving our AI a library card to our company’s private data.