Skip to main content
  1. Blog/

ChatGPT Just Changed Everything — Or Did It?

·958 words·5 mins
Osmond van Hemert
Author
Osmond van Hemert
AI Models & Releases - This article is part of a series.
Part : This Article

Yesterday, OpenAI quietly released ChatGPT, a conversational AI model, and within hours my entire timeline — across every platform — was nothing but screenshots of people testing it. By this morning, it feels like every developer I know has tried it, and the reactions range from “this changes everything” to genuine existential dread.

I spent a few hours with it last night, and I’ll be honest: it’s the most impressive AI demo I’ve ever interacted with. But I’ve been in this industry long enough to know that impressive demos and production-ready tools are very different things. Let’s dig into what’s actually happening here.

What ChatGPT Actually Is
#

ChatGPT is built on GPT-3.5, fine-tuned using Reinforcement Learning from Human Feedback (RLHF). The key innovation isn’t the base model — GPT-3 has been available via API for over two years — but the conversational fine-tuning. Previous GPT-3 interactions required careful prompt engineering to get useful outputs. ChatGPT understands conversational context, follows instructions more reliably, and produces structured responses without elaborate prompting.

The RLHF approach is significant. Human trainers ranked model outputs by quality, and these rankings were used to train a reward model, which then guided further fine-tuning via proximal policy optimization. The result is a model that’s better at following the intent behind a question rather than just pattern-matching on the words.

From a technical perspective, this is an elegant demonstration of alignment techniques. The model actively refuses harmful requests, asks clarifying questions, and admits uncertainty. It’s not perfect at any of these — I’ve seen plenty of confident-sounding incorrect answers — but the improvement over raw GPT-3 is substantial.

The Coding Implications
#

As a developer, the coding capabilities are what caught my attention most. I tested it across several scenarios:

Explaining code: I pasted in a complex regex and asked for an explanation. It broke it down component by component, accurately. I tried the same with a tricky SQL query involving window functions. Again, solid explanation.

Generating code: I asked for a Python function to parse a specific log format. The first output was functional and reasonably idiomatic. When I asked it to add error handling, it modified the code appropriately. When I pointed out an edge case it missed, it corrected it.

Debugging: I pasted in a function with a subtle off-by-one error and asked it to find bugs. It identified the issue and explained why it was wrong.

None of this is magic — the model has been trained on vast amounts of code and programming discussion. But the conversational interface makes it dramatically more accessible than previous code-generation tools. You don’t need to craft perfect prompts; you can iterate naturally.

Where It Falls Down
#

The failure modes are important to understand. ChatGPT generates plausible-sounding text, but it has no mechanism for verifying factual accuracy. I asked it several questions about specific library APIs and got confidently stated answers that were subtly wrong — the function signatures looked right but had incorrect parameter names or return types.

This is the fundamental challenge with large language models for technical work: they optimize for coherence, not correctness. A senior developer will catch these errors. A junior developer might not. And that’s a real concern as these tools become more accessible.

I also noticed the model struggles with temporal knowledge boundaries. It sometimes references features or versions that don’t exist yet, or conflates information from different time periods. The training data cutoff creates a knowledge horizon that the model doesn’t always respect gracefully.

The “What Happens to Stack Overflow” Question
#

The immediate reaction in many communities has been to question whether ChatGPT will replace Stack Overflow, developer documentation, or even developers themselves. Let me push back on the more dramatic predictions.

Stack Overflow’s value isn’t just answers — it’s verified, community-vetted, version-specific answers with context about why alternatives don’t work. ChatGPT gives you an answer; Stack Overflow gives you an answer that a thousand developers have validated.

That said, for the initial “how do I approach this” phase of problem-solving, ChatGPT is remarkably effective. It’s like having a knowledgeable colleague available 24/7 who can get you 80% of the way there. The last 20% — verification, edge cases, production considerations — still requires human expertise.

The Infrastructure Question
#

One aspect that hasn’t gotten enough attention: the compute costs of running this at scale. Each conversation requires significant GPU resources for inference. OpenAI is currently offering this for free during a “research preview,” but the cost of serving millions of concurrent conversations with a model this size is non-trivial.

The economics of large language model inference are going to be a major infrastructure challenge going forward. Model serving at this scale requires specialized hardware, aggressive optimization (quantization, distillation, batching strategies), and potentially new architectural approaches.

My Take
#

ChatGPT is the most compelling AI product I’ve used. Full stop. The conversational interface, the quality of responses, and the breadth of capability represent a genuine step function in what’s accessible to developers and non-developers alike.

But I want to be measured here. I’ve seen enough hype cycles to know that the gap between “amazing demo” and “reliable production tool” is where most technologies stall. The question isn’t whether ChatGPT is impressive — it clearly is. The question is how quickly the rough edges get smoothed, how the economics work at scale, and whether the accuracy problems can be addressed.

For now, I’m treating it as a powerful drafting tool — great for generating first attempts, exploring approaches, and explaining unfamiliar code. But I’m verifying everything it produces, just as I would with code from any source I don’t fully trust.

The developer workflow is about to change. How much, and how fast, remains to be seen.

AI Models & Releases - This article is part of a series.
Part : This Article