AI/LLM Models & Capabilities — From In-Context Learning to Extended Reasoning · Osmond van Hemert — Senior Software Engineer

The landscape of AI model capabilities has evolved dramatically in recent years, and the pace of change is accelerating. As developers building AI-powered systems, it’s essential to understand the different capabilities available and how they fundamentally change what we can build.

The Core Capability Evolution
#

Understanding how AI models work has become foundational knowledge for any developer. Claude’s in-context learning represents a paradigm shift — moving from fine-tuning as the primary customization mechanism to treating prompts and context as the primary tool for teaching models.

This shift builds on earlier work in transformer efficiency. The Reformer architecture demonstrated how we could process longer sequences efficiently, and years of progress eventually led to the massive context windows we have today.

Extended Thinking and Advanced Reasoning
#

The next frontier moves beyond in-context learning to how models reason over context. Extended thinking models like Claude 3.7 Sonnet introduce deliberate reasoning steps, where the model can think through complex problems before generating responses.

This capability pairs beautifully with reasoning-focused models like OpenAI’s O3 and O4 Mini, which prioritize deep reasoning over raw speed. These models demonstrate that capability boundaries are moving, and reasoning itself is becoming a first-class feature rather than an emergent property.

Practical Applications in Development
#

The model capability improvements translate directly to how developers build systems. AI-assisted testing frameworks leverage these advanced reasoning capabilities to validate code, catch bugs, and ensure correctness without manual intervention.

More broadly, AI-powered development tools and GitHub Copilot’s agent mode show how these models handle real development workflows. The models aren’t just autocompleting code — they’re reasoning about architectural decisions and generating multi-step solutions.

Infrastructure Implications
#

Supporting these advanced capabilities requires serious infrastructure. Anthropic’s compute strategy recognizes that efficiency at scale is itself a competitive advantage. As context windows grow and reasoning becomes more complex, cloud cost optimization becomes critical.

For teams deploying these models locally, Docker’s Model Runner makes experimentation more accessible. The ability to run and iterate on models locally, then scale to cloud infrastructure, is becoming standard practice.

Open Source Models and Alternatives
#

The proprietary frontier models get most attention, but open-source alternatives continue to mature. Meta’s Llama 3.1 release demonstrated that open models can deliver compelling capabilities, and the ecosystem around them continues to improve.

The broader AI infrastructure landscape shows how different providers are positioning themselves. Whether you choose open models, API-based services, or hybrid approaches depends on your specific constraints and requirements.

Governance and Responsible Development
#

As model capabilities advance, governance becomes increasingly important. The EU AI Act places specific requirements on general-purpose AI models, and understanding these compliance requirements is essential for teams building with frontier models.

Teams should also consider how model context protocol adoption and advanced tooling support responsible development practices. Better tooling and standards are making it easier to build systems that are both capable and responsible.

The Trajectory Forward
#

The evolution from ChatGPT’s explosive first month through today’s reasoning and extended thinking models has been remarkably fast. The pattern we’re seeing — rapid capability improvements followed by developer ecosystem maturation — suggests this pace will continue.

Teams building today have access to an unprecedented toolkit. The question isn’t whether to use AI models — it’s which models, which capabilities, and what architectural patterns make sense for your specific problem.

My Take
#

We’re in the middle of a capability inflection point. The models available today can handle tasks that required specialized training just two years ago. In-context learning eliminated fine-tuning for many use cases. Extended reasoning is eliminating the need for chain-of-thought prompting tricks.

The teams that will win are those that understand their models as tools with specific capabilities and limitations, rather than general-purpose solvers. Pair that understanding with thoughtful governance and compliance practices, and you have the foundation for building genuinely valuable AI systems.

The next capability frontier is probably already being worked on. Stay curious about what’s coming, but focus your energy on what these models can do for your users right now.

AI Models & Releases - This article is part of a series.

Part : AI Model Optimization & Efficiency — Making AI Accessible

Part : LLM Agents in Production — Moving Beyond Chat Interfaces

Part : Claude's In-Context Learning — The End of Fine-Tuning as We Know It

Part : This Article

Part : Google Gemini 2.0 — A New Chapter in Multimodal AI

Part : GPT-5 Is Here — A Developer's First Look at What Actually Changed

Part : OpenAI's o3 and o4-mini — Reasoning Models Get Real

Part : Claude 3.7 Sonnet — Extended Thinking Changes the Game for AI-Assisted Development

Part : Claude 3.5 Gets a Computer — Anthropic's 'Computer Use' and the Future of AI Agents

Part : Google Launches Gemini 2.0 Flash — The Multi-Modal AI Race Accelerates

Part : OpenAI Launches o1 Full Model and $200/Month ChatGPT Pro — The Reasoning Era Begins