<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>LLM on Osmond van Hemert</title><link>https://www.osmondvanhemert.nl/tags/llm/</link><description>Recent content in LLM on Osmond van Hemert</description><generator>Hugo -- gohugo.io</generator><language>en</language><copyright>© Osmond van Hemert. All rights reserved.</copyright><lastBuildDate>Thu, 29 Jan 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://www.osmondvanhemert.nl/tags/llm/index.xml" rel="self" type="application/rss+xml"/><item><title>AI Agent Frameworks — The Wild West of Autonomous Systems</title><link>https://www.osmondvanhemert.nl/posts/260129-ai-agent-frameworks-landscape/</link><pubDate>Thu, 29 Jan 2026 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/260129-ai-agent-frameworks-landscape/</guid><description>The AI agent framework landscape has exploded, with LangGraph, CrewAI, AutoGen, and dozens more competing for developer mindshare. Here&amp;rsquo;s what matters.</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://www.osmondvanhemert.nl/posts/260129-ai-agent-frameworks-landscape/featured.jpg"/></item><item><title>Google Gemini 2.0 — A New Chapter in Multimodal AI</title><link>https://www.osmondvanhemert.nl/posts/251211-google-gemini-2-multimodal-ai/</link><pubDate>Thu, 11 Dec 2025 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/251211-google-gemini-2-multimodal-ai/</guid><description>Google launches Gemini 2.0 with native multimodal capabilities, and the implications for developers are significant.</description></item><item><title>SWE-bench Benchmark Contamination — When the Test Answers Are in the Training Data</title><link>https://www.osmondvanhemert.nl/posts/250911-swe-bench-git-history-leaks/</link><pubDate>Thu, 11 Sep 2025 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/250911-swe-bench-git-history-leaks/</guid><description>Research reveals that top AI coding model scores on SWE-bench may be inflated due to git history leaks, raising fundamental questions about how we evaluate AI coding capabilities.</description></item><item><title>Mistral's Le Chat Gets MCP Connectors — The Protocol That's Quietly Connecting Everything</title><link>https://www.osmondvanhemert.nl/posts/250904-mistral-le-chat-mcp-connectors/</link><pubDate>Thu, 04 Sep 2025 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/250904-mistral-le-chat-mcp-connectors/</guid><description>Mistral adds custom MCP connectors and persistent memory to Le Chat, signaling that the Model Context Protocol is becoming the standard glue for AI tool integration.</description></item><item><title>Google's Gemma 3 270M — Why Tiny Models Are the Real AI Story</title><link>https://www.osmondvanhemert.nl/posts/250814-gemma3-270m-small-models-big-impact/</link><pubDate>Thu, 14 Aug 2025 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/250814-gemma3-270m-small-models-big-impact/</guid><description>Google releases Gemma 3 at 270M parameters, proving that smaller, more efficient models might matter more than the next big model launch.</description></item><item><title>GPT-5 Is Here — A Developer's First Look at What Actually Changed</title><link>https://www.osmondvanhemert.nl/posts/250807-gpt5-launch-developer-implications/</link><pubDate>Thu, 07 Aug 2025 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/250807-gpt5-launch-developer-implications/</guid><description>OpenAI launches GPT-5 with significant improvements. Here&amp;rsquo;s what matters for developers beyond the marketing.</description></item><item><title>The EU AI Act Compliance Clock Is Ticking — What Developers Need to Know</title><link>https://www.osmondvanhemert.nl/posts/250703-eu-ai-act-developer-compliance/</link><pubDate>Thu, 03 Jul 2025 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/250703-eu-ai-act-developer-compliance/</guid><description>With key EU AI Act provisions now in effect, development teams building AI systems need to understand the practical implications for their architectures and workflows.</description></item><item><title>OpenAI's o3 and o4-mini — Reasoning Models Get Real</title><link>https://www.osmondvanhemert.nl/posts/250417-openai-o3-o4-mini-reasoning-models/</link><pubDate>Thu, 17 Apr 2025 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/250417-openai-o3-o4-mini-reasoning-models/</guid><description>OpenAI releases o3 and o4-mini reasoning models, bringing chain-of-thought inference to mainstream developer workflows.</description></item><item><title>Claude 3.7 Sonnet — Extended Thinking Changes the Game for AI-Assisted Development</title><link>https://www.osmondvanhemert.nl/posts/250306-claude-3-7-sonnet-extended-thinking/</link><pubDate>Thu, 06 Mar 2025 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/250306-claude-3-7-sonnet-extended-thinking/</guid><description>Anthropic&amp;rsquo;s Claude 3.7 Sonnet introduces extended thinking, letting the model reason step-by-step before responding — and the implications for developer workflows are significant.</description></item><item><title>Claude 3.5 Gets a Computer — Anthropic's 'Computer Use' and the Future of AI Agents</title><link>https://www.osmondvanhemert.nl/posts/250220-anthropic-computer-use-ai-agents/</link><pubDate>Thu, 20 Feb 2025 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/250220-anthropic-computer-use-ai-agents/</guid><description>Anthropic&amp;rsquo;s computer use capability lets Claude interact with desktop applications like a human. What does this mean for automation, testing, and the future of AI agents?</description></item><item><title>DeepSeek R1 — Open-Source Reasoning Models Change the Game</title><link>https://www.osmondvanhemert.nl/posts/250123-deepseek-r1-open-source-reasoning/</link><pubDate>Thu, 23 Jan 2025 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/250123-deepseek-r1-open-source-reasoning/</guid><description>DeepSeek&amp;rsquo;s R1 reasoning model, released as fully open-source with an MIT license, demonstrates that frontier AI capabilities aren&amp;rsquo;t exclusive to US labs anymore.</description></item><item><title>Google Launches Gemini 2.0 Flash — The Multi-Modal AI Race Accelerates</title><link>https://www.osmondvanhemert.nl/posts/241212-google-gemini-2-flash/</link><pubDate>Thu, 12 Dec 2024 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/241212-google-gemini-2-flash/</guid><description>Google&amp;rsquo;s Gemini 2.0 Flash brings native tool use, multimodal output, and agentic capabilities. A look at what this means for the competitive AI landscape.</description></item><item><title>OpenAI Launches o1 Full Model and $200/Month ChatGPT Pro — The Reasoning Era Begins</title><link>https://www.osmondvanhemert.nl/posts/241205-openai-o1-full-model-chatgpt-pro/</link><pubDate>Thu, 05 Dec 2024 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/241205-openai-o1-full-model-chatgpt-pro/</guid><description>OpenAI kicks off its &amp;lsquo;12 Days of OpenAI&amp;rsquo; event with the full o1 reasoning model and a new $200/month ChatGPT Pro tier. What this means for developers building with AI.</description></item><item><title>Claude Gets Hands — Anthropic's Computer Use Changes the AI Game</title><link>https://www.osmondvanhemert.nl/posts/241024-anthropic-claude-computer-use/</link><pubDate>Thu, 24 Oct 2024 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/241024-anthropic-claude-computer-use/</guid><description>Anthropic&amp;rsquo;s updated Claude 3.5 Sonnet introduces Computer Use, letting AI directly interact with desktop environments — a significant leap toward autonomous AI agents.</description></item><item><title>OpenAI o1 — The Dawn of Reasoning Models</title><link>https://www.osmondvanhemert.nl/posts/240912-openai-o1-reasoning-models/</link><pubDate>Thu, 12 Sep 2024 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/240912-openai-o1-reasoning-models/</guid><description>OpenAI releases o1, a model that &amp;rsquo;thinks before it answers&amp;rsquo; — what chain-of-thought reasoning means for developers and the future of AI-assisted coding.</description></item><item><title>GitHub Models — Bringing AI Model Experimentation to Where Developers Already Live</title><link>https://www.osmondvanhemert.nl/posts/240822-github-models-ai-marketplace/</link><pubDate>Thu, 22 Aug 2024 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/240822-github-models-ai-marketplace/</guid><description>GitHub launches Models, a new playground for experimenting with AI models directly from GitHub. Here&amp;rsquo;s why this integration matters.</description></item><item><title>Llama 3.1 405B — Meta Goes All-In on Open-Source AI</title><link>https://www.osmondvanhemert.nl/posts/240725-meta-llama-3-1-open-source/</link><pubDate>Thu, 25 Jul 2024 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/240725-meta-llama-3-1-open-source/</guid><description>Meta releases Llama 3.1 with a 405 billion parameter model under a permissive license, making frontier-class AI genuinely open for the first time.</description></item><item><title>Ollama and the Rise of Local LLMs — Why Running AI on Your Own Hardware Matters</title><link>https://www.osmondvanhemert.nl/posts/240711-ollama-local-llm-revolution/</link><pubDate>Thu, 11 Jul 2024 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/240711-ollama-local-llm-revolution/</guid><description>Local LLM tooling has matured rapidly, with Ollama leading the charge. Here&amp;rsquo;s why self-hosted AI is becoming a serious option for developers.</description></item><item><title>Claude 3.5 Sonnet — Anthropic Raises the Bar for Coding AI</title><link>https://www.osmondvanhemert.nl/posts/240620-claude-35-sonnet-raises-the-bar/</link><pubDate>Thu, 20 Jun 2024 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/240620-claude-35-sonnet-raises-the-bar/</guid><description>Anthropic releases Claude 3.5 Sonnet, which benchmarks above GPT-4o on coding tasks while running faster and cheaper — reshaping the competitive landscape for AI-assisted development.</description></item><item><title>GPT-4o — OpenAI's Multimodal Leap and What It Means for Developers</title><link>https://www.osmondvanhemert.nl/posts/240509-openai-gpt4o-multimodal/</link><pubDate>Thu, 09 May 2024 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/240509-openai-gpt4o-multimodal/</guid><description>OpenAI&amp;rsquo;s Spring Update reveals GPT-4o, a natively multimodal model that processes text, audio, and vision in a single architecture. The developer implications are significant.</description></item><item><title>Meta Releases Llama 3 — Open Source AI Just Got Serious</title><link>https://www.osmondvanhemert.nl/posts/240418-meta-llama-3-release/</link><pubDate>Thu, 18 Apr 2024 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/240418-meta-llama-3-release/</guid><description>Meta&amp;rsquo;s Llama 3 arrives with 8B and 70B parameter models that rival closed-source competitors, reshaping the open-weight AI landscape.</description></item><item><title>Claude 3 Arrives — Anthropic's New Family of Models Raises the Bar</title><link>https://www.osmondvanhemert.nl/posts/240307-anthropic-claude-3-benchmarks/</link><pubDate>Thu, 07 Mar 2024 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/240307-anthropic-claude-3-benchmarks/</guid><description>Anthropic launches Claude 3 in three tiers — Haiku, Sonnet, and Opus — with benchmark results that challenge GPT-4&amp;rsquo;s dominance.</description></item><item><title>Gemini 1.5 Pro — A Million Tokens Changes the Game</title><link>https://www.osmondvanhemert.nl/posts/240215-gemini-1-5-million-token-context/</link><pubDate>Thu, 15 Feb 2024 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/240215-gemini-1-5-million-token-context/</guid><description>Google&amp;rsquo;s Gemini 1.5 Pro launches with a 1 million token context window, fundamentally reshaping what&amp;rsquo;s possible with large language models.</description></item><item><title>Google Rebrands Bard to Gemini — The AI Naming Game Gets Real</title><link>https://www.osmondvanhemert.nl/posts/240208-google-gemini-rebrand-ai-platform/</link><pubDate>Thu, 08 Feb 2024 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/240208-google-gemini-rebrand-ai-platform/</guid><description>Google retires the Bard brand and goes all-in on Gemini. Behind the marketing refresh is a real technical shift with implications for developers building on Google&amp;rsquo;s AI stack.</description></item><item><title>The GPT Store Is Live — What It Means for AI Development</title><link>https://www.osmondvanhemert.nl/posts/240118-gpt-store-launch-ai-development/</link><pubDate>Thu, 18 Jan 2024 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/240118-gpt-store-launch-ai-development/</guid><description>OpenAI launches the GPT Store, creating a marketplace for custom GPTs. Here&amp;rsquo;s what it means for developers and why the platform play matters more than the individual bots.</description></item><item><title>Google Gemini Arrives — Multimodal AI Gets Real</title><link>https://www.osmondvanhemert.nl/posts/231207-google-gemini-multimodal-ai/</link><pubDate>Thu, 07 Dec 2023 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/231207-google-gemini-multimodal-ai/</guid><description>Google launches Gemini, its most capable AI model yet, bringing native multimodal reasoning to the forefront of the AI race.</description></item><item><title>OpenAI DevDay — GPT-4 Turbo and the Platform Play</title><link>https://www.osmondvanhemert.nl/posts/231109-openai-devday-gpt4-turbo/</link><pubDate>Thu, 09 Nov 2023 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/231109-openai-devday-gpt4-turbo/</guid><description>OpenAI&amp;rsquo;s DevDay unveils GPT-4 Turbo, custom GPTs, and the Assistants API — signaling a major shift from model provider to developer platform.</description></item><item><title>Bletchley Park AI Safety Summit — Governments Finally Enter the Chat</title><link>https://www.osmondvanhemert.nl/posts/231102-bletchley-park-ai-safety-summit/</link><pubDate>Thu, 02 Nov 2023 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/231102-bletchley-park-ai-safety-summit/</guid><description>The UK&amp;rsquo;s AI Safety Summit at Bletchley Park brings 28 nations together to discuss AI risks, marking a watershed moment for international AI governance.</description></item><item><title>Code Llama — Meta's Open Source Bet on AI-Assisted Coding</title><link>https://www.osmondvanhemert.nl/posts/230824-code-llama-open-source-code-generation/</link><pubDate>Thu, 24 Aug 2023 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/230824-code-llama-open-source-code-generation/</guid><description>Meta releases Code Llama, a family of open-source code generation models, and it might just change the dynamics of AI-assisted development.</description></item><item><title>Meta Releases Llama 2 — Open Source AI Gets a Massive Boost</title><link>https://www.osmondvanhemert.nl/posts/230720-meta-llama2-open-source-llm/</link><pubDate>Thu, 20 Jul 2023 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/230720-meta-llama2-open-source-llm/</guid><description>Meta&amp;rsquo;s release of Llama 2 as a commercially-licensed open model changes the game for developers building with large language models.</description></item><item><title>GPT-4 Lands — And It Raises the Bar Significantly</title><link>https://www.osmondvanhemert.nl/posts/230316-gpt4-lands-and-raises-the-bar/</link><pubDate>Thu, 16 Mar 2023 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/230316-gpt4-lands-and-raises-the-bar/</guid><description>OpenAI releases GPT-4 with multimodal capabilities and dramatically improved reasoning — here&amp;rsquo;s what it means for developers.</description></item><item><title>The AI Search Wars Begin — Bing Chat, Google Bard, and the Future of Finding Things</title><link>https://www.osmondvanhemert.nl/posts/230209-ai-search-wars-bing-bard/</link><pubDate>Thu, 09 Feb 2023 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/230209-ai-search-wars-bing-bard/</guid><description>Microsoft and Google are racing to integrate large language models into search, and the implications go far beyond just finding web pages.</description></item><item><title>ChatGPT's First Month — Why This AI Moment Feels Different</title><link>https://www.osmondvanhemert.nl/posts/221229-chatgpt-explosive-first-month/</link><pubDate>Thu, 29 Dec 2022 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/221229-chatgpt-explosive-first-month/</guid><description>One month after launch, ChatGPT has crossed a million users and sparked conversations about AI that reach far beyond the usual tech circles. Here&amp;rsquo;s why this one matters.</description></item><item><title>GPT-3 API Access — First Impressions from the Beta</title><link>https://www.osmondvanhemert.nl/posts/200723-gpt3-api-beta-first-impressions/</link><pubDate>Thu, 23 Jul 2020 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/200723-gpt3-api-beta-first-impressions/</guid><description>OpenAI is granting beta access to the GPT-3 API. After a week of experimentation, here&amp;rsquo;s what&amp;rsquo;s genuinely impressive and what&amp;rsquo;s overhyped.</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://www.osmondvanhemert.nl/posts/200723-gpt3-api-beta-first-impressions/featured.jpg"/></item><item><title>Reformer — Can We Make Transformers Practical for the Rest of Us?</title><link>https://www.osmondvanhemert.nl/posts/200123-reformer-efficient-transformers/</link><pubDate>Thu, 23 Jan 2020 00:00:00 +0000</pubDate><guid>https://www.osmondvanhemert.nl/posts/200123-reformer-efficient-transformers/</guid><description>Google&amp;rsquo;s new Reformer model tackles the massive memory and compute costs of Transformers. For engineers building AI-powered features, this matters more than another benchmark score.</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://www.osmondvanhemert.nl/posts/200123-reformer-efficient-transformers/featured.jpg"/></item></channel></rss>