Anthropic expands Claude Sonnet 4’s context window to 1M tokens
With this bigger context window, Claude can course of codebases with 75,000+ traces of code in a single request. This enables it to raised perceive venture structure, cross-file dependencies, and make strategies that match with the whole system design.
Longer context home windows are actually in beta on the Anthropic API and Amazon Bedrock, and can quickly be obtainable in Google Cloud’s Vertex AI.
For prompts over 200K tokens, pricing will enhance to $6 / million tokens (MTok) for enter and $22.50 / MTok for output. The pricing for requests beneath 200K tokens can be $3 / MTok for enter and $15 / MTok for output.
The corporate additionally prolonged its studying mode designed for college students into Claude.ai and Claude Code. Studying mode asks customers inquiries to information then by way of ideas as a substitute of offering rapid solutions, to advertise essential considering of issues.
OpenAI provides GPT-4o as a legacy mannequin in ChatGPT
With this replace, paid customers will now have the ability to choose GPT-4o when utilizing ChatGPT, together with different fashions like o3, GPT-4.1, and GPT-5 Considering mini.
The mannequin picker for GPT-5 additionally now contains Auto, Quick, and Considering mode. Quick prioritizes giving the quickest solutions, considering prioritizes giving deeper solutions that take longer to assume by way of, and auto chooses between the 2.
The corporate additionally elevated the message restrict for Plus and Staff customers to three,000 per week on GPT-5 Considering.
Google releases Gemma 3 270M
This new mannequin is “designed from the bottom up for task-specific fine-tuning with robust instruction-following and textual content structuring capabilities already educated in,” in keeping with Google.
It’s excellent in conditions the place there’s a high-volume, well-defined job; velocity and price issues; person privateness must be protected; or there’s a need for a fleet of specialised job fashions.
Each pretrained and instruction tuned variations of the mannequin can be found for obtain from Hugging Face, Ollama, Kaggle, LM Studio, and Docker. Alternatively, the fashions will be tried out in Vertex AI.
NVIDIA releases newest fashions in Llama Nemotron household
Llama Nemotron are a household of reasoning fashions, and the most recent updates embrace a brand new hybrid mannequin structure, compact quantized fashions, and a configurable considering funds to provide builders extra management over token technology.
This mixture lets the fashions motive extra deeply and reply quicker, with no need extra time or computing energy. This implies higher outcomes at a decrease price,” the corporate wrote in an announcement.
Google’s coding agent Jules will get critique performance
Google is enhancing its AI coding agent, Jules, with new performance that evaluations and critiques code whereas Jules continues to be engaged on it.
“In a world of fast iteration, the critic strikes the overview to earlier within the course of and into the act of technology itself. This implies the code you overview has already been interrogated, refined, and stress-tested … Nice builders don’t simply write code, they query it. And now, so does Jules,” Google wrote in a weblog publish.
In accordance with the corporate, the coding critic is sort of a peer reviewer who’s conversant in code high quality rules and is “unafraid to level out while you’ve reinvented a dangerous wheel.”
GitHub to be folded into Microsoft’s CoreAI org
GitHub’s CEO Thomas Dohmke has introduced his plans to depart the corporate on the finish of the yr.
In a memo to workers, he mentioned that Microsoft doesn’t plan to switch him; moderately, GitHub and its management staff will now function beneath Microsoft’s CoreAI group, a gaggle inside the firm centered on creating AI-powered instruments, together with GitHub Copilot.
“At this time, GitHub Copilot is the chief of essentially the most profitable and thriving market within the age of AI, with over 20 million customers and counting,” he wrote. “We did this by innovating forward of the curve and exhibiting grit and dedication when challenged by the disruptors in our area. In simply the final yr, GitHub Copilot turned the primary multi-model resolution at Microsoft, in partnership with Anthropic, Google, and OpenAI. We enabled Copilot Free for hundreds of thousands and launched the synchronous agent mode in VS Code in addition to the asynchronous coding agent native to GitHub.”
Sentry launches MCP monitoring software
Software monitoring firm Sentry is making it simpler to realize visibility into MCP servers with the launch of a brand new monitoring software.
With MCP monitoring, builders can perceive issues like which purchasers are experiencing errors, which instruments are most used, or which instruments are operating gradual. They’ll additionally correlate errors with occasions like site visitors spikes or new launch deployments, or determine if errors are solely occurring on one sort of transport.
In accordance with Cody De Arkland, head of developer expertise at Sentry, when Sentry launched its personal MCP server, it was getting over 30 million requests per thirty days. He mentioned that at that scale, it’s inevitable that errors will happen, and present monitoring instruments had been combating MCP servers.
bitHuman launches SDK for creating AI avatars
AI firm bitHuman has introduced a visible SDK for creating avatars to be used as chat brokers, instructors, digital coaches, companions, and specialists in numerous fields.
In accordance with the corporate, the SDK permits avatars to be created on Arm-based and x86 methods with out a GPU. The avatars have a small footprint and will be run on-line or offline on gadgets like Chromebooks, Mac Minis, and Raspberry Pis.
Due to their small footprint, these characters will be delivered to a variety of environments, together with school rooms, kiosks, cellular apps, or edge gadgets.
Learn final week’s updates right here: This week in AI dev instruments: GPT-5, Claude Opus 4.1, and extra (August 8, 2025)
