October 2025: AI updates from the previous month


OpenAI pronounces agentic safety researcher that may discover and repair vulnerabilities

OpenAI has launched a personal beta for a brand new AI agent known as Aardvark that acts as a safety researcher, discovering vulnerabilities and making use of fixes, at scale.

“Software program safety is among the most important—and difficult—frontiers in know-how. Annually, tens of 1000’s of recent vulnerabilities are found throughout enterprise and open-source codebases. Defenders face the daunting duties of discovering and patching vulnerabilities earlier than their adversaries do. At OpenAI, we’re working to tip that steadiness in favor of defenders,” OpenAI wrote in a weblog submit.

The agent repeatedly analyzes supply code repositories to determine vulnerabilities, assess their exploitability, prioritize severity, and suggest patches. As a substitute of utilizing conventional evaluation strategies like fuzzing of software program composition evaluation, Aardvark makes use of LLM-powered reasoning and tool-use.

Cursor 2.0 allows eight brokers to work in parallel with out interfering with one another

The AI coding editor Cursor introduced the launch of Cursor 2.0, the following iteration of the platform, that includes a brand new interface for working with a number of brokers and its first ever coding mannequin.

The brand new multi-agent interface facilities round brokers as a substitute of information. With this new interface, as much as eight brokers can work in parallel, utilizing git worktrees and distant bushes to forestall them from interfering with one another. It additionally permits builders to have a number of fashions try the identical drawback and see which one produces the very best output.

Whereas this new interface is designed for brokers, builders will nonetheless have the ability to open information or swap again to the traditional IDE as wanted.

The brand new coding mannequin, Composer, is 4 instances quicker than comparable fashions, the corporate claims. It was designed for low-latency agentic coding duties in Cursor, and it may possibly full most turns in lower than 30 seconds.

Workato launches Enterprise MCP for SaaS platforms

Organizations are spending big {dollars} on AI brokers, however are discovering that integrating the brokers into all of the methods the enterprise must operate is a really excessive hurdle.

To assist make SaaS platforms agent-ready, integration orchestration firm Workato launched Workato Enterprise MCP, which the corporate mentioned in its announcement can “flip present workflows, integrations, and APIs into wealthy, multi-step agent expertise that any large-language-model (LLM)-based agent can name, together with ChatGPT, Claude, Gemini, and Cursor.”

Adam Seligman, chief know-how officer at Workato, informed SD Instances that “the factor we preserve coming again to again and again is brokers present loads of promise, however to actually work for enterprise, they should get entry to enterprise knowledge. And so they have to have the ability to do issues inside your small business, however do it in a approach that you just belief. And it’s actually laborious to get these two issues proper.”

JetBrains launches open benchmarking platform for measuring AI productiveness

JetBrains has launched a brand new device designed to allow builders to measure their precise productiveness good points from AI instruments.

The corporate’s Developer Productiveness AI Enviornment (DPAI Enviornment) is an open benchmarking platform for the way effectively AI improvement instruments full real-world software program engineering duties. In response to the corporate, present benchmarks that LLMs are run in opposition to depend on outdated datasets, cowl a slender vary of applied sciences, and focus primarily on issue-to-patch workflows.

“As AI coding instruments advance quickly, the trade nonetheless lacks a impartial, standards-based framework to measure their actual impression on developer productiveness,” the corporate wrote in a weblog submit.

DPAI Enviornment makes use of a versatile, track-based structure to allow reproducible comparisons throughout workflows like patching, bug fixes, PR evaluation, check technology, static evaluation, and extra.

GitHub unveils Agent HQ, the following evolution of its platform that focuses on agent-based improvement

Throughout its annual convention, GitHub Universe, GitHub shared its plans for Agent HQ, its imaginative and prescient for the way forward for the platform the place AI brokers are natively built-in throughout all of GitHub.

As a part of this Agent HQ initiative, over the following a number of months, paid GitHub Copilot customers will acquire direct entry to standard coding brokers from Anthropic, OpenAI, Google, Cognition, xAI, and extra.

Agent HQ brings with it a number of new capabilities to help this subsequent evolution, the primary of which is mission management, a central command heart for assigning, steering, and monitoring the work of a number of brokers throughout GitHub, Copilot CLI, and VS Code.

Mission management’s department controls provides builders granular oversight over operating checks for code created by the brokers. Id options can even be launched to permit builders to handle brokers like they might different coworkers and management which agent is constructing a activity, handle entry, and implement insurance policies.

OpenAI completes restructuring, strikes new take care of Microsoft

OpenAI at this time introduced that it has accomplished the restructuring of its enterprise. When the corporate was based in 2015, it was launched as a non-profit group and that non-profit has managed the for-profit arm of the enterprise.

Right now’s restructuring turns the for-profit arm right into a public profit company known as OpenAI PBC. The OpenAI Basis—the brand new identify for the non-profit—will nonetheless management the for-profit and maintain a 26% fairness stake in OpenAI PBC, which is at present valued at round $130 billion.

Being a public profit company differs from conventional company constructions in that they’re “required to advance its said mission and think about the broader pursuits of all stakeholders, guaranteeing the corporate’s mission and industrial success advance collectively,” OpenAI’s web site explains.

Microsoft pronounces public preview for planning functionality that improves how Copilot in Visible Studio handles complicated duties

Microsoft has introduced a public preview for a brand new function that goals to allow Copilot in Visible Studio to deal with extra complicated initiatives.

With its new planning functionality in Agent Mode, Copilot will analysis the codebase to interrupt down large duties into smaller and extra manageable duties, whereas additionally iterating on its plan as it really works by the steps.

“Planning makes Copilot extra predictable and constant by giving it a structured technique to cause about your mission. It builds on strategies from hierarchical and closed-loop planning analysis – enabling Copilot to plan at a excessive stage, execute step-by-step, and modify dynamically because it learns extra about your codebase and points encountered throughout implementation,” Rhea Patel, product supervisor at Microsoft, wrote in a weblog submit.

GitKraken releases Insights to assist firms measure ROI of AI

GitKraken, a software program engineering intelligence firm that focuses on enhancing the developer expertise, introduced the launch of GitKraken Insights to offer firms with higher insights into AI’s impression on developer productiveness.

Matt Johnston, CEO of Gitkraken, informed SD Instances that regardless of the incremental investments in and perceived velocity good points from AI, they wrestle to grasp the impression. “I used to be speaking to a VP of developer expertise at a big Silicon Valley firm, and he was mainly saying, ‘We’ve made investments of 1000’s of seats in Cursor and Copilot and Claude, and we will’t actually inform what’s getting used… and how on earth do I measure this in a approach that’s compelling to my enterprise leaders.”

GitKraken Insights brings collectively a number of totally different metrics—DORA metrics, code high quality evaluation, technical debt monitoring, AI impression measurement, and developer expertise indicators—to color an image of what’s occurring throughout the improvement lifecycle.

Mabl pronounces updates to Agentic Testing Teammate

The Agentic Testing Teammate works alongside human testers to make the method extra environment friendly. New updates embody AI vectorizations and check semantic search, enhancements to check protection, and enhancements to the MCP Server that allow testers to do plenty of duties straight inside their IDE, together with Take a look at Impression Evaluation, clever check creation, and failure suggestions.

“This new work is constructed on the concept an agent can develop into an integral a part of your testing group,” mentioned Dan Belcher, co-founder of mabl. “In contrast to scripting frameworks and general-purpose massive language fashions, mabl builds deep data about your software over time and makes use of that data to make it–and your group–more practical.”

Couchbase 8.0 provides three new vector indexing and retrieval capabilities

These new capabilities are designed to help various vector workloads that facilitate real-time AI functions.

Hyperscale Vector Index relies on the DiskANN nearest-neighbor search algorithm and allows operation throughout partitioned disks for distributed processing. Composite Vector Index helps pre-filtered queries that may scope the particular vector being sought. Search Vector Index helps hybrid searches containing vectors, lexical search, and structured question standards in a single SQL++ request.

Anthropic expands reminiscence to all paid Claude customers

Anthropic introduced that the latest reminiscence function in Claude is being rolled out to Professional and Max plan customers, making it obtainable to all paid customers now.

Reminiscence was initially introduced in early September, however was solely obtainable to Staff and Enterprise customers to start with.

Reminiscence permits Claude to recollect your initiatives and preferences so that you just don’t have to re-explain necessary context throughout periods. “Nice work builds over time. With reminiscence, every dialog with Claude improves the following,” Anthropic wrote in its preliminary announcement.

Harness brings vibe coding to database migration with new AI-Powered Database Migration Authoring function

Harness is on a mission to make it simpler for builders to do database migrations with its new AI-Powered Database Migration Authoring function. This new functionality permits customers to explain schema modifications in pure language to obtain a production-ready migration.

For instance, a developer may ask “Create a desk named animals with columns for genus_species and common_name. Then add a associated desk named birds that tracks unladen airspeed and correct identify. Add rows for Captain Canary, African swallow, and European swallow.”

Harness’ platform would then analyze the present schema and insurance policies, generate a backward-compatible migration, validate the change for security and compliance, commit it to Git for testing, and create rollback migrations.

Pink Hat Developer Lightspeed brings AI help to Pink Hat’s Developer Hub and migration toolkit

Pink Hat Developer Lightspeed has been built-in into each the Pink Hat Developer Hub and the migration toolkit for functions (MTA).

Within the Pink Hat Developer Hub, it acts as an assistant to hurry up non-coding duties, like exploring software design approaches, writing documentation, producing check plans, and troubleshooting functions.

Within the migration toolkit, Pink Hat Developer Lightspeed automates supply code refactoring throughout the IDE. It leverages MTA’s static code evaluation to grasp migration points and the right way to repair them, and likewise improves over time by studying what made previous modifications profitable.

MariaDB unifies transactional, analytical, and vector databases in MariaDB Enterprise Platform 2026 launch

MariaDB’s Enterprise Platform 2026 launch was introduced this week, with the promise that it’s going to act as “the definitive database platform for constructing next-generation clever functions.”

To help agentic AI, the corporate added native RAG for grounding LLMs with context from MariaDB without having embeddings, vector shops, or retrieval pipelines. The corporate additionally added ready-to-use brokers throughout the platform, together with a developer copilot that connects to the database and might reply to pure language queries, and a DBA copilot that may handle duties like efficiency tuning and debugging.

Moreover, the corporate added an built-in MCP server in order that brokers can work together with MariaDB databases. The MCP interface in MariaDB permits customers to combine vector search, LLMs, and normal SQL operations, and permits brokers to launch serverless databases within the cloud.

Spotify Portal now typically obtainable and full of options for enhancing dev expertise

Spotify Portal for Backstage supplies builders with a ready-to-use model of Backstage, its open supply answer for constructing inside developer portals (IDPs).

AiKA, which is an AI assistant for Portal, can now hook up with third-party MCP servers and set off actions in Portal. AiKA itself additionally features as an MCP server, permitting builders to attach it as much as instruments like Cursor or Copilot and entry Portal knowledge.

“The final availability of Spotify Portal marks a pivotal second in how organizations construct, measure, and optimize developer expertise. What started as an inside device for Spotify engineers is now a fully-fledged platform for enterprises, combining the reliability of Backstage, the perception of Confidence, and the pace of AI-driven workflows,” Spotify wrote.

Sonar pronounces new answer to optimize coaching datasets for coding LLMs

Sonar, an organization that focuses on code high quality, introduced a brand new answer that can enhance how LLMs are skilled for coding functions.

In response to the corporate, LLMs which might be used to assist with software program improvement are sometimes skilled on publicly obtainable, open supply code containing safety points and bugs, which develop into amplified all through the coaching course of. “Even a small quantity of flawed knowledge can degrade fashions of any measurement, disproportionately degrading their output,” Sonar wrote in an announcement.

SonarSweep (now in early entry) goals to mitigate these points by guaranteeing that fashions are studying from high-quality, safe examples.

It really works by figuring out and fixing code high quality and safety points within the coaching knowledge itself. After analyzing the dataset, it applies a strict filtering course of to take away low-quality code whereas additionally balancing the up to date dataset to make sure it’ll nonetheless provide various and consultant studying.

Amazon launches Fast Suite to offer agentic AI throughout functions and AWS providers

Amazon Fast Suite permits customers to ask questions, conduct deep analysis, analyze and visualize knowledge, and create automations.

It will probably hook up with inside repositories, like wikis or intranet, and AWS providers. Amazon additionally presents 50+ built-in connectors to functions like Adobe Analytics, SharePoint, Snowflake, Google Drive, OneDrive, Outlook, ServiceNow, and Databricks, in addition to help for over 1,000+ apps by way of connecting to their MCP servers.

This deep connection throughout the enterprise allows Fast Sight to research knowledge throughout all of an organization’s methods and create complicated enterprise workflows throughout a number of functions and departments.

“In contrast to conventional enterprise intelligence instruments that work solely with databases and knowledge warehouses, Fast Sight’s agentic expertise analyzes all types of knowledge throughout all of your methods and apps, together with your paperwork,” Amazon wrote in a weblog submit.

Google unveils Gemini Enterprise to supply firms a extra unified platform for AI innovation

Google is saying a brand new providing constructed round Gemini, designed particularly with massive enterprise use in thoughts.

Gemini Enterprise consolidates six core parts:

  • Superior Gemini fashions
  • A no-code workbench for analyzing data and orchestrating brokers
  • Pre-built Google brokers for duties like deep analysis or knowledge insights
  • The flexibility to hook up with firm knowledge
  • A central governance framework for visualizing and securing all brokers
  • Entry to an ecosystem of over 100,000 trade companions

“By bringing all of those parts collectively by a single interface, Gemini Enterprise transforms how groups work. It strikes past easy duties to automate complete workflows and drive smarter enterprise outcomes — all on Google’s safe, enterprise-grade structure,” Thomas Kurian, CEO of Google Cloud, wrote in a weblog submit.

Atlassian shares main updates to its genAI assistant Rovo at Staff ‘25 Europe

Atlassian is internet hosting its annual person convention Staff ‘25 Europe this week in Barcelona, and in the course of the occasion, the corporate shared a number of new and upcoming updates to its generative AI assistant Rovo.

Atlassian introduced the final availability of its AI coding agent Rovo Dev. Rovo Dev will help with code critiques, documentation, dependency cleanups, and extra, and it leverages context from tickets, docs, incidents, and enterprise objectives to offer builders with data that can assist them make extra knowledgeable selections.

Moreover, beginning early subsequent 12 months, Rovo Search will develop into the default search in Jira, which is able to enable Jira’s search to counsel related points and initiatives.

Rovo Chat can even be getting over 100 out-of-the-box modular capabilities from Atlassian and its companions that can be utilized in chat, brokers, and workflows. Different new Chat capabilities embody the flexibility to recollect previous conversations and preferences and a brand new collaborative workspace known as Canvas.

Google launches ecosystem of extensions for Gemini CLI

Google is launching Gemini CLI extensions to permit totally different improvement instruments to attach as much as the Gemini CLI.

Every extension features a playbook that teaches the CLI the right way to successfully use that device, eliminating the necessity for builders to configure them. “If you wish to look below the hood, Gemini CLI extensions bundle directions, MCP servers and customized instructions into a well-known and user-friendly format,” Google wrote in a weblog submit.

Twenty-two extensions can be found at launch from Google companions Atlassian, Canva, Confluent, Dynatrace, Elastic, Figma, GitLab, Grafana Labs, Harness, HashiCorp, MongoDB, Neo4j, Pinecone, Postman, Qodo, Shopify, Snyk, Sonar, Stripe, ThoughtSpot, Weights & Biases by CoreWeave, and WIX.

IBM provides new capabilities to watsonx Orchestrate to facilitate agentic AI at scale

As IBM kicked off its annual developer occasion TechXchange 2025, it introduced a number of new capabilities to allow organizations to unlock worth from agentic AI.

“There’s definitely been loads of buzz within the trade,” mentioned Bruno Aziza, vp of Knowledge, AI, and Analytics Technique at IBM Software program. “I feel for those who have a look at the context of every little thing that’s happening, prospects are struggling. They’re struggling to get worth from their funding.

It introduced many updates to its AI agent orchestration platform, watsonx Orchestrate. The platform now consists of AgentOps, an observability and governance layer for AI brokers; Agentic Workflows, standardized and reusable flows that can be utilized to construct and sequence multi-agent methods; and Langflow integration to cut back agent setup time.

OpenAI DevDay: ChatGPT Apps, AgentKit, and GA launch of Codex

OpenAI held its annual Developer Day occasion this week the place it introduced a number of updates to its merchandise.

The corporate unveiled apps in ChatGPT in addition to an SDK for builders to construct them. Firms which have created apps which might be already obtainable embody Reserving.com, Canva, Coursera, Figma, Expedia, Spotify, and Zillow.

When a person says the identify of an obtainable app in a immediate, ChatGPT will routinely floor that app within the chat. For instance, saying “Spotify, make a playlist for my celebration this Friday” will convey within the Spotify app. ChatGPT can even have the ability to counsel apps when it thinks they’re related to the dialog, reminiscent of suggesting Zillow’s app in a dialog about shopping for a home.

Google’s coding agent Jules now works within the command line

Google’s coding agent Jules now can be utilized straight in developer’s command traces in order that it may possibly act as extra of a coding companion.

In response to Google, it created this new command line interface—known as Jules Instruments—out of a recognition that the terminal is the place builders spend most of their time.

Jules Instruments permits builders to spin up duties, examine what Jules is doing, and combine Jules into automation. “Consider Jules Instruments as each a dashboard and a command floor to your coding agent,” Google wrote in a weblog submit.

Amazon Bedrock AgentCore MCP server now obtainable

The AgentCore MCP server presents built-in help for runtime, gateway integration, id administration, and agent reminiscence. It was created to hurry up the method of making parts which might be appropriate with Bedrock AgentCore.

“What usually takes vital effort and time, for instance studying about Bedrock AgentCore providers, integrating Runtime and Instruments Gateway, managing safety configurations, and deploying to manufacturing can now be accomplished in minutes by conversational instructions together with your coding assistant,” AWS wrote in a weblog submit.

DigitalOcean updates Gradient AI Platform

The Gradient AI Platform is a platform for constructing AI brokers without having to handle the underlying infrastructure. New options which were added embody help for picture technology, auto-indexing of data bases, and VPC integration.

Moreover, DigitalOcean revealed that will probably be increasing the platform additional within the subsequent few weeks with new choices just like the Gradient AI AgentDevelopmentKit and Gradient AI Genie, which integrates into IDEs and can be utilized to handle multi-agent methods utilizing pure language.

Microsoft pronounces preview of its new Agent Framework

Microsoft has introduced a preview of the Microsoft Agent Framework, an open-source improvement package for .NET and Python for creating AI brokers and multi-agent workflows.

It helps creating particular person brokers in addition to graph-based workflows to attach up a number of brokers.

In response to Microsoft, the Agent Framework is a direct successor to its different initiatives Semantic Kernel and AutoGen, using foundations from each. It brings collectively Semantic Kernel’s enterprise-grade options like thread-based state administration, kind security, filters, telemetry, and mannequin and embedding help, with AutoGen’s abstractions for single- and multi-agent patterns.

Mendix updates its low-code platform with agentic AI options

New agent and genAI options embody an agent builder, the flexibility to create mission plans utilizing generative AI, the flexibility to create microflows and workflows with AI, and help for MCP.

One other focus space of the discharge is enterprise course of automation, and new options associated to that embody the flexibility for Mendix Workflows to name AI brokers, dynamic case administration, and International Inbox, a single view for all duties from a number of distributed workflows.

California passes regulation to make sure protected innovation of frontier AI fashions

Earlier this week, California’s governor Gavin Newsom signed a brand new regulation designed to make sure protected improvement and deployment of frontier AI fashions.

“California has confirmed that we will set up laws to guard our communities whereas additionally guaranteeing that the rising AI trade continues to thrive,” Newsom mentioned. “This laws strikes that steadiness. AI is the brand new frontier in innovation, and California shouldn’t be solely right here for it – however stands robust as a nationwide chief by enacting the first-in-the-nation frontier AI security laws that builds public belief as this rising know-how quickly evolves.”

The regulation, SB 53, establishes necessities for firms creating frontier AI fashions, spanning 5 classes: transparency, innovation, security, accountability, and responsiveness.

Slack evolves to help agentic capabilities constructed on dialog knowledge

Salesforce is saying a number of main updates to Slack that can allow prospects to leverage their dialog historical past for AI apps and brokers.

The corporate is saying a real-time search (RTS) API, which surfaces up-to-date discussions, information, and channels to offer brokers entry with context-aware data. To make sure safe use of data, knowledge stays in Slack and the API adheres to present person entry permissions and solely retrieves knowledge that’s related to the question.

“It unlocks your group’s collective intelligence, securely connecting brokers to conversations and selections that have been as soon as trapped in silos,” Salesforce wrote in a weblog submit.

Anthropic claims its newly launched Claude Sonnet 4.5 is the “finest coding mannequin on the earth”

Claude Sonnet 4.5 achieves a 77.2% on the SWE-bench for software program engineering, in comparison with 74.5% for Claude Opus 4.1 and 72.7% for Claude Sonnet 4. For exterior comparability, GPT-5 Codex scored at 74.5%, GPT-5 scored 72.8%, and Gemini 2.5 Professional scored 67.2%.

Moreover, it leads within the OSWorld benchmark, which exams AI fashions on real-world laptop duties. It scored 61.4% on that benchmark, beating out Claude Sonnet 4, which scored 42.2%.

“Sonnet 4.5 can produce near-instant responses or prolonged, step-by-step considering that’s made seen to the person,” Anthropic says.

In response to Anthropic, Claude Sonnet 4.5 additionally exhibits higher domain-specific data and reasoning within the fields of finance, regulation, and drugs.

Workato pronounces MCP platform

Workato Enterprise MCP supplies prospects with entry to over 100 totally managed MCP servers that may join with totally different LLMs and brokers, together with ChatGPT, Claude.AI, Amazon Q, Cursor, and Google Gemini. Among the MCP servers obtainable within the platform embody ones from Atlassian, Field, Reddit, Salesforce, Okta, and Shopify.

“At Workato, we hear every single day that whereas MCP is thrilling, enterprises nonetheless face challenges making MCP work securely, successfully, and reliably at scale,” mentioned Adam Seligman, Chief Expertise Officer at Workato. “Workato Enterprise MCP modifications that by bringing the total spectrum of enterprise processes, from the entrance workplace to the again workplace and every little thing in between, to AI brokers by MCP. With pre-built, enterprise-grade servers and expertise, we’re giving international enterprises a first-of-its-kind answer that unlocks AI brokers to soundly execute actual enterprise processes at scale, delivering measurable enterprise worth.”

VibeSec embeds safety evaluation into AI coding fashions to forestall technology of insecure code

OX Safety is shifting safety as far left as it may possibly go along with the launch of VibeSec, which it says can cease insecure AI-generated code earlier than the code even will get generated.

It does this by embedding dynamic safety context into the coding mannequin in order that it doesn’t counsel code that comprises safety points.

“VibeSec doesn’t simply speed up safety – it essentially modifications how safety operates. For the primary time, safety strikes quicker than vulnerabilities,” mentioned Neatsun Ziv, co-founder and CEO, at OX Safety.

OutSystems launches Agent Workbench

Agent Workbench permits customers to create and orchestrate AI brokers that leverage their firm’s knowledge units and workflows. For instance, in early entry, Axos Financial institution constructed a log evaluation agent to interpret error logs and Thermo Fisher Scientific used it to construct a Buyer Escalation Agent that interprets unstructured knowledge from buyer interactions.

“Agent Workbench was created to present our prospects the instruments they should construct the agentic future with OutSystems. Our Early Entry Program contributors have realized spectacular outcomes with Agent Workbench, positioning them as trade leaders in agentic AI,” mentioned Woodson Martin, CEO of OutSystems.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles