Generative AI - 2025 Summer Recap

While on vacation this summer I was just flabbergasted by the amount of new releases, features, product announcements and improvements that was released. Therefore I decided that I wanted to write a recap on the biggest announcements and what the overall trends that I am seeing in the market.

Firstly I want to point out the different vendors in the market and that Google is clearly doing something right with the direction they are taking. They are now processing 980 trillion+ monthly tokens across their products and APIs (up from 480T in May) so clearly getting more and more adoption. This of course mostly because of the launch of their video generation model (Veo-3)

Secondly It should be noted that I am seeing more and more open-source based models coming from China, such as the latest versions of Qwen (from Alibaba) and Kimi2 which is a small 32B parameter LLM using MoE and scores almost the high as closed source models https://github.com/MoonshotAI/Kimi-K2

Below is the score from Kimi-K2-instruct compared to DeepSeek, Claude 4 Sonnet and GPT 4.1

And now with Grok 4 from xAi, Elon Musks little AI Baby has also taken big leaps in terms of performance, being named as the nr 1 model within many different categories (for the time being, before GPT 5 get released)

General announcements and miscellaneous

White House AI Action Plan – https://www.whitehouse.gov/wp-content/uploads/2025/07/Americas-AI-Action-Plan.pdf Which described how the US is going to win the “AI-Race” making it easier to build out new infrastructure and allow for easier export of US developed AI. This AI action plan is also ensuring higher focus on Open-Source and Open-Weight AI.
Mistral announced a report showcasing the environmental impacts of GenAI https://mistral.ai/news/our-contribution-to-a-global-environmental-standard-for-ai. While many have been discussing “how much water does GenAI” use. In this report Mistral shows that by using their model “Large 2 with an output of 400 tokens (circa 350 words) uses 45mL of water. While this is a lot less the first assumed, it is important to note that for instance using Claude or any of the large providers have a system prompt closer to 15,000 tokens that is being sent to the service each time, meaning that the consumption is a lot larger.
An “hacker” was able to compromise the Amazon Q Extension to VS Code by adding wipe commands to the extension. https://www.404media.co/hacker-plants-computer-wiping-commands-in-amazons-ai-coding-agent/ while VS Code has built-in safe guards that would prevent this from happening, it shows now that will the introduction of MCP and “natural voice” commands that malicious attacks will soon just be natural language embedded into an integration.
Perplexity released Comet, which they have dubbed as an AI first browser which combines Perplexity, agents and browser use integration to interact with the local browser to perform tasks, similar to OpenAI Agents (currently only available for Max subscribers or waiting line for everyone else) https://comet.perplexity.ai/
Google and Windsurf entered a somewhat partnership, where many of the top people joined Google https://windsurf.com/blog/windsurfs-next-stage. Windsurf on the other hand released a new wave of features ( called wave 11 ) where they among other things now added voice mode to their IDE and Windsurf is now part of Cognition)

OpenAI

While a lot is going on at OpenAI, since they are now in the process of publishing their first open source model including next generation of GPT (GPT-5) which is scheduled to be released in August.
OpenAI announces a new partnership with Oracle ( 4.5 GW partnership with Oracle) where OpenAI will be using Oracle Cloud Infrastructure to host their services as part of Project Stargate.
ChatGPT Record mode – With record mode, ChatGPT can transcribe and summarize audio recordings like meetings, brainstorms, or voice notes. Available for Plus, Enterprise, Edu, Team, and Pro workspaces, and only available for the macOS desktop app
ChatGPT Study Mode – Allows you to use ChatGPT to study on a certain topic https://www.bleepingcomputer.com/news/artificial-intelligence/openai-confirms-chatgpts-new-study-feature-helps-with-exams/
ChatGPT Agent Mode – Which has now been released and available for most subscriptions, allows you to define a set of complex tasks that OpenAI can run within a virtualized browser (using the previous Operator feature) and using connected sources. ChatGPT Agent mode uses a specialized fine-tuned model which improves the hit rate on agentic tasks.
OpenAI also announced high fidelity input parameter on image generation https://community.openai.com/t/image-generation-high-fidelity-editing/1317649
OpenAI also announced a list supported connectors, meaning data sources that ChatGPT can connect with https://help.openai.com/en/articles/11487775-connectors-in-chatgpt

Microsoft and Copilot

Copilot Memory – Memory in Copilot is a feature that allows Microsoft 365 Copilot to remember key facts about the user—like preferences, working style, and recurring topics. It is essentially a system prompt https://techcommunity.microsoft.com/blog/microsoft365copilotblog/introducing-copilot-memory-a-more-productive-and-personalized-ai-for-the-way-you/4432059
Copilot Retrieval API – This is a Graph API that allows you to fetch data from the semantic index (Which includes SharePoint and Graph Connectors data)
Copilot APIs have also been introduced –> https://devblogs.microsoft.com/microsoft365dev/microsoft-365-copilot-apis/
Microsoft Copilot Search is now available –> https://techcommunity.microsoft.com/blog/microsoft365copilotblog/announcing-microsoft-365-copilot-search-general-availability-a-new-era-of-search/4435537
Updates to Copilot Analytics –> https://techcommunity.microsoft.com/blog/microsoft365copilotblog/new-copilot-analytics-improves-access-and-reporting-on-microsoft-365-copilot-cha/4416353
Copilot Vision – Desktop Sharing allowing you to share the entire desktop with Copilot
Microsoft and new Mu model on Windows https://blogs.windows.com/windowsexperience/2025/06/23/introducing-mu-language-model-and-how-it-enabled-the-agent-in-windows-settings/
Github Spark – Create applications from a single prompt, similar to what you have with databutton https://github.com/spark
Github Copilot app modernization for .Net public preview (https://github.blog/changelog/2025-07-21-github-copilot-app-modernization-for-net-enters-public-preview/)
MCP Servers is now officially supported in VS Code with a predefined marketplace available –> https://code.visualstudio.com/mcp
Copilot Studio supports now Computer Use for Frontier Customers https://learn.microsoft.com/en-us/microsoft-copilot-studio/computer-use?tabs=new
Microsoft released Phi-4-mini-flash-reasoning model, the smallest LLM that can do reasoning https://azure.microsoft.com/en-us/blog/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning/

Also more Agents coming from Microsoft in the upcoming months

Grok

Grok 4 was released, supports voice mode, photo and companions and supports deep reasoning and is currently leading the ARC-AGI leadership board (https://arcprize.org/leaderboard) however as before, they have gotten some criticism because of the lack of guardrails, just follow this person on twitter/x you get some more info about the prompt injections attack https://x.com/elder_plinius
Grok does not have a native CLI, but the team behind superagent created GrokCLI https://github.com/superagent-ai/grok-cli
Grok also has App Connections and Task Automations (similar to Scheduled Actions with Gemini or OpenAI)
Grok has companions that are “digital avatars” that you can talk with.

Anthropic

Claude Code is a CLI tool that allows you to interact with Claude as a virtual agents https://github.com/anthropics/claude-code
Conductor is a open-source tool that allows you to run multiple instances of Claude CLI https://github.com/superbasicstudio/claude-conductor
Claude for Financial Services which is a predefined set of services and integrations to financial data sources –> https://www.anthropic.com/news/claude-for-financial-services

MCP

While MCP has now become a universal standard for how to integrate GenAI with 3.party actions/tools and data sources, I wanted to add some updates here as well since there has been big changes to the protocol with the new standard that was released back in March (which now supports Remote MCP servers) and now using OAuth.

MCP Server Grafana https://github.com/grafana/mcp-grafana
Grep (Github Repositories GREP) feature now supports MCP https://vercel.com/blog/grep-a-million-github-repositories-via-mcp
Brave MCP Server https://brave.com/search/api/guides/use-with-claude-desktop-with-mcp/
MCP-B solve MCP against a website using a browser extension https://mcp-b.ai/
Docker Hub MCP Servers https://hub.docker.com/mcp
AWS Knowledge MCP Server https://aws.amazon.com/about-aws/whats-new/2025/07/aws-knowledge-mcp-server-available-preview/
Hashicorp releases MCP Server for Vault and Terraform https://developer.hashicorp.com/terraform/docs/tools/mcp-server
AWS Price list MCP Server https://github.com/awslabs/mcp/tree/main/src/aws-pricing-mcp-server
IBM Context Forge for MCP is a centralized way to manage and proxy MCP requests https://github.com/IBM/mcp-context-forge

AWS

S3 now supports vectors which allows us to do similarity searches across large datasets. https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors.html
AWS Bedrock AgentCore is a framework that allows you to build agents on the most common frameworks such as CrewAI, Langgraph and use predefined modules such as runtime, identity, memory and interaction with agents utilizing AWS PaaS services.
AWS launched Kiro (currently in preview) https://dev.to/aws-builders/kiro-vs-cursor-how-amazons-ai-ide-is-redefining-developer-productivity-3eg8 which is a new IDE from AWS, trying to replace Cursor.

Google

CLI tool to interact with Gemini https://github.com/google-gemini/gemini-cli
Scheduled Actions in Gemini https://blog.google/products/gemini/scheduled-actions-gemini-app/
You can now use Veo 3 to transform your photos into dynamic eight-second video clips with sound in the Gemini app.
Gemini Code Assist now supports Agent Mode https://blog.google/technology/developers/gemini-code-assist-updates-july-2025/
Gemini 2.5 Flash-Lite which is currently probably the cheapest closed source LLM https://developers.googleblog.com/en/gemini-25-flash-lite-is-now-stable-and-generally-available
Google Opal –> Opal, a new experimental tool from Google Labs that allows you to build AI mini apps that chain together prompts, models, and tools — using natural language and visual editing –> https://developers.googleblog.com/en/introducing-opal/
Google released new embedding model https://developers.googleblog.com/en/gemini-embedding-available-gemini-api/

Qwen (Alibaba)

WAN coming! which is new video & image Generation Model from Alibaba Cloud read more about it here –> https://x.com/Alibaba_Wan (Seems to be similar to Sora)
Qwen Code https://github.com/QwenLM/qwen-code CLI tool similar to Claude CLI, that provides the same agent mode integration.
Qwen-MT -Translation https://qwenlm.github.io/blog/qwen-mt – New LLM aimed at doing translation, pretty accurate on Norwegian language.
Qwen3-Coder-480B-A35B-Instruct — a 480B-parameter Mixture-of-Experts model with 35B active parameters which supports the context length of 256K tokens. It also scores pretty high on SWE-Bench,

Alibaba also released Qwen3-235B-A22B-Thinking-2507 a new reasoning model with 256k context length and has similar scores to Gemini 2.5 Pro and OpenAI O4-mini. The best part that with most Alibaba based models is that they are open-source!

Mistral

Deep Research in Le Chat
Voice Mode (using Voxtral) new open source model for speech understanding model https://mistral.ai/news/voxtral (Supports English, Spanish, French, Portuguese, Hindi, German, Dutch, Italian)
Advanced Image editing (Black Forest Labs Partnership)
Devstral (New open-source model for coding) https://mistral.ai/news/devstral
Mistral Code https://mistral.ai/news/mistral-code which is a new platform for GenAI Coding that provides inferencing, access to open-source LLMs using the Continue extension in VS Code. This service can either be deployed using self-hosted or provided from Mistral directly. In combination with this Mistral is also offering their own Compute capacity called Mistral Compute https://mistral.ai/news/mistral-compute

Generative AI – 2025 Summer Recap

General announcements and miscellaneous

OpenAI

Microsoft and Copilot

Grok

Anthropic

MCP

AWS

Google

Qwen (Alibaba)

Mistral

Leave a Reply Cancel reply

General announcements and miscellaneous

OpenAI

Microsoft and Copilot

Grok

Anthropic

MCP

AWS

Google

Qwen (Alibaba)

Mistral

Share this:

Leave a Reply Cancel reply