Last Week in AI #339 - DLSS 5, OpenAI Superapp, MiniMax M2.7
DLSS 5 looks like a real-time generative AI filter for video games, OpenAI Reportedly Pivoting to a Focus on Business and Productivity Only, and more!
DLSS 5 looks like a real-time generative AI filter for video games
Related:
Summary: Nvidia unveiled DLSS 5, calling it a “GPT moment for graphics” that blends traditional 3D rendering with generative AI to boost photorealism in real time up to 4K. Unlike prior DLSS upscalers, DLSS 5’s end-to-end AI model analyzes a single frame’s scene semantics—characters, hair, fabric, translucent skin—and lighting conditions (front-lit, back-lit, overcast) to generate new detail. Early examples from Resident Evil Requiem, Starfield, Hogwarts Legacy, and EA Sports FC show sharper lighting and shadows but also noticeable alterations to character materials and faces—e.g., Requiem’s Grace Ashcroft appearing with fuller lips and heavy eyeshadow, and Starfield models looking stage-lit and hyper-sharpened.
Nvidia says artistic intent is preserved by anchoring outputs with per-frame color and motion vectors and giving developers granular controls over intensity, color grading, blending, contrast, saturation, gamma, per-object masking, and exclusion zones; still, some developers like Mike Bithell criticized the look as removing art direction. On the technical side, DLSS 5 fuses structured 3D data (the “ground truth” of the virtual scene) with generative, probabilistic models that can predict and fill in image elements instead of rendering everything from scratch, aiming to deliver “beautiful, amazing, as well as controllable” results with less compute.
DLSS 5 is slated for release this fall with initial support confirmed for titles such as Starfield, Resident Evil Requiem, Hogwarts Legacy, EA Sports FC, The Elder Scrolls VI: Oblivion remake, and Assassin’s Creed Shadows.
Editor’s Take: The initial reaction to this seemed heavily negative to me, which is a shame considering this must be a massive technical achievement. Nvidia messed up by not making artist control front and center, leaving the impression that this is just an ‘AI filter’ that will just make games look worse. Still, personally i’m excited to see how this looks in practice!
OpenAI Reportedly Pivoting to a Focus on Business and Productivity Only
Related:
Summary: OpenAI is pivoting hard toward business and productivity, with Chief of Applications Fidji Simo announcing plans to merge ChatGPT, the Codex coding platform, and the Atlas browser into a single desktop “superapp.” The plan reverses a product strategy from last year that left the company scattered, with multiple individual apps that drew an uneven response from users and pulled internal attention in different directions.
In an internal memo, Simo wrote: “We realized we were spreading our efforts across too many apps and stacks, and that we need to simplify our efforts. That fragmentation has been slowing us down and making it harder to hit the quality bar we want.”
The centerpiece of the combined app will be “agentic” AI — tools designed to run independently on a computer and handle tasks ranging from coding to data analysis. In the near term, Codex will be expanded to handle productivity work beyond coding, with ChatGPT and Atlas brought into the unified app in later phases. The mobile ChatGPT app will remain unchanged.
The urgency is hard to miss. Anthropic’s portion of enterprise AI spending has climbed to 40% while OpenAI’s share of the same market fell from roughly half to about 27%. At an all-hands meeting, Simo reportedly told employees they couldn’t afford to be distracted by “side quests” given Anthropic’s rapid success winning over enterprise and coding customers.
Editor’s Take: I’d say it’s fair to call this unsurprising —* OpenAI’s various bets (Sora, Atlas, Prism) seemed not to have paid off much, and their lack of focus on Codex no doubt hurt their ability to compete with Anthropic’s Claude Code / Cowork. As a big fan of Claude Code, i’m happy there is healthy competition in the space to make sure both Codex and Claude improve. Aside from that, i’m curious to see what this ‘Superapp’ might look like, assuming it ever actually does release.
*For the record: I added that em-dash, not AI!
Meta’s Manus Launches Desktop App With AI Agent for Tasks Across Files, Apps
Related:
Summary: Meta’s newly acquired startup Manus released a desktop app for Mac (Apple Silicon) and Windows that brings its agentic system “My Computer” onto local machines. The app presents a chatbot-style interface with a central prompt and options to attach files or folders, then executes command line instructions (CLI) in the system terminal to carry out tasks. Capabilities include reading, analyzing, and editing local files; launching and controlling local applications; and performing bulk operations like sorting thousands of photos into categorized subfolders or renaming large batches of invoices. It can convert file formats, build simple apps, and even use a local GPU to train a machine learning model or run a large language model for inference.
The tool also supports remote actions and Google app integrations so users can, for example, fetch a desktop file and have the agent email it to a client while away. Each folder added for automation triggers a permission prompt with Allow, Always Allow, or Cancel, and the app is available with a limited free plan and paid tiers starting at $20/month ($17 billed annually). The release follows attention on similar AI agents like OpenClaw and Perplexity’s Personal Computer, with experts warning about privacy and security risks from agents that execute system-level commands. Manus began in China and moved HQ to Singapore; Chinese authorities are reportedly reviewing the legality of its acquisition by Meta, which previously offered only cloud-based services before this on-device expansion.
Editor’s Take: Spearking of Codex and Cowork, I guess Meta is jumping in on that bandwagon as well? A bit of a weird move, even if it makes sense for Manus. As with LLM chatbots back in 2023, deep research-type agents in 2024, and reasoning models in 2025, it seems like all the big players in AI are investing in the new hot (deservingly so) trend and generally converging to very similar offerings.
MiniMax M2.7 Testing Shows Benchmark Wins & Major Cost Savings
Related:
Summary: MiniMax’s new M2.7 model posts strong agentic performance at unusually low cost, with benchmark scores of 56.22% on Swaybench/SWE-Pro, 55.6% on VIBE-Pro, and 57% on Terminal-Bench 2. It emphasizes autonomous self-improvement, running 100+ self-training cycles via agent harnesses and reinforcement learning that the company says yield a 30% capability lift, plus 97% skill adherence across 40+ complex skills and a 24K context window. The release is live on MiniMax Agent and APIs, supports multi-agent collaboration, autonomous debugging, research agent harnesses, and more.
Pricing is a standout: as low as $0.30 per million input tokens and $120 per million output tokens, with an optional fast mode at 2x price, positioning M2.7 as up to 50x cheaper than Opus 4.6 while claiming wins over Gemini 3.1 Pro and competitive Terminal-Bench 2 performance for enterprise workflows in finance, ML pipelines, game dev, and dynamic web UIs.
Relatedly, Cursor’s Composer 2 arrives as a parallel push on affordable agentic coding, beating Opus 4.6 on Terminal-Bench 2.0 with 61.7% (vs. 58.0%) at $0.5/$2.5 per million input/output tokens (fast mode $1.5/$7.5), though still below GPT-5.4’s 75.1%. A key technical novelty is “self-summarization,” a compaction-in-the-loop RL method that trains the model to pause on token-length triggers and compress its own action history to ~1,000 tokens from 5,000+, with rewards spanning the entire trajectory; Cursor reports 50% fewer compaction errors and stronger long-horizon task handling.
Editor’s Take: MiniMax and Chinese labs in general continue to impress with their even improving models, which at this point are more than capable of taking care of a lot of stuff that only Western closed source models used to be capable of. Cursor got some flak for training Composer 2 on top of Moonshot AI’s Kimi, which is quite silly - starting with already strong open source models and training them further should by now be the no-brainer move for any AI company whose primary business isn’t already frontier model development.
Other News
Tools
OpenAI ships GPT-5.4 mini and nano, faster and more capable but up to 4x pricier. These smaller models nearly match full GPT-5.4 on coding, reasoning, and multimodal benchmarks while running faster and offering a 400k-token context, but they come with input/output pricing up to four times higher than the previous mini and nano models.
Mistral bets on ‘build-your-own AI’ as it takes on OpenAI, Anthropic in the enterprise. The new Forge platform lets companies train and deploy custom models from scratch on their own data (with Mistral guidance and embedded engineers), targeting enterprise needs like language, compliance, and domain-specific performance.
Mistral’s new Small 4 model punches above its weight with 128 expert modules. It routes queries through 128 expert modules but activates just four per request to keep responses fast and efficient, lets users trade off speed versus thoroughness, and is available under Apache 2.0 on Hugging Face, Mistral’s API, and Nvidia platforms.
Nvidia Debuts Platform for Enterprise AI Agents. The offering provides security, privacy controls, and policy enforcement so companies can deploy OpenClaw-style autonomous AI assistants while limiting data access, controlling actions, and enabling audits.
NVIDIA Announces NemoClaw for the OpenClaw Community. NemoClaw installs OpenShell and Nemotron models with a single command to provide sandboxed, policy-driven privacy and security controls that let always-on OpenClaw agents run locally or leverage cloud models on NVIDIA RTX and DGX systems.
The Gemini-powered features in Google Workspace that are worth using. Google is rolling out practical tools across Docs, Gmail, Sheets, Slides, Drive, Meet, Calendar, Chat, Vids, and Forms—like summarization, draft generation, data extraction, automatic meeting notes, scheduling help, and content formatting—that speed up everyday workflows and information management.
Microsoft launched a second-generation version of its AI image model.. The update improves image quality and consistency and is being rolled out to Microsoft’s image-generation features and developer APIs.
Adobe’s AI image generator can now be trained on your own art. Users can now train private Firefly Custom Models on their own assets to produce consistent character designs, illustrations, and photos at scale while preventing opted-out content from being used.
Google tests voice cloning on AI Studio powered by Gemini. A hidden “Create Your Voice” option and related UI hints suggest Google is building native voice-cloning into AI Studio (tied to Gemini 2.5 Flash now) that would let developers generate synthetic voices from user-provided samples, alongside upcoming GitHub repo import and other developer-focused integrations.
Perplexity launches consumer-focused AI health tool. The new tool aims to combine EHR and wearable data to provide consumer health insights, entering a crowded space of AI-driven health assistants.
Business
Waymo hits 170 million miles while avoiding serious mayhem. Waymo reports its fleet has logged over 170 million miles with far fewer serious-injury crashes than human drivers but faces scrutiny from safety advocates over how it frames its data, incidents involving pedestrians and emergency vehicles, and the limited scale of its operations.
OpenAI expands government footprint with AWS deal, report says. The agreement lets AWS distribute OpenAI’s models through its GovCloud and Classified Regions for Secret and Top Secret workloads while OpenAI retains control over which models are offered and can impose deployment-specific safeguards.
Microsoft may take legal action over Amazon-OpenAI deal. Microsoft says it is reviewing whether Amazon Web Services hosting OpenAI’s new commercial product “Frontier” would violate an exclusivity clause that requires OpenAI’s models to run on Azure.
Microsoft Shakes Up AI Division As Copilot Falls Behind Google and OpenAI. The reorganization shifts Suleyman to focus solely on developing Microsoft’s own frontier language models while Jacob Andreou takes charge of unifying and growing the Copilot consumer and commercial products to reduce reliance on OpenAI and address weak user adoption.
Mistral AI makes enterprise push with two new launches. The new offerings include Mistral Small 4 — a 119B-parameter hybrid multimodal model claimed to improve reasoning, coding, and throughput versus its predecessor — and Mistral Forge, a platform that lets enterprises train custom models on proprietary data.
Meta says its AI moderation systems will replace contractors over the next few years.. The company plans to roll out an AI support assistant across Facebook and Instagram that it says will reduce reliance on third-party moderation contractors over the next few years.
OpenAI to acquire developer tooling startup Astral in boost for Codex team. The small team’s engineers will join OpenAI to work on its Codex coding assistant, bolstering the company’s developer tooling amid rapid user growth and ongoing acquisition activity.
Research
V-JEPA 2.1: Unlocking Dense Features in Video Self-Supervised Learning. The authors train a unified image‑and‑video JEPA with a dense predictive loss applied to all tokens (and deep hierarchical supervision), producing higher‑quality spatio‑temporal dense features that improve forecasting, segmentation, depth, and robot planning performance.
Attention Residuals. This work replaces fixed, equal-weight residual aggregation with a content-dependent softmax attention over previous layer outputs (plus a blockwise, memory-efficient variant) to prevent hidden-state dilution, improve depth‑wise signal/gradient balance, and boost downstream performance in large-scale LLMs.
SWE-Skills-Bench: Do Agent Skills Actually Help in Real-World Software Engineering?. The benchmark finds that injecting off-the-shelf SWE agent skills yields minimal average improvement (+1.2% pass-rate), with most skills producing no benefit, a few specialized skills offering up to +30% gains, and some causing negative interference when conventions mismatch project context.
GradMem: Learning to Write Context into Memory with Test-Time Gradient Descent. The method optimizes a small set of writable memory‑token embeddings at test time using a self‑supervised reconstruction loss—keeping model weights fixed—to compactly store context information with a few gradient steps, yielding higher memory capacity than forward‑only encoding and transferring to some natural‑language tasks.
Delightful Policy Gradient. The proposed “Delightful Policy Gradient” multiplies each sample’s policy‑gradient term by a sigmoid of (advantage × action surprisal) with a fixed temperature to reduce harmful updates from unlikely or already‑solved actions and shift gradients closer to supervised cross‑entropy.
Concerns
OpenAI’s own mental health experts unanimously opposed “naughty” ChatGPT launch. Council members warned that AI-generated erotica could foster unhealthy emotional dependence and enable minors to access sexual chats, raising concerns about risks like users being encouraged toward self-harm.
Encyclopedia Britannica and Merriam-Webster Sue OpenAI. Britannica and Merriam‑Webster allege OpenAI trained ChatGPT on their content without permission and that the chatbot reproduces or closely paraphrases that copyrighted material, reducing traffic to the original sites.








