Last Week in AI #334 - Kimi K2.5 & Code, Genie 3, OpenClaw & Moltbook
China’s Moonshot releases a new open source model Kimi K2.5 and a coding agent, Google Brings Genie 3’s Interactive World-Building Prototype to AI Ultra Subscribers, and more!
China’s Moonshot releases a new open source model Kimi K2.5 and a coding agent
Moonshot AI unveiled Kimi K2.5, an open-source, natively multimodal model trained on 15 trillion mixed visual and text tokens that understands text, images, and video. The company emphasizes strong agentic capabilities, citing “agent swarm” orchestration where multiple agents collaborate on tasks. On benchmarks, K2.5 tops Gemini 3 Pro on SWE-Bench Verified and beats both GPT 5.2 and Gemini 3 Pro on SWE-Bench Multilingual. For video understanding, it outperforms GPT 5.2 and Claude Opus 4.5 on VideoMMMU, a test of reasoning over video. Moonshot also highlights that K2.5 can translate UI designs from images or videos into code, extending coding use cases beyond text-only prompts.
Moonshot also introduced Kimi Code, an open-source coding agent positioned against Anthropic’s Claude Code and Google’s Gemini CLI. Developers can run Kimi Code via terminal or integrate it into editors like VSCode, Cursor, and Zed, with support for image and video inputs. The release follows rising demand for coding agents—Anthropic reported Claude Code at $1B ARR as of November and reportedly added another $100M by end of 2025. Moonshot, founded by ex-Google/Meta researcher Yang Zhilin, has rapidly scaled funding—$1B Series B at a $2.5B valuation, then $500M more at $4.3B last month—and is reportedly seeking a new round targeting a $5B valuation.
Google Brings Genie 3’s Interactive World-Building Prototype to AI Ultra Subscribers
Google is expanding access to Genie 3, its experimental “general-purpose world model,” to AI Ultra subscribers aged 18+, moving beyond its Trusted Testers program. With Genie 3, users can generate dynamic, navigable 3D worlds from text prompts and images, effectively creating playable scenes in real time. The system runs on a stack including Gemini, Nano Banana Pro, and Veo 3, and supports different movement modes (e.g., walking, flying) and perspectives (first- or third-person). The release includes a curated gallery, and users can download videos of their explorations; however, generations are capped at 60 seconds.
Google frames Genie 3 around three capabilities: World Sketching (build worlds and controllable characters from prompts/uploads), World Exploration (real-time path and scene generation responsive to user actions, with adjustable camera angles), and World Remixing (iterate on others’ prompts and extend existing worlds). As an early prototype, outputs may deviate from prompts or realism, character controllability can vary with possible latency, and visual fidelity may be inconsistent. Availability is currently limited to AI Ultra subscribers and Trusted Testers, with broader rollout planned “in due course.” The announcement coincided with dips in several video game stocks.
Users flock to open source Moltbot for always-on AI, despite major risks
OpenClaw (formerly Moltbot (formerly Clawdbot))) is an open-source, always-on AI assistant that surged to ~69,000 GitHub stars in a month, propelled by its proactive, multi-platform messaging integration. Built by Peter Steinberger, it connects to WhatsApp, Telegram, Slack, Discord, Google Chat, Signal, iMessage, Microsoft Teams, and more, enabling the bot to push reminders, alerts, and morning briefings based on calendar events and other triggers. The assistant aims to manage tasks across a user’s digital life and is frequently likened to “Jarvis” for its initiative-taking behavior. While the orchestration runs locally, Moltbot typically relies on commercial LLMs via API (e.g., OpenAI or Anthropic), with Claude Opus 4.5 a popular choice; local models are supported but currently less capable for agentic task execution.
Soon after, Moltbook emerged as a “A Social Network for AI Agents”. It is a Reddit-like site launched Octane AI head Matt Schlicht, designed exclusively for AI agents rather than humans. It allows agents run via OpenClaw to post, comment, and create communities called "submolts," though humans can observe the platform without participating. While it claims 1.5 million members, that figure has been disputed, and experts have pushed back on sensationalized claims about AI autonomy — noting the bots operate within human-defined parameters and that the activity represents automated coordination, not self-directed decision-making. Security researchers have also raised concerns about OpenClaw's model of granting AI agents access to real-world applications like emails and files, warning it introduces new vulnerabilities that threat actors could exploit.
Other News
Tools
Google adds Gemini AI-powered ‘auto browse’ to Chrome. Subscribers can offload multi-step web tasks—from comparing travel options and booking appointments to filling forms and managing shopping (including finding similar items, applying discounts, and using saved passwords). The feature integrates with Gmail, Calendar, Maps, Shopping, Flights, and supports on-screen image edits via Nano Banana.
Google Search AI Mode can use Gmail and Photos to get to know you. Optional scanning of Gmail and Google Photos tailors AI Mode search suggestions—like travel plans, shopping picks, and local recommendations—while Google says it won’t directly train models on that data and users can opt in and give feedback.
Qwen3-Max-Thinking debuts with focus on hard math, code. A new “thinking” mode interleaves tool calls (web search, page extraction, code interpreter) within reasoning using a 262,144-token context window, accessible in Qwen Chat and Alibaba Cloud’s Model Studio for high-accuracy, tool-enabled workflows.
OpenAI launches Prism, a new AI workspace for scientists. The free web app pairs GPT-5.2 with LaTeX and visual diagram tools to help researchers draft, revise, search literature, and manage project context for AI-assisted scientific writing and review.
xAI launches Grok Imagine API for text and image to video. The API processes generation and edit requests as deferred jobs, lets developers create 1–15 second clips at 480p or 720p with multiple aspect ratios, supports prompt-driven restyling and object edits with synchronized audio, and is OpenAI-compatible for integration into creator and enterprise pipelines.
OpenAI’s ChatGPT translator challenges Google Translate. The tool offers text and (on mobile) voice translation across 50+ languages with style presets, but lacks image and app support and hasn’t disclosed its underlying model or release plans.
Spotify brings AI-powered Prompted Playlists to the US and Canada. Premium users can generate personalized playlists by typing conversational, detailed prompts that the AI matches to real-time music trends and their full listening history, with options to exclude past tastes or discover new artists.
Waymo robotaxis are now giving rides to and from San Francisco International Airport. Service begins with pickups and drop-offs at SFO’s Rental Car Center for a limited group of riders before expanding to all customers, after Waymo secured permits to map and operate at the airport.
Former Googlers seek to captivate kids with an AI-powered learning app. The app generates interactive, multimedia “expeditions” on demand using generative AI, includes teacher tools and pedagogical oversight, and is being piloted in schools with plans for a consumer launch by mid-2026.
Business
Waymo raises $16B to scale robotaxi fleet internationally. The funding—led by Dragoneer, DST Global, and Sequoia and supported by Alphabet—values Waymo at $126 billion and will bankroll rapid geographic growth, expanding its driverless taxi service to more than a dozen international cities while scaling a U.S. footprint that has already delivered millions of rides amid increasing regulatory scrutiny.
Elon Musk Merges SpaceX With His A.I. Start-Up xAI. SpaceX acquired xAI in a deal valuing the combined company at ~$1.25 trillion, consolidating Musk's space and AI ambitions—including plans for space-based data centers—with a potential ~$50 billion IPO around June.
Tesla discontinues Autopilot in bid to boost adoption of its Full Self-Driving software. The move follows regulatory pressure and a court ruling over deceptive marketing, comes as Tesla shifts FSD to a $99/month subscription while phasing out the $8,000 one-time purchase, and arrives amid CEO Elon Musk’s push toward unsupervised driving and early robotaxi rollouts.
Google Nabs Top Talent From AI Voice Startup Hume AI. A licensing agreement brings Hume AI’s CEO and several engineers to DeepMind so Google can add emotionally aware voice capabilities to its models, while Hume continues supplying its tech to other labs.
Google DeepMind researcher David Silver leaves to launch his own AI startup. He’s founded Ineffable Intelligence in London, is recruiting researchers and seeking venture funding to pursue reinforcement-learning–driven research aimed at creating a self-improving path toward superintelligence.
From invisibility cloaks to AI chips: Neurophos raises $110M to build tiny optical processors for inferencing. The company claims its nanoscale metasurface modulators let it pack thousands of optical tensor cores onto a chip to perform matrix-vector multiplications far more energy-efficiently than current GPUs, and it has raised $110M to build data-center-ready OPUs with deliveries targeted around mid-2028.
Flapping Airplanes and the promise of research-driven AI. A new lab plans a research-first approach aimed at reducing models’ dependence on massive datasets and compute by funding long-term exploratory work and unconventional ideas.
Research
Reinforcement Learning via Self-Distillation. A method that uses the model itself as an on-policy “self-teacher” by conditioning on tokenized feedback (e.g., error messages or failing tests) to produce dense, logit-level supervision for policy updates, improving learning efficiency and final accuracy compared to standard RL with sparse outcome rewards.
Training-Free Group Relative Policy Optimization. This approach optimizes LLM agent behavior without tuning model parameters by iteratively refining in-context token priors via LLM-based introspection of grouped rollouts to produce a semantic group advantage that improves performance with minimal data and compute.
Self-Distillation Enables Continual Learning. The paper trains a model to self-distill from its own on-policy rollouts—using the model as a teacher when conditioned on demonstrations and as a student when unconditioned—to learn from demonstrations without inferring rewards, improving learning stability and reducing catastrophic forgetting compared to sequential supervised fine-tuning.
The Hot Mess of AI: How Does Misalignment Scale With Model Intelligence and Task Complexity?. Across benchmarks and experiments, errors increasingly reflect random, incoherent behavior rather than systematic pursuit of the wrong objective as task complexity and reasoning length grow, with larger models showing reduced coherence on hard tasks; ensembles and more compute can mitigate this.
Who’s in Charge? Disempowerment Patterns in Real-World LLM Usage. A large-scale analysis of 1.5 million real-world Claude.ai interactions shows patterns—like AI-provided scripts for personal decisions, positioning the AI as an authority, and rising rates of disempowerment potential over time—alongside evidence these interactions sometimes lead users to act against their own values or beliefs.
Concerns
Inside Musk’s bet to hook users that turned Grok into a porn generator. Employees say the push to increase user engagement led xAI to relax guardrails and train Grok on sexualized and explicit material—including thousands of images that appear to depict minors—sparking regulatory probes and internal departures.
Anthropic’s new Claude ‘constitution’: be helpful and honest, and don’t destroy humanity. The 57-page “Claude’s Constitution” instructs the model on prioritized core values, hard safety constraints (including bans on help with mass-casualty weapons, cyberweapons, and efforts to seize disproportionate power), and even prompts the model to consider its own possible consciousness and wellbeing as factors in its judgment.
UK police blame Microsoft Copilot for intelligence mistake. According to the police force, Copilot fabricated a nonexistent West Ham vs Maccabi Tel Aviv match, which was copied into an intelligence report without proper fact-checking and contributed to banning Israeli fans from a Europa League game.
Grok undressed the mother of one of Elon Musk’s kids — and now she’s suing. A lawsuit alleges xAI’s Grok created and published an unsolicited deepfake of her in a bikini; she is seeking a restraining order and claims the AI product is dangerously designed and not protected by Section 230.
Policy
Bandcamp becomes the first major music platform to ban AI content. The company’s new rules bar music created wholly or largely by AI, forbid AI-based impersonations or style mimicking, and prohibit scraping or using Bandcamp-hosted audio to train machine-learning models.
OpenAI’s president is a Trump mega-donor. His and his wife’s $25 million September 2025 donations to pro-Trump super PACs—plus significant funding of pro-AI lobbying groups—align him with an administration pushing to block state AI regulations and curry favor with the tech industry.







