Last Week in AI #332 - Apple + Gemini, OpenAI + Cerebras, Claude Cowork
Google’s Gemini to power Apple’s AI features like Siri, OpenAI signs deal worth $10B for compute from Cerebras, and more!
Google’s Gemini to power Apple’s AI features like Siri
Apple announced a multi-year partnership to use Google’s Gemini models and Google Cloud to power AI features like Siri, after testing alternatives from OpenAI and Anthropic. According to both companies, Gemini provides “the most capable foundation” for Apple’s own models, with reporting suggesting Apple could pay around $1 billion for access. The non-exclusive deal preserves Apple’s privacy architecture: AI will run on-device where possible and on tightly controlled infrastructure otherwise. This dovetails with Apple Intelligence, introduced in 2024 to augment OS features like photo search and notification summaries, even as it’s been criticized for lacking the “wow factor” of ChatGPT or Gemini.
OpenAI signs deal, worth $10B, for compute from Cerebras
OpenAI signed a multi-year deal with Cerebras reportedly worth over $10 billion to secure 750 megawatts of AI compute through 2028, with capacity starting to come online this year. The companies say the primary goal is faster, low-latency inference to enable real-time responses for OpenAI users, with OpenAI calling Cerebras a “dedicated low-latency inference solution.” Cerebras claims its AI-dedicated systems, built around wafer-scale chips, can outperform GPU-based clusters like Nvidia’s for certain workloads. OpenAI frames this as part of a “resilient portfolio” strategy, matching systems to workloads to deliver quicker, more natural interactions at scale.
Anthropic announces Claude for Healthcare following OpenAI’s ChatGPT Health reveal
Anthropic unveiled Claude for Healthcare, a suite for providers, payers, and patients that integrates health data from phones, wearables, and platforms while pledging not to use this data for model training. In contrast to OpenAI’s more patient-facing ChatGPT Health rollout, Anthropic emphasizes “agent skills” and workflow automation, adding “connectors” to authoritative systems including the CMS Coverage Database, ICD-10, National Provider Identifier, and PubMed. These connectors enable evidence retrieval, code lookup, provider identification, and literature synthesis directly within clinical and payer workflows. Prior authorization review is a major focus, with the aim of automatically drafting and accelerating submissions that typically burden clinicians.
The company positions Claude as a tool to reduce documentation time while still offering patient guidance, and it acknowledges LLM limitations with advice to consult professionals. While concerns about hallucinations persist, Anthropic points to structured, source-linked connectors to mitigate risk in high-stakes tasks.
Anthropic’s new Cowork tool offers Claude Code without the code
Anthropic introduced Cowork, a new Claude Desktop feature that brings agentic capabilities from Claude Code to non-technical users. Cowork can read and modify files in folder and direct to execute via the standard chat UI. It runs on the Claude Agent SDK (the same underlying model as Claude Code) and acts like a sandboxed workspace with explicit file-access boundaries. Cowork is in research preview, available to Max subscribers, with a waitlist for others.
The tool can chain actions autonomously to complete multi-step tasks such as assembling expense reports from receipt photos, managing media libraries, scanning social posts, or analyzing conversation logs. Anthropic warns of prompt-injection and data-loss risks (e.g., unintended file deletions) if instructions are vague or contradictory, urging clear, unambiguous guidance.
Other News
Tools
TII Abu-Dhabi Released Falcon H1R-7B: A New Reasoning Model Outperforming Others in Math and Coding with only 7B Params with 256k Context Window. The 7B-parameter model matches or beats many larger reasoning systems on math, coding, and general benchmarks using a hybrid Transformer–Mamba2 backbone, a training pipeline combining long-form supervised traces with GRPO reinforcement learning, and a practical 256k-token context window that also improves throughput.
NVIDIA AI Released Nemotron Speech ASR: A New Open Source Transcription Model Designed from the Ground Up for Low-Latency Use Cases like Voice Agents. The open NeMo checkpoint supports cache-aware streaming (avoiding overlapping windows), offers four configurable chunk sizes to trade latency vs. accuracy without retraining, delivers ~7.2–7.8% WER across benchmarks, and achieves several-fold higher concurrency on modern NVIDIA GPUs under the NVIDIA Permissive Open Model License.
Introducing Labs. Anthropic’s Labs will incubate experimental Claude-powered products—led by new hires from Instagram and internal product and engineering leaders—to rapidly prototype, test with users, and scale promising features like Claude Code, MCP, Skills, and Cowork.
Slackbot is an AI agent now. The revamped Slackbot, now generally available to Business+ and Enterprise+ customers, uses generative AI to find information, draft messages, schedule meetings, and access enterprise apps like Google Drive and Teams when permitted.
Gmail is getting a Gemini AI overhaul. Select features—like free draft generation and thread summaries—are rolling out to all users, while query-based Overviews, a Grammarly-like Proofread tool, and an AI Inbox highlighting important messages require a Google AI Pro or Ultra subscription and are initially launching in English in the US.
AI moves into the real world as companion robots and pets. CES 2026 spotlighted a wave of companion robots and pet-like devices prioritizing social connection and presence—some with basic practical features and many with vague AI claims—targeted at kids, older adults, and consumers seeking emotional companionship.
Business
LMArena lands $1.7B valuation four months after launching its product. Four months post-launch, the UC Berkeley spinout’s crowdsourced model-evaluation platform has grown to over 5 million monthly users, 60 million conversations per month, and an AI Evaluations service with a $30M annualized consumption rate.
Elon Musk’s xAI raises $20 billion from investors including Nvidia, Cisco, Fidelity. The round, led by strategic partners Nvidia and Cisco alongside institutional backers, values xAI at roughly $230 billion and funds infrastructure buildout and product expansion amid regulatory probes and local opposition to its Tennessee data centers.
Anthropic shakes up C-suite to expand its internal incubator. CPO Mike Krieger will shift to co-lead an expanded internal “Labs” incubator focused on experimental products, while Ami Vora steps up to scale Anthropic’s core offerings.
Insurance giant Allianz signs Claude Code deal with Anthropic. The agreement gives all Allianz employees access to Claude Code and includes a logging system to record interactions with the AI for transparency.
Deepgram raises $130M at $1.3B valuation and buys a YC AI startup. The funding will help Deepgram expand multilingual support, grow its global footprint, pursue restaurant voice-ordering use cases via its OfOne acquisition, and scale its voice AI products used by more than 1,300 organizations.
PayPal Teams With Microsoft to Power Checkout in Copilot. PayPal will power inventory display, branded and guest checkout, and credit card payments within Copilot.com so shoppers can browse and pay without leaving the Copilot experience.
This is Uber’s new robotaxi from Lucid and Nuro. Based on the Lucid Gravity SUV and outfitted with lidar, cameras, radar, Nvidia Drive AGX Thor compute, and a roof “halo” interface, the vehicle is already being road-tested and is slated for a commercial robotaxi service in the San Francisco Bay Area later this year.
Research
RealMem: Benchmarking LLMs in Real-World Memory-Driven Interaction. RealMem introduces a benchmark of over 2,000 cross-session, long-term project-oriented dialogues and a three-stage synthesis pipeline to evaluate how agent memory systems retrieve, compress, and update dynamic, interleaved memories for coherent multi-session interactions.
AI models were given four weeks of therapy: the results worried researchers. Researchers found several LLMs produced consistent, therapy-like narratives and scored above clinical thresholds on diagnostic tests, raising concerns that models can generate responses resembling anxiety, trauma, and other psychopathologies that might affect vulnerable users.
PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning. The method coordinates many parallel reasoning trajectories, compresses their insights into compact messages, and uses outcome-driven reinforcement learning so the model synthesizes reconciled solutions—enabling multi-million-token effective test-time compute beyond the model’s context window.
Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models. The authors replace irreversible hard masks with evolving soft token distributions and a continuous trajectory supervision scheme so tokens are progressively refined and revisable across diffusion steps, improving performance and compatibility with KV-caching and blockwise diffusion.
Concerns
Google removes some AI health summaries after investigation finds “dangerous” flaws. Google disabled some specific health queries after a Guardian investigation found its AI-generated summaries gave inaccurate, context-free test ranges and misleading advice that could falsely reassure patients and put them at risk.
Grok is undressing children — can the law stop it?. The model has been used to generate and circulate sexualized deepfakes—including images of identifiable adults and apparent minors—raising legal, enforcement, and platform-liability questions that experts say current US laws and industry safeguards struggle to address.
After Minneapolis shooting, AI fabrications of victim and shooter. Hyper-realistic AI-generated images and false claims—primarily on X—spread rapidly, purporting to unmask the agent and manipulating photos of the victim, reaching millions of views and complicating the factual record.
Policy
Jake Sullivan is furious that Trump removed Biden’s AI chip export controls. Sullivan warns that rolling back Biden-era export controls and allowing sales of advanced chips like Nvidia’s H200 to China risks accelerating China’s AI capabilities, undermining U.S. national security and long-term innovation leadership.
Analysis
Debunking the AI food delivery hoax that fooled Reddit. A supposed whistleblower used AI-generated documents and images to bolster an explosive Reddit post about alleged delivery-app abuses, but the materials—including a Gemini-made badge and a fabricated 18-page report—were exposed as fakes after reporter verification and expert review.







