Last Week in AI #338 - Anthropic sues Trump, xAI starting over, Iran AI Fakes
Anthropic sues Trump administration in AI dispute with Pentagon, ‘Not built right the first time’ — Musk’s xAI is starting over again, again, Cascade of A.I. Fakes About War With Iran Causes Chaos Onl
Anthropic sues Trump administration in AI dispute with Pentagon
Related:
OpenAI and Google Workers File Amicus Brief in Support of Anthropic Against the US Government
Internal Pentagon memo orders military commanders to remove Anthropic AI technology from key systems
Summary: Anthropic filed two lawsuits—one in the Northern District of California and one in the D.C. Circuit—arguing the Pentagon’s new “supply‑chain risk to national security” designation and a White House-directed government-wide ban are unlawful retaliation after negotiations over usage limits for Claude collapsed. The company contends the unprecedented SCR label for a U.S. firm is already jeopardizing hundreds of millions of dollars and violates required procedures and presidential authority; it is seeking a temporary restraining order to continue work with military partners.
An internal DoD memo dated March 6 ordered all commanders to remove Anthropic AI from Defense systems and networks within 180 days, including nuclear, missile defense, cyber warfare systems, and any contractor work, with narrow exemptions requiring CIO approval and risk‑mitigation plans. Pentagon officials say they need Claude for “all lawful purposes,” while Anthropic’s red lines sought to bar mass domestic surveillance and fully autonomous lethal weapons; DoD has reportedly used Claude on classified networks for intelligence synthesis, targeting recommendations, and battle simulations in partnership with Palantir.
Support and backlash quickly mounted across the AI sector and government. More than 30 employees from OpenAI and Google, including Google DeepMind’s Jeff Dean and researchers from both labs, filed an amicus brief supporting Anthropic’s TRO, warning that blacklisting introduces unpredictability, chills debate on frontier AI risks, and could have broader competitiveness consequences; Sam Altman publicly called enforcing the SCR “very bad” even as OpenAI inked its own Pentagon deal.
The White House stated the administration will not let a “woke AI company’s terms of service” constrain the military, while DoD declined to comment on litigation.
Editor’s Take: Last week I wrote “Still, it appears likely that this story is not yet over”, and so it is. Now that things have gone to the court it may well be that things will drag on for months, and the situation will quiet down for a time. There’s a lot to be said about this whole affair, and you can hear more about it in our latest podcast episode.
‘Not built right the first time’ — Musk’s xAI is starting over again, again
Related:
The XAI Exodus: Two More Cofounders Leave As Musk Says He’s Rebuilding
Musk’s xAI wins permit for datacenter’s makeshift power plant despite backlash
XAI’s Macrohard project stalls as Tesla ramps up a similar AI agent effort
Summary: Elon Musk says xAI is being “rebuilt from the foundations up,” as the startup undergoes a sweeping reorg and leadership exodus while lagging rivals in AI coding tools. Two more cofounders, Zihang Dai and Guodong Zhang (who led Grok Code and Grok Imagine), departed this week, leaving only Manuel Kroiss and Ross Nordeen from the original 11 founders; earlier exits included Toby Pohlen, Jimmy Ba, Tony Wu, and Greg Yang.
Musk acknowledged “Grok is currently behind in coding” versus Anthropic’s Claude Code and OpenAI’s Codex, held an all‑hands to course‑correct, and predicted catch‑up by mid‑year. xAI has shed dozens of employees since January, brought in SpaceX/Tesla execs to evaluate and cut staff, and is now combing through previously rejected applicants; it also hired Cursor’s Andrew Milich and Jason Ginsberg to bolster product engineering for coding assistants. Meanwhile, Grok Imagine (image/video generation) and key initiatives have faced cuts as the company prioritizes revenue‑driving coding tools and tries to show traction now that xAI sits under SpaceX ahead of a potential IPO.
Macrohard, xAI’s “AI white‑collar worker” agent meant to perform end‑to‑end computer tasks, has stalled after leadership churn and a paused data effort involving 600 contractors asked to screen‑record workflows; many engineers left or moved teams, and job listings for Macrohard have disappeared. Musk now frames Macrohard as a joint xAI–Tesla project alongside Tesla’s “Digital Optimus,” with Grok acting as the high‑level planner and the Tesla agent handling continuous, real‑time control—an approach modeled on Tesla Full Self‑Driving’s video‑based pipeline rather than screenshot‑by‑screenshot agents.
Editor’s Take: Elon Musk said “xAI was not built right first time around, so is being rebuilt from the foundations up”, and that ‘from the foundations up’ bit certainly appears to be true. Re-focusing away from Grok Imagine towards competing in the AI coder space makes sense strategically (OpenAI seems to have similarly prioritized Codex in recent months), but given xAI’s late start and current chaos it’s hard to imagine them catching up.
Cascade of A.I. Fakes About War With Iran Causes Chaos Online
Summary: Over 110 unique AI‑generated fakes about the new Iran war circulated across X, TikTok, Facebook, and private messaging apps in just two weeks, amassing millions of views, according to the New York Times. These items covered active combat, preparation, destruction, and propaganda: 37 pieces falsely depicting ongoing warfare, 5 on war preparations, 8 on destruction, 5 showing crying soldiers, 43 memes or overt AI content, and 13 other fabricated items.
To verify fakes, reporters combined visual tells (nonexistent buildings, garbled text, physics‑defying motion), invisible watermarks, multiple AI‑detector tools, and cross‑checks against reliable reporting. Experts say this wave outpaces previous conflicts due to more capable, low‑cost generative tools (including video models like Sora), multi‑front hostilities, and pro‑Iran narratives emphasizing military prowess and regional devastation.
Several clips went intensely viral, such as a balcony “Tel Aviv under missile barrage” video with an inserted Israeli flag—a common artifact when prompts mention Israel—and spectacular, Hollywood‑like scenes with mushroom clouds, hypersonic streaks, and sonic booms not seen in genuine battlefield footage. AI fabrications also fueled misinformation around the alleged attack on the U.S.S. Abraham Lincoln, with numerous bogus clips showing carriers ablaze despite U.S. statements that the ship was unharmed. Some content was openly propagandistic, including dramatized short films of the Shajarah Tayyebeh school strike and flattering or dehumanizing leader portrayals.
Platform responses remain thin: watermarking is easily stripped, few posts bore labels, and X’s new rule only demonetizes unlabeled “armed conflict” AI posts for 90 days.
Editor’s Take: Concern over ‘DeepFakes’ was all the rage in the late 2010s when generative AI for images was just starting to get good. Fears that DeepFakes’ harms would spread quickly proved unfounded (though ‘undressing’ apps did do real harm), but over the past several years we’ve seen a gradual growth in the impact of AI-generated images in all sorts of ways (scams, propaganda, brainrot content, false media of wars). The economic and psychological harm it brings to people is unqestionable and saddening. Are these false depicisions of war as harmful? I don’t know, but the fact they contribute to this being ‘the era of post-truth’ frustrates me immensely.
You can now ask Google Maps ‘complex, real-world questions’ — and Gemini will answer
Summary: Google is rolling out Ask Maps, an AI‑powered conversational search in Google Maps that uses Gemini to handle “complex, real‑world questions.” Users can describe plans in natural language (e.g., “find a vegetarian spot between Midtown East and my office with a cozy aesthetic and a table for four at 7pm”), and Maps will parse reviews, photos, and busyness data to surface tailored options—then book a table with a tap.
Personalization draws only from Google Maps and relevant prior searches tied to saved or favorited places; it does not use data from other Google apps like Gmail, according to Google. Paid placements currently do not influence Ask Maps recommendations. The feature launches this week in the US and India on Android and iOS, with desktop support coming soon.
A major navigation overhaul is arriving as well. Google is introducing Immersive Navigation, which it calls the biggest Maps upgrade in over a decade. The interface adds refreshed colors, detailed 3D buildings, elevated roadways, realistic terrain and greenery, plus dynamic camera zoom that shifts to highlight upcoming maneuvers. It explicitly calls out lanes, crosswalks, traffic lights, and stop signs when relevant, and explains route choices using live traffic plus user‑reported construction, crashes, and hazards; it can also provide parking info and walking directions after arrival.
Immersive Navigation begins rolling out in the US next week on iOS and Android, and will be available on Apple CarPlay, Android Auto, and vehicles with Google built in.
Editor’s Take: Finally, Google brings some Gemini greatness to its actual product line. Despite Gemini’s technical excellenge and Deepmind’s relentless pace over the past year, it’s felt like Google has kept up as far as actually improving their consumer offerings. Unlike many bolted-on barely useful AI features, this appears legimitately cool, and a step in the right direction.
Other News
Tools
Cursor is rolling out a new kind of agentic coding tool. Called Automations, the feature launches and manages coding agents automatically (triggered by code changes, Slack messages, timers, or incidents) so engineers only intervene at key review points.
Anthropic launches code review tool to check flood of AI-generated code. Integrated with GitHub, the tool automatically analyzes and comments on pull requests—focusing on logical errors, flagging severity levels, offering step‑by‑step explanations and suggested fixes, and allowing customization and light security checks for enterprise teams.
ChatGPT can now create interactive visuals to help you understand math and science concepts. Users get over 70 manipulable modules for math and science topics, where adjusting variables updates formulas, diagrams, and results in real time directly within ChatGPT.
Perplexity’s Personal Computer turns your spare Mac into an AI agent. The tool runs continuously on a dedicated Mac on your local network, giving the agent full access to your files and apps, remote control from any device, and safety features like an audit trail, action approvals, and a kill switch, with early access available via a waitlist.
Introducing Nemotron 3 Super: An Open Hybrid Mamba-Transformer MoE for Agentic Reasoning. NVIDIA is releasing an open 120B‑total (12B active) hybrid Mamba‑Transformer MoE trained natively in NVFP4 with latent MoE, multi‑token prediction, a 1M‑token context window, and RL‑tuned workflows to improve throughput, long‑context reasoning, and deployment efficiency for multi‑agent and agentic tasks.
You can (sort of) block Grok from editing your uploaded photos. A new toggle prevents Grok from being tagged to edit an uploaded image, but it doesn’t stop other workarounds or broader abuses of its image‑generation features.
Anthropic’s Claude AI can respond with charts, diagrams, and other visuals now. The update lets Claude automatically insert interactive charts, diagrams, and other visualizations directly into chats (or generate them on request); images are editable or clickable for more information and change as the conversation evolves.
Meta Is Developing 4 New Chips to Power Its AI and Recommendation Systems. Built with Broadcom on RISC‑V and TSMC fabrication, the MTIA 300 is already in production for ranking model training, with three upcoming inference‑focused chips (MTIA 400/450/500) planned to ship between early and late 2027 as part of Meta’s strategy to supplement, not replace, purchases from Nvidia, AMD, and others.
Gemini’s task automation is here and it’s wild. Gemini can now take actions inside apps—like ordering rides or food—by following prompts, asking clarifying questions, and performing steps in a virtual window while pausing for user confirmation before finalizing.
Business
Zoox starts mapping Dallas and Phoenix for its robotaxis. Mapping and initial SUV‑based testing will let Zoox collect driving data, open depots and a Scottsdale command center, and begin local trials before deploying its purpose‑built robotaxis pending federal and local approvals.
Anthropic’s Claude Marketplace allows customers to buy third-party cloud services. Enterprise customers can use existing Anthropic spending commitments to purchase third‑party cloud and AI services (starting with six partners) in one consolidated billing portal while Anthropic takes no cut.
Yann LeCun’s AMI Labs raises $1.03B to build world models. The funding will bankroll AMI Labs’ multi‑year effort to develop “world models” that learn from real‑world data (using approaches like JEPA) and to partner with companies such as Nabla for early health‑focused testing, while prioritizing open research over near‑term revenue.
Qualcomm, Wayve partner to accelerate AI-powered self-driving system rollout. The partnership combines Wayve’s data‑driven AI Driver software with Qualcomm’s Snapdragon Ride chips and safety stack to offer carmakers a scalable, standardized platform that shortens integration complexity and supports features from hands‑off to advanced “eyes‑off” driving as regulations permit.
Nissan, Uber reportedly finalizing deal for Wayve-powered robotaxi rollout. Demonstrated in an Ariya electric crossover, the system will arrive in Nissan’s 2027 fiscal year and use artificial intelligence to power its next‑generation ProPilot driving features.
Anthropic is launching a new think tank amid Pentagon blacklist fight. The company will combine three existing teams into a roughly 30‑person research unit led by Jack Clark to study large‑scale societal, economic, and safety implications of powerful AI, expanding into new projects and staffing despite Anthropic’s ongoing dispute with the Pentagon.
Humanoid robotics maker Sunday reaches $1.15B valuation to build household robots. The Series B will fund development of Sunday’s humanoid household robot Memo, which the startup says will assist with chores like laundry and clearing the table as it scales toward production and addresses longstanding challenges in robot manipulation.
Anthropic Pours $100 Million Into Claude Partner Network In Channel Push. New funding will support partner training, go‑to‑market efforts, direct partner funding for deployments and co‑marketing, a certification program, and expanded partner‑facing engineering and GTM resources to scale Claude’s enterprise adoption.
ByteDance reportedly pauses global launch of its Seedance 2.0 video generator. The company is delaying the planned mid‑March global rollout while engineers and lawyers add safeguards to address copyright complaints and cease‑and‑desist threats from studios.
Concerns
AI error jails innocent grandmother for months in North Dakota fraud case. She was wrongly identified by facial‑recognition software as a suspect in a Fargo bank fraud investigation, spent over five months jailed and extradited before her bank records proved she was 1,200 miles away and the charges were dropped.
Impossible to 100% prevent abuse, Grok lawyers say in Dutch case against nudify tools. Lawyers presented examples of Grok producing non‑consensual nude images and child sexual abuse material and asked the court to ban the feature with a €100,000‑per‑day fine, while xAI argued it cannot guarantee complete prevention of such misuse despite efforts to stop it.
Research
Many SWE-bench-Passing PRs Would Not Be Merged into Main. A maintainer‑reviewed evaluation of 296 SWE‑bench–passing AI‑generated PRs shows that about half would not be merged into main (after adjusting for maintainer noise), largely due to code‑quality issues, breaking other code, or core functionality problems that the automated grader missed.
Exclusive Self Attention. The proposed method, exclusive self attention (XSA), removes the component of attention outputs aligned with each token’s value vector to force attention to focus on contextual information, yielding consistent perplexity and downstream improvements with little extra cost across model sizes and sequence lengths.
Lost in Backpropagation: The LM Head is a Gradient Bottleneck. Findings show that the LM head’s low‑rank softmax causes a “gradient bottleneck” that compresses and destroys most of the backpropagated gradient—often 95–99%—slowing or preventing learning and reducing training efficiency by up to 16× even when model expressivity would otherwise suffice.
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs. Enabling chain‑of‑thought generation helps models surface related factual snippets and use the reasoning tokens as a computational buffer, which together increase parametric recall coverage but also introduce a hallucination risk that can be mitigated by selecting hallucination‑free reasoning trajectories at inference.
Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion. A joint mask‑based discrete diffusion over tokenized text, image, and speech enables any‑to‑any multimodal understanding and generation, with tailored training/inference techniques (progressive training, attenuated tail‑pad masking, position penalties, pre‑infilling, and adaptive token‑length initialization) that deliver comparable or better performance and faster sampling than existing autoregressive any‑to‑any systems on VQA, text‑to‑image, ASR, and TTS.
Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights. The authors show that after pretraining, the parameter space near the weights becomes densely populated with diverse, task‑specialist perturbations, enabling simple random‑sampling plus ensembling (RandOpt) to achieve competitive post‑training improvements.
EndoCoT: Scaling Endogenous Chain-of-Thought Reasoning in Diffusion Models. This approach enables diffusion models to perform iterative, chain‑of‑thought‑style reasoning by updating MLLM latent states during generation and grounding the final reasoning state, improving accuracy on maze, TSP, VSP, and Sudoku tasks.







