Last Week in AI #302 - QwQ 32B, OpenAI injunction refused, Alexa Plus
Alibaba’s New QwQ 32B Model is as Good as DeepSeek-R1 , Judge Denies Musk’s Request to Block OpenAI’s For-Profit Plan, Alexa Plus’ AI upgrades cost $19.99, and more!
Top News
Alibaba’s New QwQ 32B Model is as Good as DeepSeek-R1 ; Outperforms OpenAI’s o1-mini
Alibaba has announced a new AI model, QwQ 32B, under its Qwen umbrella, which contains 32 billion parameters and is said to perform comparably to DeepSeek-R1, a model with 671 billion parameters. The success of the QwQ 32B model is attributed to the application of reinforcement learning (RL) to foundational models on a large knowledge corpus, and its agentic capabilities that allow for critical thinking based on external feedback. The model, which is available on Hugging Face and ModelScope, outperforms OpenAI’s o1-mini in several benchmarks, including code, mathematical reasoning, and general problem-solving tasks. Alibaba also recently released the Wan 2.1, an open-source video foundation model, and announced plans to invest over $52 billion in the cloud computing and artificial intelligence sector over the next three years.
Judge Denies Musk’s Request to Block OpenAI’s For-Profit Plan
Elon Musk's request to grant a preliminary injunction to halt OpenAI's transition from a non-profit to a for-profit entity has been denied by a federal judge in San Francisco:
“The relief requested is extraordinary and rarely granted as it seeks the ultimate relief of the case on an expedited basis, with a cursory record, and without the benefit of a trial.
Having carefully considered the papers submitted and the pleadings in this action, including oral argument, and for the reasons set forth below, the Court hereby FINDS that plaintiffs have failed to meet their burden of proof for the extraordinary relief requested and DENIES the motion. That said, the Court is prepared to offer an expedited schedule on the core claims driving this litigation, while staying the balance.”
Marc Toberoff, a lawyer for Musk, said they were pleased the judge "offered an expedited trial on the core claims driving this case".
Alexa Plus’ AI upgrades cost $19.99, but it’s all free with Prime
Amazon has unveiled a major AI-powered overhaul of Alexa with the introduction of Alexa+, a subscription service that costs $19.99 per month—or is free for Amazon Prime members—and marks a significant pivot toward advanced artificial intelligence. At the recent event, Amazon demonstrated that Alexa+ can handle complex tasks such as making dinner reservations, ordering groceries, and booking an Uber, capabilities driven largely by AI models. While Amazon’s own Nova model managed over 70% of routine conversations, the more intellectually demanding queries were handled by Anthropic’s Claude large language model, which was described as tackling tasks that require “more thinking and intellectual heft.”
Other News
Tools
Alibaba makes AI video generation model free to use globally - Alibaba has open-sourced its video generation AI models globally, intensifying competition with companies like OpenAI and contributing to the growing trend of open-source AI development, particularly among Chinese firms.
Tencent heats up AI video-generation competition in China with new open-source product - Tencent's new open-source image-to-video model allows users to create high-resolution video clips with added sound effects and voice synchronization, intensifying the competition in China's AI video-generation market.
Sesame is the first voice assistant I’ve ever wanted to talk to more than once - Sesame, a new startup led by Oculus co-founder Brendan Iribe, introduces AI glasses with a voice assistant named Maya that offers a more engaging and natural conversational experience than existing voice assistants.
Microsoft launches next-gen Phi AI models. - Microsoft's Phi-4-multimodal enhances various AI capabilities while Phi-4-mini focuses on speed and efficiency, both accessible on multiple platforms like smartphones, PCs, and cars.
Microsoft’s new Dragon Copilot is an AI assistant for healthcare - Microsoft's Dragon Copilot aims to reduce administrative burdens in healthcare by using AI for tasks like note-taking and medical information searches, enhancing clinician efficiency and patient experience.
The ‘First Commercial Scale’ Diffusion LLM Mercury Offers over 1000 Tokens/sec on NVIDIA H100 - Inception Labs' Mercury, a diffusion-based large language model, offers a significant speed advantage over traditional transformer models by generating text all at once, potentially challenging the need for specialised hardware for high-speed inference.
AMD Releases Instella: A Series of Fully Open-Source State-of-the-Art 3B Parameter Language Model - AMD Instella offers a fully open-source, 3 billion parameter language model that balances performance and accessibility
ElevenLabs is launching its own speech-to-text model - ElevenLabs has launched its first standalone speech-to-text model, Scribe, which supports over 99 languages and aims to compete with established models by offering features like speaker diarization and word-level timestamps, although it currently only works with pre-recorded audio.
Physical Intelligence open-sources Pi0 robotics foundation model - Physical Intelligence has open-sourced its Pi0 robotic foundation model, allowing developers to fine-tune it for various tasks and platforms, with the aim of advancing general-purpose robotic intelligence through community collaboration.
Quora’s Poe now lets users create and share custom AI-powered apps - Quora's Poe platform now allows users to create and share custom AI-powered apps using a new App Creator tool that translates descriptions into code, with potential future monetization options for creators.
Google launches a free AI coding assistant with very high usage caps - Google's new free AI coding assistant, Gemini Code Assist for individuals, offers significantly higher usage caps than GitHub Copilot, aiming to attract developers early in their careers and potentially convert them to enterprise plans in the future.
Mistral’s new OCR API turns any PDF document into an AI-ready Markdown file - Mistral's new OCR API efficiently converts complex PDF documents into Markdown files, enhancing AI model processing by preserving text and graphical elements, and outperforming existing OCR solutions in speed and handling of non-English documents.
Business
Scale AI announces multimillion-dollar defense deal, a major step in U.S. military automation - Scale AI's new multimillion-dollar contract with the Department of Defense for the "Thunderforge" program marks a significant shift towards AI-driven military operations, raising ethical concerns about the potential for harm despite assurances of human oversight.
Meta plans to release standalone Meta AI app in effort to compete with OpenAI's ChatGPT - Meta plans to launch a standalone Meta AI app in the second quarter to enhance user interaction and compete with AI tools like ChatGPT, while also exploring monetization opportunities through a potential subscription service.
OpenAI launches $50M grant program to help fund academic research - OpenAI's $50 million grant program, NextGenAI, aims to support AI-assisted research at top universities while potentially increasing reliance on its own AI tools over competitors.
A.I. Start-Up Anthropic Closes Deal That Values It at $61.5 Billion - Anthropic's recent fund-raising round, led by Lightspeed Venture Partners, significantly increased its valuation amid a renewed surge of investor interest in AI start-ups.
Waymo has doubled its weekly robotaxi rides in less than a year - Waymo's rapid expansion in the robotaxi market, with over 200,000 weekly rides and plans to launch services in new cities, positions it as a leader in the autonomous vehicle industry.
Waymo and Uber's Austin robotaxi expansion begins today - Waymo and Uber have launched their robotaxi service in Austin, allowing users to potentially ride in a Waymo vehicle when ordering through the Uber app, with the service covering 37 square miles and maintained by a third-party partner.
OpenAI reportedly plans to charge up to $20,000 a month for specialized AI ‘agents’ - OpenAI is reportedly planning to introduce specialized AI agents with monthly fees ranging from $2,000 to $20,000, targeting various professional applications to help offset its significant financial losses.
Research
Pioneers of Reinforcement Learning Win the Turing Award - Andrew Barto and Rich Sutton, pioneers of reinforcement learning, have been awarded the Turing Award for their work that has become critical to modern AI applications, including guiding large language models and developing advanced AI agents.
Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs - Finetuning language models on narrow tasks like writing insecure code can lead to broad misalignment, causing them to exhibit harmful behaviors across unrelated prompts, with this effect being particularly pronounced in certain models.
Towards an AI co-scientist - An AI co-scientist system, built on Gemini 2.0, uses a multi-agent architecture to generate and validate novel research hypotheses, demonstrating its potential in biomedical fields like drug repurposing and target discovery.
BIG-Bench Extra Hard - BIG-Bench Extra Hard (BBEH) introduces a new benchmark to evaluate advanced reasoning capabilities in large language models, revealing significant challenges and room for improvement even for state-of-the-art models.
LongRoPE2: Near-Lossless LLM Context Window Scaling - LongRoPE2 introduces a novel RoPE rescaling algorithm and mixed context window training to effectively extend LLM context windows to 128k while preserving short-context performance, outperforming existing methods with significantly reduced training costs.
LADDER: Self-Improving LLMs Through Recursive Problem Decomposition - LADDER is a framework that enables large language models to autonomously improve their problem-solving abilities through recursive problem decomposition and self-guided learning, achieving significant performance improvements without human intervention or architectural scaling.
Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success - OpenVLA-OFT, an optimized fine-tuning recipe for vision-language-action models, enhances inference efficiency and task performance by integrating parallel decoding, continuous action representations, and an L1 regression objective, achieving state-of-the-art results in both simulation and real-world dexterous tasks.
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation - xAR introduces a next-X prediction framework for autoregressive visual generation, using Noisy Context Learning to improve robustness and achieve state-of-the-art performance on the ImageNet benchmark.
All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning - Reinforcement learning enhances fine-tuning by effectively narrowing the search space to optimal policies for simple reward models, despite the potential information loss in the process.
Concerns
Donald Trump’s A.I. Propaganda - Donald Trump's dissemination of an AI-generated video depicting a fantastical vision of "Trump Gaza" highlights the potential for AI to be used as a tool for political propaganda and misinformation.
Key ex-OpenAI researcher subpoenaed in AI copyright case - Alec Radford, a key figure in developing OpenAI's AI technologies, has been subpoenaed in a copyright case where authors allege OpenAI's models, including ChatGPT, infringed on their works.
Alibaba Releases Advanced Open Video Model, Immediately Becomes AI Porn Machine - Alibaba's release of the open-source AI video generation model Wan 2.1 quickly attracted the attention of the AI porn community, highlighting the ethical challenges of open AI technology.
Analysis
“It’s a lemon”—OpenAI’s largest AI model ever arrives to mixed reviews - OpenAI's GPT-4.5 model has received mixed reviews due to its high cost and marginal performance improvements over GPT-4o, leading to questions about the future of traditional AI models.
Expert Opinions
Anthropic’s C.E.O., Dario Amodei, on Surviving the A.I. Endgame - Dario Amodei discusses Anthropic's new Claude 3.7 Sonnet model, the competitive AI landscape, particularly with China, and the potential risks and societal impacts of AI advancements over the next few years.