Last Week in AI #289 - OpenAI's latest drama, Llama 3.2, SB 1047 vetoed, AIs get chatty
‘We Are Not a Normal Company’: OpenAI’s Latest Drama, Meta releases its first open AI model that can process images, Gov. Gavin Newsom vetoes first-in-nation AI safety bill, and more!
Top News
‘We Are Not a Normal Company’: OpenAI’s Latest Drama
According to The Times and others, OpenAI is undergoing a significant transition as it seeks to become more appealing to external investors. This includes a shift towards becoming a for-profit business and potentially raising one of the largest funding rounds in recent history, which could increase its valuation to around $150 billion. Despite this, multiple high ranking employees resigned last week, including Chief Technical Officer Mira Murati, Chief Research Officer Bob McGrew, and VP of Research Barret Zoph. All who departed posted messages statements stating they are resigning to explore new opportunities or take a break, and are totally supportive of OpenAI.
OpenAI CFO tells investors funding round should close by next week despite executive departures
As OpenAI CTO and two others depart, Altman denies link to restructuring plans
Sam Altman denies plan to give him equity in for-profit OpenAI
Meta releases its first open AI model that can process images
Meta has released Llama 3.2, the first of its large open source models capable of processing both images and text. The model is designed to be easily implemented by developers, with minimal adjustments needed to incorporate its multimodal capabilities. Llama 3.2 includes two vision models and two lightweight text-only models, with the smaller models designed to work on Qualcomm, MediaTek, and other Arm hardware, indicating Meta's interest in mobile applications. Despite the release of Llama 3.2, the previous model, Llama 3.1, which has a larger number of parameters, will still be relevant for text generation tasks.
California Gov. Gavin Newsom vetoes first-in-nation AI safety bill
California Governor Gavin Newsom has vetoed a pioneering bill that aimed to establish safety measures for large artificial intelligence (AI) models, a move seen as a setback to efforts to regulate the rapidly evolving AI industry. The bill, which faced opposition from tech giants, startups, and several Democratic House members, would have set some of the first regulations on large-scale AI models in the nation. It proposed requirements for companies to test their models and publicly disclose their safety protocols. However, Newsom argued that the bill did not consider the context of AI deployment:
"While well-intentioned, SB 1047 does not take into account whether an AI system is deployed in high-risk environments, involves critical decision-making or the use of sensitive data … Instead, the bill applies stringent standards to even the most basic functions — so long as a large system deploys it. I do not believe this is the best approach to protecting the public from real threats posed by the technology." -Statement from Governor Newwsom
Instead of the bill, Newsom announced a partnership with industry experts to develop guidelines for powerful AI models.
Conversational AI takes off: OpenAI, Google, and Meta give their chatbots new voices, increase availability.
OpenAI rolls out Advanced Voice Mode with more voices and a new look
OpenAI has announced the rollout of its Advanced Voice Mode (AVM) to a broader set of ChatGPT's paying customers, starting with Plus and Teams tiers, followed by Enterprise and Edu customers. The AVM, which enhances the naturalness of ChatGPT's speech, has been redesigned and is now represented by a blue animated sphere. The update also includes five new voices - Arbor, Maple, Sol, Spruce, and Vale - bringing the total number of voices to nine. OpenAI has also improved the voice feature's understanding of accents and made conversations smoother and faster. The rollout does not include video and screen sharing features, and AVM is not yet available in several regions, including the EU, the U.K., Switzerland, Iceland, Norway, and Liechtenstein.
Gemini’s chatty voice mode is out now for free on Android
Relatedly, Google has launched its Gemini Live voice chat mode for all Android users, which was previously only available to Gemini Advanced subscribers. The feature allows users to interact with the conversational AI chatbot through the Gemini app or its overlay. Similar to AVM, users can ask questions aloud, interrupt the chatbot mid-sentence, and choose from several different voices. Google has also expanded Gemini Live’s voice options, adding ten new styles inspired by astronomical phenomena.
These voices, named after constellations, stars, and star-related phenomena, were also initially exclusive to Gemini Advanced subscribers but are now available to all users.
Meta’s AI can now talk to you in the voices of Awkwafina, John Cena, and Judi Dench
Joining the club, Meta's AI chatbot in Instagram, WhatsApp, and Facebook now have new celebrity voices like Awkwafina, John Cena, and Judi Dench. It can also now answer questions about photos and make changes to images, likely due its update to using Llama 3.2.
Other News
Tools
Duolingo launches AI-powered Adventures mini-games and Video Call feature - Duolingo, the world’s leading mobile learning platform, today announced new features including Adventures mini-games and Video Call to help with language learning.
OpenAI increases API limits for o1 and o1-mini - OpenAI increases API limits for o1 and o1-mini, benefiting tier 5 developers and ChatGPT Plus and Teams subscribers, with the o1-mini now having a query limit of 50 queries per day.
Perplexity introduces new 'Reasoning' focus powered by OpenAI's o1 - Perplexity AI introduces a new 'Reasoning' focus powered by OpenAI's o1 model, aimed at solving puzzles, math problems, and coding challenges, available to Pro users with a limit of 10 queries per day.
Meta’s Ray-Bans will now ‘remember’ things for you - Meta's Ray-Ban smart glasses receive software updates that enhance their AI capabilities, including features like "Reminders," real-time language translation, and the ability to scan QR codes and make calls, bringing the glasses closer to feeling smart.
Siri May Not Get Its Apple Intelligence Update Until January 2025 - Siri's Apple Intelligence update, including features like Genmoji and Image Playground, is expected to be released in stages, with the major update not arriving until January 2025.
Business
Pudu unveils super semi-humanoid robot with 8-hour battery, 10kg lift power - Pudu Robotics introduces a super semi-humanoid robot with 8-hour battery life and 10kg lift power, aiming to shape the future of the service robotics industry.
Uber will soon offer WeRide robotaxis in Abu Dhabi - Uber partners with WeRide to bring robotaxis to Abu Dhabi, expanding its autonomous vehicle offerings in the Middle East.
Google Reportedly Spent $2.7 Billion to Rehire Character.AI Founder - Google reportedly spent $2.7 billion to rehire ex-employee/AI guru Noam Shazeer, who left the company in frustration after they refused to release a chatbot he had developed, and is now one of the leaders of Google’s Gemini AI project.
Middle Eastern funds are plowing billions of dollars into hottest AI start-ups - Middle Eastern sovereign wealth funds, particularly those from Saudi Arabia, United Arab Emirates, Kuwait, and Qatar, are increasingly investing in Silicon Valley's AI companies, with funding for AI companies by Middle-Eastern sovereigns increasing fivefold in the past year.
Intel Unveils Next-Generation AI Solutions with the Launch of Xeon 6 and Gaudi 3 - Intel launches Xeon 6 with Performance-cores (P-cores) and Gaudi 3 AI accelerators, enabling an open ecosystem for implementing workloads with greater performance, efficiency, and security.
Warner Bros. Discovery Inks AI Deal With Google For Captions on Programming - Warner Bros. Discovery partners with Google Cloud to use AI-powered tool for creating captions, aiming to cut time and production costs by up to 50%.
James Cameron Joins Board of Stability AI In Coup for Tech Firm - James Cameron joins the board of Stability AI, expressing excitement about the potential of generative AI and CGI image creation in film technology.
TSMC execs allegedly dismissed OpenAI CEO Sam Altman as ‘podcasting bro’ - OpenAI CEO Sam Altman's ambitious plans for AI, involving trillions of dollars in investment and partnerships with major tech companies, have faced skepticism and dismissal from high-powered execs at TSMC, Samsung, and SK Hynix.
OpenAI Is Growing Fast and Burning Through Piles of Money - OpenAI's revenue from ChatGPT has skyrocketed, but the company is projected to lose billions this year due to high expenses, as it seeks a $7 billion investment to support its rapid growth.
Research
Autonomous robot replaces human fusion reactor inspectors in world-first trial - Researchers successfully deployed a fully autonomous robot to inspect the inside of a nuclear fusion reactor, demonstrating the potential for autonomous robots to enhance safety and cut costs in industrial facilities.
Archaeologists use AI to discover 303 unknown geolyphs near Nazca Lines - AI and drones have helped archaeologists discover 303 previously unknown geoglyphs near the Nazca Lines in Peru, shedding light on the transition from the Paracas culture to the Nazcas and providing a new understanding of the area's ancient history.
Agent Workflow Memory - Agents can benefit from a method called Agent Workflow Memory to induce commonly reused routines and guide future actions, leading to substantial improvements in web navigation tasks.
ByteDance Researchers Release InfiMM-WebMath-40: An Open Multimodal Dataset Designed for Complex Mathematical Reasoning - AI researchers from ByteDance and the Chinese Academy of Sciences have introduced InfiMM-WebMath-40B, a comprehensive multimodal dataset designed to enhance the mathematical reasoning capabilities of Large Language Models, demonstrating superior performance in handling complex reasoning tasks involving text and visual data.
Diagram of Thought (DoT) : An AI Framework that Models Iterative Reasoning in Large Language Models (LLMs) as the Construction of a Directed Acyclic Graph (DAG) within a Single Model - The DoT framework enhances reasoning capabilities in large language models by modeling iterative reasoning as a directed acyclic graph within a single LLM, addressing limitations of previous methods and providing a sophisticated model capable of handling the complexities of human-like reasoning in a computationally efficient manner.
LLMs Still Can't Plan; Can LRMs? A Preliminary Evaluation of OpenAI's o1 on PlanBench - OpenAI's o1 model, designed as a Large Reasoning Model (LRM), shows significant improvement in planning abilities compared to traditional large language models (LLMs) on the PlanBench benchmark, raising questions about accuracy, efficiency, and guarantees for deploying such systems.
Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale - A novel framework called ProX demonstrates that even small language models can refine data at scale, outperforming human-crafted rule-based methods and offering potential for efficient pre-training.
OpenAI Releases Multilingual Massive Multitask Language Understanding (MMMLU)- The dataset addresses the pressing need to evaluate language models across diverse linguistic, cognitive, and cultural contexts, offering a robust, multilingual, and multitask dataset designed to assess the performance of LLMs on various tasks.
LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness - LLaVA-3D introduces a simple yet effective framework for empowering LMMs with 3D-awareness, achieving faster convergence and state-of-the-art performance in 3D tasks while maintaining strong 2D understanding capabilities.
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models - Molmo introduces state-of-the-art VLMs with open weights and open data, outperforming proprietary systems and providing detailed image caption datasets collected from human annotators.
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models - LLMs lack long text generation capabilities, as observed through the introduction of the Hierarchical Long Text Generation Benchmark (HelloBench) and the Hierarchical Long Text Evaluation (HelloEval) method, which significantly reduces the time and effort required for human evaluation while maintaining a high correlation with human evaluation.
MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling - MIMO is a novel framework for synthesizing character videos with controllable attributes, achieving scalability to arbitrary characters, generality to novel 3D motions, and applicability to interactive real-world scenes in a unified framework.
JailbreakBench: An Open Sourced Benchmark for Jailbreaking Large Language Models (LLMs) - JailbreakBench is an open-source benchmark developed to standardize the assessment of jailbreak attempts and defenses on LLMs, aiming to promote advancements in protecting LLMs against adversarial manipulation and improve their dependability and safety.
Concerns
Meet MathPrompt, a way threat actors can break AI safety controls - AI safety controls can be bypassed by translating malicious requests into math equations, posing a critical vulnerability in current AI safety measures
False memories planted in ChatGPT give hacker persistent exfiltration channel - False memories planted in ChatGPT by a security researcher allowed for persistent exfiltration of user input, prompting OpenAI to issue a partial fix for the vulnerability.
AI ban ordered after child protection worker used ChatGPT in Victorian court case - Child protection worker in Victoria banned from using generative AI services after entering personal information, including the name of an at-risk child, into ChatGPT.
Sarah Silverman Lawyers Get Judge’s Harsh Rebuke in Meta AI Case - Sarah Silverman's legal team receives harsh criticism from a federal judge for inadequately representing clients in a proposed class action against Meta Platforms Inc. over the use of copyrighted books to train its AI model.
Policy
The Secret Service Spent $50,000 on OpenAI and Won’t Say Why - The Secret Service spent $50,000 on Microsoft Azure and OpenAI cloud services, but won't disclose the use case due to not discussing methods used for its "operations," amid a recent White House policy change regarding AI usage by federal agencies.
Startup behind “world’s first robot lawyer” to pay $193K for false ads, FTC says - AI startup DoNotPay, initially advertised as "the world's first robot lawyer," has been exposed by the FTC for making false claims and has agreed to pay $193,000 to settle charges.
Expert Opinions
The Intelligence Age - Sam Altman argues that in the next couple of decades, AI will enable us to accomplish tasks that were once unimaginable, leading to shared prosperity and significant improvements in people's lives.