Last Week in AI #294 - Search in ChatGPT, AI for robots, real-time Minecraft simulation
OpenAI’s search engine is now live in ChatGPT, This Is a Glimpse of the Future of AI Robots, Decart’s AI simulates a real-time, playable version of Minecraft, and more!
Top News
OpenAI’s search engine is now live in ChatGPT
OpenAI has integrated a web search feature into its AI-powered chatbot, ChatGPT, closing a competitive gap with rivals like Microsoft Copilot and Google Gemini. The feature, which can be manually triggered or activated based on queries, allows users to access real-time information from the web during conversations. The search functionality, built with a mix of technologies including Microsoft's Bing, is available across all ChatGPT platforms and is based on a fine-tuned version of GPT-4o. Despite the new feature, OpenAI will continue to update its training data to ensure users have access to the latest advancements. The launch comes amid a surge in AI-powered search technologies, with companies like Meta and Google also developing their own solutions.
This Is a Glimpse of the Future of AI Robots
San Francisco-based startup, Physical Intelligence, is developing an artificial intelligence model capable of performing a variety of household chores, moving the concept of a domestic robot from science fiction to reality. The company is leveraging large amounts of data to train the AI model, similar to the approach used in creating large language models (LLMs) for chatbots. The goal is to create a general-purpose learning algorithm for the physical world, capable of handling tasks across different types of robots. CEO Karol Hausman likens the process to training language models, emphasizing the generality of their approach and its ability to utilize data from various robotic embodiments.
Decart’s AI simulates a real-time, playable version of Minecraft
Decart, an Israeli AI firm, has launched Oasis, an "open-world" AI model that simulates a real-time, playable version of Minecraft. The model, which was trained on Minecraft gameplay videos, generates frames in real time based on keyboard and mouse movements, simulating the game's physics, rules, and graphics. Despite its current low resolution and tendency to forget level layouts, Decart is working on improvements, including the ability to create a custom "world" from an uploaded image. Future versions of Oasis, optimized for Etched's upcoming AI accelerator chips, could potentially generate up to 4K gameplay. However, questions about copyright implications arise as Decart did not mention obtaining Microsoft's permission to train the model on Minecraft footage.
Other News
Tools
OpenAI Releases SimpleQA: A New AI Benchmark that Measures the Factuality of Language Models - SimpleQA focusing on short, fact-seeking questions with a single, indisputable answer, and designed to remain challenging for the latest AI models.
Meta unveils AI tools to give robots a human touch in physical world - Meta unveils AI tools for robots to interact with the physical world, including touch perception models, tactile sensors, and a benchmark for evaluating human-robot collaboration.
Google preps ‘Jarvis’ AI agent that works in Chrome - Google is developing an AI agent called Project Jarvis, which will operate in Chrome and automate everyday web-based tasks, powered by Gemini 2.0 and expected to be previewed in December.
Runway Adds Precise Camera Controls to its AI Video Editor - Runway's new Advanced Camera Control feature allows precise panning, tracking, and zooming around AI subjects, catering to filmmakers and Hollywood studios.
xAI adds image understanding capabilities to Grok - xAI, owned by Elon Musk, has integrated image-understanding capabilities into its Grok AI model, allowing paid users on X social platform to upload images and ask the AI chatbot questions about them, with plans to further enhance its functionality.
Watch out, Midjourney — Recraft just announced new AI image generator model - Recraft has unveiled its latest AI image generation model, Recraft V3, which offers designer-centric features and sets a new benchmark for quality among AI image generators, surpassing competitors like Midjourney and OpenAI.
Claude AI can now analyze PDFs - here's how to try it - Anthropic's Claude 3.5 Sonnet AI model now has the ability to analyze PDF files, including text, images, charts, and graphs, but this feature is only available through a paid professional subscription or the API.
Business
Alphabet's Waymo Serving Over 150,000 Paid Robotaxi Rides Every Week Now, Surging 50% In 2 Months - Waymo, the robotaxi operator, has seen a 50% surge in paid rides in just two months, now providing over 150,000 trips per week and planning to expand its operations.
Waymo is now valued at a staggering $45 billion - Waymo, Alphabet's autonomous driving unit, has received a significant amount of fresh capital, leading to a valuation of over $45 billion, and plans to expand its robotaxi service in various cities.
Zoox custom robotaxis are finally coming to San Francisco and Las Vegas - Zoox, an Amazon-owned AV company, is set to launch its purpose-built autonomous vehicles in San Francisco and Las Vegas, starting with an "explorer" program for early riders and a gradual expansion of its robotaxi service.
What if A.I. Is Actually Good for Hollywood? - A Hollywood visual-effects start-up is using artificial intelligence to create seamless digital renderings of human faces, revolutionizing the industry and hinting at the potential for A.I. to accomplish high-quality visual effects at a fraction of the production cost.
Universal Music partners with AI company building an ‘ethical’ music generator - Universal Music partners with AI company Klay Vision to create an "ethical" foundational model for AI music generation, aiming to collaborate with the music industry and creators while respecting copyright and likeness rights.
Meta says it’s making its Llama models available for US national security applications - Meta is making its Llama series of AI models available to U.S. government agencies and contractors working on national security applications, in an effort to combat the perception that its "open" AI is aiding foreign adversaries.
SAG-AFTRA Inks Deal With AI Company Ethovox To Build Foundational Voice Model For Digital Replicas - SAG-AFTRA has partnered with Ethovox to create a foundational voice model for digital replicas, ensuring fair compensation and consent for voice actors involved, while also advocating for more contractual protection in the age of AI.
Microsoft just delayed Recall again - Microsoft is delaying the rollout of its Recall feature for Copilot Plus PCs once again, this time to refine the experience and ensure a secure and trusted user experience.
Meta strikes multi-year AI deal with Reuters - Meta has struck a multi-year deal with Reuters to use its news content to provide real-time answers to user queries about news and current events in its Meta AI chatbot, sources familiar with the agreement told Axios.
Perplexity CEO offers AI company’s services to replace striking NYT staff - Perplexity CEO offers AI company’s services to replace striking NYT staff, sparking controversy and criticism.
Anthropic hikes the price of its Haiku model - Anthropic's new AI model, Claude 3.5 Haiku, is pricier than its predecessor and lacks image analysis capabilities, despite outperforming the previous flagship model on certain benchmarks.
Research
Unbounded: A Generative Infinite Game of Character Life Simulation - A generative infinite game called Unbounded uses advanced AI to create a virtual world where players can interact with autonomous virtual characters through open-ended mechanics generated by a large language model.
Can Language Models Replace Programmers? REPOCOD Says 'Not Yet' - Language models have shown impressive code generation abilities, but the REPOCOD benchmark reveals that they are not yet capable of replacing human programmers in real-world software development.
Distinguishing Ignorance from Error in LLM Hallucinations - Distinguishing between two types of hallucinations in large language models is crucial for detecting and mitigating errors, and a new approach called Wrong Answer despite having Correct Knowledge (WACK) is introduced to address this issue.
Are LLMs Better than Reported? Detecting Label Errors and Mitigating Their Effect on Model Performance - Large language models (LLMs) can detect label errors in datasets, revealing that reported model performance may be higher than previously thought, and propose methods to mitigate the impact of mislabeled data on model training.
MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark - MMAU is a benchmark designed to evaluate multimodal audio understanding models on tasks requiring expert-level knowledge and complex reasoning, challenging models to tackle tasks akin to those faced by experts.
Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse - Chain-of-thought (CoT) prompting can reduce performance on tasks where thinking makes humans worse, as shown by experiments across various settings and models.
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents - OS-ATLAS is a foundational GUI action model that excels at GUI grounding and OOD agentic tasks through innovations in both data and modeling, providing significant performance improvements over previous state-of-the-art models.
Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms - Ferret-UI 2 is a multimodal large language model designed to understand user interfaces across various platforms, offering support for multiple platform types, high-resolution perception, and advanced task training data generation.
ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting - Master open-world interaction with visual-temporal context prompting enables agents to accomplish complex tasks in Minecraft, showcasing the effectiveness of this approach in embodied decision-making.
Concerns
Anthropic warns of AI catastrophe if governments don't regulate in 18 months - AI company Anthropic warns of catastrophic AI risks and advocates for targeted regulation to mitigate these risks, emphasizing the importance of transparency, incentivizing security, and simplicity in government guidelines.
Google, Microsoft, and Perplexity Are Promoting Scientific Racism in Search Results - AI-infused search engines from Google, Microsoft, and Perplexity are surfacing debunked research promoting race science and the idea of white genetic superiority, raising concerns about potential radicalization.
Open Source Bites Back as China’s Military Makes Full Use of Meta AI - Chinese research institutions with connections to the military have developed AI systems using Meta’s open-source Llama model, training them for military applications such as intelligence analysis, strategic planning, and command decision-making, despite Meta's prohibitions.
Tesla self-driving test driver: ‘you’re running on adrenaline the entire eight-hour shift’ - Tesla's internal self-driving team pushes the limits of autonomous driving technology, with test drivers describing dangerous scenarios and risky behaviors in the pursuit of data collection.
OpenAI Research Finds That Even Its Best Models Give Wrong Answers a Wild Proportion of the Time - OpenAI's latest AI models, including its cutting edge o1-preview model, are shockingly bad at providing correct answers, with even the best models scoring abysmally on the new SimpleQA benchmark, raising concerns about the pervasiveness of AI in everyday life.