Qualcomm has announced that on-device AI capabilities will be available on mobile devices and Windows 11 PCs through their new Snapdragon 8 Gen 3 and X Elite chips. These chips were specifically designed to support generative AI and can handle various large language models, language vision models, and transformer network-based automatic speech recognition models. With up to 10 billion parameters for the SD8 Gen 3 and 13 billion parameters for the X Elite, these chips enable users to run AI models like Baidu’s ERNIE 3.5, OpenAI’s Whisper, Meta’s Llama 2, or Google’s Gecko without an internet connection. The focus of Qualcomm’s chips is on voice, text, and image inputs.
The Qualcomm AI Engine consists of the Oryon CPU, the Adreno GPU, and Hexagon NPU. These components work together to deliver up to 45 TOPS (trillions of operations per second) and process 30 tokens per second on laptops and 20 tokens per second on mobile devices. Tokens are the basic text/data units that large language models can process and generate. The chipsets utilize Samsung’s 4.8GHz LP-DDR5x DRAM for efficient memory allocation.
Durga Malladi, SVP & General Manager, Technology Planning & Edge Solutions at Qualcomm, highlighted the importance of heterogeneous compute for running these models efficiently. The state-of-the-art CPU, GPU, and NPU processors in Qualcomm’s chips work concurrently to handle multiple models simultaneously.
Generative AI has shown great potential in solving complex tasks efficiently. Malladi mentioned potential applications such as meeting and document summarization, email drafting, prompt-based computer code or music generation, and even enhancing photography capabilities. Qualcomm integrates its previous work with edge AI, specifically the Cognitive ISP, allowing devices with these chips to edit photos in real-time with up to 12 layers. They can also capture clearer images in low light, remove unwanted objects, expand image backgrounds, and even watermark shots as being real and not AI-generated.
Local on-device AI offers several advantages over cloud-based AI. As AI models are stored on the device itself, personalized experiences can gradually develop over time. The absence of delays that occur when the model has to query the cloud allows for faster processing and image generation. The X Elite and SD8 Gen 3 chips can run Stable Diffusion on-device and generate images in less than 0.6 seconds.
The ability to run larger and more capable models and interact with them through voice inputs can greatly benefit consumers. Voice interfaces provide a more natural and intuitive way to interact with devices. Qualcomm believes this can be a transformative moment in human-device interaction.
Qualcomm’s on-device AI plans extend beyond mobile devices and PCs. The company is already working on chip iterations to increase the parameter limit from 10-13 billion to potentially 20 billion or more. This will enable them to support sophisticated models and a wider range of use cases.
Malladi highlighted the potential of on-device AI for Advanced Driver Assistance Systems (ADAS). With multiple cameras, IR sensors, radar, lidar, and voice inputs, the models used in ADAS can be quite large, reaching 30-60 billion parameters. Qualcomm’s estimates suggest that on-device models could eventually exceed 100 billion parameters.
In conclusion, Qualcomm’s introduction of on-device AI capabilities in their Snapdragon 8 Gen 3 and X Elite chips opens up new possibilities for AI-powered applications on mobile devices and PCs. With support for large language models and improved image processing capabilities, users can expect more personalized and efficient AI experiences without relying on an internet connection. Future chip iterations will further enhance the capabilities of on-device AI, making it a crucial technology in various industries, including automotive.