1-Bit LLM Commercial Milestone: Microsoft Research and Huawei Jointly Announce BitNet b1.58 Achieves Lossless 100B-Parameter Model Deployment on Kirin and Snapdragon Edge Platforms

Microsoft Research and Huawei jointly announced a key commercial milestone for 1-bit large language models (LLMs), stating that BitNet b1.58 technology has realized the efficient deployment of large-scale models on edge devices equipped with Huawei's latest Kirin chips and Qualcomm's Snapdragon platforms. Supported by BitNet's innovative ternary weight architecture {-1, 0, 1}, the technology reduces the model's memory footprint by over 70% and consumes only 0.028 joules per inference, enabling large-scale LLMs to run on consumer mobile hardware without relying on cloud connectivity. This breakthrough optimizes the economics of on-device AI, making high-performance model capabilities accessible on smartphones and tablets without the need for data center infrastructure or network latency.

Microsoft Releases BitNet b1.58 Performance Report — 1.58-bit LLMs Match Full-Precision Models While Using 71% Less Memory and Running 2.4x Faster

Microsoft Research has published comprehensive benchmarks for BitNet b1.58, its revolutionary 1.58-bit quantized language model architecture that uses only three values (-1, 0, +1) for weights. The results show BitNet b1.58 matching full-precision Transformer models in perplexity and downstream tasks while consuming 71.4% less GPU memory and achieving 2.4x speedup in latency. With the recent open-sourcing of bitnet.cpp for CPU inference, Microsoft is positioning BitNet as a practical path to deploying large models on consumer hardware and edge devices.

Telegram
Telegram
WhatsApp
WhatsApp