Together AI Showcases Open Agentic Systems at GTC 2026: FlashAttention-4, ThunderAgent, Voice AI, and Production-Grade Inference — Research and Product Updates Highlight Open Source LLMs and AI Factory Capabilities

**Together AI**, as a diamond sponsor of **NVIDIA GTC 2026**, is showcasing its latest research and product innovations at Booth #1213 in San Jose from March 16 to 19. Today’s updates focus on open-source LLMs, voice AI capabilities, production-grade inference, and AI factory infrastructure. Key announcements include **FlashAttention-4** (up to 1.3× faster than cuDNN on NVIDIA Blackwell), the open-source **ThunderAgent** for agentic workloads (delivering a 3.6× throughput improvement), the **ATLAS-2** adaptive learning speculator, and a full-featured voice AI stack supporting real-time speech-to-text and text-to-speech. Together AI demonstrates how enterprises can transition from AI experiments to production deployment in minutes using its GPU clusters and inference platform.

The Voice Architect: Monetize Voice Anywhere + TalkToText by Building Custom Voice Workflows

Voice is the new keyboard, but most apps aren't listening. This guide outlines a "Voice Architect" consultancy. Use Voice Anywhere to add voice control capabilities to any existing website or SaaS platform, and TalkToText to provide high-accuracy, long-form dictation services. Learn to sell "Hands-Free Upgrades" to businesses: enabling warehouse workers to input data by voice, or helping writers draft books while walking. Includes a technical implementation plan, pricing tiers, and a strategy to find "keyboard-weary" clients.

Step-Audio 2.1 Claims Global Audio Evaluation Crown

China's Steptok AI has made a significant leap in voice AI with its latest Step-Audio 2.1 model, which has reportedly achieved top-tier scores in multiple global audio understanding benchmarks, showcasing its advancements in end-to-end architecture and reasoning capabilities

Telegram
Telegram
WhatsApp
WhatsApp