Canonical is planning to introduce AI-powered voice input capabilities to Ubuntu, specifically targeting text fields across the operating system. This feature, expected in Ubuntu 26.10, will allow users to simply speak into any text box instead of typing, making the interface more accessible and convenient.
At the recent Ubuntu Summit, Jon Seager, Canonical’s VP of Engineering, highlighted this initiative by stating that users will be able to "press a button and talk into any field that you could previously type in." The feature will rely on a lightweight AI language model, like Whisper, to facilitate this functionality.
This move is part of a broader effort to integrate AI features throughout Ubuntu. Founder Mark Shuttleworth has expressed a vision for Ubuntu to become the "OS for agentic AI." While the voice input feature aims to enhance accessibility, it is also designed for everyday desktop users who might prefer speaking to typing. Seager, with a bit of humor, remarked, "Why type like an animal to your [AI] agent when you can just talk to it?"
While the concept of voice-to-text is straightforward, integrating it seamlessly across all text input areas presents challenges. Seager acknowledged that ensuring consistent, reliable performance will be complex.
The initial goal is to have the voice input feature available by default in Ubuntu 26.10. However, it’s still undecided whether this will be turned on by default or introduced as an opt-in preview feature. In addition to voice input, Canonical has plans for ‘implicit’ AI features, which will operate behind the scenes, enhancing aspects like webcam autofocus and microphone quality.
Would you find it beneficial to type with your voice? Your thoughts are welcome!
