The Web Speech API is not ready for production. Its accuracy is like a coin flip and it has critical privacy flaws.
#1about 3 minutes
Why voice user interfaces are important for accessibility
Voice interfaces can significantly improve web accessibility for users with disabilities and provide hands-free convenience for mobile professionals.
#2about 1 minute
Understanding the Web Speech API's core functions
The Web Speech API is a W3C standard divided into speech recognition for converting voice to text and speech synthesis for converting text to voice.
#3about 2 minutes
Reviewing VUI research and its current limitations
Research projects like the Conversational Web and a wheelchair VUI demonstrate potential but suffer from inconsistent accuracy, online-only functionality, and lack of wake words.
#4about 3 minutes
How to implement the Web Speech API in JavaScript
Learn the step-by-step process of implementing speech recognition, including loading the class, configuring grammar with JSGF, starting the listener, and processing the results.
#5about 2 minutes
Navigating the Web Speech API's result data structure
The API returns a nested data structure containing a list of results, each with alternatives that include the text transcript and a confidence score.
#6about 3 minutes
Key challenges limiting Web Speech API adoption
The API's adoption is hindered by significant issues including poor developer experience, privacy risks from cloud processing, no offline support, and inconsistent browser implementations.
#7about 3 minutes
A look inside the browser's implementation of speech recognition
An analysis of the Chromium source code reveals how the Web Speech API is implemented through layers that manage and dispatch recognition tasks to either remote cloud services or local OS-dependent engines.
#8about 5 minutes
The future of VUIs with Stanford's React Genie
Stanford's React Genie project offers a new paradigm by loosely coupling a voice agent with React state, allowing for complex voice commands that can manipulate off-screen content and application logic.
#9about 1 minute
Final verdict on the web's readiness for voice UIs
While the current Web Speech API is suitable for experimentation, it is not reliable enough for production use, but promising research indicates a more capable future for web-based voice interfaces.
Related jobs
Jobs that call for the skills explored in this talk.
Matching moments
00:59 MIN
Building a custom voice AI with WebRTC and Google APIs
Raise your voice!
02:35 MIN
Understanding the limitations of the Web Speech API
Building a Browser-Based Karaoke Game with Web Speech API
01:03 MIN
An overview of the Web Speech API
Building a Browser-Based Karaoke Game with Web Speech API
01:11 MIN
Practical design considerations for voice interfaces
Building a Browser-Based Karaoke Game with Web Speech API
02:11 MIN
The technical stack for a voice-driven coding tool
Speak, Code, Deploy: Transforming Developer Experience with Voice Commands
02:35 MIN
Understanding the limitations of the Web Speech API
Building a Browser-Based Karaoke Game with Web Speech API
02:45 MIN
Assessing the current state of web accessibility with AI
AI and Accessibility: The Good and the Bad - Fireside Chat
03:34 MIN
Building language-enabled universal interfaces for software
Semantic AI: Why Embeddings Might Matter More Than LLMs
Dev Digest 116 - WWWAI?This time, learn how to un-AI Google's search results, what's new on the web, avoid a new security hole and go back to BASICS with us. News and ArticlesWhat a week. Google, Microsoft, OpenAI and many others had their big flagship events announcing th...
Luis Minvielle
The Best Upcoming IT WebinarsNow that you already know what IT webinars are and how they can help you level up your professional appeal, you might want actually to get into one. Live tech webinars are one of the best ways to stay on top of the latest trends and tools because eit...
Chris Heilmann
WWC24 Talk - Scott Hanselman - AI: Superhero or Supervillain?Join Scott Hanselman at WWC24 to explore AI's role as a superhero or supervillain. Scott shares his 32 years of experience in software engineering, discusses AI myths, ethical dilemmas, and tech advancements. Engage with his live demos and insights o...
Eli McGarvie
16 Ways Developers Can Use ChatGPT-4 and GPT-4oChatGPT has been busy getting new designations. If you’ve been scrolling on 𝕏 over the last week, then you’ve seen the ChatGPT-4o announcement and probably thought of Joaquin Phoenix’s virtual girlfriend on Her.Beyond the references to flicks, the la...
From learning to earning
Jobs that call for the skills explored in this talk.