Author Information

Brian Kardell
  • Developer Advocate at Igalia
  • Original Co-author/Co-signer of The Extensible Web Manifesto
  • Co-Founder/Chair, W3C Extensible Web CG
  • Member, W3C (OpenJS Foundation)
  • Co-author of HitchJS
  • Blogger
  • Art, Science & History Lover
  • Standards Geek
Follow Me On...
Posted on null

You Don't Say: Web Speech APIs Part II

This is part of a series about making the browser speak and listen to speech. In my last post Greetings, Professor Falken: Web Speech APIs Part I, I talked about the existing APIs we "have" for speaking (as well as why the air quotes) and all the many ways that they are wonky today. In this post I'll share some of my own opinions on that API, as well as how I'm dealing with the speaking part IRL.

What I did

While I think we need to drill down, I do believe there is room for high level features that are super easy to use for various cases. The original proposals from Google talked about declarative ways to integrate TTS and Voice Recognition into HTML. That's kind of interesting and fortunately, Custom Elements allows us to experiment in that arena.

Simple input...


<x-voice-listener>
    <input name="input">
</x-voice-listener>
                

That's pretty easy. It uses the concept of progressive enhancement to say that the input element is the target of listening, if it happens to be listening and, in the Shadow DOM provides an accessible button for toggling that. The element isn't itself an input, but rather a decoration around one.

The element has a ._voiceListener property which exposes an API about how it works. In this case, the API is called a PauseableVoiceListener which has methods .pause() .unpause() and .pauseWhile(Promise)

What about


<x-voice-listener><input name="input" ></x-voice-listener>