You Don't Say: Web Speech APIs Part II
This is part of a series about making the browser speak and listen to speech. In my last post Greetings, Professor Falken: Web Speech APIs Part I, I talked about the existing APIs we "have" for speaking (as well as why the air quotes) and all the many ways that they are wonky today. In this post I'll share some of my own opinions on that API, as well as how I'm dealing with the speaking part IRL.
What I did
While I think we need to drill down, I do believe there is room for high level features that are super easy to use for various cases. The original proposals from Google talked about declarative ways to integrate TTS and Voice Recognition into HTML. That's kind of interesting and fortunately, Custom Elements allows us to experiment in that arena.
Simple input...
<x-voice-listener>
<input name="input">
</x-voice-listener>
That's pretty easy. It uses the concept of progressive enhancement to say
that the input
element is the target of listening, if
it happens to be listening and, in the Shadow DOM provides an accessible button
for toggling that. The element isn't itself an input, but rather a decoration
around one.
The element has a ._voiceListener
property which exposes
an API about how it works. In this case, the API is called a
PauseableVoiceListener
which has methods .pause()
.unpause()
and .pauseWhile(Promise)
What about
<x-voice-listener><input name="input" ></x-voice-listener>