Speaking of cool Web stuff...

The Web Speech APIs

"The Plan"...

Efforts to bring speech to the Web / Current state of standardization
Code examples / walkthrough of APIs (with demos)
Discussion of what's good and bad about the APIs
What's next?

Today -26 years

1991: The Web
October 1, 1994: W3C Founded
December 15, 1994: Netscape Released
Microsoft didn't really enter the picture in a serious way until ~1996

Today -20 years

<object ID="AgentControl" width="0" height="0" CLASSID="clsid:D45FD31B-5C6E-11D1-9EC1-00C04FD7081F" CODEBASE="http://server/path/msagent.exe#VERSION=2,0,0,0">
</object>

But wow.... 1997!!!

A lot of people were thinking about speech on the Web...

Today -19 years

1998: CSS2

Aural Stylesheets!!!

Today -18 years

Like HTML, but for voice...

March 1999: VoiceXML Forum
AT&T Corporation, IBM, Lucent, and Motorola
Handed over to W3C in 2000
AskJeves, AT&T, Avaya, BT, Canon, Cisco, France Telecon, General Magic, Hitachi, HP, IBM, isSound, Intel, Locus Dialogue, Lucent, Microsoft, Mitre, Motorola, Nokia, Nortel, Nuance, Phillips, PipeBeach, Speech Works, Sun, Telecon Italia, TellMe.com, and Unisys

OMG!

Today -17 years

So many XMLs

You get an XML, and you get an XML, and...

None of these made it to the browser...

Standards

Community Groups vs Working Groups

Today -7 years

2010: A Decade After VoiceXML...

Speech XG Community Group

Some of us at Google have been working on extending HTML elements with speech...

They got a lot more than they bargained for...

But more than that they got actual competing draft proposals from Google, Microsoft, Mozilla and Voxeo as well

It didn't.

Today -5 years

2012: Another Community Group!

Today -4 years

June 10, 2013: Extensible Web Manifesto

OK... so let's talk about where we are now...

The Web: Present Day

Implementations are buggy / inconsistent
Implementations and bugs are low priority
The APIs aren't super great
There is no W3C standard, official draft or WG

But wait...

Let's go to the details!

window.speechSynthesis

A top-level object

speechSynthesis.speak(...)

SpeechSynthesisUtterance

An utterance is the thing that the synthesizer "speaks"

speechSynthesis.speak(
  new SpeechSynthesisUtterance(
    'Hello Darkness, my old friend'
  )
)

.pitch, .rate, .volume

0-2, 0.1-10, 0-1

let utterance =   new SpeechSynthesisUtterance(
  `dude...setting the pitch,
  rate and volume is easy,
  but really weird.`
)
utterance.pitch = 0.1  // so low
utterance.rate = 0.5   // half speed
utterance.volume = 0.9 // a little quieter than normal
speechSynthesis.speak(utterance)

Expressiveness

Voices
...?...

Today -78 years!

Bell Labs demonstrated digital voice
synthesis at the World's Fair in 1939

Speech, at the end of the day is full of really, really hard problems.

If you find this interesting, I wrote a whole
piece on The History of Speech

.lang

let apollonia = new SpeechSynthesisUtterance(
   `io so l'inglese:
      Monday Tuesday Thursday Wednesday
      Friday Sunday Saturday
    `
  )
apollonia.pitch = 1.1
apollonia.lang = 'it-US'
speechSynthesis.speak(apollonia)

Utterance events

.onstart, .onend, .onerror

let outEl = document.querySelector('#zepplin-out'),
    utteranceOne = new SpeechSynthesisUtterance(
       `We come from the land of the ice and snow`
    ),
    utteranceTwo = new SpeechSynthesisUtterance(
      `From the midnight sun where the hot springs flow`
    ),
    syncUIHandler = (event) => {
       outEl.innerText = event.target.text
    }

  utteranceOne.onstart = syncUIHandler
  utteranceTwo.onstart = syncUIHandler

  utteranceTwo.onend = () => {
    outEl.innerText = 'Ahh! Ahh!... Any questions?'
  }

speechSynthesis.speak(utteranceOne)
speechSynthesis.speak(utteranceTwo)

Err...

.speak() is async, .start() doesn't mean "started speaking"

// pauses processing of the queue
// utterances have a corresponding onpause
speechSynthesis.pause()

// resumes processing of the queue
// utterances have a corresponding onpause
speechSynthesis.resume()

// empties the queue, no effect on paused state
speechSynthesis.cancel()

Speaking of cool Web stuff... The Web Speech APIs

"The Plan"...

Standards

It didn't.

2012: Another Community Group!

June 10, 2013: Extensible Web Manifesto

The Web: Present Day

The Web: Present Day

The Web: Present Day

But wait...

Let's go to the details!

window.speechSynthesis

Expressiveness

Err...

Now, speak!

Speaking of cool Web stuff...

The Web Speech APIs